On Sun, Apr 30, 2017 at 06:09:12AM -0400, Stephen P. Molnar wrote: [...] > I would have managed to extract input data from another calculation (not > a Python program) into the following text file. > > LOEWDIN ATOMIC CHARGES > ---------------------- > 0 C : -0.780631 > 1 H : 0.114577 > 2 Br: 0.309802 > 3 Cl: 0.357316 > 4 F : -0.001065 > > What I need to do is extract the floating point numbers into a Python file
I don't quite understand your question, but I'll take a guess. I'm going to assume you have a TEXT file containing literally this text: # ---- cut here ---- LOEWDIN ATOMIC CHARGES ---------------------- 0 C : -0.780631 1 H : 0.114577 2 Br: 0.309802 3 Cl: 0.357316 4 F : -0.001065 # ---- cut here ---- and you want to extract the atomic symbols (C, H, Br, Cl, F) and charges as floats. For the sake of the exercise, I'll extract them into a dictionary {'C': -0.780631, 'H': 0.114577, ... } then print them. Let me start by preparing the text file. Of course I could just use a text editor, but let's do it with Python: data = """LOEWDIN ATOMIC CHARGES ---------------------- 0 C : -0.780631 1 H : 0.114577 2 Br: 0.309802 3 Cl: 0.357316 4 F : -0.001065 """ filename = 'datafile.txt' with open(filename, 'w') as f: f.write(data) (Of course, in real life, it is silly to put your text into Python just to write it out to a file so you can read it back in. But as a programming exercise, its fine.) Now let's re-read the file, processing each line, and extract the data we want. atomic_charges = {} filename = 'datafile.txt' with open(filename, 'r') as f: # Skip lines until we reach a line made of nothing but --- for line in f: line = line.strip() # ignore leading and trailing whitespace if set(line) == set('-'): break # Continue reading lines from where we last got to. for line in f: line = line.strip() if line == '': # Skip blank lines. continue # We expect lines to look like: # 1 C : 0.12345 # where there may or may not be a space between the # letter and the colon. That makes it tricky to process, # so let's force there to always be at least one space. line = line.replace(':', ' :') # Split on spaces. try: number, symbol, colon, number = line.split() except ValueError as err: print("failed to process line:", line) print(err) continue # skip to the next line assert colon == ':', 'expected a colon but found something else' try: number = float(number) except ValueError: # We expected a numeric string like -0.234 or 0.123, but got # something else. We could skip this line, or replace it # with an out-of-bounds value. I'm going to use an IEEE-754 # "Not A Number" value as the out-of-bounds value. number = float("NaN") atomic_charges[symbol] = number # Finished! Let's see what we have: for sym in sorted(atomic_charges): print(sym, atomic_charges[sym]) There may be more efficient ways to process the lines, for example by using a regular expression. But its late, and I'm too tired to go messing about with regular expressions now. Perhaps somebody else will suggest one. -- Steve _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor