On 23/10/19 8:51 PM, joseph pareti wrote:
I am experimnenting with this (reproducer) code:
pattern_eur = ['Total amount']
mylines = [] # Declare an empty list.
with open ('tmp0.txt', 'rt') as myfile: # Open tmp.txt for reading text.
for myline in myfile: # For each line in the file,
mylines.append(myline.rstrip('\n')) # strip newline and add to list.
for element in mylines: # For each element in the list,
match_C = re.search(pattern_eur, element)
if match_C:
element = element + 2
print(element)
--------------
the input file being:
$ cat tmp0.txt
line 0
line 1
Total amount
50000.00
linex
...
My intent is to locate the line containing "Total amount", skip the next
line, then print the eur value. The program terminates as follows:
...
Thanks for any insigths --
The first observation is that the two for loops are essentially
identical, so why not condense?
However, what is described may be calling for a solution called "a
finite state machine":
state 1: ignore unwanted data, until "Total amount" is found
state 2: skip blank line
state 3: grab the Euro value, and return to state 1
Being a simple-boy, I would avoid any reg-ex, because:
myline[ :11 ] == "Total amount"
is easier (and faster). Similarly, there is no need for rstrip-ping
except at "state 3" (unless there are particular rules for the
formatting of the total).
Another thought is that the problem is being visualised as a series of
lines and this may complicate things. If instead, a "buffer" or indeed
the entire file, could be read at a time (which is current code, per
first comment above), the string.find() method could be employed
(replacing "state 1"), and then (implicit assumption about spacing here)
"state 2" becomes a matter of moving a few characters 'along', before
grabbing the total; rinse and repeat...
Web-ref:
https://en.wikipedia.org/wiki/Finite-state_machine
--
Regards =dn
--
https://mail.python.org/mailman/listinfo/python-list