Re: Simple Text Processing Help

2007-10-17 Thread Tim Roberts
[EMAIL PROTECTED] wrote: And now for something completely different... I've been reading up a bit about Python and Excel and I quickly told the program to output to Excel quite easily. However, what if the input file were a Word document? I can't seem to find much information about parsing

Re: Simple Text Processing Help

2007-10-16 Thread Peter Otten
patrick.waldo wrote: manipulation? Also, I conceptually get it, but would you mind walking me through for key, group in groupby(instream, unicode.isspace): if not key: yield .join(group) itertools.groupby() splits a sequence into groups with the same key; e. g. to

Re: Simple Text Processing Help

2007-10-16 Thread patrick . waldo
And now for something completely different... I see a lot of COM stuff with Python for excel...and I quickly made the same program output to excel. What if the input file were a Word document? Where is there information about manipulating word documents, or what could I add to make the same

Re: Simple Text Processing Help

2007-10-16 Thread patrick . waldo
And now for something completely different... I've been reading up a bit about Python and Excel and I quickly told the program to output to Excel quite easily. However, what if the input file were a Word document? I can't seem to find much information about parsing Word files. What could I add

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
lines = open('your_file.txt').readlines()[:4] print lines print map(len, lines) gave me: ['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov \xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n'] [28, 32, 1, 18] I think it means that I'm still at option 3. I got

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
lines = open('your_file.txt').readlines()[:4] print lines print map(len, lines) gave me: ['\xef\xbb\xbf200-720-769-93-2\n', 'kyselina mo\xc4\x8dov \xc3\xa1 C5H4N4O3\n', '\n', '200-001-8\t50-00-0\n'] [28, 32, 1, 18] I think it means that I'm still at option 3. I got

Re: Simple Text Processing Help

2007-10-15 Thread Marc 'BlackJack' Rintsch
On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote: my sample input file looks like this( not organized,as you see it): 200-720-769-93-2 kyselina mocová C5H4N4O3 200-001-8 50-00-0 formaldehyd CH2O 200-002-3 50-01-1 guanidínium-chlorid CH5N3.ClH

Re: Simple Text Processing Help

2007-10-15 Thread Paul Hankin
On Oct 15, 12:20 pm, Marc 'BlackJack' Rintsch [EMAIL PROTECTED] wrote: On Mon, 15 Oct 2007 10:47:16 +, patrick.waldo wrote: my sample input file looks like this( not organized,as you see it): 200-720-769-93-2 kyselina mocová C5H4N4O3 200-001-8 50-00-0 formaldehyd

Re: Simple Text Processing Help

2007-10-15 Thread Peter Otten
patrick.waldo wrote: my sample input file looks like this( not organized,as you see it): 200-720-769-93-2 kyselina mocová C5H4N4O3 200-001-8 50-00-0 formaldehyd CH2O 200-002-3 50-01-1 guanidínium-chlorid CH5N3.ClH Assuming that the records are always

Re: Simple Text Processing Help

2007-10-15 Thread patrick . waldo
Wow, thank you all. All three work. To output correctly I needed to add: output.write(\r\n) This is really a great help!! Because of my limited Python knowledge, I will need to try to figure out exactly how they work for future text manipulation and for my own knowledge. Could you recommend

Re: Simple Text Processing Help

2007-10-15 Thread Paul Hankin
On Oct 15, 10:08 pm, [EMAIL PROTECTED] wrote: Because of my limited Python knowledge, I will need to try to figure out exactly how they work for future text manipulation and for my own knowledge. Could you recommend some resources for this kind of text manipulation? Also, I conceptually get

Re: Simple Text Processing Help

2007-10-15 Thread Paul McGuire
On Oct 14, 8:48 am, [EMAIL PROTECTED] wrote: Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out. Essentially I need to take a text document with some chemical information in Czech and organize it into another text

Simple Text Processing Help

2007-10-14 Thread patrick . waldo
Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out. Essentially I need to take a text document with some chemical information in Czech and organize it into another text file. The information is always EINECS number,

Re: Simple Text Processing Help

2007-10-14 Thread Marc 'BlackJack' Rintsch
On Sun, 14 Oct 2007 13:48:51 +, patrick.waldo wrote: Essentially I need to take a text document with some chemical information in Czech and organize it into another text file. The information is always EINECS number, CAS, chemical name, and formula in tables. I need to organize them

Re: Simple Text Processing Help

2007-10-14 Thread Paul Hankin
On Oct 14, 2:48 pm, [EMAIL PROTECTED] wrote: Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out. Essentially I need to take a text document with some chemical information in Czech and organize it into another text

Re: Simple Text Processing Help

2007-10-14 Thread patrick . waldo
Thank you both for helping me out. I am still rather new to Python and so I'm probably trying to reinvent the wheel here. When I try to do Paul's response, I get tokens = line.strip().split() [] So I am not quite sure how to read line by line. tokens = input.read().split() gets me all the

Re: Simple Text Processing Help

2007-10-14 Thread Marc 'BlackJack' Rintsch
On Sun, 14 Oct 2007 16:57:06 +, patrick.waldo wrote: Thank you both for helping me out. I am still rather new to Python and so I'm probably trying to reinvent the wheel here. When I try to do Paul's response, I get tokens = line.strip().split() [] What is in `line`? Paul wrote this

Re: Simple Text Processing Help

2007-10-14 Thread John Machin
On Oct 14, 11:48 pm, [EMAIL PROTECTED] wrote: Hi all, I started Python just a little while ago and I am stuck on something that is really simple, but I just can't figure out. Essentially I need to take a text document with some chemical information in Czech and organize it into another text