Re: Text file with mixed end-of-line terminations
You can use f.read() to read the entire file's contents into a string, providing the file isn't huge. Then, split on "\r" and replace "\n" when found. A simple test: input_data = "abc\rdef\rghi\r\njkl\r\nmno\r\n" first_split = input_data.split("\r") for rec in first_split: rec = rec.replace("\n", "") print rec -- http://mail.python.org/mailman/listinfo/python-list
Re: Text file with mixed end-of-line terminations
On Wed, Aug 31, 2011 at 12:37 PM, Alex van der Spek wrote: > I have a text file that uses both '\r' and '\r\n' end-of-line terminations. > > The '\r' terminates the first 25 lines or so, the remainder is termiated > with '\r\n' > Is there a way to make it read one line at a time, regardless of the line > termination? Universal Newline Support http://www.python.org/dev/peps/pep-0278/ http://docs.python.org/library/functions.html#open (Modes involving "U") Cheers, Chris -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
In <15d8f853-7c87-427b-8f21-e8537bde8...@x12g2000yql.googlegroups.com> Siboniso Shangase writes: > i want to type this data in a text file it the same the diffrence is > the number that only increase and i canot write this up myself since > it up to 5000 samples > Data\ja1.wav Data\ja1.mfc > . > . > Data\ja(n).wav Data\ja(n).mfc > Data\ma1.wav Data\ma1.mfc > . > Data\ma(n).wav Data\ma(n).mfc This is a simple python program to do what you want: samples = 5000 for i in range(1, samples): print "Data\\ja%d.wav Data\\ja%d.mfc" % (i, i) for i in range(1, samples): print "Data\\ma%d.wav Data\\ma%d.mfc" % (i, i) Replace the number 5000 with however many repetitions you want. (The loop will stop at one less than the number, so if you want 5000 exactly, use 5001.) Then run the program like this from your command line: python samples.py > textfile And it will save the output in "textfile". -- John Gordon A is for Amy, who fell down the stairs gor...@panix.com B is for Basil, assaulted by bears -- Edward Gorey, "The Gashlycrumb Tinies" -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
On 01/07/2011 01:19, Siboniso Shangase wrote: Hi i m very new to python and i need hepl plz!! i want to type this data in a text file it the same the diffrence is the number that only increase and i canot write this up myself since it up to 5000 samples Data\ja1.wav Data\ja1.mfc Data\ja2.wav Data\ja2.mfc Data\ja3.wav Data\ja3.mfc Data\ja4.wav Data\ja4.mfc . . . . Data\ja(n).wav Data\ja(n).mfc Data\ma1.wav Data\ma1.mfc Data\ma2.wav Data\ma2.mfc Data\ma3.wav Data\ma3.mfc Data\ma4.wav Data\ma4.mfc . . . Data\ma(n).wav Data\ma(n).mfc This should give you a start: path_of_samples_file = "samples.txt" with open(path_of_samples_file, "w") as samples_file: for index in range(1, 101): samples_file.write("Data\\ja{0}.wav Data\\ja{0}.mfc\n".format(index)) -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
import os lst = [] for x in xrange(1, 5001): lst.append(r"Data\ma{0}.wav Data\ma{0}.mfc".format(x)) lst.insert(x-1, r"Data\ja{0}.wav Data\ja{0}.mfc".format(x)) with open("filename.txt", "w") as fd: sep = os.linesep fd.write(sep.join(lst)) On Thu, Jun 30, 2011 at 5:19 PM, Siboniso Shangase wrote: > Hi > i m very new to python and i need hepl plz!! > > i want to type this data in a text file it the same the diffrence is > the number that only increase and i canot write this up myself since > it up to 5000 samples > > Data\ja1.wav Data\ja1.mfc > Data\ja2.wav Data\ja2.mfc > Data\ja3.wav Data\ja3.mfc > Data\ja4.wav Data\ja4.mfc > . > . > . > . > Data\ja(n).wav Data\ja(n).mfc > > Data\ma1.wav Data\ma1.mfc > Data\ma2.wav Data\ma2.mfc > Data\ma3.wav Data\ma3.mfc > Data\ma4.wav Data\ma4.mfc > . > . > . > Data\ma(n).wav Data\ma(n).mfc > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list
Re: text file reformatting
On Nov 1, 6:50 pm, "cbr...@cbrownsystems.com" wrote: > On Nov 1, 1:58 am, iwawi wrote: > > > > > > > On 1 marras, 09:59, "cbr...@cbrownsystems.com" > > > wrote: > > > On Oct 31, 11:46 pm, iwawi wrote: > > > > > On 31 loka, 21:48, Tim Chase wrote: > > > > > > > PRJ01001 4 00100END > > > > > > PRJ01002 3 00110END > > > > > > > I would like to pick only some columns to a new file and put them > > > > > > to a > > > > > > certain places (to match previous data) - definition file (def.csv) > > > > > > could be something like this: > > > > > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA > > > > > > FILE > > > > > > ProjID ; 1 ; 5 ; 1 > > > > > > CaseID ; 6 ; 3 ; 10 > > > > > > UselessV ; 10 ; 1 ; > > > > > > Zipcode ; 12 ; 5 ; 15 > > > > > > > So the new datafile should look like this: > > > > > > > PRJ01 001 00100END > > > > > > PRJ01 002 00110END > > > > > > How flexible is the def.csv format? The difficulty I see with > > > > > your def.csv format is that it leaves undefined gaps (presumably > > > > > to be filled in with spaces) and that you also have a blank "new > > > > > place in new file" value. If instead, you could specify the > > > > > width to which you want to pad it and omit variables you don't > > > > > want in the output, ordering the variables in the same order you > > > > > want them in the output: > > > > > > Variable; Start; Size; Width > > > > > ProjID; 1; 5; 10 > > > > > CaseID; 6; 3; 10 > > > > > Zipcode; 12; 5; 5 > > > > > End; 16; 3; 3 > > > > > > (note that I lazily use the same method to copy the END from the > > > > > source to the destination, rather than coding specially for it) > > > > > you could do something like this (untested) > > > > > > import csv > > > > > f = file('def.csv', 'rb') > > > > > f.next() # discard the header row > > > > > r = csv.reader(f, delimiter=';') > > > > > fields = [ > > > > > (varname, slice(int(start), int(start)+int(size)), width) > > > > > for varname, start, size, width > > > > > in r > > > > > ] > > > > > f.close() > > > > > out = file('out.txt', 'w') > > > > > try: > > > > > for row in file('data.txt'): > > > > > for varname, slc, width in fields: > > > > > out.write(row[slc].ljust(width)) > > > > > out.write('\n') > > > > > finally: > > > > > out.close() > > > > > > Hope that's fairly easy to follow and makes sense. There might > > > > > be some fence-posting errors (particularly your use of "1" as the > > > > > initial offset, while python uses "0" as the initial offset for > > > > > strings) > > > > > > If you can't modify the def.csv format, then things are a bit > > > > > more complex and I'd almost be tempted to write a script to try > > > > > and convert your existing def.csv format into something simpler > > > > > to process like what I describe. > > > > > > -tkc- Piilota siteerattu teksti - > > > > > > - Näytä siteerattu teksti - > > > > > Hi, > > > > > Thanks for your reply. > > > > > Def.csv could be modified so that every line has the same structure: > > > > variable name, field start, field size and new place and would be > > > > separated with semicolomns as you mentioned. > > > > > I tried your script (which seems quite logical) but I get this > > > > > Traceback (most recent call last): > > > > File "testing.py", line 16, in > > > > out.write (row[slc].ljust(width)) > > > > TypeError: an integer is required > > > > > Yes - you said it was untested, but I can't figure out how to > > > > proceed... > > > > The line > > > > (varname, slice(int(start), int(start)+int(size)), width) > > > > should instead be > > > > (varname, slice(int(start), int(start)+int(size)), int(width)) > > > > although you give an example where there is no width - what does that > > > imply? In the above case, it will throw an exception. > > > > Anyway, I think you'll find there's something a bit off in the output > > > loop with the parameter passed to ljust() as well. The value given in > > > your csv seems to be the absolute position, but as it's implemented by > > > Tim, it acts as the relative position. > > > > Given Tim's parsing into the list fields, I have a feeling that what > > > you really want instead of > > > > for varname, slc, width in fields: > > > out.write(row[slc].ljust(width)) > > > out.write('\n') > > > > is to have > > > > s = '' > > > for varname, slc, width in fields: > > > s += " "*(width - len(s)) + row[slc] > > > out.write(s+'\n') > > > > And if that is what you want, then you will surely want to globally > > > replace the name 'width' with for example 'start_column', because then > > > it all makes sense :). > > > > Cheers - Chas- Piilota siteerattu teksti - > > > > - Näytä siteerattu teksti - > > > Yes, it's meant to be t
Re: text file reformatting
On Nov 1, 1:58 am, iwawi wrote: > On 1 marras, 09:59, "cbr...@cbrownsystems.com" > > > > wrote: > > On Oct 31, 11:46 pm, iwawi wrote: > > > > On 31 loka, 21:48, Tim Chase wrote: > > > > > > PRJ01001 4 00100END > > > > > PRJ01002 3 00110END > > > > > > I would like to pick only some columns to a new file and put them to a > > > > > certain places (to match previous data) - definition file (def.csv) > > > > > could be something like this: > > > > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > > > > ProjID ; 1 ; 5 ; 1 > > > > > CaseID ; 6 ; 3 ; 10 > > > > > UselessV ; 10 ; 1 ; > > > > > Zipcode ; 12 ; 5 ; 15 > > > > > > So the new datafile should look like this: > > > > > > PRJ01 001 00100END > > > > > PRJ01 002 00110END > > > > > How flexible is the def.csv format? The difficulty I see with > > > > your def.csv format is that it leaves undefined gaps (presumably > > > > to be filled in with spaces) and that you also have a blank "new > > > > place in new file" value. If instead, you could specify the > > > > width to which you want to pad it and omit variables you don't > > > > want in the output, ordering the variables in the same order you > > > > want them in the output: > > > > > Variable; Start; Size; Width > > > > ProjID; 1; 5; 10 > > > > CaseID; 6; 3; 10 > > > > Zipcode; 12; 5; 5 > > > > End; 16; 3; 3 > > > > > (note that I lazily use the same method to copy the END from the > > > > source to the destination, rather than coding specially for it) > > > > you could do something like this (untested) > > > > > import csv > > > > f = file('def.csv', 'rb') > > > > f.next() # discard the header row > > > > r = csv.reader(f, delimiter=';') > > > > fields = [ > > > > (varname, slice(int(start), int(start)+int(size)), width) > > > > for varname, start, size, width > > > > in r > > > > ] > > > > f.close() > > > > out = file('out.txt', 'w') > > > > try: > > > > for row in file('data.txt'): > > > > for varname, slc, width in fields: > > > > out.write(row[slc].ljust(width)) > > > > out.write('\n') > > > > finally: > > > > out.close() > > > > > Hope that's fairly easy to follow and makes sense. There might > > > > be some fence-posting errors (particularly your use of "1" as the > > > > initial offset, while python uses "0" as the initial offset for > > > > strings) > > > > > If you can't modify the def.csv format, then things are a bit > > > > more complex and I'd almost be tempted to write a script to try > > > > and convert your existing def.csv format into something simpler > > > > to process like what I describe. > > > > > -tkc- Piilota siteerattu teksti - > > > > > - Näytä siteerattu teksti - > > > > Hi, > > > > Thanks for your reply. > > > > Def.csv could be modified so that every line has the same structure: > > > variable name, field start, field size and new place and would be > > > separated with semicolomns as you mentioned. > > > > I tried your script (which seems quite logical) but I get this > > > > Traceback (most recent call last): > > > File "testing.py", line 16, in > > > out.write (row[slc].ljust(width)) > > > TypeError: an integer is required > > > > Yes - you said it was untested, but I can't figure out how to > > > proceed... > > > The line > > > (varname, slice(int(start), int(start)+int(size)), width) > > > should instead be > > > (varname, slice(int(start), int(start)+int(size)), int(width)) > > > although you give an example where there is no width - what does that > > imply? In the above case, it will throw an exception. > > > Anyway, I think you'll find there's something a bit off in the output > > loop with the parameter passed to ljust() as well. The value given in > > your csv seems to be the absolute position, but as it's implemented by > > Tim, it acts as the relative position. > > > Given Tim's parsing into the list fields, I have a feeling that what > > you really want instead of > > > for varname, slc, width in fields: > > out.write(row[slc].ljust(width)) > > out.write('\n') > > > is to have > > > s = '' > > for varname, slc, width in fields: > > s += " "*(width - len(s)) + row[slc] > > out.write(s+'\n') > > > And if that is what you want, then you will surely want to globally > > replace the name 'width' with for example 'start_column', because then > > it all makes sense :). > > > Cheers - Chas- Piilota siteerattu teksti - > > > - Näytä siteerattu teksti - > > Yes, it's meant to be the absolute column position in a new file like > you said. > > I used your changes to the csv-reading cause it seems more flexible, > but the end of the code is still not working. Here's were I stand now: > > import re > > parse_columns = re.compile(r'\s*;\s*') > > f = file('def.csv'
Re: text file reformatting
On 1 marras, 09:59, "cbr...@cbrownsystems.com" wrote: > On Oct 31, 11:46 pm, iwawi wrote: > > > > > > > On 31 loka, 21:48, Tim Chase wrote: > > > > > PRJ01001 4 00100END > > > > PRJ01002 3 00110END > > > > > I would like to pick only some columns to a new file and put them to a > > > > certain places (to match previous data) - definition file (def.csv) > > > > could be something like this: > > > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > > > ProjID ; 1 ; 5 ; 1 > > > > CaseID ; 6 ; 3 ; 10 > > > > UselessV ; 10 ; 1 ; > > > > Zipcode ; 12 ; 5 ; 15 > > > > > So the new datafile should look like this: > > > > > PRJ01 001 00100END > > > > PRJ01 002 00110END > > > > How flexible is the def.csv format? The difficulty I see with > > > your def.csv format is that it leaves undefined gaps (presumably > > > to be filled in with spaces) and that you also have a blank "new > > > place in new file" value. If instead, you could specify the > > > width to which you want to pad it and omit variables you don't > > > want in the output, ordering the variables in the same order you > > > want them in the output: > > > > Variable; Start; Size; Width > > > ProjID; 1; 5; 10 > > > CaseID; 6; 3; 10 > > > Zipcode; 12; 5; 5 > > > End; 16; 3; 3 > > > > (note that I lazily use the same method to copy the END from the > > > source to the destination, rather than coding specially for it) > > > you could do something like this (untested) > > > > import csv > > > f = file('def.csv', 'rb') > > > f.next() # discard the header row > > > r = csv.reader(f, delimiter=';') > > > fields = [ > > > (varname, slice(int(start), int(start)+int(size)), width) > > > for varname, start, size, width > > > in r > > > ] > > > f.close() > > > out = file('out.txt', 'w') > > > try: > > > for row in file('data.txt'): > > > for varname, slc, width in fields: > > > out.write(row[slc].ljust(width)) > > > out.write('\n') > > > finally: > > > out.close() > > > > Hope that's fairly easy to follow and makes sense. There might > > > be some fence-posting errors (particularly your use of "1" as the > > > initial offset, while python uses "0" as the initial offset for > > > strings) > > > > If you can't modify the def.csv format, then things are a bit > > > more complex and I'd almost be tempted to write a script to try > > > and convert your existing def.csv format into something simpler > > > to process like what I describe. > > > > -tkc- Piilota siteerattu teksti - > > > > - Näytä siteerattu teksti - > > > Hi, > > > Thanks for your reply. > > > Def.csv could be modified so that every line has the same structure: > > variable name, field start, field size and new place and would be > > separated with semicolomns as you mentioned. > > > I tried your script (which seems quite logical) but I get this > > > Traceback (most recent call last): > > File "testing.py", line 16, in > > out.write (row[slc].ljust(width)) > > TypeError: an integer is required > > > Yes - you said it was untested, but I can't figure out how to > > proceed... > > The line > > (varname, slice(int(start), int(start)+int(size)), width) > > should instead be > > (varname, slice(int(start), int(start)+int(size)), int(width)) > > although you give an example where there is no width - what does that > imply? In the above case, it will throw an exception. > > Anyway, I think you'll find there's something a bit off in the output > loop with the parameter passed to ljust() as well. The value given in > your csv seems to be the absolute position, but as it's implemented by > Tim, it acts as the relative position. > > Given Tim's parsing into the list fields, I have a feeling that what > you really want instead of > > for varname, slc, width in fields: > out.write(row[slc].ljust(width)) > out.write('\n') > > is to have > > s = '' > for varname, slc, width in fields: > s += " "*(width - len(s)) + row[slc] > out.write(s+'\n') > > And if that is what you want, then you will surely want to globally > replace the name 'width' with for example 'start_column', because then > it all makes sense :). > > Cheers - Chas- Piilota siteerattu teksti - > > - Näytä siteerattu teksti - Yes, it's meant to be the absolute column position in a new file like you said. I used your changes to the csv-reading cause it seems more flexible, but the end of the code is still not working. Here's were I stand now: import re parse_columns = re.compile(r'\s*;\s*') f = file('def.csv', 'rb') f.readline() # discard the header row r = (parse_columns.split(line.strip()) for line in f) fields = [ (varname, slice(int(start), int(start)+int(size), int(width) if width else 0)) for varname, start, size, width in r ] f.close() p
Re: text file reformatting
On Oct 31, 11:46 pm, iwawi wrote: > On 31 loka, 21:48, Tim Chase wrote: > > > > > > PRJ01001 4 00100END > > > PRJ01002 3 00110END > > > > I would like to pick only some columns to a new file and put them to a > > > certain places (to match previous data) - definition file (def.csv) > > > could be something like this: > > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > > ProjID ; 1 ; 5 ; 1 > > > CaseID ; 6 ; 3 ; 10 > > > UselessV ; 10 ; 1 ; > > > Zipcode ; 12 ; 5 ; 15 > > > > So the new datafile should look like this: > > > > PRJ01 001 00100END > > > PRJ01 002 00110END > > > How flexible is the def.csv format? The difficulty I see with > > your def.csv format is that it leaves undefined gaps (presumably > > to be filled in with spaces) and that you also have a blank "new > > place in new file" value. If instead, you could specify the > > width to which you want to pad it and omit variables you don't > > want in the output, ordering the variables in the same order you > > want them in the output: > > > Variable; Start; Size; Width > > ProjID; 1; 5; 10 > > CaseID; 6; 3; 10 > > Zipcode; 12; 5; 5 > > End; 16; 3; 3 > > > (note that I lazily use the same method to copy the END from the > > source to the destination, rather than coding specially for it) > > you could do something like this (untested) > > > import csv > > f = file('def.csv', 'rb') > > f.next() # discard the header row > > r = csv.reader(f, delimiter=';') > > fields = [ > > (varname, slice(int(start), int(start)+int(size)), width) > > for varname, start, size, width > > in r > > ] > > f.close() > > out = file('out.txt', 'w') > > try: > > for row in file('data.txt'): > > for varname, slc, width in fields: > > out.write(row[slc].ljust(width)) > > out.write('\n') > > finally: > > out.close() > > > Hope that's fairly easy to follow and makes sense. There might > > be some fence-posting errors (particularly your use of "1" as the > > initial offset, while python uses "0" as the initial offset for > > strings) > > > If you can't modify the def.csv format, then things are a bit > > more complex and I'd almost be tempted to write a script to try > > and convert your existing def.csv format into something simpler > > to process like what I describe. > > > -tkc- Piilota siteerattu teksti - > > > - Näytä siteerattu teksti - > > Hi, > > Thanks for your reply. > > Def.csv could be modified so that every line has the same structure: > variable name, field start, field size and new place and would be > separated with semicolomns as you mentioned. > > I tried your script (which seems quite logical) but I get this > > Traceback (most recent call last): > File "testing.py", line 16, in > out.write (row[slc].ljust(width)) > TypeError: an integer is required > > Yes - you said it was untested, but I can't figure out how to > proceed... The line (varname, slice(int(start), int(start)+int(size)), width) should instead be (varname, slice(int(start), int(start)+int(size)), int(width)) although you give an example where there is no width - what does that imply? In the above case, it will throw an exception. Anyway, I think you'll find there's something a bit off in the output loop with the parameter passed to ljust() as well. The value given in your csv seems to be the absolute position, but as it's implemented by Tim, it acts as the relative position. Given Tim's parsing into the list fields, I have a feeling that what you really want instead of for varname, slc, width in fields: out.write(row[slc].ljust(width)) out.write('\n') is to have s = '' for varname, slc, width in fields: s += " "*(width - len(s)) + row[slc] out.write(s+'\n') And if that is what you want, then you will surely want to globally replace the name 'width' with for example 'start_column', because then it all makes sense :). Cheers - Chas -- http://mail.python.org/mailman/listinfo/python-list
Re: text file reformatting
On 31 loka, 21:48, Tim Chase wrote: > > PRJ01001 4 00100END > > PRJ01002 3 00110END > > > I would like to pick only some columns to a new file and put them to a > > certain places (to match previous data) - definition file (def.csv) > > could be something like this: > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > ProjID ; 1 ; 5 ; 1 > > CaseID ; 6 ; 3 ; 10 > > UselessV ; 10 ; 1 ; > > Zipcode ; 12 ; 5 ; 15 > > > So the new datafile should look like this: > > > PRJ01 001 00100END > > PRJ01 002 00110END > > How flexible is the def.csv format? The difficulty I see with > your def.csv format is that it leaves undefined gaps (presumably > to be filled in with spaces) and that you also have a blank "new > place in new file" value. If instead, you could specify the > width to which you want to pad it and omit variables you don't > want in the output, ordering the variables in the same order you > want them in the output: > > Variable; Start; Size; Width > ProjID; 1; 5; 10 > CaseID; 6; 3; 10 > Zipcode; 12; 5; 5 > End; 16; 3; 3 > > (note that I lazily use the same method to copy the END from the > source to the destination, rather than coding specially for it) > you could do something like this (untested) > > import csv > f = file('def.csv', 'rb') > f.next() # discard the header row > r = csv.reader(f, delimiter=';') > fields = [ > (varname, slice(int(start), int(start)+int(size)), width) > for varname, start, size, width > in r > ] > f.close() > out = file('out.txt', 'w') > try: > for row in file('data.txt'): > for varname, slc, width in fields: > out.write(row[slc].ljust(width)) > out.write('\n') > finally: > out.close() > > Hope that's fairly easy to follow and makes sense. There might > be some fence-posting errors (particularly your use of "1" as the > initial offset, while python uses "0" as the initial offset for > strings) > > If you can't modify the def.csv format, then things are a bit > more complex and I'd almost be tempted to write a script to try > and convert your existing def.csv format into something simpler > to process like what I describe. > > -tkc- Piilota siteerattu teksti - > > - Näytä siteerattu teksti - Hi, Thanks for your reply. Def.csv could be modified so that every line has the same structure: variable name, field start, field size and new place and would be separated with semicolomns as you mentioned. I tried your script (which seems quite logical) but I get this Traceback (most recent call last): File "testing.py", line 16, in out.write (row[slc].ljust(width)) TypeError: an integer is required Yes - you said it was untested, but I can't figure out how to proceed... -- http://mail.python.org/mailman/listinfo/python-list
Re: text file reformatting
On Oct 31, 12:48 pm, Tim Chase wrote: > > PRJ01001 4 00100END > > PRJ01002 3 00110END > > > I would like to pick only some columns to a new file and put them to a > > certain places (to match previous data) - definition file (def.csv) > > could be something like this: > > > VARIABLE FIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > ProjID ; 1 ; 5 ; 1 > > CaseID ; 6 ; 3 ; 10 > > UselessV ; 10 ; 1 ; > > Zipcode ; 12 ; 5 ; 15 > > > So the new datafile should look like this: > > > PRJ01 001 00100END > > PRJ01 002 00110END > > How flexible is the def.csv format? The difficulty I see with > your def.csv format is that it leaves undefined gaps (presumably > to be filled in with spaces) and that you also have a blank "new > place in new file" value. If instead, you could specify the > width to which you want to pad it and omit variables you don't > want in the output, ordering the variables in the same order you > want them in the output: > > Variable; Start; Size; Width > ProjID; 1; 5; 10 > CaseID; 6; 3; 10 > Zipcode; 12; 5; 5 > End; 16; 3; 3 > > (note that I lazily use the same method to copy the END from the > source to the destination, rather than coding specially for it) > you could do something like this (untested) > > import csv > f = file('def.csv', 'rb') > f.next() # discard the header row > r = csv.reader(f, delimiter=';') > fields = [ > (varname, slice(int(start), int(start)+int(size)), width) > for varname, start, size, width > in r > ] > f.close() > out = file('out.txt', 'w') > try: > for row in file('data.txt'): > for varname, slc, width in fields: > out.write(row[slc].ljust(width)) > out.write('\n') > finally: > out.close() > > Hope that's fairly easy to follow and makes sense. There might > be some fence-posting errors (particularly your use of "1" as the > initial offset, while python uses "0" as the initial offset for > strings) > > If you can't modify the def.csv format, then things are a bit > more complex and I'd almost be tempted to write a script to try > and convert your existing def.csv format into something simpler > to process like what I describe. > > -tkc To your point about the non-stand csv encoding in the defs.csv file, you could use a reg exp instead of the csv module to solve that: import re parse_columns = re.compile(r'\s*;\s*') f = file('defs.csv', 'rb') f.readline() # discard the header row r = (parse_columns.split(line.strip()) for line in f) fields = [ (varname, slice(int(start), int(start)+int(size), int(width) if width else 0)) for varname, start, size, width in r ] f.close() which given the OP's csv produces for fields: [('ProjID', slice(1, 6, 1)), ('CaseID', slice(6, 9, 10)), ('UselessV', slice(10, 11, 0)), ('Zipcode', slice(12, 17, 15))] and that should work with the remainder of your original code; although perhaps the OP wants something else to happen when width is omitted from the csv... Cheers - Chas -- http://mail.python.org/mailman/listinfo/python-list
RE: text file reformatting
Sorry to clarify, I was having issues getting this to work. I'm relatively new to Python. Sorry for the miscommunication. > Date: Sun, 31 Oct 2010 16:13:42 -0500 > From: python.l...@tim.thechases.com > To: brad...@hotmail.com > CC: python-list@python.org > Subject: Re: text file reformatting > > On 10/31/10 14:52, Braden Faulkner wrote: > >> import csv > >> f = file('def.csv', 'rb') > >> f.next() # discard the header row > >> r = csv.reader(f, delimiter=';') > >> fields = [ > >> (varname, slice(int(start), int(start)+int(size)), width) > >> for varname, start, size, width > >> in r > >> ] > >> f.close() > >> out = file('out.txt', 'w') > >> try: > >> for row in file('data.txt'): > >> for varname, slc, width in fields: > >> out.write(row[slc].ljust(width)) > >> out.write('\n') > >> finally: > >> out.close() > > > > I also am having issues with this. > > [top-posting fixed -- it's generally frowned upon in this > newsgroup/mailing-list and adherence to the preferences will tend > to get you a wider audience] > > Are your issues with my code, or with the topic at hand? If it's > my code, note my comment about it being untested. If it's the > topic at hand, I recommend trying my code (or a variation > there-of after you've tested it). > > -tkc > > -- http://mail.python.org/mailman/listinfo/python-list
Re: text file reformatting
On 10/31/10 14:52, Braden Faulkner wrote: import csv f = file('def.csv', 'rb') f.next() # discard the header row r = csv.reader(f, delimiter=';') fields = [ (varname, slice(int(start), int(start)+int(size)), width) for varname, start, size, width in r ] f.close() out = file('out.txt', 'w') try: for row in file('data.txt'): for varname, slc, width in fields: out.write(row[slc].ljust(width)) out.write('\n') finally: out.close() I also am having issues with this. [top-posting fixed -- it's generally frowned upon in this newsgroup/mailing-list and adherence to the preferences will tend to get you a wider audience] Are your issues with my code, or with the topic at hand? If it's my code, note my comment about it being untested. If it's the topic at hand, I recommend trying my code (or a variation there-of after you've tested it). -tkc -- http://mail.python.org/mailman/listinfo/python-list
RE: text file reformatting
I also am having issues with this. > Date: Sun, 31 Oct 2010 14:48:09 -0500 > From: python.l...@tim.thechases.com > To: iwawi...@gmail.com > Subject: Re: text file reformatting > CC: python-list@python.org > > > PRJ01001 4 00100END > > PRJ01002 3 00110END > > > > I would like to pick only some columns to a new file and put them to a > > certain places (to match previous data) - definition file (def.csv) > > could be something like this: > > > > VARIABLEFIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE > > ProjID ; 1 ; 5 ; 1 > > CaseID ; 6 ; 3 ; 10 > > UselessV ; 10 ; 1 ; > > Zipcode ; 12 ; 5 ; 15 > > > > So the new datafile should look like this: > > > > PRJ01001 00100END > > PRJ01002 00110END > > > How flexible is the def.csv format? The difficulty I see with > your def.csv format is that it leaves undefined gaps (presumably > to be filled in with spaces) and that you also have a blank "new > place in new file" value. If instead, you could specify the > width to which you want to pad it and omit variables you don't > want in the output, ordering the variables in the same order you > want them in the output: > > Variable; Start; Size; Width > ProjID; 1; 5; 10 > CaseID; 6; 3; 10 > Zipcode; 12; 5; 5 > End; 16; 3; 3 > > (note that I lazily use the same method to copy the END from the > source to the destination, rather than coding specially for it) > you could do something like this (untested) > >import csv >f = file('def.csv', 'rb') >f.next() # discard the header row >r = csv.reader(f, delimiter=';') >fields = [ > (varname, slice(int(start), int(start)+int(size)), width) > for varname, start, size, width > in r > ] >f.close() >out = file('out.txt', 'w') >try: > for row in file('data.txt'): >for varname, slc, width in fields: > out.write(row[slc].ljust(width)) >out.write('\n') >finally: > out.close() > > Hope that's fairly easy to follow and makes sense. There might > be some fence-posting errors (particularly your use of "1" as the > initial offset, while python uses "0" as the initial offset for > strings) > > If you can't modify the def.csv format, then things are a bit > more complex and I'd almost be tempted to write a script to try > and convert your existing def.csv format into something simpler > to process like what I describe. > > -tkc > > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: text file reformatting
PRJ01001 4 00100END PRJ01002 3 00110END I would like to pick only some columns to a new file and put them to a certain places (to match previous data) - definition file (def.csv) could be something like this: VARIABLEFIELDSTARTS FIELD SIZE NEW PLACE IN NEW DATA FILE ProjID ; 1 ; 5 ; 1 CaseID ; 6 ; 3 ; 10 UselessV ; 10 ; 1 ; Zipcode ; 12 ; 5 ; 15 So the new datafile should look like this: PRJ01001 00100END PRJ01002 00110END How flexible is the def.csv format? The difficulty I see with your def.csv format is that it leaves undefined gaps (presumably to be filled in with spaces) and that you also have a blank "new place in new file" value. If instead, you could specify the width to which you want to pad it and omit variables you don't want in the output, ordering the variables in the same order you want them in the output: Variable; Start; Size; Width ProjID; 1; 5; 10 CaseID; 6; 3; 10 Zipcode; 12; 5; 5 End; 16; 3; 3 (note that I lazily use the same method to copy the END from the source to the destination, rather than coding specially for it) you could do something like this (untested) import csv f = file('def.csv', 'rb') f.next() # discard the header row r = csv.reader(f, delimiter=';') fields = [ (varname, slice(int(start), int(start)+int(size)), width) for varname, start, size, width in r ] f.close() out = file('out.txt', 'w') try: for row in file('data.txt'): for varname, slc, width in fields: out.write(row[slc].ljust(width)) out.write('\n') finally: out.close() Hope that's fairly easy to follow and makes sense. There might be some fence-posting errors (particularly your use of "1" as the initial offset, while python uses "0" as the initial offset for strings) If you can't modify the def.csv format, then things are a bit more complex and I'd almost be tempted to write a script to try and convert your existing def.csv format into something simpler to process like what I describe. -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Text file to XML representation
On Wed, 2009-10-21, kak...@gmail.com wrote: > Hello, > I would like to make a program that takes a text file with the > following representation: > > outlook = sunny > | humidity <= 70: yes (2.0) > | humidity > 70: no (3.0) > outlook = overcast: yes (4.0) > outlook = rainy > | windy = TRUE: no (2.0) > | windy = FALSE: yes (3.0) > > and convert it to xml file for example: > > ... > > > Is there a way to do it? No. Impossible. No, of course it is possible. I'd think of it as a problem of (a) making clear to yourself what the input format (language) is, (b) write a parser for it (which transform it to Python data structures and (c) write code to dump the data structures according to some DTD (or whatever the XML people call it these days). (c) seems to be the easy part. /Jorgen -- // Jorgen GrahnO o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Text file to XML representation
kak...@gmail.com a écrit : Hello, I would like to make a program that takes a text file with the following representation: outlook = sunny | humidity <= 70: yes (2.0) | humidity > 70: no (3.0) outlook = overcast: yes (4.0) outlook = rainy | windy = TRUE: no (2.0) | windy = FALSE: yes (3.0) and convert it to xml file for example: (snip xml) Is there a way to do it? More than one. But I'd stronly suggest something like PyParsing + ElementTree. PyParsing : http://pyparsing.wikispaces.com/ ElementTree : is now in the stdlib, so refer to the FineManual -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
[EMAIL PROTECTED] wrote: HI all, i have some problem with the code belove, i have a list of servers in a textfile (elencopc.txt) i would to retrieve informations via WMI ( cicle for ), but i don't understand if the code is correct: Try this, using http://timgolden.me.uk/python/wmi.html : import wmi # # For the test to work # open ("elencopc.txt", "w").write ("localhost") for server in open ("elencopc.txt").read ().splitlines (): c = wmi.WMI (server) print "SERVER:", server for item in c.Win32_QuickFixEngineering (): print item # or print item.Caption, etc. print print If you get RPC Server unavailable, it usually means that the WMI service isn't running on that machine. Usually. TJG -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
On Wed, 01 Oct 2008 07:19:44 -0700, yqyq22 wrote: > My problem is how to translate this vbs in python: > > Dim fso > Dim strComputer > Set fso = CreateObject("Scripting.FileSystemObject") Set ElencoPC = > fso.OpenTextFile("elencoPC.txt" , 1, False) Do Until > ElencoPC.AtEndOfStream > strComputer = ElencoPC.ReadLine > > thanks try this: fso = open('elencoPC.txt', 'r') for line in f: strComputer = line -- http://mail.python.org/mailman/listinfo/python-list
Re: text file
On Oct 1, 4:03 pm, [EMAIL PROTECTED] wrote: > HI all, > i have some problem with the code belove, i have a list of servers in > a textfile (elencopc.txt) i would to retrieve informations via WMI > ( cicle for ), but i don't understand if the code is correct: > > import win32com.client > import string > import sys > listserver = open('c:\\elencopc.txt','r') > objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator") > objSWbemServices = objWMIService.ConnectServer(listserver,"root > \cimv2") > colItems = objSWbemServices.ExecQuery("Select * from > Win32_QuickFixEngineering") > for objItem in colItems: > print "Caption: ", objItem.Caption > print "Description: ", objItem.Description > print "Fix Comments: ", objItem.FixComments > print "HotFix ID: ", objItem.HotFixID > print "Install Date: ", objItem.InstallDate > print "Installed By: ", objItem.InstalledBy > print "Installed On: ", objItem.InstalledOn > print "Name: ", objItem.Name > print "Service Pack In Effect: ", objItem.ServicePackInEffect > print "Status: ", objItem.Status > > I receive the error : > ile "C:\Python25\Lib\site-packages\win32com\client\dynamic.py", line > 258, in _ApplyTypes_ > result = self._oleobj_.InvokeTypes(*(dispid, LCID, wFlags, > retType, argTypes) + args) > com_error: (-2147352567, 'Exception occurred.', (0, 'SWbemLocator', > 'The RPC server is unavailable. ', None, 0, -2147023174), None) > > MY big dubt is if the code is correct... because if i use vbscript all > works fine.. > thanks a lot in advance My problem is how to translate this vbs in python: Dim fso Dim strComputer Set fso = CreateObject("Scripting.FileSystemObject") Set ElencoPC = fso.OpenTextFile("elencoPC.txt" , 1, False) Do Until ElencoPC.AtEndOfStream strComputer = ElencoPC.ReadLine thanks -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
Dag a écrit : > I have an application which works with lists of tuples of the form > (id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these > lists containing anywhere from 3 to over 3 tuples. The actions I > need to do is either append a new tuple to the end of the list, display > all the tuples or display all the tuples where the last element is a 1 > > Basically what I'm wondering is the best way to store these data stuctures > to disc. As the subject mentioned I've basically got three approaches. > Store each list as a text file, pickle each list to file or shove the > whole thing into a bunch of database tables. I can see pros and cons > with each approach. Does anybody have any advice as to whether any of > these approaches is obviously better than any other? Seems that so far, you get as many different opinion as answers - not sure this will help much :-/ -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
On Apr 11, 5:40 pm, Dag <[EMAIL PROTECTED]> wrote: > I have an application which works with lists of tuples of the form > (id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these > lists containing anywhere from 3 to over 3 tuples. The actions I > need to do is either append a new tuple to the end of the list, display > all the tuples or display all the tuples where the last element is a 1 > > Basically what I'm wondering is the best way to store these data stuctures > to disc. As the subject mentioned I've basically got three approaches. > Store each list as a text file, pickle each list to file or shove the > whole thing into a bunch of database tables. I can see pros and cons > with each approach. Does anybody have any advice as to whether any of > these approaches is obviously better than any other? On one hand I like > the text file approach since it lets me append without loading > everything into memory, on the other hand the sqlite approach makes it > easy to select stuff with SELECT * FROM foo WHERE... which could be > handy if ever need to add more advanced filtering. > > Dag If you have enough resources to keep all the lists comfortably in memory, and you have enough disk space then I would save your data as python text. Something like: print "# " print "all_lists = []" for i,l in enumerate(all_lists): print "all_lists.append( [ #", i for tpl in l: print " ", tpl, "," print " ]) #", i You would then have your data saved in a format that could easily be re-used by other programs at a later date, and that can be examined in any text editor. - Paddy. -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
John Machin a écrit : (snip) > ... and a few more cents: > > There are *two* relations/tables involved (at least): a "tuple" table > and a "list" table. Mmm... From a purely technical POV, not necessarily. If there's no need for anything else than distinguishing between different lists, a single table with a compound key (list_id, tuple_id) could be enough... -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
On Apr 12, 7:09 am, Bruno Desthuilliers <[EMAIL PROTECTED]> wrote: > Dag a écrit : > > > > > I have an application which works with lists of tuples of the form > > (id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these > > lists containing anywhere from 3 to over 3 tuples. The actions I > > need to do is either append a new tuple to the end of the list, display > > all the tuples or display all the tuples where the last element is a 1 > > > Basically what I'm wondering is the best way to store these data stuctures > > to disc. As the subject mentioned I've basically got three approaches. > > Store each list as a text file, pickle each list to file or shove the > > whole thing into a bunch of database tables. I can see pros and cons > > with each approach. Does anybody have any advice as to whether any of > > these approaches is obviously better than any other? On one hand I like > > the text file approach since it lets me append without loading > > everything into memory, on the other hand the sqlite approach makes it > > easy to select stuff with SELECT * FROM foo WHERE... which could be > > handy if ever need to add more advanced filtering. s/if/when/ > > Given your specs, I'd go for SQLite without any hesitation. Your data > structure is obviously relational (a list of tuples is a pretty good > definition of a relation), so a relational DBMS is the obvious solution, > and you'll get lots of other benefits from it (SQL being only one of > them - you can also think about free optimization, scalability, and > interoperability). And if you don't like raw SQL and prefer something > more pythonic, then you have SQLAlchemy and Elixir. > > My 2 cents... ... and a few more cents: There are *two* relations/tables involved (at least): a "tuple" table and a "list" table. The 20-50 or so lists need a unique name or number each, and other attributes of a list are sure to come out of the woodwork later. Each tuple will need a column containing the ID of the list it belongs to. It's a bit boggling that (1) each tuple has an id_nr but there's no requirement to query on it (2) req. only to "append" new tuples w/o checking id_nr already exists (3) req. to "display" all of 30,000 tuples ... -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
Dag a écrit : > I have an application which works with lists of tuples of the form > (id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these > lists containing anywhere from 3 to over 3 tuples. The actions I > need to do is either append a new tuple to the end of the list, display > all the tuples or display all the tuples where the last element is a 1 > > Basically what I'm wondering is the best way to store these data stuctures > to disc. As the subject mentioned I've basically got three approaches. > Store each list as a text file, pickle each list to file or shove the > whole thing into a bunch of database tables. I can see pros and cons > with each approach. Does anybody have any advice as to whether any of > these approaches is obviously better than any other? On one hand I like > the text file approach since it lets me append without loading > everything into memory, on the other hand the sqlite approach makes it > easy to select stuff with SELECT * FROM foo WHERE... which could be > handy if ever need to add more advanced filtering. Given your specs, I'd go for SQLite without any hesitation. Your data structure is obviously relational (a list of tuples is a pretty good definition of a relation), so a relational DBMS is the obvious solution, and you'll get lots of other benefits from it (SQL being only one of them - you can also think about free optimization, scalability, and interoperability). And if you don't like raw SQL and prefer something more pythonic, then you have SQLAlchemy and Elixir. My 2 cents... -- http://mail.python.org/mailman/listinfo/python-list
Re: text file vs. cPickle vs sqlite a design question
En Wed, 11 Apr 2007 13:40:02 -0300, Dag <[EMAIL PROTECTED]> escribió: > I have an application which works with lists of tuples of the form > (id_nr,'text','more text',1 or 0). I'll have maybe 20-50 or so of these > lists containing anywhere from 3 to over 3 tuples. The actions I > need to do is either append a new tuple to the end of the list, display > all the tuples or display all the tuples where the last element is a 1 > > Basically what I'm wondering is the best way to store these data > stuctures > to disc. As the subject mentioned I've basically got three approaches. > Store each list as a text file, pickle each list to file or shove the > whole thing into a bunch of database tables. I can see pros and cons > with each approach. Does anybody have any advice as to whether any of From your description, none of these three approaches is obviously better. Try to isolate the data from its storage, and use the easiest way now (pickle perhaps?). This way you can change it later easily - maybe to use sqlite if you need more difficult queries. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: text file parsing (awk -> python)
Peter Otten, your solution is very nice, it uses groupby splitting on empty lines, so it doesn't need to read the whole files into memory. But Daniel Nogradi says: > But the names of the fields (node, x, y) keeps changing from file to > file, even their number is not fixed, sometimes it is (node, x, y, z). Your version with the converters dict fails to convert the number of node, z fields, etc. (generally using such converters dict is an elegant solution, it allows to define string, float, etc fields): > converters = dict( > x=int, > y=int > ) I have created a version with a RE, but it's probably too much rigid, it doesn't handle files with the z field, etc: data = """node 10 y 1 x -1 node 11 x -2 y 1 z 5 node 12 x -3 y 1 z 6""" import re unpack = re.compile(r"(\D+) \s+ ([-+]? \d+) \s+" * 3, re.VERBOSE) result = [] for obj in unpack.finditer(data): block = obj.groups() d = dict((block[i], int(block[i+1])) for i in xrange(0, 6, 2)) result.append(d) print result So I have just modified and simplified your quite nice solution (I have removed the pprint, but it's the same): def open(filename): from cStringIO import StringIO return StringIO(data) from itertools import groupby records = [] for empty, record in groupby(open("records.txt"), key=str.isspace): if not empty: pairs = ([k, int(v)] for k,v in map(str.split, record)) records.append(dict(pairs)) print records Bye, bearophile -- http://mail.python.org/mailman/listinfo/python-list
Re: text file parsing (awk -> python)
> > I have an awk program that parses a text file which I would like to > > rewrite in python. The text file has multi-line records separated by > > empty lines and each single-line field has two subfields: > > > > node 10 > > x -1 > > y 1 > > > > node 11 > > x -2 > > y 1 > > > > node 12 > > x -3 > > y 1 > > > > and this I would like to parse into a list of dictionaries like so: > > > > mydict[0] = { 'node':10, 'x':-1, 'y':1 } > > mydict[1] = { 'node':11, 'x':-2, 'y':1 } > > mydict[2] = { 'node':12, 'x':-3', 'y':1 } > > > > But the names of the fields (node, x, y) keeps changing from file to > > file, even their number is not fixed, sometimes it is (node, x, y, z). > > > > What would be the simples way to do this? > > data = """node 10 > x -1 > y 1 > > node 11 > x -2 > y 1 > > node 12 > x -3 > y 1 > """ > > def open(filename): > from cStringIO import StringIO > return StringIO(data) > > converters = dict( > x=int, > y=int > ) > > def name_value(line): > name, value = line.split(None, 1) > return name, converters.get(name, str.rstrip)(value) > > if __name__ == "__main__": > from itertools import groupby > records = [] > > for empty, record in groupby(open("records.txt"), key=str.isspace): > if not empty: > records.append(dict(name_value(line) for line in record)) > > import pprint > pprint.pprint(records) Thanks very much, that's exactly what I had in mind. Thanks again, Daniel -- http://mail.python.org/mailman/listinfo/python-list
Re: text file parsing (awk -> python)
Daniel Nogradi wrote: > I have an awk program that parses a text file which I would like to > rewrite in python. The text file has multi-line records separated by > empty lines and each single-line field has two subfields: > > node 10 > x -1 > y 1 > > node 11 > x -2 > y 1 > > node 12 > x -3 > y 1 > > and this I would like to parse into a list of dictionaries like so: > > mydict[0] = { 'node':10, 'x':-1, 'y':1 } > mydict[1] = { 'node':11, 'x':-2, 'y':1 } > mydict[2] = { 'node':12, 'x':-3', 'y':1 } > > But the names of the fields (node, x, y) keeps changing from file to > file, even their number is not fixed, sometimes it is (node, x, y, z). > > What would be the simples way to do this? data = """node 10 x -1 y 1 node 11 x -2 y 1 node 12 x -3 y 1 """ def open(filename): from cStringIO import StringIO return StringIO(data) converters = dict( x=int, y=int ) def name_value(line): name, value = line.split(None, 1) return name, converters.get(name, str.rstrip)(value) if __name__ == "__main__": from itertools import groupby records = [] for empty, record in groupby(open("records.txt"), key=str.isspace): if not empty: records.append(dict(name_value(line) for line in record)) import pprint pprint.pprint(records) -- http://mail.python.org/mailman/listinfo/python-list