Re: Text file with mixed end-of-line terminations

2011-09-01 Thread woooee
You can use f.read() to read the entire file's contents into a string,
providing the file isn't huge.  Then, split on "\r" and replace "\n"
when found.
A simple test:
input_data = "abc\rdef\rghi\r\njkl\r\nmno\r\n"
first_split = input_data.split("\r")
for rec in first_split:
rec = rec.replace("\n", "")
print rec
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Text file with mixed end-of-line terminations

2011-08-31 Thread Chris Rebert
On Wed, Aug 31, 2011 at 12:37 PM, Alex van der Spek  wrote:
> I have a text file that uses both '\r' and '\r\n' end-of-line terminations.
>
> The '\r' terminates the first 25 lines or so, the remainder is termiated
> with '\r\n'

> Is there a way to make it read one line at a time, regardless of the line
> termination?

Universal Newline Support
http://www.python.org/dev/peps/pep-0278/

http://docs.python.org/library/functions.html#open
(Modes involving "U")

Cheers,
Chris
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2011-06-30 Thread John Gordon
In <15d8f853-7c87-427b-8f21-e8537bde8...@x12g2000yql.googlegroups.com> Siboniso 
Shangase  writes:

> i want to type this data in a text file it the same the diffrence is
> the number that only increase and i canot write this up myself since
> it up to 5000 samples

> Data\ja1.wav Data\ja1.mfc
> .
> .
> Data\ja(n).wav Data\ja(n).mfc

> Data\ma1.wav Data\ma1.mfc
> .
> Data\ma(n).wav Data\ma(n).mfc

This is a simple python program to do what you want:

  samples = 5000
  for i in range(1, samples):
print "Data\\ja%d.wav Data\\ja%d.mfc" % (i, i)
  for i in range(1, samples):
print "Data\\ma%d.wav Data\\ma%d.mfc" % (i, i)

Replace the number 5000 with however many repetitions you want.  (The
loop will stop at one less than the number, so if you want 5000 exactly,
use 5001.)

Then run the program like this from your command line:

  python samples.py > textfile

And it will save the output in "textfile".

-- 
John Gordon   A is for Amy, who fell down the stairs
gor...@panix.com  B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2011-06-30 Thread MRAB

On 01/07/2011 01:19, Siboniso Shangase wrote:

Hi
i m very new to python and i need hepl plz!!

i want to type this data in a text file it the same the diffrence is
the number that only increase and i canot write this up myself since
it up to 5000 samples

Data\ja1.wav Data\ja1.mfc
Data\ja2.wav Data\ja2.mfc
Data\ja3.wav Data\ja3.mfc
Data\ja4.wav Data\ja4.mfc
.
.
.
.
Data\ja(n).wav Data\ja(n).mfc

Data\ma1.wav Data\ma1.mfc
Data\ma2.wav Data\ma2.mfc
Data\ma3.wav Data\ma3.mfc
Data\ma4.wav Data\ma4.mfc
.
.
.
Data\ma(n).wav Data\ma(n).mfc


This should give you a start:

path_of_samples_file = "samples.txt"

with open(path_of_samples_file, "w") as samples_file:
for index in range(1, 101):
samples_file.write("Data\\ja{0}.wav 
Data\\ja{0}.mfc\n".format(index))

--
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2011-06-30 Thread Josh Benner
import os

lst = []
for x in xrange(1, 5001):
lst.append(r"Data\ma{0}.wav Data\ma{0}.mfc".format(x))
lst.insert(x-1, r"Data\ja{0}.wav Data\ja{0}.mfc".format(x))

with open("filename.txt", "w") as fd:
sep = os.linesep
fd.write(sep.join(lst))


On Thu, Jun 30, 2011 at 5:19 PM, Siboniso Shangase
wrote:

> Hi
> i m very new to python and i need hepl plz!!
>
> i want to type this data in a text file it the same the diffrence is
> the number that only increase and i canot write this up myself since
> it up to 5000 samples
>
> Data\ja1.wav Data\ja1.mfc
> Data\ja2.wav Data\ja2.mfc
> Data\ja3.wav Data\ja3.mfc
> Data\ja4.wav Data\ja4.mfc
> .
> .
> .
> .
> Data\ja(n).wav Data\ja(n).mfc
>
> Data\ma1.wav Data\ma1.mfc
> Data\ma2.wav Data\ma2.mfc
> Data\ma3.wav Data\ma3.mfc
> Data\ma4.wav Data\ma4.mfc
> .
> .
> .
> Data\ma(n).wav Data\ma(n).mfc
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file reformatting

2010-11-02 Thread iwawi
On Nov 1, 6:50 pm, "cbr...@cbrownsystems.com"
 wrote:
> On Nov 1, 1:58 am, iwawi  wrote:
>
>
>
>
>
> > On 1 marras, 09:59, "cbr...@cbrownsystems.com"
>
> >  wrote:
> > > On Oct 31, 11:46 pm, iwawi  wrote:
>
> > > > On 31 loka, 21:48, Tim Chase  wrote:
>
> > > > > > PRJ01001 4 00100END
> > > > > > PRJ01002 3 00110END
>
> > > > > > I would like to pick only some columns to a new file and put them 
> > > > > > to a
> > > > > > certain places (to match previous data) - definition file (def.csv)
> > > > > > could be something like this:
>
> > > > > > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA 
> > > > > > FILE
> > > > > > ProjID     ;       1       ;       5       ;       1
> > > > > > CaseID     ;       6       ;       3       ;       10
> > > > > > UselessV  ;        10      ;       1       ;
> > > > > > Zipcode    ;       12      ;       5       ;       15
>
> > > > > > So the new datafile should look like this:
>
> > > > > > PRJ01    001       00100END
> > > > > > PRJ01    002       00110END
>
> > > > > How flexible is the def.csv format?  The difficulty I see with
> > > > > your def.csv format is that it leaves undefined gaps (presumably
> > > > > to be filled in with spaces) and that you also have a blank "new
> > > > > place in new file" value.  If instead, you could specify the
> > > > > width to which you want to pad it and omit variables you don't
> > > > > want in the output, ordering the variables in the same order you
> > > > > want them in the output:
>
> > > > >   Variable; Start; Size; Width
> > > > >   ProjID; 1; 5; 10
> > > > >   CaseID; 6; 3; 10
> > > > >   Zipcode; 12; 5; 5
> > > > >   End; 16; 3; 3
>
> > > > > (note that I lazily use the same method to copy the END from the
> > > > > source to the destination, rather than coding specially for it)
> > > > > you could do something like this (untested)
>
> > > > >    import csv
> > > > >    f = file('def.csv', 'rb')
> > > > >    f.next() # discard the header row
> > > > >    r = csv.reader(f, delimiter=';')
> > > > >    fields = [
> > > > >      (varname, slice(int(start), int(start)+int(size)), width)
> > > > >      for varname, start, size, width
> > > > >      in r
> > > > >      ]
> > > > >    f.close()
> > > > >    out = file('out.txt', 'w')
> > > > >    try:
> > > > >      for row in file('data.txt'):
> > > > >        for varname, slc, width in fields:
> > > > >          out.write(row[slc].ljust(width))
> > > > >        out.write('\n')
> > > > >    finally:
> > > > >      out.close()
>
> > > > > Hope that's fairly easy to follow and makes sense.  There might
> > > > > be some fence-posting errors (particularly your use of "1" as the
> > > > > initial offset, while python uses "0" as the initial offset for
> > > > > strings)
>
> > > > > If you can't modify the def.csv format, then things are a bit
> > > > > more complex and I'd almost be tempted to write a script to try
> > > > > and convert your existing def.csv format into something simpler
> > > > > to process like what I describe.
>
> > > > > -tkc- Piilota siteerattu teksti -
>
> > > > > - Näytä siteerattu teksti -
>
> > > > Hi,
>
> > > > Thanks for your reply.
>
> > > > Def.csv could be modified so that every line has the same structure:
> > > > variable name, field start, field size and new place and would be
> > > > separated with semicolomns as you mentioned.
>
> > > > I tried your script (which seems quite logical) but I get this
>
> > > > Traceback (most recent call last):
> > > >   File "testing.py", line 16, in 
> > > >     out.write (row[slc].ljust(width))
> > > > TypeError: an integer is required
>
> > > > Yes - you said it was untested, but I can't figure out how to
> > > > proceed...
>
> > > The line
>
> > >     (varname, slice(int(start), int(start)+int(size)), width)
>
> > > should instead be
>
> > >     (varname, slice(int(start), int(start)+int(size)), int(width))
>
> > > although you give an example where there is no width - what does that
> > > imply? In the above case, it will throw an exception.
>
> > > Anyway, I think you'll find there's something a bit off in the output
> > > loop with the parameter passed to ljust() as well. The value given in
> > > your csv seems to be the absolute position, but as it's implemented by
> > > Tim, it acts as the relative position.
>
> > > Given Tim's parsing into the list fields, I have a feeling that what
> > > you really want instead of
>
> > >     for varname, slc, width in fields:
> > >         out.write(row[slc].ljust(width))
> > >     out.write('\n')
>
> > > is to have
>
> > >     s = ''
> > >     for varname, slc, width in fields:
> > >         s += " "*(width - len(s)) + row[slc]
> > >     out.write(s+'\n')
>
> > > And if that is what you want, then you will surely want to globally
> > > replace the name 'width' with for example 'start_column', because then
> > > it all makes sense :).
>
> > > Cheers - Chas- Piilota siteerattu teksti -
>
> > > - Näytä siteerattu teksti -
>
> > Yes, it's meant to be t

Re: text file reformatting

2010-11-01 Thread cbr...@cbrownsystems.com
On Nov 1, 1:58 am, iwawi  wrote:
> On 1 marras, 09:59, "cbr...@cbrownsystems.com"
>
>
>
>  wrote:
> > On Oct 31, 11:46 pm, iwawi  wrote:
>
> > > On 31 loka, 21:48, Tim Chase  wrote:
>
> > > > > PRJ01001 4 00100END
> > > > > PRJ01002 3 00110END
>
> > > > > I would like to pick only some columns to a new file and put them to a
> > > > > certain places (to match previous data) - definition file (def.csv)
> > > > > could be something like this:
>
> > > > > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA FILE
> > > > > ProjID     ;       1       ;       5       ;       1
> > > > > CaseID     ;       6       ;       3       ;       10
> > > > > UselessV  ;        10      ;       1       ;
> > > > > Zipcode    ;       12      ;       5       ;       15
>
> > > > > So the new datafile should look like this:
>
> > > > > PRJ01    001       00100END
> > > > > PRJ01    002       00110END
>
> > > > How flexible is the def.csv format?  The difficulty I see with
> > > > your def.csv format is that it leaves undefined gaps (presumably
> > > > to be filled in with spaces) and that you also have a blank "new
> > > > place in new file" value.  If instead, you could specify the
> > > > width to which you want to pad it and omit variables you don't
> > > > want in the output, ordering the variables in the same order you
> > > > want them in the output:
>
> > > >   Variable; Start; Size; Width
> > > >   ProjID; 1; 5; 10
> > > >   CaseID; 6; 3; 10
> > > >   Zipcode; 12; 5; 5
> > > >   End; 16; 3; 3
>
> > > > (note that I lazily use the same method to copy the END from the
> > > > source to the destination, rather than coding specially for it)
> > > > you could do something like this (untested)
>
> > > >    import csv
> > > >    f = file('def.csv', 'rb')
> > > >    f.next() # discard the header row
> > > >    r = csv.reader(f, delimiter=';')
> > > >    fields = [
> > > >      (varname, slice(int(start), int(start)+int(size)), width)
> > > >      for varname, start, size, width
> > > >      in r
> > > >      ]
> > > >    f.close()
> > > >    out = file('out.txt', 'w')
> > > >    try:
> > > >      for row in file('data.txt'):
> > > >        for varname, slc, width in fields:
> > > >          out.write(row[slc].ljust(width))
> > > >        out.write('\n')
> > > >    finally:
> > > >      out.close()
>
> > > > Hope that's fairly easy to follow and makes sense.  There might
> > > > be some fence-posting errors (particularly your use of "1" as the
> > > > initial offset, while python uses "0" as the initial offset for
> > > > strings)
>
> > > > If you can't modify the def.csv format, then things are a bit
> > > > more complex and I'd almost be tempted to write a script to try
> > > > and convert your existing def.csv format into something simpler
> > > > to process like what I describe.
>
> > > > -tkc- Piilota siteerattu teksti -
>
> > > > - Näytä siteerattu teksti -
>
> > > Hi,
>
> > > Thanks for your reply.
>
> > > Def.csv could be modified so that every line has the same structure:
> > > variable name, field start, field size and new place and would be
> > > separated with semicolomns as you mentioned.
>
> > > I tried your script (which seems quite logical) but I get this
>
> > > Traceback (most recent call last):
> > >   File "testing.py", line 16, in 
> > >     out.write (row[slc].ljust(width))
> > > TypeError: an integer is required
>
> > > Yes - you said it was untested, but I can't figure out how to
> > > proceed...
>
> > The line
>
> >     (varname, slice(int(start), int(start)+int(size)), width)
>
> > should instead be
>
> >     (varname, slice(int(start), int(start)+int(size)), int(width))
>
> > although you give an example where there is no width - what does that
> > imply? In the above case, it will throw an exception.
>
> > Anyway, I think you'll find there's something a bit off in the output
> > loop with the parameter passed to ljust() as well. The value given in
> > your csv seems to be the absolute position, but as it's implemented by
> > Tim, it acts as the relative position.
>
> > Given Tim's parsing into the list fields, I have a feeling that what
> > you really want instead of
>
> >     for varname, slc, width in fields:
> >         out.write(row[slc].ljust(width))
> >     out.write('\n')
>
> > is to have
>
> >     s = ''
> >     for varname, slc, width in fields:
> >         s += " "*(width - len(s)) + row[slc]
> >     out.write(s+'\n')
>
> > And if that is what you want, then you will surely want to globally
> > replace the name 'width' with for example 'start_column', because then
> > it all makes sense :).
>
> > Cheers - Chas- Piilota siteerattu teksti -
>
> > - Näytä siteerattu teksti -
>
> Yes, it's meant to be the absolute column position in a new file like
> you said.
>
> I used your changes to the csv-reading cause it seems more flexible,
> but the end of the code is still not working. Here's were I stand now:
>
> import re
>
> parse_columns = re.compile(r'\s*;\s*')
>
> f = file('def.csv'

Re: text file reformatting

2010-11-01 Thread iwawi
On 1 marras, 09:59, "cbr...@cbrownsystems.com"
 wrote:
> On Oct 31, 11:46 pm, iwawi  wrote:
>
>
>
>
>
> > On 31 loka, 21:48, Tim Chase  wrote:
>
> > > > PRJ01001 4 00100END
> > > > PRJ01002 3 00110END
>
> > > > I would like to pick only some columns to a new file and put them to a
> > > > certain places (to match previous data) - definition file (def.csv)
> > > > could be something like this:
>
> > > > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA FILE
> > > > ProjID     ;       1       ;       5       ;       1
> > > > CaseID     ;       6       ;       3       ;       10
> > > > UselessV  ;        10      ;       1       ;
> > > > Zipcode    ;       12      ;       5       ;       15
>
> > > > So the new datafile should look like this:
>
> > > > PRJ01    001       00100END
> > > > PRJ01    002       00110END
>
> > > How flexible is the def.csv format?  The difficulty I see with
> > > your def.csv format is that it leaves undefined gaps (presumably
> > > to be filled in with spaces) and that you also have a blank "new
> > > place in new file" value.  If instead, you could specify the
> > > width to which you want to pad it and omit variables you don't
> > > want in the output, ordering the variables in the same order you
> > > want them in the output:
>
> > >   Variable; Start; Size; Width
> > >   ProjID; 1; 5; 10
> > >   CaseID; 6; 3; 10
> > >   Zipcode; 12; 5; 5
> > >   End; 16; 3; 3
>
> > > (note that I lazily use the same method to copy the END from the
> > > source to the destination, rather than coding specially for it)
> > > you could do something like this (untested)
>
> > >    import csv
> > >    f = file('def.csv', 'rb')
> > >    f.next() # discard the header row
> > >    r = csv.reader(f, delimiter=';')
> > >    fields = [
> > >      (varname, slice(int(start), int(start)+int(size)), width)
> > >      for varname, start, size, width
> > >      in r
> > >      ]
> > >    f.close()
> > >    out = file('out.txt', 'w')
> > >    try:
> > >      for row in file('data.txt'):
> > >        for varname, slc, width in fields:
> > >          out.write(row[slc].ljust(width))
> > >        out.write('\n')
> > >    finally:
> > >      out.close()
>
> > > Hope that's fairly easy to follow and makes sense.  There might
> > > be some fence-posting errors (particularly your use of "1" as the
> > > initial offset, while python uses "0" as the initial offset for
> > > strings)
>
> > > If you can't modify the def.csv format, then things are a bit
> > > more complex and I'd almost be tempted to write a script to try
> > > and convert your existing def.csv format into something simpler
> > > to process like what I describe.
>
> > > -tkc- Piilota siteerattu teksti -
>
> > > - Näytä siteerattu teksti -
>
> > Hi,
>
> > Thanks for your reply.
>
> > Def.csv could be modified so that every line has the same structure:
> > variable name, field start, field size and new place and would be
> > separated with semicolomns as you mentioned.
>
> > I tried your script (which seems quite logical) but I get this
>
> > Traceback (most recent call last):
> >   File "testing.py", line 16, in 
> >     out.write (row[slc].ljust(width))
> > TypeError: an integer is required
>
> > Yes - you said it was untested, but I can't figure out how to
> > proceed...
>
> The line
>
>     (varname, slice(int(start), int(start)+int(size)), width)
>
> should instead be
>
>     (varname, slice(int(start), int(start)+int(size)), int(width))
>
> although you give an example where there is no width - what does that
> imply? In the above case, it will throw an exception.
>
> Anyway, I think you'll find there's something a bit off in the output
> loop with the parameter passed to ljust() as well. The value given in
> your csv seems to be the absolute position, but as it's implemented by
> Tim, it acts as the relative position.
>
> Given Tim's parsing into the list fields, I have a feeling that what
> you really want instead of
>
>     for varname, slc, width in fields:
>         out.write(row[slc].ljust(width))
>     out.write('\n')
>
> is to have
>
>     s = ''
>     for varname, slc, width in fields:
>         s += " "*(width - len(s)) + row[slc]
>     out.write(s+'\n')
>
> And if that is what you want, then you will surely want to globally
> replace the name 'width' with for example 'start_column', because then
> it all makes sense :).
>
> Cheers - Chas- Piilota siteerattu teksti -
>
> - Näytä siteerattu teksti -

Yes, it's meant to be the absolute column position in a new file like
you said.

I used your changes to the csv-reading cause it seems more flexible,
but the end of the code is still not working. Here's were I stand now:

import re

parse_columns = re.compile(r'\s*;\s*')

f = file('def.csv', 'rb')
f.readline() # discard the header row
r = (parse_columns.split(line.strip()) for line in f)
fields = [
 (varname, slice(int(start), int(start)+int(size), int(width) if width
else 0))
  for varname, start, size, width in r
 ]
f.close()
p

Re: text file reformatting

2010-11-01 Thread cbr...@cbrownsystems.com
On Oct 31, 11:46 pm, iwawi  wrote:
> On 31 loka, 21:48, Tim Chase  wrote:
>
>
>
> > > PRJ01001 4 00100END
> > > PRJ01002 3 00110END
>
> > > I would like to pick only some columns to a new file and put them to a
> > > certain places (to match previous data) - definition file (def.csv)
> > > could be something like this:
>
> > > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA FILE
> > > ProjID     ;       1       ;       5       ;       1
> > > CaseID     ;       6       ;       3       ;       10
> > > UselessV  ;        10      ;       1       ;
> > > Zipcode    ;       12      ;       5       ;       15
>
> > > So the new datafile should look like this:
>
> > > PRJ01    001       00100END
> > > PRJ01    002       00110END
>
> > How flexible is the def.csv format?  The difficulty I see with
> > your def.csv format is that it leaves undefined gaps (presumably
> > to be filled in with spaces) and that you also have a blank "new
> > place in new file" value.  If instead, you could specify the
> > width to which you want to pad it and omit variables you don't
> > want in the output, ordering the variables in the same order you
> > want them in the output:
>
> >   Variable; Start; Size; Width
> >   ProjID; 1; 5; 10
> >   CaseID; 6; 3; 10
> >   Zipcode; 12; 5; 5
> >   End; 16; 3; 3
>
> > (note that I lazily use the same method to copy the END from the
> > source to the destination, rather than coding specially for it)
> > you could do something like this (untested)
>
> >    import csv
> >    f = file('def.csv', 'rb')
> >    f.next() # discard the header row
> >    r = csv.reader(f, delimiter=';')
> >    fields = [
> >      (varname, slice(int(start), int(start)+int(size)), width)
> >      for varname, start, size, width
> >      in r
> >      ]
> >    f.close()
> >    out = file('out.txt', 'w')
> >    try:
> >      for row in file('data.txt'):
> >        for varname, slc, width in fields:
> >          out.write(row[slc].ljust(width))
> >        out.write('\n')
> >    finally:
> >      out.close()
>
> > Hope that's fairly easy to follow and makes sense.  There might
> > be some fence-posting errors (particularly your use of "1" as the
> > initial offset, while python uses "0" as the initial offset for
> > strings)
>
> > If you can't modify the def.csv format, then things are a bit
> > more complex and I'd almost be tempted to write a script to try
> > and convert your existing def.csv format into something simpler
> > to process like what I describe.
>
> > -tkc- Piilota siteerattu teksti -
>
> > - Näytä siteerattu teksti -
>
> Hi,
>
> Thanks for your reply.
>
> Def.csv could be modified so that every line has the same structure:
> variable name, field start, field size and new place and would be
> separated with semicolomns as you mentioned.
>
> I tried your script (which seems quite logical) but I get this
>
> Traceback (most recent call last):
>   File "testing.py", line 16, in 
>     out.write (row[slc].ljust(width))
> TypeError: an integer is required
>
> Yes - you said it was untested, but I can't figure out how to
> proceed...

The line

(varname, slice(int(start), int(start)+int(size)), width)

should instead be

(varname, slice(int(start), int(start)+int(size)), int(width))

although you give an example where there is no width - what does that
imply? In the above case, it will throw an exception.

Anyway, I think you'll find there's something a bit off in the output
loop with the parameter passed to ljust() as well. The value given in
your csv seems to be the absolute position, but as it's implemented by
Tim, it acts as the relative position.

Given Tim's parsing into the list fields, I have a feeling that what
you really want instead of

for varname, slc, width in fields:
out.write(row[slc].ljust(width))
out.write('\n')

is to have

s = ''
for varname, slc, width in fields:
s += " "*(width - len(s)) + row[slc]
out.write(s+'\n')

And if that is what you want, then you will surely want to globally
replace the name 'width' with for example 'start_column', because then
it all makes sense :).

Cheers - Chas

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file reformatting

2010-10-31 Thread iwawi
On 31 loka, 21:48, Tim Chase  wrote:
> > PRJ01001 4 00100END
> > PRJ01002 3 00110END
>
> > I would like to pick only some columns to a new file and put them to a
> > certain places (to match previous data) - definition file (def.csv)
> > could be something like this:
>
> > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA FILE
> > ProjID     ;       1       ;       5       ;       1
> > CaseID     ;       6       ;       3       ;       10
> > UselessV  ;        10      ;       1       ;
> > Zipcode    ;       12      ;       5       ;       15
>
> > So the new datafile should look like this:
>
> > PRJ01    001       00100END
> > PRJ01    002       00110END
>
> How flexible is the def.csv format?  The difficulty I see with
> your def.csv format is that it leaves undefined gaps (presumably
> to be filled in with spaces) and that you also have a blank "new
> place in new file" value.  If instead, you could specify the
> width to which you want to pad it and omit variables you don't
> want in the output, ordering the variables in the same order you
> want them in the output:
>
>   Variable; Start; Size; Width
>   ProjID; 1; 5; 10
>   CaseID; 6; 3; 10
>   Zipcode; 12; 5; 5
>   End; 16; 3; 3
>
> (note that I lazily use the same method to copy the END from the
> source to the destination, rather than coding specially for it)
> you could do something like this (untested)
>
>    import csv
>    f = file('def.csv', 'rb')
>    f.next() # discard the header row
>    r = csv.reader(f, delimiter=';')
>    fields = [
>      (varname, slice(int(start), int(start)+int(size)), width)
>      for varname, start, size, width
>      in r
>      ]
>    f.close()
>    out = file('out.txt', 'w')
>    try:
>      for row in file('data.txt'):
>        for varname, slc, width in fields:
>          out.write(row[slc].ljust(width))
>        out.write('\n')
>    finally:
>      out.close()
>
> Hope that's fairly easy to follow and makes sense.  There might
> be some fence-posting errors (particularly your use of "1" as the
> initial offset, while python uses "0" as the initial offset for
> strings)
>
> If you can't modify the def.csv format, then things are a bit
> more complex and I'd almost be tempted to write a script to try
> and convert your existing def.csv format into something simpler
> to process like what I describe.
>
> -tkc- Piilota siteerattu teksti -
>
> - Näytä siteerattu teksti -

Hi,

Thanks for your reply.

Def.csv could be modified so that every line has the same structure:
variable name, field start, field size and new place and would be
separated with semicolomns as you mentioned.

I tried your script (which seems quite logical) but I get this

Traceback (most recent call last):
  File "testing.py", line 16, in 
out.write (row[slc].ljust(width))
TypeError: an integer is required

Yes - you said it was untested, but I can't figure out how to
proceed...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file reformatting

2010-10-31 Thread cbr...@cbrownsystems.com
On Oct 31, 12:48 pm, Tim Chase  wrote:
> > PRJ01001 4 00100END
> > PRJ01002 3 00110END
>
> > I would like to pick only some columns to a new file and put them to a
> > certain places (to match previous data) - definition file (def.csv)
> > could be something like this:
>
> > VARIABLE   FIELDSTARTS     FIELD SIZE      NEW PLACE IN NEW DATA FILE
> > ProjID     ;       1       ;       5       ;       1
> > CaseID     ;       6       ;       3       ;       10
> > UselessV  ;        10      ;       1       ;
> > Zipcode    ;       12      ;       5       ;       15
>
> > So the new datafile should look like this:
>
> > PRJ01    001       00100END
> > PRJ01    002       00110END
>
> How flexible is the def.csv format?  The difficulty I see with
> your def.csv format is that it leaves undefined gaps (presumably
> to be filled in with spaces) and that you also have a blank "new
> place in new file" value.  If instead, you could specify the
> width to which you want to pad it and omit variables you don't
> want in the output, ordering the variables in the same order you
> want them in the output:
>
>   Variable; Start; Size; Width
>   ProjID; 1; 5; 10
>   CaseID; 6; 3; 10
>   Zipcode; 12; 5; 5
>   End; 16; 3; 3
>
> (note that I lazily use the same method to copy the END from the
> source to the destination, rather than coding specially for it)
> you could do something like this (untested)
>
>    import csv
>    f = file('def.csv', 'rb')
>    f.next() # discard the header row
>    r = csv.reader(f, delimiter=';')
>    fields = [
>      (varname, slice(int(start), int(start)+int(size)), width)
>      for varname, start, size, width
>      in r
>      ]
>    f.close()
>    out = file('out.txt', 'w')
>    try:
>      for row in file('data.txt'):
>        for varname, slc, width in fields:
>          out.write(row[slc].ljust(width))
>        out.write('\n')
>    finally:
>      out.close()
>
> Hope that's fairly easy to follow and makes sense.  There might
> be some fence-posting errors (particularly your use of "1" as the
> initial offset, while python uses "0" as the initial offset for
> strings)
>
> If you can't modify the def.csv format, then things are a bit
> more complex and I'd almost be tempted to write a script to try
> and convert your existing def.csv format into something simpler
> to process like what I describe.
>
> -tkc

To your point about the non-stand csv encoding in the defs.csv file,
you could use a reg exp instead of the csv module to solve that:

import re

parse_columns = re.compile(r'\s*;\s*')

f = file('defs.csv', 'rb')
f.readline() # discard the header row
r = (parse_columns.split(line.strip()) for line in f)
fields = [
 (varname, slice(int(start), int(start)+int(size), int(width) if
width else 0))
for varname, start, size, width in r
 ]
f.close()

which given the OP's csv produces for fields:

[('ProjID', slice(1, 6, 1)), ('CaseID', slice(6, 9, 10)), ('UselessV',
slice(10, 11, 0)), ('Zipcode', slice(12, 17, 15))]

and that should work with the remainder of your original code;
although perhaps the OP wants something else to happen when width is
omitted from the csv...

Cheers - Chas

-- 
http://mail.python.org/mailman/listinfo/python-list


RE: text file reformatting

2010-10-31 Thread Braden Faulkner

Sorry to clarify, I was having issues getting this to work. I'm relatively new 
to Python. Sorry for the miscommunication.

> Date: Sun, 31 Oct 2010 16:13:42 -0500
> From: python.l...@tim.thechases.com
> To: brad...@hotmail.com
> CC: python-list@python.org
> Subject: Re: text file reformatting
> 
> On 10/31/10 14:52, Braden Faulkner wrote:
> >> import csv
> >> f = file('def.csv', 'rb')
> >> f.next() # discard the header row
> >> r = csv.reader(f, delimiter=';')
> >> fields = [
> >>   (varname, slice(int(start), int(start)+int(size)), width)
> >>   for varname, start, size, width
> >>   in r
> >>   ]
> >> f.close()
> >> out = file('out.txt', 'w')
> >> try:
> >>   for row in file('data.txt'):
> >> for varname, slc, width in fields:
> >>   out.write(row[slc].ljust(width))
> >> out.write('\n')
> >> finally:
> >>   out.close()
> >
> > I also am having issues with this.
> 
> [top-posting fixed -- it's generally frowned upon in this 
> newsgroup/mailing-list and adherence to the preferences will tend 
> to get you a wider audience]
> 
> Are your issues with my code, or with the topic at hand?  If it's 
> my code, note my comment about it being untested.  If it's the 
> topic at hand, I recommend trying my code (or a variation 
> there-of after you've tested it).
> 
> -tkc
> 
> 
  -- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file reformatting

2010-10-31 Thread Tim Chase

On 10/31/10 14:52, Braden Faulkner wrote:

import csv
f = file('def.csv', 'rb')
f.next() # discard the header row
r = csv.reader(f, delimiter=';')
fields = [
  (varname, slice(int(start), int(start)+int(size)), width)
  for varname, start, size, width
  in r
  ]
f.close()
out = file('out.txt', 'w')
try:
  for row in file('data.txt'):
for varname, slc, width in fields:
  out.write(row[slc].ljust(width))
out.write('\n')
finally:
  out.close()


I also am having issues with this.


[top-posting fixed -- it's generally frowned upon in this 
newsgroup/mailing-list and adherence to the preferences will tend 
to get you a wider audience]


Are your issues with my code, or with the topic at hand?  If it's 
my code, note my comment about it being untested.  If it's the 
topic at hand, I recommend trying my code (or a variation 
there-of after you've tested it).


-tkc


--
http://mail.python.org/mailman/listinfo/python-list


RE: text file reformatting

2010-10-31 Thread Braden Faulkner

I also am having issues with this.

> Date: Sun, 31 Oct 2010 14:48:09 -0500
> From: python.l...@tim.thechases.com
> To: iwawi...@gmail.com
> Subject: Re: text file reformatting
> CC: python-list@python.org
> 
> > PRJ01001 4 00100END
> > PRJ01002 3 00110END
> >
> > I would like to pick only some columns to a new file and put them to a
> > certain places (to match previous data) - definition file (def.csv)
> > could be something like this:
> >
> > VARIABLEFIELDSTARTS FIELD SIZE  NEW PLACE IN NEW DATA FILE
> > ProjID  ;   1   ;   5   ;   1
> > CaseID  ;   6   ;   3   ;   10
> > UselessV  ; 10  ;   1   ;
> > Zipcode ;   12  ;   5   ;   15
> >
> > So the new datafile should look like this:
> >
> > PRJ01001   00100END
> > PRJ01002   00110END
> 
> 
> How flexible is the def.csv format?  The difficulty I see with 
> your def.csv format is that it leaves undefined gaps (presumably 
> to be filled in with spaces) and that you also have a blank "new 
> place in new file" value.  If instead, you could specify the 
> width to which you want to pad it and omit variables you don't 
> want in the output, ordering the variables in the same order you 
> want them in the output:
> 
>   Variable; Start; Size; Width
>   ProjID; 1; 5; 10
>   CaseID; 6; 3; 10
>   Zipcode; 12; 5; 5
>   End; 16; 3; 3
> 
> (note that I lazily use the same method to copy the END from the 
> source to the destination, rather than coding specially for it) 
> you could do something like this (untested)
> 
>import csv
>f = file('def.csv', 'rb')
>f.next() # discard the header row
>r = csv.reader(f, delimiter=';')
>fields = [
>  (varname, slice(int(start), int(start)+int(size)), width)
>  for varname, start, size, width
>  in r
>  ]
>f.close()
>out = file('out.txt', 'w')
>try:
>  for row in file('data.txt'):
>for varname, slc, width in fields:
>  out.write(row[slc].ljust(width))
>out.write('\n')
>finally:
>  out.close()
> 
> Hope that's fairly easy to follow and makes sense.  There might 
> be some fence-posting errors (particularly your use of "1" as the 
> initial offset, while python uses "0" as the initial offset for 
> strings)
> 
> If you can't modify the def.csv format, then things are a bit 
> more complex and I'd almost be tempted to write a script to try 
> and convert your existing def.csv format into something simpler 
> to process like what I describe.
> 
> -tkc
> 
> 
> -- 
> http://mail.python.org/mailman/listinfo/python-list
  -- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file reformatting

2010-10-31 Thread Tim Chase

PRJ01001 4 00100END
PRJ01002 3 00110END

I would like to pick only some columns to a new file and put them to a
certain places (to match previous data) - definition file (def.csv)
could be something like this:

VARIABLEFIELDSTARTS FIELD SIZE  NEW PLACE IN NEW DATA FILE
ProjID  ;   1   ;   5   ;   1
CaseID  ;   6   ;   3   ;   10
UselessV  ; 10  ;   1   ;
Zipcode ;   12  ;   5   ;   15

So the new datafile should look like this:

PRJ01001   00100END
PRJ01002   00110END



How flexible is the def.csv format?  The difficulty I see with 
your def.csv format is that it leaves undefined gaps (presumably 
to be filled in with spaces) and that you also have a blank "new 
place in new file" value.  If instead, you could specify the 
width to which you want to pad it and omit variables you don't 
want in the output, ordering the variables in the same order you 
want them in the output:


 Variable; Start; Size; Width
 ProjID; 1; 5; 10
 CaseID; 6; 3; 10
 Zipcode; 12; 5; 5
 End; 16; 3; 3

(note that I lazily use the same method to copy the END from the 
source to the destination, rather than coding specially for it) 
you could do something like this (untested)


  import csv
  f = file('def.csv', 'rb')
  f.next() # discard the header row
  r = csv.reader(f, delimiter=';')
  fields = [
(varname, slice(int(start), int(start)+int(size)), width)
for varname, start, size, width
in r
]
  f.close()
  out = file('out.txt', 'w')
  try:
for row in file('data.txt'):
  for varname, slc, width in fields:
out.write(row[slc].ljust(width))
  out.write('\n')
  finally:
out.close()

Hope that's fairly easy to follow and makes sense.  There might 
be some fence-posting errors (particularly your use of "1" as the 
initial offset, while python uses "0" as the initial offset for 
strings)


If you can't modify the def.csv format, then things are a bit 
more complex and I'd almost be tempted to write a script to try 
and convert your existing def.csv format into something simpler 
to process like what I describe.


-tkc


--
http://mail.python.org/mailman/listinfo/python-list


Re: Text file to XML representation

2009-10-22 Thread Jorgen Grahn
On Wed, 2009-10-21, kak...@gmail.com wrote:
> Hello,
> I would like to make a program that takes a text file with the
> following representation:
>
> outlook = sunny
> |   humidity <= 70: yes (2.0)
> |   humidity > 70: no (3.0)
> outlook = overcast: yes (4.0)
> outlook = rainy
> |   windy = TRUE: no (2.0)
> |   windy = FALSE: yes (3.0)
>
> and convert it to xml file for example:
> 
> 
...
> 
>
> Is there a way to do it?

No. Impossible.

No, of course it is possible.  I'd think of it as a problem of (a)
making clear to yourself what the input format (language) is, (b)
write a parser for it (which transform it to Python data structures
and (c) write code to dump the data structures according to some DTD
(or whatever the XML people call it these days).

(c) seems to be the easy part.

/Jorgen

-- 
  // Jorgen GrahnO  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Text file to XML representation

2009-10-21 Thread Bruno Desthuilliers

kak...@gmail.com a écrit :

Hello,
I would like to make a program that takes a text file with the
following representation:

outlook = sunny
|   humidity <= 70: yes (2.0)
|   humidity > 70: no (3.0)
outlook = overcast: yes (4.0)
outlook = rainy
|   windy = TRUE: no (2.0)
|   windy = FALSE: yes (3.0)

and convert it to xml file for example:


(snip xml)


Is there a way to do it?


More than one. But I'd stronly suggest something like PyParsing + 
ElementTree.


PyParsing : http://pyparsing.wikispaces.com/

ElementTree : is now in the stdlib, so refer to the FineManual


--
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2008-10-01 Thread Tim Golden

[EMAIL PROTECTED] wrote:

HI all,
i have some problem with the code belove, i have a list of servers in
a textfile (elencopc.txt) i would to retrieve informations via WMI
( cicle for ), but i don't understand if the code is correct:



Try this, using http://timgolden.me.uk/python/wmi.html :


import wmi

#
# For the test to work
#
open ("elencopc.txt", "w").write ("localhost")

for server in open ("elencopc.txt").read ().splitlines ():
  c = wmi.WMI (server)
  print "SERVER:", server
  for item in c.Win32_QuickFixEngineering ():
print item # or print item.Caption, etc.
  print
  print



If you get RPC Server unavailable, it usually means that
the WMI service isn't running on that machine. Usually.

TJG
--
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2008-10-01 Thread Lie Ryan
On Wed, 01 Oct 2008 07:19:44 -0700, yqyq22 wrote:
> My problem is how to translate this vbs in python:
> 
> Dim fso
> Dim strComputer
> Set fso = CreateObject("Scripting.FileSystemObject") Set ElencoPC =
> fso.OpenTextFile("elencoPC.txt" , 1, False) Do Until
> ElencoPC.AtEndOfStream
> strComputer = ElencoPC.ReadLine
> 
> thanks

try this:

fso = open('elencoPC.txt', 'r')
for line in f:
strComputer = line


--
http://mail.python.org/mailman/listinfo/python-list


Re: text file

2008-10-01 Thread yqyq22
On Oct 1, 4:03 pm, [EMAIL PROTECTED] wrote:
> HI all,
> i have some problem with the code belove, i have a list of servers in
> a textfile (elencopc.txt) i would to retrieve informations via WMI
> ( cicle for ), but i don't understand if the code is correct:
>
> import win32com.client
> import string
> import sys
> listserver = open('c:\\elencopc.txt','r')
> objWMIService = win32com.client.Dispatch("WbemScripting.SWbemLocator")
> objSWbemServices = objWMIService.ConnectServer(listserver,"root
> \cimv2")
> colItems = objSWbemServices.ExecQuery("Select * from
> Win32_QuickFixEngineering")
> for objItem in colItems:
>     print "Caption: ", objItem.Caption
>     print "Description: ", objItem.Description
>     print "Fix Comments: ", objItem.FixComments
>     print "HotFix ID: ", objItem.HotFixID
>     print "Install Date: ", objItem.InstallDate
>     print "Installed By: ", objItem.InstalledBy
>     print "Installed On: ", objItem.InstalledOn
>     print "Name: ", objItem.Name
>     print "Service Pack In Effect: ", objItem.ServicePackInEffect
>     print "Status: ", objItem.Status
>
> I receive the error :
> ile "C:\Python25\Lib\site-packages\win32com\client\dynamic.py", line
> 258, in _ApplyTypes_
>     result = self._oleobj_.InvokeTypes(*(dispid, LCID, wFlags,
> retType, argTypes) + args)
> com_error: (-2147352567, 'Exception occurred.', (0, 'SWbemLocator',
> 'The RPC server is unavailable. ', None, 0, -2147023174), None)
>
> MY big dubt is if the code is correct... because if i use vbscript all
> works fine..
> thanks a lot in advance

My problem is how to translate this vbs in python:

Dim fso
Dim strComputer
Set fso = CreateObject("Scripting.FileSystemObject")
Set ElencoPC = fso.OpenTextFile("elencoPC.txt" , 1, False)
Do Until ElencoPC.AtEndOfStream
strComputer = ElencoPC.ReadLine

thanks
--
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread Bruno Desthuilliers
Dag a écrit :
> I have an application which works with lists of tuples of the form
> (id_nr,'text','more text',1 or 0).  I'll have maybe 20-50 or so of these 
> lists containing anywhere from 3 to over 3 tuples.  The actions I
> need to do is either append a new tuple to the end of the list, display 
> all the tuples or display all the tuples where the last element is a 1
> 
> Basically what I'm wondering is the best way to store these data stuctures 
> to disc.  As the subject mentioned I've basically got three approaches.
> Store each list as a text file, pickle each list to file or shove the
> whole thing into a bunch of database tables.  I can see pros and cons
> with each approach.  Does anybody have any advice as to whether any of
> these approaches is obviously better than any other?  

Seems that so far, you get as many different opinion as answers - not 
sure this will help much :-/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread Paddy
On Apr 11, 5:40 pm, Dag <[EMAIL PROTECTED]> wrote:
> I have an application which works with lists of tuples of the form
> (id_nr,'text','more text',1 or 0).  I'll have maybe 20-50 or so of these
> lists containing anywhere from 3 to over 3 tuples.  The actions I
> need to do is either append a new tuple to the end of the list, display
> all the tuples or display all the tuples where the last element is a 1
>
> Basically what I'm wondering is the best way to store these data stuctures
> to disc.  As the subject mentioned I've basically got three approaches.
> Store each list as a text file, pickle each list to file or shove the
> whole thing into a bunch of database tables.  I can see pros and cons
> with each approach.  Does anybody have any advice as to whether any of
> these approaches is obviously better than any other?  On one hand I like
> the text file approach since it lets me append without loading
> everything into memory, on the other hand the sqlite approach makes it
> easy to select stuff with SELECT * FROM foo WHERE... which could be
> handy if ever need to add more advanced filtering.
>
> Dag

If you have enough resources to keep all the lists comfortably in
memory, and you have enough disk space then I would save your data as
python text. Something like:

print "# "
print "all_lists = []"
for i,l in enumerate(all_lists):
  print "all_lists.append( [  #", i
  for tpl in l:
print " ", tpl, ","
  print " ])  #", i

You would then have your data saved in a format that could easily
be re-used by other programs at a later date, and that can be
examined in any text editor.

- Paddy.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread Bruno Desthuilliers
John Machin a écrit :
(snip)
> ... and a few more cents:
> 
> There are *two* relations/tables involved (at least): a "tuple" table
> and a "list" table.


Mmm... From a purely technical POV, not necessarily. If there's no need 
for anything else than distinguishing between different lists, a single 
table with a compound key (list_id, tuple_id) could be enough...


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread John Machin
On Apr 12, 7:09 am, Bruno Desthuilliers
<[EMAIL PROTECTED]> wrote:
> Dag a écrit :
>
>
>
> > I have an application which works with lists of tuples of the form
> > (id_nr,'text','more text',1 or 0).  I'll have maybe 20-50 or so of these
> > lists containing anywhere from 3 to over 3 tuples.  The actions I
> > need to do is either append a new tuple to the end of the list, display
> > all the tuples or display all the tuples where the last element is a 1
>
> > Basically what I'm wondering is the best way to store these data stuctures
> > to disc.  As the subject mentioned I've basically got three approaches.
> > Store each list as a text file, pickle each list to file or shove the
> > whole thing into a bunch of database tables.  I can see pros and cons
> > with each approach.  Does anybody have any advice as to whether any of
> > these approaches is obviously better than any other?  On one hand I like
> > the text file approach since it lets me append without loading
> > everything into memory, on the other hand the sqlite approach makes it
> > easy to select stuff with SELECT * FROM foo WHERE... which could be
> > handy if ever need to add more advanced filtering.

s/if/when/

>
> Given your specs, I'd go for SQLite without any hesitation. Your data
> structure is obviously relational (a list of tuples is a pretty good
> definition of a relation), so a relational DBMS is the obvious solution,
> and you'll get lots of other benefits from it (SQL being only one of
> them - you can also think about free optimization, scalability, and
> interoperability). And if you don't like raw SQL and prefer something
> more pythonic, then you have SQLAlchemy and Elixir.
>
> My 2 cents...

... and a few more cents:

There are *two* relations/tables involved (at least): a "tuple" table
and a "list" table. The 20-50 or so lists need a unique name or number
each, and other attributes of a list are sure to come out of the
woodwork later. Each tuple will need a column containing the ID of the
list it belongs to. It's a bit boggling that (1) each tuple has an
id_nr but there's no requirement to query on it  (2) req. only to
"append" new tuples w/o checking id_nr already exists (3) req. to
"display" all of 30,000 tuples ...


-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread Bruno Desthuilliers
Dag a écrit :
> I have an application which works with lists of tuples of the form
> (id_nr,'text','more text',1 or 0).  I'll have maybe 20-50 or so of these 
> lists containing anywhere from 3 to over 3 tuples.  The actions I
> need to do is either append a new tuple to the end of the list, display 
> all the tuples or display all the tuples where the last element is a 1
> 
> Basically what I'm wondering is the best way to store these data stuctures 
> to disc.  As the subject mentioned I've basically got three approaches.
> Store each list as a text file, pickle each list to file or shove the
> whole thing into a bunch of database tables.  I can see pros and cons
> with each approach.  Does anybody have any advice as to whether any of
> these approaches is obviously better than any other?  On one hand I like
> the text file approach since it lets me append without loading
> everything into memory, on the other hand the sqlite approach makes it
> easy to select stuff with SELECT * FROM foo WHERE... which could be
> handy if ever need to add more advanced filtering.

Given your specs, I'd go for SQLite without any hesitation. Your data 
structure is obviously relational (a list of tuples is a pretty good 
definition of a relation), so a relational DBMS is the obvious solution, 
and you'll get lots of other benefits from it (SQL being only one of 
them - you can also think about free optimization, scalability, and 
interoperability). And if you don't like raw SQL and prefer something 
more pythonic, then you have SQLAlchemy and Elixir.

My 2 cents...
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file vs. cPickle vs sqlite a design question

2007-04-11 Thread Gabriel Genellina
En Wed, 11 Apr 2007 13:40:02 -0300, Dag <[EMAIL PROTECTED]> escribió:

> I have an application which works with lists of tuples of the form
> (id_nr,'text','more text',1 or 0).  I'll have maybe 20-50 or so of these
> lists containing anywhere from 3 to over 3 tuples.  The actions I
> need to do is either append a new tuple to the end of the list, display
> all the tuples or display all the tuples where the last element is a 1
>
> Basically what I'm wondering is the best way to store these data  
> stuctures
> to disc.  As the subject mentioned I've basically got three approaches.
> Store each list as a text file, pickle each list to file or shove the
> whole thing into a bunch of database tables.  I can see pros and cons
> with each approach.  Does anybody have any advice as to whether any of

 From your description, none of these three approaches is obviously better.
Try to isolate the data from its storage, and use the easiest way now  
(pickle perhaps?).
This way you can change it later easily - maybe to use sqlite if you need  
more difficult queries.

-- 
Gabriel Genellina
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file parsing (awk -> python)

2006-11-22 Thread bearophileHUGS
Peter Otten, your solution is very nice, it uses groupby splitting on
empty lines, so it doesn't need to read the whole files into memory.

But Daniel Nogradi says:
> But the names of the fields (node, x, y) keeps changing from file to
> file, even their number is not fixed, sometimes it is (node, x, y, z).

Your version with the converters dict fails to convert the number of
node, z fields, etc. (generally using such converters dict is an
elegant solution, it allows to define string, float, etc fields):

> converters = dict(
> x=int,
> y=int
> )


I have created a version with a RE, but it's probably too much rigid,
it doesn't handle files with the z field, etc:

data = """node 10
y 1
x -1

node 11
x -2
y 1
z 5

node 12
x -3
y 1
z 6"""

import re
unpack = re.compile(r"(\D+)   \s+  ([-+]?  \d+) \s+" * 3, re.VERBOSE)

result = []
for obj in unpack.finditer(data):
block = obj.groups()
d = dict((block[i], int(block[i+1])) for i in xrange(0, 6, 2))
result.append(d)

print result


So I have just modified and simplified your quite nice solution (I have
removed the pprint, but it's the same):

def open(filename):
from cStringIO import StringIO
return StringIO(data)

from itertools import groupby

records = []
for empty, record in groupby(open("records.txt"), key=str.isspace):
if not empty:
pairs = ([k, int(v)] for k,v in map(str.split, record))
records.append(dict(pairs))

print records

Bye,
bearophile

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file parsing (awk -> python)

2006-11-22 Thread Daniel Nogradi
> > I have an awk program that parses a text file which I would like to
> > rewrite in python. The text file has multi-line records separated by
> > empty lines and each single-line field has two subfields:
> >
> > node 10
> > x -1
> > y 1
> >
> > node 11
> > x -2
> > y 1
> >
> > node 12
> > x -3
> > y 1
> >
> > and this I would like to parse into a list of dictionaries like so:
> >
> > mydict[0] = { 'node':10, 'x':-1, 'y':1 }
> > mydict[1] = { 'node':11, 'x':-2, 'y':1 }
> > mydict[2] = { 'node':12, 'x':-3', 'y':1 }
> >
> > But the names of the fields (node, x, y) keeps changing from file to
> > file, even their number is not fixed, sometimes it is (node, x, y, z).
> >
> > What would be the simples way to do this?
>
> data = """node 10
> x -1
> y 1
>
> node 11
> x -2
> y 1
>
> node 12
> x -3
> y 1
> """
>
> def open(filename):
> from cStringIO import StringIO
> return StringIO(data)
>
> converters = dict(
> x=int,
> y=int
> )
>
> def name_value(line):
> name, value = line.split(None, 1)
> return name, converters.get(name, str.rstrip)(value)
>
> if __name__ == "__main__":
> from itertools import groupby
> records = []
>
> for empty, record in groupby(open("records.txt"), key=str.isspace):
> if not empty:
> records.append(dict(name_value(line) for line in record))
>
> import pprint
> pprint.pprint(records)


Thanks very much, that's exactly what I had in mind.

Thanks again,
Daniel
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: text file parsing (awk -> python)

2006-11-22 Thread Peter Otten
Daniel Nogradi wrote:

> I have an awk program that parses a text file which I would like to
> rewrite in python. The text file has multi-line records separated by
> empty lines and each single-line field has two subfields:
> 
> node 10
> x -1
> y 1
> 
> node 11
> x -2
> y 1
> 
> node 12
> x -3
> y 1
> 
> and this I would like to parse into a list of dictionaries like so:
> 
> mydict[0] = { 'node':10, 'x':-1, 'y':1 }
> mydict[1] = { 'node':11, 'x':-2, 'y':1 }
> mydict[2] = { 'node':12, 'x':-3', 'y':1 }
> 
> But the names of the fields (node, x, y) keeps changing from file to
> file, even their number is not fixed, sometimes it is (node, x, y, z).
> 
> What would be the simples way to do this?

data = """node 10
x -1
y 1

node 11
x -2
y 1

node 12
x -3
y 1
"""

def open(filename):
from cStringIO import StringIO
return StringIO(data)

converters = dict(
x=int,
y=int
)

def name_value(line):
name, value = line.split(None, 1)
return name, converters.get(name, str.rstrip)(value)

if __name__ == "__main__":
from itertools import groupby
records = []

for empty, record in groupby(open("records.txt"), key=str.isspace):
if not empty:
records.append(dict(name_value(line) for line in record))

import pprint
pprint.pprint(records)


-- 
http://mail.python.org/mailman/listinfo/python-list