Re: parsing a file for analysis

2011-02-26 Thread Rita
Yes, Yes :-). I was using awk to do all of this. It does work but I find myself repeating reading the same data because awk does not support complex data structures. Plus the code is getting ugly. I was told about Orange (http://orange.biolab.si/). Does anyone have experience with it? On Sat,

Re: parsing a file for analysis

2011-02-26 Thread Rita
Thanks Andrea. I was thinking that too but I was wondering if there were any other clever ways of doing this. I also though, I can build a filesystem structure depending on the __time. So, for January 01, 2011. I would create /tmp/data/20110101/data . This way I can have a fast index of the data. A

Re: parsing a file for analysis

2011-02-26 Thread Martin Gregorie
On Sat, 26 Feb 2011 16:29:54 +0100, Andrea Crotti wrote: > Il giorno 26/feb/2011, alle ore 06.45, Rita ha scritto: > >> I have a large text (4GB) which I am parsing. >> >> I am reading the file to collect stats on certain items. >> >> My approach has been simple, >> >> for row in open(file): >

Re: parsing a file for analysis

2011-02-26 Thread Andrea Crotti
Il giorno 26/feb/2011, alle ore 06.45, Rita ha scritto: > I have a large text (4GB) which I am parsing. > > I am reading the file to collect stats on certain items. > > My approach has been simple, > > for row in open(file): > if "INFO" in row: > line=row.split() > user=line[0] >

parsing a file for analysis

2011-02-25 Thread Rita
I have a large text (4GB) which I am parsing. I am reading the file to collect stats on certain items. My approach has been simple, for row in open(file): if "INFO" in row: line=row.split() user=line[0] host=line[1] __time=line[2] ... I was wondering if there is a framewor