Yes, Yes :-). I was using awk to do all of this. It does work but I find
myself repeating reading the same data because awk does not support complex
data structures. Plus the code is getting ugly.
I was told about Orange (http://orange.biolab.si/). Does anyone have
experience with it?
On Sat,
Thanks Andrea. I was thinking that too but I was wondering if there were any
other clever ways of doing this.
I also though, I can build a filesystem structure depending on the __time.
So, for January 01, 2011. I would create /tmp/data/20110101/data . This way
I can have a fast index of the data. A
On Sat, 26 Feb 2011 16:29:54 +0100, Andrea Crotti wrote:
> Il giorno 26/feb/2011, alle ore 06.45, Rita ha scritto:
>
>> I have a large text (4GB) which I am parsing.
>>
>> I am reading the file to collect stats on certain items.
>>
>> My approach has been simple,
>>
>> for row in open(file):
>
Il giorno 26/feb/2011, alle ore 06.45, Rita ha scritto:
> I have a large text (4GB) which I am parsing.
>
> I am reading the file to collect stats on certain items.
>
> My approach has been simple,
>
> for row in open(file):
> if "INFO" in row:
> line=row.split()
> user=line[0]
>
I have a large text (4GB) which I am parsing.
I am reading the file to collect stats on certain items.
My approach has been simple,
for row in open(file):
if "INFO" in row:
line=row.split()
user=line[0]
host=line[1]
__time=line[2]
...
I was wondering if there is a framewor