Steve, thats actually an interesting idea. I'll look into that. Thanks!

2008/12/15 Daniel Kersten <[email protected]>:
> Hi Juan.That would probably be the best way to do it alright. It has
> the added side effect that I can display stats before I have read the
> complete dataset too, which would be useful.
>
> Thanks everyone else too! You gave me some nice ideas and sent my
> brain back working, thats pretty much what I was hoping for.
>
> Thanks again,
> Dan.
>
> 2008/12/15 Juan Hernandez Gomez <[email protected]>:
>> Sometime ago I started doing a custom stats analyzer and the idea was to not
>> load the full log (it was huge) but update
>> the stats as I was reading the log.
>>
>> I registered <individual stats analyzers> within the analyzer so for every
>> line read each <individual analyzer> did its own calculation (some kind of
>> aggregation of data) so they didn't consume to much memory either.
>>
>> I think you can do something similar. And add as many analyzers as different
>> criterias you have.
>> For example you can have one analyzer that reads the transaction id and
>> keeps it in a dictionary as long as all the different types of messages are
>> found ... all depends on what's the output you want to get (like missing
>> messages, repeated messages, ...)
>>
>> Hope you get the idea too.
>>
>>
>> Daniel Kersten wrote:
>>
>> PS: So, yes, they are readonly.
>>
>> 2008/12/15 Daniel Kersten <[email protected]>:
>>
>>
>> Ok, I'll give a few more details as to what I'm doing.
>>
>> Basically, I have a little python app which analyses log files (these
>> log files are large, I have one here thats incomplete and is already
>> 200MB). Each entry contains a number of fields which I package into
>> convenient little objects.
>>
>> The objects represent "messages" and the fields are addresses of those
>> messages (transaction id's etc) and I need to verify that if I get
>> message of type A that I then receive a message of type B with
>> matching transaction id's and address X in range Y... you get the
>> idea, I hope.
>>
>> I could use sqlite for this (and that might even be a good solution),
>> though I'd like to keep it in plain python, if possible, since its
>> meant to just be a little script which I can run over the log files on
>> whatever machine it happens to be on, though I may settle for using
>> sqlite if theres no alternative.
>>
>>
>> 2008/12/15 Juan Hernandez Gomez <[email protected]>:
>>
>>
>> Hi,
>>
>> you could create an SQLite table with the properties you want to filter as
>> columns and an extra column with the index of the object in your large list
>> (if not the full object).
>> Then you have the full power of SQL and you can create indexes as needed. Is
>> quite flexible.
>>
>> You haven't said if the list of objects can be updated or is just readonly.
>>
>> Juan
>>
>>
>> Daniel Kersten wrote:
>>
>> Hi all,
>>
>> I have a large list of objects which I'd like to filter on various criteria.
>> For example, I'd like to do something like:
>>   give me all objects o where o.a == "A" and o.b == "B" and o.c in [...]
>>
>> I thought of storing references to these objects in dictionaries, so
>> that I can look them up by their values (eg dict_of_a would contain
>> all objects where its value is the object and the key is that objects
>> value of 'a', this way if I do dict_of_a[o.a] I get back [o] (or more
>> elements, if other objects have the same value)) and then look up each
>> field and then perform a set union to get all objects which match the
>> desired criteria (though this doesn't work for the `in` operator). I
>> hope that made sense.
>>
>> The problem is that I have a large list of these objects (well over
>> 100k) and I was wondering if there was a better way of doing this?
>> Perhaps a super-efficient built in query object?? anything?
>>
>> I'm probably doing it wrong anyway, so any tips or ideas to push me
>> towards a proper solution would be greatly appreciated.
>>
>> Thanks,
>> Dan.
>>
>>
>>
>>
>>
>> --
>> Daniel Kersten.
>> Leveraging dynamic paradigms since the synergies of 1985.
>>
>>
>>
>>
>>
>> >>
>>
>
>
>
> --
> Daniel Kersten.
> Leveraging dynamic paradigms since the synergies of 1985.
>



-- 
Daniel Kersten.
Leveraging dynamic paradigms since the synergies of 1985.

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Python Ireland" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.ie/group/pythonireland?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to