Daniel Kersten wrote:
> PS: So, yes, they are readonly.
>
> 2008/12/15 Daniel Kersten <[email protected]>:
>   
>> Ok, I'll give a few more details as to what I'm doing.
>>
>> Basically, I have a little python app which analyses log files (these
>> log files are large, I have one here thats incomplete and is already
>> 200MB). Each entry contains a number of fields which I package into
>> convenient little objects.
>>
>> The objects represent "messages" and the fields are addresses of those
>> messages (transaction id's etc) and I need to verify that if I get
>> message of type A that I then receive a message of type B with
>> matching transaction id's and address X in range Y... you get the
>> idea, I hope.
>>
>> I could use sqlite for this (and that might even be a good solution),
>> though I'd like to keep it in plain python, if possible, since its
>> meant to just be a little script which I can run over the log files on
>> whatever machine it happens to be on, though I may settle for using
>> sqlite if theres no alternative.
>>
>>
>> 2008/12/15 Juan Hernandez Gomez <[email protected]>:
>>     
>>> Hi,
>>>
>>> you could create an SQLite table with the properties you want to filter as
>>> columns and an extra column with the index of the object in your large list
>>> (if not the full object).
>>> Then you have the full power of SQL and you can create indexes as needed. Is
>>> quite flexible.
>>>
>>> You haven't said if the list of objects can be updated or is just readonly.
>>>
>>> Juan
>>>
>>>
>>> Daniel Kersten wrote:
>>>
>>> Hi all,
>>>
>>> I have a large list of objects which I'd like to filter on various criteria.
>>> For example, I'd like to do something like:
>>>   give me all objects o where o.a == "A" and o.b == "B" and o.c in [...]
>>>
>>> I thought of storing references to these objects in dictionaries, so
>>> that I can look them up by their values (eg dict_of_a would contain
>>> all objects where its value is the object and the key is that objects
>>> value of 'a', this way if I do dict_of_a[o.a] I get back [o] (or more
>>> elements, if other objects have the same value)) and then look up each
>>> field and then perform a set union to get all objects which match the
>>> desired criteria (though this doesn't work for the `in` operator). I
>>> hope that made sense.
>>>
>>> The problem is that I have a large list of these objects (well over
>>> 100k) and I was wondering if there was a better way of doing this?
>>> Perhaps a super-efficient built in query object?? anything?
>>>
>>> I'm probably doing it wrong anyway, so any tips or ideas to push me
>>> towards a proper solution would be greatly appreciated.
>>>
>>> Thanks,
>>> Dan.
>>>
>>>
>>>
>>>       
>>
>> --
>> Daniel Kersten.
>> Leveraging dynamic paradigms since the synergies of 1985.
>>
>>     
>
>
>
>   
Could you use a string of generators to handle the filtering process? So 
if your objects (logobj) were instantiated from a line of the log file, 
then:

fp = open(yourlog, 'r')
events = (logobj(x) for x in fp)
match_one = (e for e in events if e.a == 'A')
match_two = (m for m in match_one if m.b == 'B')

would result in the generator match_two which would provide objects with 
o.a == 'A' and o.b == 'B'. You can reuse or reorganise the generators as 
necessary to perform the desired filtering.

Padraig

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Python Ireland" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.ie/group/pythonireland?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to