Steve, thats actually an interesting idea. I'll look into that. Thanks! 2008/12/15 Daniel Kersten <[email protected]>: > Hi Juan.That would probably be the best way to do it alright. It has > the added side effect that I can display stats before I have read the > complete dataset too, which would be useful. > > Thanks everyone else too! You gave me some nice ideas and sent my > brain back working, thats pretty much what I was hoping for. > > Thanks again, > Dan. > > 2008/12/15 Juan Hernandez Gomez <[email protected]>: >> Sometime ago I started doing a custom stats analyzer and the idea was to not >> load the full log (it was huge) but update >> the stats as I was reading the log. >> >> I registered <individual stats analyzers> within the analyzer so for every >> line read each <individual analyzer> did its own calculation (some kind of >> aggregation of data) so they didn't consume to much memory either. >> >> I think you can do something similar. And add as many analyzers as different >> criterias you have. >> For example you can have one analyzer that reads the transaction id and >> keeps it in a dictionary as long as all the different types of messages are >> found ... all depends on what's the output you want to get (like missing >> messages, repeated messages, ...) >> >> Hope you get the idea too. >> >> >> Daniel Kersten wrote: >> >> PS: So, yes, they are readonly. >> >> 2008/12/15 Daniel Kersten <[email protected]>: >> >> >> Ok, I'll give a few more details as to what I'm doing. >> >> Basically, I have a little python app which analyses log files (these >> log files are large, I have one here thats incomplete and is already >> 200MB). Each entry contains a number of fields which I package into >> convenient little objects. >> >> The objects represent "messages" and the fields are addresses of those >> messages (transaction id's etc) and I need to verify that if I get >> message of type A that I then receive a message of type B with >> matching transaction id's and address X in range Y... you get the >> idea, I hope. >> >> I could use sqlite for this (and that might even be a good solution), >> though I'd like to keep it in plain python, if possible, since its >> meant to just be a little script which I can run over the log files on >> whatever machine it happens to be on, though I may settle for using >> sqlite if theres no alternative. >> >> >> 2008/12/15 Juan Hernandez Gomez <[email protected]>: >> >> >> Hi, >> >> you could create an SQLite table with the properties you want to filter as >> columns and an extra column with the index of the object in your large list >> (if not the full object). >> Then you have the full power of SQL and you can create indexes as needed. Is >> quite flexible. >> >> You haven't said if the list of objects can be updated or is just readonly. >> >> Juan >> >> >> Daniel Kersten wrote: >> >> Hi all, >> >> I have a large list of objects which I'd like to filter on various criteria. >> For example, I'd like to do something like: >> give me all objects o where o.a == "A" and o.b == "B" and o.c in [...] >> >> I thought of storing references to these objects in dictionaries, so >> that I can look them up by their values (eg dict_of_a would contain >> all objects where its value is the object and the key is that objects >> value of 'a', this way if I do dict_of_a[o.a] I get back [o] (or more >> elements, if other objects have the same value)) and then look up each >> field and then perform a set union to get all objects which match the >> desired criteria (though this doesn't work for the `in` operator). I >> hope that made sense. >> >> The problem is that I have a large list of these objects (well over >> 100k) and I was wondering if there was a better way of doing this? >> Perhaps a super-efficient built in query object?? anything? >> >> I'm probably doing it wrong anyway, so any tips or ideas to push me >> towards a proper solution would be greatly appreciated. >> >> Thanks, >> Dan. >> >> >> >> >> >> -- >> Daniel Kersten. >> Leveraging dynamic paradigms since the synergies of 1985. >> >> >> >> >> >> >> >> > > > > -- > Daniel Kersten. > Leveraging dynamic paradigms since the synergies of 1985. >
-- Daniel Kersten. Leveraging dynamic paradigms since the synergies of 1985. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Python Ireland" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.ie/group/pythonireland?hl=en -~----------~----~----~----~------~----~------~--~---
