On Wed, Jan 12, 2011 at 1:05 PM, Scott McCarty <scott.mcca...@gmail.com> wrote: > Sorry to ask this question. I have search the list archives and googled, but > I don't even know what words to find what I am looking for, I am just > looking for a little kick in the right direction. > I have a Python based log analysis program called petit > (http://crunchtools.com/petit). I am trying to modify it to manage the main > object types to and from disk. > Essentially, I have one object which is a list of a bunch of "Entry" > objects. The Entry objects have date, time, date, etc fields which I use for > analysis techniques. At the very beginning I build up the list of objects > then would like to start pickling it while building to save memory. I want > to be able to process more entries than I have memory. With a strait list it > looks like I could build from xreadlines(), but once you turn it into a more > complex object, I don't quick know where to go. > I understand how to pickle the entire data structure, but I need something > that will manage the memory/disk allocation? Any thoughts?
You could subclass `list` and use sys.getsizeof() [http://docs.python.org/library/sys.html#sys.getsizeof ] to keep track of the size of the elements, and then start pickling them to disk once the total size reaches some preset limit. But like MRAB said, using a proper database, e.g. SQLite (http://docs.python.org/library/sqlite3.html ), wouldn't be a bad idea either. Cheers, Chris -- http://blog.rebertia.com -- http://mail.python.org/mailman/listinfo/python-list