Scott McCarty wrote: > Sorry to ask this question. I have search the list archives and googled, > but I don't even know what words to find what I am looking for, I am just > looking for a little kick in the right direction. > > I have a Python based log analysis program called petit ( > http://crunchtools.com/petit). I am trying to modify it to manage the main > object types to and from disk. > > Essentially, I have one object which is a list of a bunch of "Entry" > objects. The Entry objects have date, time, date, etc fields which I use > for analysis techniques. At the very beginning I build up the list of > objects then would like to start pickling it while building to save > memory. I want to be able to process more entries than I have memory. With > a strait list it looks like I could build from xreadlines(), but once you > turn it into a more complex object, I don't quick know where to go. > > I understand how to pickle the entire data structure, but I need something > that will manage the memory/disk allocation? Any thoughts?
You can write multiple pickled objects into a single file: import cPickle as pickle def dump(filename, items): with open(filename, "wb") as out: dump = pickle.Pickler(out).dump for item in items: dump(item) def load(filename): with open(filename, "rb") as instream: load = pickle.Unpickler(instream).load while True: try: item = load() except EOFError: break yield item if __name__ == "__main__": filename = "tmp.pickle" from collections import namedtuple T = namedtuple("T", "alpha beta") dump(filename, (T(a, b) for a, b in zip("abc", [1,2,3]))) for item in load(filename): print item To get random access you'd have to maintain a list containing the offsets of the entries in the file. However, a simple database like SQLite is probably sufficient for the kind of entries you have in mind, and it allows operations like aggregation, sorting and grouping out of the box. Peter -- http://mail.python.org/mailman/listinfo/python-list