On Fri, Aug 19, 2011 at 8:31 AM, Forafo San <ppv.g...@gmail.com> wrote:
> Folks, > What might be a good replacement for the shelve module, but one that > can handle a few gigs of data. I'm doing some calculations on daily > stock prices and the result is a nested list like: > > [[date_1, floating result 1], > [date_2, floating result 2], > ... > [date_n, floating result n]] > > However, there are about 5,000 lists like that, one for each stock > symbol. Using the shelve module I could easily save them to a file > ( myshelvefile['symbol_1') = symbol_1_list) and likewise retrieve the > data. But shelve is deprecated AND when a lot of data is written > shelve was acting weird (refusing to write, filesizes reported with an > "ls" did not make sense, etc.). > I'd probably use a cachedb, though perhaps I'm biased since I wrote it: http://stromberg.dnsalias.org/~dstromberg/cachedb.html It'll allow you to specify functions for serializing and deserializing values (but not keys), and cache a user-specified number of values in virtual memory. IOW, once you instantiate the class, you pretty much get caching and seralizing/deserializing as freebies, without the details of same getting scattered throughout your code. It wraps something like gdbm.
-- http://mail.python.org/mailman/listinfo/python-list