Estimating memory use?

2005-11-27 Thread Roy Smith
I've got a large text processing task to attack (it's actually a genomics task; matching DNA probes against bacterial genomes). I've got roughly 200,000 probes, each of which is a 25 character long text string. My first thought is to compile these into 200,000 regexes, but before I launch

Re: Estimating memory use?

2005-11-27 Thread Tim N. van der Leeuw
Hi, What is your 'static' data (database), and what is your input-data? Those 200.000 probes are your database? Perhaps they can be stored as pickled compiled regexes and thus be loaded in pickled form; then you don't need to keep them all in memory at once -- if you fear that memory usage will

Re: Estimating memory use?

2005-11-27 Thread Alex Martelli
Roy Smith [EMAIL PROTECTED] wrote: ... Is there any easy way to find out how much memory a Python object takes? No, but there are a few early attempts out there at supplying SOME ways (not necessarily easy, but SOME). For example, PySizer, at http://pysizer.8325.org/. Alex --

Re: Estimating memory use?

2005-11-27 Thread Roy Smith
In article [EMAIL PROTECTED], [EMAIL PROTECTED] (Alex Martelli) wrote: Roy Smith [EMAIL PROTECTED] wrote: ... Is there any easy way to find out how much memory a Python object takes? No, but there are a few early attempts out there at supplying SOME ways (not necessarily easy, but

Re: Estimating memory use?

2005-11-27 Thread Fredrik Lundh
Roy Smith wrote: I've already discovered one (very) surprising thing -- if I build a dict containing all my regexes (takes about 3 minutes on my PowerBook) and pickle them to a file, re-loading the pickle takes just about as long as compiling them did in the first place. the internal RE byte

Re: Estimating memory use?

2005-11-27 Thread MrJean1
There is a function mx_sizeof() in the mx.Tools module from eGenix which may be helpful. More at http://www.egenix.com/files/python/eGenix-mx-Extensions.html#mxTools /Jean Brouwers PS) This is an approximation for memory usage which is useful in certain, simple cases. Each built-in type has

Re: Estimating memory use?

2005-11-27 Thread MrJean1
The name of the function in mx.Tools is sizeof() and not mx_sizeof(). My apologies. Also, it turns out that the return value of mx.Tools.sizeof() function is non-aligned. For example mx.Tools.sizeof(abcde) returns 29 which is fine, but not entirely accurate. /Jean Brouwers --

Re: Estimating memory use?

2005-11-27 Thread François Pinard
[Fredrik Lundh] the internal RE byte code format is version dependent, so pickle stores the patterns instead. Oh! Nice to know. That explains why, when I was learning Python, my initial experiment with pickles left me with the (probably wrong) feeling that they were not worth the trouble.