Re: Memory usage per top 10x usage per heapy

2012-09-27 Thread bryanjugglercryptographer
MrsEntity wrote: > Based on heapy, a db based solution would be serious overkill. I've embraced overkill and my life is better for it. Don't confuse overkill with cost. Overkill is your friend. The facts of the case: You need to save some derived strings for each of 2M input lines. Even half th

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 26 September 2012 00:35, Tim Chase wrote: > On 09/25/12 17:55, Oscar Benjamin wrote: > > On 25 September 2012 23:10, Tim Chase > wrote: > >> If tuples provide a savings but you find them opaque, you might also > >> consider named-tuples for clarity. > > > > Do they have the same memory usage?

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Tim Chase
On 09/25/12 17:55, Oscar Benjamin wrote: > On 25 September 2012 23:10, Tim Chase wrote: >> If tuples provide a savings but you find them opaque, you might also >> consider named-tuples for clarity. > > Do they have the same memory usage? > > Since tuples don't have a per-instance __dict__, I'd e

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 25 September 2012 23:10, Tim Chase wrote: > On 09/25/12 16:17, Oscar Benjamin wrote: > > I don't know whether it would be better or worse but it might be > > worth seeing what happens if you replace the FileContext objects > > with tuples. > > If tuples provide a savings but you find them opaq

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 25 September 2012 23:09, Ian Kelly wrote: > On Tue, Sep 25, 2012 at 12:17 PM, Oscar Benjamin > wrote: > > Also I think lambda functions might be able to keep the frame alive. Are > > they by any chance being created in a function that is called in a loop? > > I'm pretty sure they don't. Clos

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Ian Kelly
On Tue, Sep 25, 2012 at 12:17 PM, Oscar Benjamin wrote: > Also I think lambda functions might be able to keep the frame alive. Are > they by any chance being created in a function that is called in a loop? I'm pretty sure they don't. Closures don't keep a reference to the calling frame, only to

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Tim Chase
On 09/25/12 16:17, Oscar Benjamin wrote: > I don't know whether it would be better or worse but it might be > worth seeing what happens if you replace the FileContext objects > with tuples. If tuples provide a savings but you find them opaque, you might also consider named-tuples for clarity. -tk

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Junkshops
On 9/25/2012 2:17 PM, Oscar Benjamin wrote: I don't know whether it would be better or worse but it might be worth seeing what happens if you replace the FileContext objects with tuples. I originally used a string, and it was slightly better since you don't have the object overhead, but I wanted

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 25 September 2012 21:26, Junkshops wrote: > On 9/25/2012 11:17 AM, Oscar Benjamin wrote: > > On 25 September 2012 19:08, Junkshops wrote: > >> >> In [38]: mpef._ustore._store >> Out[38]: defaultdict(, {'Measurement': >> {'8991c2dc67a49b909918477ee4efd767': >> , >> '7b38b429230f00fe4731e60419

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Junkshops
On 9/25/2012 11:50 AM, Dave Angel wrote: I suspect that heapy has some limitation in its reporting, and that's what the discrepancy. That would be my first suspicion as well - except that heapy's results agree so well with what I expect, and I can't think of any reason I'd be using 10x more m

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Junkshops
On 9/25/2012 11:17 AM, Oscar Benjamin wrote: On 25 September 2012 19:08, Junkshops > wrote: In [38]: mpef._ustore._store Out[38]: defaultdict(, {'Measurement': {'8991c2dc67a49b909918477ee4efd767': , '7b38b429230f00fe4731e60419e92346': , 'b53531471

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Dave Angel
On 09/25/2012 01:39 PM, Junkshops wrote: Procedural point: I know you're trying to conform to the standard that this mailing list uses, but you're off a little, and it's distracting. It's also probably more work for you, and certainly for us. You need an attribution in front of the quoted portio

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 25 September 2012 19:08, Junkshops wrote: > > Can you give an example of how these data structures look after reading > only the first 5 lines? > > Sure, here you go: > > In [38]: mpef._ustore._store > Out[38]: defaultdict(, {'Measurement': > {'8991c2dc67a49b909918477ee4efd767': > , > '7b38b4

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Junkshops
Can you give an example of how these data structures look after reading only the first 5 lines? Sure, here you go: In [38]: mpef._ustore._store Out[38]: defaultdict(, {'Measurement': {'8991c2dc67a49b909918477ee4efd767': , '7b38b429230f00fe4731e60419e92346': , 'b53531471b261c44d52f651add64

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Junkshops
I'm a bit surprised you aren't beyond the 2gb limit, just with the structures you describe for the file. You do realize that each object has quite a few bytes of overhead, so it's not surprising to use several times the size of a file, to store the file in an organized way. I did some back of the

Re: gracious responses (was: Memory usage per top 10x usage per heapy)

2012-09-25 Thread alex23
On Sep 25, 9:39 pm, Tim Chase wrote: > Mostly instigated by one person with a > particularly quick trigger, vitriolic tongue, and a disregard for > pythonic code. I'm sorry. I'll get me coat. -- http://mail.python.org/mailman/listinfo/python-list

Re: gracious responses (was: Memory usage per top 10x usage per heapy)

2012-09-25 Thread Tim Chase
On 09/25/12 06:10, Mark Lawrence wrote: > On 25/09/2012 11:51, Tim Chase wrote: >> If only other unnamed persons on the list were so gracious rather >> than turning the flame-dial to 11. >> > > Oh heck what have I said this time? You'd *like* to take credit? ;-) Nah, not you or any of the regul

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Oscar Benjamin
On 25 September 2012 00:58, Junkshops wrote: > Hi Tim, thanks for the response. > > > - check how you're reading the data: are you iterating over >>the lines a row at a time, or are you using >>.read()/.readlines() to pull in the whole file and then >>operate on that? >> > I'm using

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Mark Lawrence
On 25/09/2012 11:51, Tim Chase wrote: [snip] If only other unnamed persons on the list were so gracious rather than turning the flame-dial to 11. Oh heck what have I said this time? -tkc -- Cheers. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Dave Angel
On 09/25/2012 12:21 AM, Junkshops wrote: >> Just curious; which is it, two million lines, or half a million bytes? > > Sorry, that should've been a 500Mb, 2M line file. > >> which machine is 2gb, the Windows machine, or the VM? > VM. Winders is 4gb. > >> ...but I would point out that just beca

Re: Memory usage per top 10x usage per heapy

2012-09-25 Thread Tim Chase
On 09/24/12 23:41, Dennis Lee Bieber wrote: > On Mon, 24 Sep 2012 14:59:47 -0700 (PDT), MrsEntity > declaimed the following in > gmane.comp.python.general: > >> Hi all, >> >> I'm working on some code that parses a 500kb, 2M line file line by line and >> saves, per line, some derived strings > >

Re: Memory usage per top 10x usage per heapy

2012-09-24 Thread Junkshops
Just curious; which is it, two million lines, or half a million bytes? I have, in fact, this very afternoon, invented a means of writing a carriage return character using only 2 bits of information. I am prepared to sell licenses to this revolutionary technology for the low price of $29.95 plu

Re: Memory usage per top 10x usage per heapy

2012-09-24 Thread Dave Angel
On 09/24/2012 05:59 PM, MrsEntity wrote: > Hi all, > > I'm working on some code that parses a 500kb, 2M line file Just curious; which is it, two million lines, or half a million bytes? > line by line and saves, per line, some derived strings into various data > structures. I thus expect that m

Re: Memory usage per top 10x usage per heapy

2012-09-24 Thread Junkshops
Hi Tim, thanks for the response. - check how you're reading the data: are you iterating over the lines a row at a time, or are you using .read()/.readlines() to pull in the whole file and then operate on that? I'm using enumerate() on an iterable input (which in this case is the fileh

Re: Memory usage per top 10x usage per heapy

2012-09-24 Thread Tim Chase
On 09/24/12 16:59, MrsEntity wrote: > I'm working on some code that parses a 500kb, 2M line file line > by line and saves, per line, some derived strings into various > data structures. I thus expect that memory use should > monotonically increase. Currently, the program is taking up so > much memo

Memory usage per top 10x usage per heapy

2012-09-24 Thread MrsEntity
Hi all, I'm working on some code that parses a 500kb, 2M line file line by line and saves, per line, some derived strings into various data structures. I thus expect that memory use should monotonically increase. Currently, the program is taking up so much memory - even on 1/2 sized files - tha