Re: [Pytables-users] SegFault w. large(ish) DB -test attached

Francesc Altet Wed, 15 Aug 2007 10:41:24 -0700

A Wednesday 15 August 2007, David Worrall escrigué:
> On 15/08/2007, at 5:36 PM, [EMAIL PROTECTED] wrote:
> > On Wed, Aug 15, 2007 at 08:40:28AM +1000, David Worrall wrote:
> >> My RAM is 256Mb on OSX 10.4.10 (the latest).
> >
> > Mmmm, not so good ;)  I think what is happening is that your
> > computer is getting out of memory resources.  A Table node should
> > use around 1 MB of memory, and the number of slots in the PyTables'
> > node cache (NODE_MAX_SLOTS) is 256 by default.  This means that
> > with 256 MB you are completely filling it (i.e. the system will
> > start using swap), and that could be the source of the problem you
> > are experiencing. My advice is that you should try with a machine
> > with at least 512 MB, or better yet, try lowering NODE_MAX_SLOTS
> > to, say, 64.
>
> Arrrrrh! I made a mistake - it is actually 512Mb - sorry.


Mmm, in that case the memory should not be a problem.  In my tests with 
a 64-bit machine running Linux, your script doesn't consume more than 
340 MB to complete (and completes the 100,000 leaves without problems, 
as I said in a previous message), but in a 32-bit machine should be 
less than this.  So no clues on what is happening in your MacOSX box.

> >> Also, is there any reason why I can't/shouldn't use 2 or more .h5
> >> files in parallel (split the 3500+ top-level groups into two DBs)?
> >
> > As Ivan has already said, this perfectly fine with PyTables. 
> > However, this will only worse the memory problem, as each opened
> > file will have its own node cache, raising still more your memory
> > needs.  So, if you are dealing with a very large amount of nodes
> > and have not much memory available, it is better to stay with only
> > one opened file at a time.
>
> I understand that, however it means I might be able to find a way to
> meaningfully segment the DB so each segment-DB  is smaller.
> As I think I mentioned earlier, I did success in generating a
> complete DB by loading 1/2 of it, closing the file, quitting python
> and then starting again with the same file, adding the 2nd 1/2.

Ah, ok.  I understand now what you are trying to do.

> All that aside,
> 1. it is a worry that there is no exception generated, by python, at
> least,  don't you agree?

Yes, I do.

> 2. Is there a way of purging memory after a flush or are we up
> against python's memory management?

I think the thing to try first would be lowering NODE_MAX_SLOTS.  Have 
you tried this already?

Cheers,

-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   Cárabos Coop. V.   Enjoy Data
 "-"

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Re: [Pytables-users] SegFault w. large(ish) DB -test attached

Reply via email to