>> I'm using pytables to buil a persistent binary tree containing from
>> 10^3 to 10^8 (may be in future 10^10) integer numbers. The structure
>> reflects a previus binary tree entirely built in memory.
>> For now traversing the tree in the hdf5 file it's 10 time slower than
>> traversing the tree in memory...
>
> How many data are you putting into the leaves?  Also, how many leaves do you
> have in your binary tree?

For now the leaves max size is 300 (that is: leaves are Arrays
containing 300 or less integers), the number of nodes dipends on the
dimension of the dataset. Now I'm using dataset between 1000 and 10^8
elements. The tree is built splitting every node (=PyTables Group) in
two children until the number of elements is <=300.

>> It's possible to load the entire tree in memory before traversing the
>> nodes instead of use the read/caching method so the traversing can be
>> faster?
>
> Without knowing more about the number of nodes you have, I'd say that a
> factor 10x slower than memory is expected.  But you can play with the
> internal cache and try to load everything in memory.  You may want to
> experiment with a large negative value for the ``NODE_CACHE_SLOTS``
> parameter of PyTables.  See appendix C of User's Guide for how to change
> this setting:
>
> http://www.pytables.org/docs/manual/apc.html

Thank you very much!

I will try this!

brunetto

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to