Hello Valeriy,

For better or worse, the is exactly the performance I would expect.  The
thing that you are running up against is that every HDF5 data set has 64 Kb
of header space for meta information.  There is no way of changing this
without invalidating the HDF5 spec.  The fact that you are seeing an
average of 70 Kb per data set is consistent since data sets don't need to
be contiguously stored.

I would suggest that you use a VLArray [1] of length-1 string atoms.
 You'll lose the filenode interface but you'll also loose the 3200%
overhead =).

Be Well
Anthony

1.
http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-vlarray-class


On Wed, Jul 17, 2013 at 3:14 PM, Valeriy Sokolov
<sokolov.val...@gmail.com>wrote:

> Not sure if the quoted message was delivered to the list (maybe because I
> was not registered on this list), so reposting it this way...
>
> On Fri, Jul 12, 2013 at 5:40 PM, Valeriy Sokolov <sokolov.val...@gmail.com
> > wrote:
>
>> Hi,
>>
>> I am trying to store lots of small (~2Kb) files in the filenode-s of the
>> pytables. And I ran into a trouble with size overhead.
>>
>> 200 such files which consumes in total ~2Mb on the filesystem takes 14Mb
>> in the .h5 file produced by pytables. My experiments show that if I create
>> 200 file nodes and store 1 byte in each, I have .h5 of 14Mb. Approximately
>> from the size like 200Kb per file node I have a linear increase of size.
>> I.e. 400Kb per node leads to 89Mb, and 800Kb per node leads to 164Mb.
>>
>> But I would like to store ~2Kb there and current overhead (like 70Kb per
>> file node) is pretty huge.
>>
>> Could you please help me with work-around for this issue?
>>
>> Thank you in advance.
>>
>> --
>> Best regards,
>> Valeriy Sokolov.
>>
>
>
>
> --
> Best regards,
> Valeriy Sokolov.
>
>
> ------------------------------------------------------------------------------
> See everything from the browser to the database with AppDynamics
> Get end-to-end visibility with application monitoring from AppDynamics
> Isolate bottlenecks and diagnose root cause in seconds.
> Start your free trial of AppDynamics Pro today!
> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
> _______________________________________________
> Pytables-users mailing list
> Pytables-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/pytables-users
>
>
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to