Thank you, Anthony,

I will try VLArray as you suggested =)


On Thu, Jul 18, 2013 at 3:39 AM, Anthony Scopatz <scop...@gmail.com> wrote:

> Hello Valeriy,
>
> For better or worse, the is exactly the performance I would expect.  The
> thing that you are running up against is that every HDF5 data set has 64 Kb
> of header space for meta information.  There is no way of changing this
> without invalidating the HDF5 spec.  The fact that you are seeing an
> average of 70 Kb per data set is consistent since data sets don't need to
> be contiguously stored.
>
> I would suggest that you use a VLArray [1] of length-1 string atoms.
>  You'll lose the filenode interface but you'll also loose the 3200%
> overhead =).
>
> Be Well
> Anthony
>
> 1.
> http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-vlarray-class
>
>
> On Wed, Jul 17, 2013 at 3:14 PM, Valeriy Sokolov <sokolov.val...@gmail.com
> > wrote:
>
>> Not sure if the quoted message was delivered to the list (maybe because I
>> was not registered on this list), so reposting it this way...
>>
>> On Fri, Jul 12, 2013 at 5:40 PM, Valeriy Sokolov <
>> sokolov.val...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I am trying to store lots of small (~2Kb) files in the filenode-s of the
>>> pytables. And I ran into a trouble with size overhead.
>>>
>>> 200 such files which consumes in total ~2Mb on the filesystem takes 14Mb
>>> in the .h5 file produced by pytables. My experiments show that if I create
>>> 200 file nodes and store 1 byte in each, I have .h5 of 14Mb. Approximately
>>> from the size like 200Kb per file node I have a linear increase of size.
>>> I.e. 400Kb per node leads to 89Mb, and 800Kb per node leads to 164Mb.
>>>
>>> But I would like to store ~2Kb there and current overhead (like 70Kb per
>>> file node) is pretty huge.
>>>
>>> Could you please help me with work-around for this issue?
>>>
>>> Thank you in advance.
>>>
>>> --
>>> Best regards,
>>> Valeriy Sokolov.
>>>
>>
>>
>>
>> --
>> Best regards,
>> Valeriy Sokolov.
>>
>>
>> ------------------------------------------------------------------------------
>> See everything from the browser to the database with AppDynamics
>> Get end-to-end visibility with application monitoring from AppDynamics
>> Isolate bottlenecks and diagnose root cause in seconds.
>> Start your free trial of AppDynamics Pro today!
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Pytables-users mailing list
>> Pytables-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/pytables-users
>>
>>
>


-- 
Best regards,
Valeriy Sokolov.
------------------------------------------------------------------------------
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to