Re: [Pytables-users] Storing small files in the filenodes

2013-07-18 Thread Valeriy Sokolov
Thank you, Anthony,

I will try VLArray as you suggested =)


On Thu, Jul 18, 2013 at 3:39 AM, Anthony Scopatz scop...@gmail.com wrote:

 Hello Valeriy,

 For better or worse, the is exactly the performance I would expect.  The
 thing that you are running up against is that every HDF5 data set has 64 Kb
 of header space for meta information.  There is no way of changing this
 without invalidating the HDF5 spec.  The fact that you are seeing an
 average of 70 Kb per data set is consistent since data sets don't need to
 be contiguously stored.

 I would suggest that you use a VLArray [1] of length-1 string atoms.
  You'll lose the filenode interface but you'll also loose the 3200%
 overhead =).

 Be Well
 Anthony

 1.
 http://pytables.github.io/usersguide/libref/homogenous_storage.html#the-vlarray-class


 On Wed, Jul 17, 2013 at 3:14 PM, Valeriy Sokolov sokolov.val...@gmail.com
  wrote:

 Not sure if the quoted message was delivered to the list (maybe because I
 was not registered on this list), so reposting it this way...

 On Fri, Jul 12, 2013 at 5:40 PM, Valeriy Sokolov 
 sokolov.val...@gmail.com wrote:

 Hi,

 I am trying to store lots of small (~2Kb) files in the filenode-s of the
 pytables. And I ran into a trouble with size overhead.

 200 such files which consumes in total ~2Mb on the filesystem takes 14Mb
 in the .h5 file produced by pytables. My experiments show that if I create
 200 file nodes and store 1 byte in each, I have .h5 of 14Mb. Approximately
 from the size like 200Kb per file node I have a linear increase of size.
 I.e. 400Kb per node leads to 89Mb, and 800Kb per node leads to 164Mb.

 But I would like to store ~2Kb there and current overhead (like 70Kb per
 file node) is pretty huge.

 Could you please help me with work-around for this issue?

 Thank you in advance.

 --
 Best regards,
 Valeriy Sokolov.




 --
 Best regards,
 Valeriy Sokolov.


 --
 See everything from the browser to the database with AppDynamics
 Get end-to-end visibility with application monitoring from AppDynamics
 Isolate bottlenecks and diagnose root cause in seconds.
 Start your free trial of AppDynamics Pro today!

 http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk
 ___
 Pytables-users mailing list
 Pytables-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/pytables-users





-- 
Best regards,
Valeriy Sokolov.
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users


Re: [Pytables-users] Storing small files in the filenodes

2013-07-17 Thread Valeriy Sokolov
Not sure if the quoted message was delivered to the list (maybe because I
was not registered on this list), so reposting it this way...

On Fri, Jul 12, 2013 at 5:40 PM, Valeriy Sokolov
sokolov.val...@gmail.comwrote:

 Hi,

 I am trying to store lots of small (~2Kb) files in the filenode-s of the
 pytables. And I ran into a trouble with size overhead.

 200 such files which consumes in total ~2Mb on the filesystem takes 14Mb
 in the .h5 file produced by pytables. My experiments show that if I create
 200 file nodes and store 1 byte in each, I have .h5 of 14Mb. Approximately
 from the size like 200Kb per file node I have a linear increase of size.
 I.e. 400Kb per node leads to 89Mb, and 800Kb per node leads to 164Mb.

 But I would like to store ~2Kb there and current overhead (like 70Kb per
 file node) is pretty huge.

 Could you please help me with work-around for this issue?

 Thank you in advance.

 --
 Best regards,
 Valeriy Sokolov.




-- 
Best regards,
Valeriy Sokolov.
--
See everything from the browser to the database with AppDynamics
Get end-to-end visibility with application monitoring from AppDynamics
Isolate bottlenecks and diagnose root cause in seconds.
Start your free trial of AppDynamics Pro today!
http://pubads.g.doubleclick.net/gampad/clk?id=48808831iu=/4140/ostg.clktrk___
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users