More on this subject.  Apparently, the HDF5 crew is going to optimize 
this use case:

----------  Missatge transmès  ----------

Subject: Re: [hdf-forum] Reading across multiple chunks is very slow
Date: Thursday 02 April 2009
From: Francesc Alted <fal...@pytables.org>
To: hdf-fo...@hdfgroup.org

A Wednesday 01 April 2009, Neil Fortner escrigué:
[clip]
> > As can be seen, the scatter process completes in around 40 + 40 =
> > 80 ms for my chunk size.  Hence, I'd expect the second read case to
> > complete in around 0.17 + 0.08 = 0.25 seconds, while the actual
> > time is still 5x slower.  So, unless I'm missing something, my
> > guess is that the scatter code in the HDF5 library could be made a
> > lot faster.
> >
> > Thanks,
>
> Thanks for the investigation.  The routines currently used for
> partial I/O are very general and there is a substantial amount of
> overhead when the sequences are very short (4 bytes in this case).  I
> have filed an enhancement bug to add optimized routines to handle
> this case, and hopefully we will get to this soon.
>
> The optimized code would be used whenever:
> - There is at most one hyperslab selection in the source and
> destination (and no points), and:
> - Either:
>     - The selections in the source and destination have the same
> dimensions (but not necessarily offsets), or:
>     - The selection in either the source or destination is contiguous
> (in the serialized form)
>
> For chunked datasets this would of course be applied on a per-chunk
> basis.  Does this seem like it would fit most of your use cases?

I think so, yes.  These requirements would cover a very important use 
case.  In my experience, when one ask for performance, accessing data 
with the above restrictions is very reasonable and common practice.

Thanks!

-- 
Francesc Alted

"One would expect people to feel threatened by the 'giant
brains or machines that think'.  In fact, the fightening
computer becomes less frightening if it is used only to
simulate a familiar noncomputer."

-- Edsger W. Dykstra
   "On the cruelty of really teaching computer science"

------------------------------------------------------------------------------
_______________________________________________
Pytables-users mailing list
Pytables-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/pytables-users

Reply via email to