Re: [Hdf-forum] Serial writes to many files at once on lustre file system

Quincey Koziol Mon, 07 Mar 2011 19:58:32 -0800

Hi Leigh,

On Mar 7, 2011, at 3:01 PM, Leigh Orf wrote:


> On Mon, Mar 7, 2011 at 9:16 AM, Quincey Koziol <[email protected]> wrote:
>> Hi Leigh,
>> 
>>> 
>>> Chunk in Z only, so my chunk dimensions would be something like
>>> 28x21x30 (it's never been clear to me what chunk size to pick to
>>> optimize I/O).
>>> 
>>> And keep the other parameters the same (1 stripe, and 3,000 files per
>>> directory).
>>> 
>>> I guess what I'm mostly looking for is assurance that I will get
>>> faster I/O going down this kind of route than the current way I am
>>> doing unformatted I/O.
>> 
>>        This looks like a fruitful direction to go it.  Do you really need 
>> chunking though?
> 
> Not sure, It's never been super clear to me what chunking gets you
> beyond (1) the ability to do compression (2) faster seeking through
> large datasets when you want to access space towards the end of the
> file. I may just forego chunking and see where that gets me first.

        Chunking is required if you want to have unlimited dimensions on your 
dataset's dataspace.  I would rephrase (2) above as "faster I/O when your 
selection is a good match for the chunk size", which could be an exact match 
for the chunk size, or a selection with a well-aligned, good multiple or 
fraction of the chunk size.  If you aren't using compression, don't need 
unlimited dimensions and aren't performing I/O on selections of the dataset, 
contiguous storage is probably a better fit.

        Quincey


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Re: [Hdf-forum] Serial writes to many files at once on lustre file system

Reply via email to