Hi Mark,
On Feb 23, 2011, at 5:43 PM, Mark Miller wrote:
> On Wed, 2011-02-23 at 14:41, Quincey Koziol wrote:
>>>
>>> Ah, yes, that may be a good segue into this two-pass feature. I've
>> been thinking about this feature and wondering about how to implement
>> it. Something that occurs to me would be to construct it like a
>> "transaction", where the application opens a transaction, the HDF5
>> library just records those operations performed with API routines,
>> then when the application closes the transaction, they are replayed
>> twice: once to record the results of all the operations, and then a
>> second pass that actually performs all the I/O. That would help to
>> reduce the overhead from the collective metadata modification overhead
>> also.
>>
>> BTW, if we go down this "transaction" path, it allows the HDF5
>> library to push the fault tolerance up to the application level - the
>> library could guarantee that the atomicity of what was "visible" in
>> the file was an entire checkpoint, rather than the atomicity being on
>> a per-API call basis.
>
> Hmm. Thats only true if 'transaction' is whole file scope, right? I mean
> aren't you going to allow application to decide what 'granularity' a
> transaction should be; a single dataset, a bunch of datasets in a group
> in the file, etc.
Yes, it would be the whole file scope. (Although the modifications
within the transaction could be limited to changes to a single dataset, of
course)
> If scope of 'transaction' is only a whole-file, then...
>
> I may be misunderstanding your notions here but I don't think you'd want
> to design this around the assumption that a 'transaction' could embody
> something that included all buffer pointers passed into HDF5 by caller
> and then HDF5 could automagically FINISH the transaction on behalf of
> the application without returning control back to the application.
>
> I think there are going to be too many situations where applications
> unwind their own internal data structures placing data into temporary
> buffers that are then handed off to HDF5 for I/O and freed. And, for a
> given HDF5 file, this likely happens again and again as different parts
> of the application's internal data is spit out to HDF5. But, not to
> worry.
Hmm, I think you are saying that the application would re-use the
buffer(s) passed to HDF5 for more than one call to H5Dwrite(), is that the
case? If so, that would complicate the transaction idea. Hmm... Perhaps a
callback for the application to free/re-allocate the memory buffers? Maybe we
could use the transaction idea for just metadata modifications, and the
two-pass idea for the raw data writes?
> My idea included the notion the application would have to re-engage in
> all such 'data prep for I/O' processes a second time. I assume time to
> complete such process, relative to actual I/O time, is small enough that
> it doesn't matter to the application that it has to do it twice. I think
> for most applications, that would be true and relatively easy to
> engineer to engage in the work in two passes.
Yes, running through the whole process of copying data into the
temporary buffers would be necessary, if the temporary buffers are re-used.
Something else to consider - I'm just wrapping up an exascale-oriented
workshop this morning and after listening to the presentations and talking to
people here, I'm concerned that the exascale-class machines are not going to
want to double-copy/compress the data they are writing. Yes, they will have
more compute that I/O bandwidth, but the costs of moving memory are going to be
very expensive... :-/
Quincey
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org