Hi, I'm using HDF5 in my company to store hierarchical organized log data from a control system. Up to now we had the following setup:
There is 1 control PC which operates the machine and continuously stores the data to a local HDF5 file. As there are several departments in our company that need the data for analysis and visualization there is also the need to provide read access to this file for a number of client PCs. Requirements are .) the data provided to the clients should be quite up-to-date. It doesn't have to be live or real-time data, but it should be at least ~1h old .) Since a HDF5 file that is opened by writer process cannot be read consistenty, we cannot provide direct access to this file to Clients. Currently we have a cronjob that perdiodically (once every 15min) copies the HDF5 file to the fileserver. Clients use this copy for read-only analysis and visualization. Although this setup satisfies more or less our needs, there are several issues we suffer from and we want to solve better in the next project. 1) HDF5 files cannot be read (and therefore not be copied) consistently when another process write to it. As our control PC continuously writes new data, this is quite painfully when the cronjob tries to copy the file to the server as a simple "cp" is not sufficient. We have implemented some kind of primitive non-blocking-write protocol between the control process and our custom file copy tool so that the file is copied only when the write process has flushed the file and not yet started a new write operation. If the copy finishes before the writer starts writing again, everything is fine, otherwise it is retried. 2) The copy on the server is read-only in fact, since changes would get overwritten during the next sync copy (and also multiple non-synchronized clients must be able to open the file). Sometimes it would be nice if additional data (mostly comments) could be saved to the file during analysis. 3) Whenever the write process on the control PC crashes (and that happens from time to time, as it is ongoing development) the HDF5 file might be corrupted, therefore losing all previously saved data. We therefore have invested some shell script effort to detect such corruptions and restore the last good file from the server to keep the data loss small, but its far from perfect and sometimes means a data loss of the whole day (since the last fileserver backup) To solve issues 1 and 2, we plan a different setup for the next project: We want to have a single HDF5 server which provides read and write access to the file via some kind of RPC protocol (we already have such a protocol for other tasks in our control system which has good performance, so we will use that, as there seems to be no existing HDF5 server implementation around that also supports remote write access). Concurrency should be no problem then as multiple clients are handled by a single server process which synchronizes its threads via read-write-locks. Multithread support is also enabled in the HDF5 library. In fact this would be some kind of database server, except that we want to keep HDF5 as a file format for its performance in manipulating large data sets and its hierarchical structure. A first implementation seems to work quite well except that we still suffer from the problem that a crash of the server (or even a server thread) still corrupts the whole file. Of course, one could argue that a server should be stable enough to prevent such situations, but it's unrealistic to assume that there will never be any bug in the software nor any other failure (power loss,...) that causes such a corruption. As for the next project we have stricter requirements concerning reliability and dependability of the log data than for the current project (a loss of a few hours up to one day is currently acceptable, but won't be acceptable then) this might turn out a showstopper (or at least means a lot of additional effort to implement a more sophisticated custom detection and recovery framework). Now that I told (or spamed ;-) you enough with what we do with HDF5 and what we plan, I have some concrete questions: .) You announced support for a single-writer-multiple-reader approach in version 1.10. Do you already have any detailed information about how you plan to implement this and what will be the limitations ? This feature might address most of our issues, so it has great impact on how much effort we should invest in our server solution (it does not make sense to implement features that we can get natively from HDF5 then). .) Is there already an existing project that offers remote access to HDF5 files in read and write mode. (I only found read-only or write-local, read-remote implementations) that I did not find ? .) In some post in August, you mentioned that your are already finishing the implementation of metadata journaling . This is quite a must-have for our project, so I'm really interested in when this will be available. Is there any chance that this will be backported to 1.8 or do we have to wait for 1.10 ? Looking at the current 1.9 snapshot, I cannot find any hints of this journaling features, so how much is still missing ? Thanks chris _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
