Hello Michaël!

04.12.2017 21:23, Michaël Melchiore пишет:
I build an application which operates on NetCDF data using Big Data technologies.

My design aims at avoiding unnecessarily writing data to disk. Instead, I want to operate as much as possible in memory. The challenge is data (de)serialization for distributed communications between computing nodes.

Since NetCDF4 and HDF5 already provide a portable data format, a simple and efficient design would simply access and then exchange the raw binary data over the network.

Currently, I fail to access this buffer without creating files. I am investigating the use of the Apache Common VFS Ram file system to trick NetCDF into working in memory.

But, a suggestion on the NetCDF Java mailing list (see ticket MQO-415619) was to build an alternative to the core driver. I feel this is the more desirable course of actions as it is about improving the existing solutions instead of working around their limitations.

Do you think this approach is feasible ? Any starting pointers would be appreciated !

I am probably not a distinguished expert in HDF5, but I take courage to suggest you to check
https://www.hdfgroup.org/downloads/spark-connector/
It would be superb if you could share your experience and whether Spark connector helped you to implement in-memory processing.

Best wishes,
Andrey Paramonov

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to