Hello Michaël!
04.12.2017 21:23, Michaël Melchiore пишет:
I build an application which operates on NetCDF data using Big Data
technologies.
My design aims at avoiding unnecessarily writing data to disk. Instead,
I want to operate as much as possible in memory. The challenge is data
(de)serialization for distributed communications between computing nodes.
Since NetCDF4 and HDF5 already provide a portable data format, a simple
and efficient design would simply access and then exchange the raw
binary data over the network.
Currently, I fail to access this buffer without creating files. I am
investigating the use of the Apache Common VFS Ram file system to trick
NetCDF into working in memory.
But, a suggestion on the NetCDF Java mailing list (see ticket
MQO-415619) was to build an alternative to the core driver. I feel this
is the more desirable course of actions as it is about improving the
existing solutions instead of working around their limitations.
Do you think this approach is feasible ? Any starting pointers would be
appreciated !
I am probably not a distinguished expert in HDF5, but I take courage to
suggest you to check
https://www.hdfgroup.org/downloads/spark-connector/
It would be superb if you could share your experience and whether Spark
connector helped you to implement in-memory processing.
Best wishes,
Andrey Paramonov
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@lists.hdfgroup.org
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5