You may want to take a look at com.datatorrent.lib.fileaccess.DTFileReader in the malhar-library – not sure whether it gives you reading the whole file into memory.
Also there is a library called Megh at https://github.com/DataTorrent/Megh where you might find some useful operators like com.datatorrent.contrib.hdht.hfile.HFileImpl . From: Roger F <[email protected]> Reply-To: <[email protected]> Date: Sunday, January 22, 2017 at 9:32 PM To: <[email protected]> Subject: One-time Initialization of in-memory data using a data file Hi, I have a use case where application business data needs migrated from a legacy system (such as mainframe) into HDFS and then loaded for use by an Apex application. To get this done, an approach that is being considered to perform one-time initialization of the data from the HDFS into application memory. This data will then be queried for various business logic functions of the application. Once the data is loaded, this operator/module (?) should no longer perform any further function except for acting as a master of this data and then supporting operations to query the data (via a key). Any pointers to how this can be done ? I was looking for an operator or any other entity which can load this data at startup (Activation or Setup) and then allow queries to be submitted to it via an input port. -R
