I skimmed through HdfsIO and I think it is essentially HahdoopInpuFormatIO with FileInputFormat. I would pretty much move most of the code to HadoopInputFormatIO (just make HdfsIO a specific instance of HIF_IO).
On Wed, Feb 15, 2017 at 9:15 AM, Dipti Kulkarni < dipti_dkulka...@persistent.com> wrote: > Hello there! > I am working on writing a Read IO for Hadoop InputFormat. This will enable > reading from any datasource which supports Hadoop InputFormat, i.e. > provides source to read from InputFormat for integration with Hadoop. > It makes sense for the HadoopInputFormatIO to share some code with the > HdfsIO - WritableCoder in particular, but also some helper classes like > SerializableSplit etc. I was wondering if we could move HDFS and > HadoopInputFormat into a shared module for Hadoop IO in general instead of > maintaining them separately. > Do let me know on what you think, please let me know if you can think of > any other ideas too. > > Thanks, > Dipti > > > DISCLAIMER > ========== > This e-mail may contain privileged and confidential information which is > the property of Persistent Systems Ltd. It is intended only for the use of > the individual or entity to which it is addressed. If you are not the > intended recipient, you are not authorized to read, retain, copy, print, > distribute or use this message. If you have received this communication in > error, please notify the sender and delete all copies of this message. > Persistent Systems Ltd. does not accept any liability for virus infected > mails. > >