Stephen Sisk created BEAM-1592:
----------------------------------

             Summary: Unify HdfsIO and HadoopInputFormatIO
                 Key: BEAM-1592
                 URL: https://issues.apache.org/jira/browse/BEAM-1592
             Project: Beam
          Issue Type: Bug
          Components: sdk-java-core
            Reporter: Stephen Sisk
            Assignee: Davor Bonaci


HIFIO is currently in PR (https://github.com/apache/beam/pull/1994)  and as per 
discussion in 
https://lists.apache.org/thread.html/803857877804165e798cf31edf079e6603eb9682b7690d52124c31e7@%3Cdev.beam.apache.org%3E,
 we'd like to check HIFIO in as-is, then unify the two since they share a lot 
of code. 

[[email protected]] has mentioned: "the FileInputFormat reader gets to call 
some special APIs that the
generic InputFormat reader cannot -- so they are not completely redundant. 
Specifically, FileInputFormat reader can do size-based splitting." 

Dan recommended: "See if we can "inline" the FileInputFormat specific parts of 
HdfsIO inside of HadoopInputFormatIO via reflection. If so, we can get the best 
of both worlds with shared code." 

This seems reasonable to me. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to