yihua commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r802049943



##########
File path: 
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
  *   <li>Incremental mode: reading table's state as of particular timestamp 
(or instant, in Hudi's terms)</li>
  *   <li>External mode: reading non-Hudi partitions</li>
  * </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files 
being read
  */
-public abstract class HoodieFileInputFormatBase extends 
FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends 
FileInputFormat<NullWritable, ArrayWritable>

Review comment:
       I agree that `Realtime` naming can be removed.
   
   I come from the angle where the InputFormat classes can be named according 
to the Hudi file layouts, i.e., file groups with file slices containing a base 
file and a set of log files.  "CopyOnWrite" and "MergeOnRead" naming is one 
layer above for write/read logic.  They are cleaner than the previous naming.  
I'm thinking that "BaseFile" and "BaseAndLogFile" prefixes may be a better fit 
here.
   
   Since the changes are fundamental, I prefer that the naming should be 
finalized in 0.11.0 and won't be changed for some time.  So consensus is needed 
here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to