yihua commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r802049943
##########
File path:
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
* <li>Incremental mode: reading table's state as of particular timestamp
(or instant, in Hudi's terms)</li>
* <li>External mode: reading non-Hudi partitions</li>
* </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files
being read
*/
-public abstract class HoodieFileInputFormatBase extends
FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends
FileInputFormat<NullWritable, ArrayWritable>
Review comment:
I agree that `Realtime` naming can be removed.
I come from the angle where the InputFormat classes can be named according
to the Hudi file layouts, i.e., file groups with file slices containing a base
file and a set of log files. "CopyOnWrite" and "MergeOnRead" naming is one
layer above for write/read logic. They are cleaner than the previous naming.
I'm thinking that "BaseFile" and "BaseAndLogFile" prefixes may be a better fit
here.
Since the changes are fundamental, I prefer that the naming should be
finalized in 0.11.0 and won't be changed for some time. So consensus is needed
here.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]