alexeykudinkin commented on a change in pull request #4667:
URL: https://github.com/apache/hudi/pull/4667#discussion_r807182343
##########
File path:
hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/HoodieCopyOnWriteTableInputFormat.java
##########
@@ -65,12 +71,32 @@
* <li>Incremental mode: reading table's state as of particular timestamp
(or instant, in Hudi's terms)</li>
* <li>External mode: reading non-Hudi partitions</li>
* </ul>
+ *
+ * NOTE: This class is invariant of the underlying file-format of the files
being read
*/
-public abstract class HoodieFileInputFormatBase extends
FileInputFormat<NullWritable, ArrayWritable>
+public class HoodieCopyOnWriteTableInputFormat extends
FileInputFormat<NullWritable, ArrayWritable>
Review comment:
Yeah, i've crossed the same path recently realizing that this dichotomy
doesn't line up well with the reading path, and i think the crux of the problem
is that COW is purely write-side semantic therefore when we say COW on the
read-path that doesn't really make sense.
I'm touching up sibling hierarchy on Spark side, and will think about better
terminology there and afterwards we can carry it over here as well
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]