n3nash commented on a change in pull request #2927:
URL: https://github.com/apache/hudi/pull/2927#discussion_r629686478
##########
File path:
hudi-utilities/src/main/java/org/apache/hudi/utilities/UtilHelpers.java
##########
@@ -429,8 +432,55 @@ public static SchemaProvider
createRowBasedSchemaProvider(StructType structType,
return wrapSchemaProviderWithPostProcessor(rowSchemaProvider, cfg, jssc,
null);
}
+ /**
+ * Create latest schema provider for Target schema.
+ * @param structType spark data type of incoming batch.
+ * @param jssc instance of {@link JavaSparkContext}.
+ * @param fs instance of {@link FileSystem}.
+ * @param basePath base path of the table.
+ * @return the schema provider where target schema refers to latest
schema(either incoming schema or table schema).
+ */
+ public static SchemaProvider createLatestSchemaProvider(StructType
structType,
+ JavaSparkContext jssc, FileSystem fs, String basePath) {
+ SchemaProvider rowSchemaProvider = new RowBasedSchemaProvider(structType);
+ Schema incomingSchema = rowSchemaProvider.getTargetSchema();
+ Schema latestSchema = incomingSchema;
+
+ try {
+ if (fs.exists(new Path(basePath + "/" +
HoodieTableMetaClient.METAFOLDER_NAME))) {
Review comment:
Is there are better way to check if the table is present on the basePath
? May be add a static method to HoodieTableMetaClient ? Would like to contain
explicit use of `fs.` as much as possible in specific classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]