the-other-tim-brown commented on code in PR #12417:
URL: https://github.com/apache/hudi/pull/12417#discussion_r1869637309
##########
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/DataSourceUtils.java:
##########
@@ -279,15 +279,15 @@ public static HoodieRecord
createHoodieRecord(GenericRecord gr, HoodieKey hKey,
/**
* Drop records already present in the dataset.
*
- * @param jssc JavaSparkContext
+ * @param engineContext the spark engine context
* @param incomingHoodieRecords HoodieRecords to deduplicate
* @param writeConfig HoodieWriteConfig
*/
@SuppressWarnings("unchecked")
- public static JavaRDD<HoodieRecord> dropDuplicates(JavaSparkContext jssc,
JavaRDD<HoodieRecord> incomingHoodieRecords,
+ public static JavaRDD<HoodieRecord> dropDuplicates(HoodieSparkEngineContext
engineContext, JavaRDD<HoodieRecord> incomingHoodieRecords,
HoodieWriteConfig writeConfig) {
try {
- SparkRDDReadClient client = new SparkRDDReadClient<>(new
HoodieSparkEngineContext(jssc), writeConfig);
+ SparkRDDReadClient client = new SparkRDDReadClient<>(engineContext,
writeConfig);
Review Comment:
it's just cosmetic, we use a custom extension of the engine context to add
more context to the names of steps so we can more easily debug when looking at
the spark UI or history server UI
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]