wangxianghu commented on a change in pull request #1827:
URL: https://github.com/apache/hudi/pull/1827#discussion_r485058261
##########
File path:
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/SparkWorkloadProfile.java
##########
@@ -22,49 +22,22 @@
import org.apache.hudi.common.model.HoodieRecordLocation;
import org.apache.hudi.common.model.HoodieRecordPayload;
import org.apache.hudi.common.util.Option;
-
import org.apache.spark.api.java.JavaRDD;
+import scala.Tuple2;
-import java.io.Serializable;
-import java.util.HashMap;
import java.util.Map;
-import java.util.Set;
-
-import scala.Tuple2;
/**
- * Information about incoming records for upsert/insert obtained either via
sampling or introspecting the data fully.
- * <p>
- * TODO(vc): Think about obtaining this directly from index.tagLocation
+ * Spark implementation of {@link BaseWorkloadProfile}.
+ * @param <T>
*/
-public class WorkloadProfile<T extends HoodieRecordPayload> implements
Serializable {
-
- /**
- * Input workload.
- */
- private final JavaRDD<HoodieRecord<T>> taggedRecords;
-
- /**
- * Computed workload profile.
- */
- private final HashMap<String, WorkloadStat> partitionPathStatMap;
-
- /**
- * Global workloadStat.
- */
- private final WorkloadStat globalStat;
-
- public WorkloadProfile(JavaRDD<HoodieRecord<T>> taggedRecords) {
- this.taggedRecords = taggedRecords;
- this.partitionPathStatMap = new HashMap<>();
- this.globalStat = new WorkloadStat();
- buildProfile();
+public class SparkWorkloadProfile<T extends HoodieRecordPayload> extends
BaseWorkloadProfile<JavaRDD<HoodieRecord<T>>> {
Review comment:
> we can actually try and keep this generic and just pass in what we
need from `taggedRecords` to constructor instead of the entire thing
done
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]