wangxianghu commented on a change in pull request #1827:
URL: https://github.com/apache/hudi/pull/1827#discussion_r485058261



##########
File path: 
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/table/SparkWorkloadProfile.java
##########
@@ -22,49 +22,22 @@
 import org.apache.hudi.common.model.HoodieRecordLocation;
 import org.apache.hudi.common.model.HoodieRecordPayload;
 import org.apache.hudi.common.util.Option;
-
 import org.apache.spark.api.java.JavaRDD;
+import scala.Tuple2;
 
-import java.io.Serializable;
-import java.util.HashMap;
 import java.util.Map;
-import java.util.Set;
-
-import scala.Tuple2;
 
 /**
- * Information about incoming records for upsert/insert obtained either via 
sampling or introspecting the data fully.
- * <p>
- * TODO(vc): Think about obtaining this directly from index.tagLocation
+ * Spark implementation of {@link BaseWorkloadProfile}.
+ * @param <T>
  */
-public class WorkloadProfile<T extends HoodieRecordPayload> implements 
Serializable {
-
-  /**
-   * Input workload.
-   */
-  private final JavaRDD<HoodieRecord<T>> taggedRecords;
-
-  /**
-   * Computed workload profile.
-   */
-  private final HashMap<String, WorkloadStat> partitionPathStatMap;
-
-  /**
-   * Global workloadStat.
-   */
-  private final WorkloadStat globalStat;
-
-  public WorkloadProfile(JavaRDD<HoodieRecord<T>> taggedRecords) {
-    this.taggedRecords = taggedRecords;
-    this.partitionPathStatMap = new HashMap<>();
-    this.globalStat = new WorkloadStat();
-    buildProfile();
+public class SparkWorkloadProfile<T extends HoodieRecordPayload> extends 
BaseWorkloadProfile<JavaRDD<HoodieRecord<T>>> {

Review comment:
       > we can actually try and keep this generic and just pass in what we 
need from `taggedRecords` to constructor instead of the entire thing
   
   done




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to