[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5462: [HUDI-3995] Making pref optimizations for bulk insert row writer path

GitBox Thu, 28 Apr 2022 18:02:57 -0700


alexeykudinkin commented on code in PR #5462:
URL: https://github.com/apache/hudi/pull/5462#discussion_r861404133



##########
hudi-spark-datasource/hudi-spark-common/src/main/java/org/apache/hudi/HoodieDatasetBulkInsertHelper.java:
##########
@@ -57,18 +61,18 @@ public class HoodieDatasetBulkInsertHelper {
 
   /**
    * Prepares input hoodie spark dataset for bulk insert. It does the 
following steps.
-   *  1. Uses KeyGenerator to generate hoodie record keys and partition path.
-   *  2. Add hoodie columns to input spark dataset.
-   *  3. Reorders input dataset columns so that hoodie columns appear in the 
beginning.
-   *  4. Sorts input dataset by hoodie partition path and record key
+   * 1. Uses KeyGenerator to generate hoodie record keys and partition path.
+   * 2. Add hoodie columns to input spark dataset.
+   * 3. Reorders input dataset columns so that hoodie columns appear in the 
beginning.
+   * 4. Sorts input dataset by hoodie partition path and record key
    *
    * @param sqlContext SQL Context
-   * @param config Hoodie Write Config
-   * @param rows Spark Input dataset
+   * @param config     Hoodie Write Config
+   * @param rows       Spark Input dataset
    * @return hoodie dataset which is ready for bulk insert.
    */
   public static Dataset<Row> prepareHoodieDatasetForBulkInsert(SQLContext 
sqlContext,
-      HoodieWriteConfig config, Dataset<Row> rows, String structName, String 
recordNamespace,
+                                                               
HoodieWriteConfig config, Dataset<Row> rows, String structName, String 
recordNamespace,

Review Comment:
   But we'll have to share both of them, otherwise we will fix only half of the 
problems



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] alexeykudinkin commented on a diff in pull request #5462: [HUDI-3995] Making pref optimizations for bulk insert row writer path

Reply via email to