[GitHub] [hudi] LinMingQiang opened a new issue, #6390: [SUPPORT] HoodieCompactor.generateCompactionPlan return the wrong results operations.

GitBox Sat, 13 Aug 2022 06:40:44 -0700


LinMingQiang opened a new issue, #6390:
URL: https://github.com/apache/hudi/issues/6390


   Caused by: java.io.FileNotFoundException: File 
file:/hudi_tbl/2/.00000000-575e-4905-85e4-fb62d29a4bea_20220813205548259.log.1_0-1-0
 does not exist
   
   HoodieCompactor.generateCompactionPlan
         List<HoodieCompactionOperation> operations = 
context.flatMap(partitionPaths, partitionPath -> fileSystemView
           .getLatestFileSlices(partitionPath)
           .filter(slice -> 
!fgIdsInPendingCompactionAndClustering.contains(slice.getFileGroupId()))
           .map(s -> 
   // error :    partitionPath.equals(s.getPartitionPath()) ->  is  false ???
   //  This will cause the partition to correspond to the wrong fileid .
           );
     
   
   How to reproduce
    flink : RuntimeExecutionMode.BATCH
        public static String srcTbl = "create table test_data (\n" +
                        "  msg string,\n" +
                        "  name string,\n" +
                        "  rowtime string,\n" +
                        "  dt STRING\n" +
                        ") with (\n" +
                        "  'connector' = 'filesystem',\n" +
                        "  'path' = '"+srcPath+"',\n" +
                        "  'format' = 'csv'\n" +
                        ")";
   
         public static String hudiTbl = "CREATE TABLE hudi_tbl(\n" +
                        "    msg STRING PRIMARY KEY NOT ENFORCED,\n" +
                        "   `name` STRING,\n" +
                        "    rowtime BIGINT,\n" +
                        "    `dt` STRING\n" +
                        ") PARTITIONED BY (`dt`) WITH (\n" +
                        "    'connector' = 'hudi',\n" +
                        "    'table.type' = 'MERGE_ON_READ',\n" +
                        "    'path' = '"+path+"',\n" +
                        "    'write.payload.class' = 
'org.apache.hudi.common.model.OverwriteNonDefaultsWithLatestAvroPayload',\n" +
                        "    'write.precombine' = 'true',\n" +
                        "    'write.precombine.field' = 'rowtime',\n" +
                        "    'compaction.delta_commits' = '1',\n" +
                        "    'compaction.schedule.enable' = 'true',\n" +
                        "    'compaction.async.enabled' = 'true',\n" +
                        "    'changelog.enabled' = 'false',\n" +
                        "    'index.type' = 'BUCKET', \n" +
                        "    'hoodie.bucket.index.num.buckets'='1', \n" +
                        "    'write.tasks' = '1')\n";
   
                String insertSql = "insert into hudi_tbl " +
                                " select msg,name," +
                                " 
UNIX_TIMESTAMP(DATE_FORMAT(rowtime,'yyyy-MM-dd HH:mm:ss')) as rowtime," +
                                " dt from test_data";
   
   How to create data:
           FileWriter s = new FileWriter(f);
                Random r = new Random();
              for (int i = 0; i < 10000; i++) {
                 s.write("msg"+r.nextInt(100)+",name,2022-01-03 11:11:11," + 
r.nextInt(10));
                 s.write("\n");
              }
   like this : 
   `msg19,name,2022-01-03 11:11:11,1`
   
   
   
   
   **Environment Description**
   
   * Hudi version : master or  release-0.12.0
   
   * Spark version :
   
   * Hive version :
   
   * Hadoop version :
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :
   
   
   **Additional context**
   
   Add any other context about the problem here.
   
   **Stacktrace**
   
   ```Add the stacktrace of the error.```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] LinMingQiang opened a new issue, #6390: [SUPPORT] HoodieCompactor.generateCompactionPlan return the wrong results operations.

Reply via email to