KnightChess commented on issue #6194:
URL: https://github.com/apache/hudi/issues/6194#issuecomment-1200433614

   @ehurheap 
   ```shell
   22/07/22 19:18:58 ERROR SparkMain: Fail to execute commandString
   org.apache.spark.sql.AnalysisException: cannot resolve '_hoodie_record_key' 
given input columns: []; line 5 pos 15;
   'UnresolvedHaving ('dupe_cnt > 1)
   +- 'Aggregate ['_hoodie_record_key], ['_hoodie_record_key AS dupe_key#0, 
count(1) AS dupe_cnt#1L]
      +- SubqueryAlias htbl_1658517533303
         +- View (htbl_1658517533303, [])
            +- LocalRelation <empty>
   ```
   `+- LocalRelation <empty>`, it look like the path you input not have 
complete files, can you give some detail log in 
`org.apache.hudi.cli.DedupeSparkJob`
   the info log is: `List of files under partition: xxx => yyyy`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to