yihua commented on issue #8977:
URL: https://github.com/apache/hudi/issues/8977#issuecomment-1592371897
Hi @soumilshah1995 thanks for the question. I took your pyspark script and
ran it locally. I found two issues:
(1) For a commit adding records to a new partition (or non-partitioned
table), the spark job throws the following exception:
```
Caused by: java.util.concurrent.CompletionException:
org.apache.spark.sql.AnalysisException: cannot resolve 'message' given input
columns: []; line 1 pos 50;
'Aggregate [unresolvedalias(count(1), None)]
+- 'Filter ('message = null)
+- SubqueryAlias staged_table_7_before
+- View (`staged_table_7_before`, [])
+- LocalRelation <empty>
```
(2) For a non-partitioned table adding new records, the pre-commit validator
fails to identify the inequality.
Either case, the job throws an exception to me. The reason you didn't hit
an exception is likely due to Glue environment.
I'll look into the two issues above.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]