BBency commented on issue #9094:
URL: https://github.com/apache/hudi/issues/9094#issuecomment-1620270322
@ad1happy2go Hi Aditya,
I triggered the run again today. Sharing the timestamp and the error
message. Let me know if you would need more details.
### Approach 1 Error
**2023-07-04 13:33:04,217** ERROR [task-result-getter-2]
scheduler.TaskSetManager (Logging.scala:logError(77)): task 0.0 in stage 23.0
(TID 183) had a not serializable result:
org.apache.avro.generic.GenericData$Record
Serialization stack:
- object not serializable (class:
org.apache.avro.generic.GenericData$Record, value:
### Approach 2 Error
**2023-07-04 13:20:38,915** ERROR [main] glue.ProcessLauncher
(Logging.scala:logError(77)): Error from Python:Traceback (most recent call
last):
File "/tmp/eec-aws-uk-ukidcibatchanalytics-hudi-clustering-job.py", line
54, in <module>
main()
File "/tmp/eec-aws-uk-ukidcibatchanalytics-hudi-clustering-job.py", line
47, in main
spark_df_run_clustering = spark.sql(query_run_clustering)
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/session.py",
line 1034, in sql
return DataFrame(self._jsparkSession.sql(sqlQuery), self)
File
"/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line
1321, in __call__
return_value = get_return_value(
File "/opt/amazon/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line
190, in deco
return f(*a, **kw)
File
"/opt/amazon/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/protocol.py", line
326, in get_return_value
raise Py4JJavaError(
py4j.protocol.Py4JJavaError: An error occurred while calling o97.sql.
: org.apache.hudi.exception.HoodieClusteringException: Clustering failed to
write to
files:61b699e1-9a0b-4a23-8102-f66ab5b46fc8-0,3c2c597d-f807-4c71-aede-7b7c8b95ef19-0,69751dea-eddb-4cc5-97da-507ae885512b-0,ab443ddc-bfad-49a8-aadc-be47150f3c43-0,d9e692e6-cdb1-47d9-b7dd-63e288f42c44-0,0597f0dc-f9dd-4e17-a030-098eaa70860e-0,026645e2-9c06-4804-ac60-13dd9cde28ee-0,a61b0dc0-3762-4dc0-9379-53d8f36d988c-0,82107955-04a4-43dd-b62a-3e89860c2924-0,7526b140-d4b2-4e48-8f49-0df8018feddf-0,83fb889e-bf1d-43ce-bdb9-f68b9d4ee43a-0,d989f110-fbd0-4882-ad67-ca16f1179681-0,127c70f3-ba55-4dbc-8fe9-69c14c857e10-0,0a0b5c3b-40f8-4ff3-aac2-05d575eecd3b-0,a592daf4-5b43-42e8-a85b-f82b318cb76a-0,2dadec1c-520e-4c18-9e01-c4d4de3c42e4-0,f918e9f2-4254-49ce-9026-459545075a6c-0,b4be2ac2-8238-475f-9e4c-736a778299f1-0,6ec7b26a-c5d9-43a0-82e3-487d1a440565-0,ddf44fbd-b537-463b-af28-cc4f45e9f447-0,1cfbd8bd-06c8-4b77-9d7a-52efa2dc59a0-0,19934523-c4ea-4285-acd2-b9077dd0f028-0,8efd2ea5-d96b-4010-9aa6-beeaa0d0026f-0,1b82307
e-a481-4f67-abfd-271f8fc700d7-0,ad7f46dd-8ca4-4cb4-8fde-add11565c58c-0,63fd0f37-4c8c-439d-a7eb-571786c9d88c-0,a9abb2e6-7d50-4b80-baf3-4fb23b130741-0,f3a19847-e1ea-49bf-99b1-aaec2b0a21b0-0,ce89f9cf-86a6-46fd-b1b4-327fed85d8c4-0,e38269b8-8d55-4cc5-a9be-3edf156cc81a-0
at
org.apache.hudi.client.SparkRDDWriteClient.completeClustering(SparkRDDWriteClient.java:381)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]