[GitHub] [hudi] BBency commented on issue #9094: Async Clustering failing with errors for MOR table

via GitHub Thu, 17 Aug 2023 16:38:16 -0700


BBency commented on issue #9094:
URL: https://github.com/apache/hudi/issues/9094#issuecomment-1683116028


   I was able to make the clustering work on a test job, but it is failing when 
I apply the same clustering configs on the production table. It is failing with 
the error: 
   py4j.protocol.Py4JJavaError: An error occurred while calling o97.sql.
   : org.apache.hudi.exception.HoodieClusteringException: **Clustering failed 
to write to files:**3b43f625-3095-4834-ab45-beade1dbbfa5-0
        at 
org.apache.hudi.client.SparkRDDWriteClient.completeClustering(SparkRDDWriteClient.java:381)
        at 
org.apache.hudi.client.SparkRDDWriteClient.completeTableService(SparkRDDWriteClient.java:468)
   What parameters should I consider while specifying the values for 
hoodie.clustering.plan.strategy.max.num.groups, 
hoodie.clustering.plan.strategy.small.file.limit, 
hoodie.clustering.plan.strategy.target.file.max.bytes and 
hoodie.clustering.plan.strategy.max.bytes.per.group. Can your provide some 
guidance on the same please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] BBency commented on issue #9094: Async Clustering failing with errors for MOR table

Reply via email to