kirillklimenko opened a new issue, #12182:
URL: https://github.com/apache/hudi/issues/12182

   **Description**
   
   After upgrading our Amazon EMR cluster from version `7.2.0` to `7.3.0` 
([release 
notes](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-730-release.html)),
 and Apache Hudi from `0.14.1` to `0.15.0` within the EMR bundle ([release 
notes](https://hudi.apache.org/releases/release-0.15.0/)), we noticed that Hudi 
metrics stopped reporting to CloudWatch. The error observed is as follows:
   
   ```
   ERROR ScheduledReporter: Exception thrown from CloudWatchReporter#report. 
Exception was suppressed.
   java.lang.ArrayIndexOutOfBoundsException: Index 1 out of bounds for length 1
        at 
org.apache.hudi.aws.cloudwatch.CloudWatchReporter.stageMetricDatum(CloudWatchReporter.java:281)
 ~[hudi-aws-bundle-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at 
org.apache.hudi.aws.cloudwatch.CloudWatchReporter.lambda$processGauge$2(CloudWatchReporter.java:250)
 ~[hudi-aws-bundle-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at java.util.Optional.ifPresent(Optional.java:178) ~[?:?]
        at 
org.apache.hudi.aws.cloudwatch.CloudWatchReporter.processGauge(CloudWatchReporter.java:250)
 ~[hudi-aws-bundle-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at 
org.apache.hudi.aws.cloudwatch.CloudWatchReporter.report(CloudWatchReporter.java:189)
 ~[hudi-aws-bundle-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at 
org.apache.hudi.com.codahale.metrics.ScheduledReporter.report(ScheduledReporter.java:237)
 ~[hudi-spark3-bundle_2.12-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at 
org.apache.hudi.com.codahale.metrics.ScheduledReporter.lambda$start$0(ScheduledReporter.java:177)
 ~[hudi-spark3-bundle_2.12-0.15.0-amzn-0.jar:0.15.0-amzn-0]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) [?:?]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) 
[?:?]
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
 [?:?]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) 
[?:?]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) 
[?:?]
        at java.lang.Thread.run(Thread.java:840) [?:?]
   ```
   
   **Environment Description**
   
   - Amazon EMR Version: 7.3.0
   - Hudi Version: 0.15.0
   - Spark Version: 3.5.1
   - Previous EMR Version (working): 7.2.0 with Hudi 0.14.1
   - AWS CloudWatch Setup: Default settings for Hudi metrics reporting
   
   **Steps to Reproduce**
   
   1. Upgrade an Amazon EMR cluster from version 7.2.0 to 7.3.0.
   2. Enable Hudi metrics reporting to CloudWatch for the MOR table.
   ```python
   {
       "hoodie.metrics.on": True,
       "hoodie.metrics.reporter.type": "CLOUDWATCH",
       "hoodie.metrics.cloudwatch.namespace": "ULH",
   }
   ```
   3. Write to the MOR table and monitor Hudi CloudWatchReporter reporting 
errors.
   
   **Observed Behavior**
   
   After the upgrade, Hudi stopped sending metrics to CloudWatch. The 
`ArrayIndexOutOfBoundsException` exception is thrown in the 
`CloudWatchReporter.stageMetricDatum` function during each reporting interval.
   
   **Expected Behavior**
   
   Metrics should be reported to CloudWatch without errors.
   
   **Additional Context**
   
   This error suggests an issue with how the `CloudWatchReporter` processes or 
formats metrics data for CloudWatch, potentially related to an array handling 
bug in the `stageMetricDatum` method. This issue only appeared after upgrading 
to Hudi `0.15.0`, included in EMR `7.3.0`.
   
   Could you confirm if this is a known issue or if there is a workaround? Any 
insight or suggested fixes would be appreciated.
   
   Full Hudi config for insert:
   
   ```python
   _TABLE_OPTIONS = {
       "hoodie.database.name": "ulh",
       "hoodie.table.name": "bronze",
       "hoodie.index.type": "BUCKET",
       "hoodie.index.bucket.engine": "CONSISTENT_HASHING",
       "hoodie.bucket.index.num.buckets": 32,
       "hoodie.enable.data.skipping": True,
       "hoodie.datasource.query.type": "read_optimized",
   }
   
   _WRITE_OPTIONS = {
       "hoodie.datasource.write.table.type": "MERGE_ON_READ",
       "hoodie.datasource.write.recordkey.field": "sha512",
       "hoodie.datasource.write.partitionpath.field": 
"year,month,day,data_origin",
       "hoodie.datasource.write.precombine.field": "updated_dt",
       "hoodie.datasource.write.hive_style_partitioning": True,
       "hoodie.datasource.write.keygenerator.class": 
"org.apache.hudi.keygen.ComplexKeyGenerator",
   }
   
   _METADATA_OPTIONS = {
       "hoodie.metadata.enable": True,
       "hoodie.parquet.compression.codec": "zstd",
       "hoodie.storage.layout.partitioner.class": 
"org.apache.hudi.table.action.commit.SparkBucketIndexPartitioner",
   }
   
   _METRICS_OPTIONS = {
       "hoodie.metrics.on": True,
       "hoodie.metrics.reporter.type": "CLOUDWATCH",
       "hoodie.metrics.cloudwatch.namespace": "ULH",
   }
   
   INSERT_OPTIONS = {
       **_TABLE_OPTIONS,
       **_METADATA_OPTIONS,
       **_METRICS_OPTIONS,
       **_WRITE_OPTIONS,
       "hoodie.datasource.write.operation": "insert",
       "hoodie.datasource.write.payload.class": 
"org.apache.hudi.common.model.OverwriteWithLatestAvroPayload",
   }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to