SuneethaYamani opened a new issue, #11168: URL: https://github.com/apache/hudi/issues/11168
Hi , I am following the blog https://hudi.apache.org/blog/2021/08/23/s3-events-source/ Used below command spark-submit \ --jars "s3://codebucket/jars/hudi-spark3.4-bundle_2.12-0.14.0.jar,s3://codebucket/jars/aws-msk-iam-auth-1.1.1-all.jar,s3://codebucket/jars/spark-avro_2.12-3.5.0.jar,s3://codebucket/jars/aws-java-sdk-sqs-1.12.715.jar,s3://codebucket/jars/hudi-utilities-slim-bundle_2.12-0.14.0.jar" \ --master yarn \ --deploy-mode client \ --conf spark.dynamicAllocation.enabled=true \ --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ s3://codebucket/jars/hudi-utilities-slim-bundle_2.12-0.14.0.jar \ --table-type COPY_ON_WRITE \ --source-ordering-field eventTime \ --target-base-path s3://stage-data/iot-raw-s3/firehose_meta_table_3 \ --target-table firehose_xxxx_meta_table_3 \ --min-sync-interval-seconds 10 \ --hoodie-conf hoodie.datasource.write.recordkey.field="s3.object.key,eventName" \ --hoodie-conf hoodie.datasource.write.keygenerator.class=org.apache.hudi.keygen.ComplexKeyGenerator \ --hoodie-conf hoodie.datasource.write.partitionpath.field=s3.bucket.name \ --enable-hive-sync \ --hoodie-conf hoodie.datasource.hive_sync.partition_extractor_class=org.apache.hudi.hive.MultiPartKeysValueExtractor \ --hoodie-conf hoodie.database.name=iot_raw \ --hoodie-conf hoodie.datasource.hive_sync.enable=true \ --hoodie-conf hoodie.datasource.hive_sync.database=iot_raw \ --hoodie-conf hoodie.metadata.record.index.enable=true \ --hoodie-conf hoodie.index.type=RECORD_INDEX \ --hoodie-conf hoodie.datasource.write.hive_style_partitioning=true \ --hoodie-conf hoodie.datasource.hive_sync.table=firehose_meta_table_3 \ --hoodie-conf hoodie.datasource.hive_sync.partition_fields=bucket \ --source-class org.apache.hudi.utilities.sources.S3EventsSource \ --hoodie-conf hoodie.streamer.s3.source.queue.url=https://sqs.us-east-1.amazonaws.com/XXXX/xxx-parquet \ --hoodie-conf hoodie.deltastreamer.s3.source.queue.region=us-east-1 Meta table got created ,but I am getting Exception in thread "main" software.amazon.awssdk.services.sqs.model.EmptyBatchRequestException: There should be at least one DeleteMessageBatchRequestEntry in the request. (Service: Sqs, Status Code: 400, Request ID: 7a37b3c4-5ff5-5e40-b5d4-ed70519d6d1a) at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleErrorResponse(CombinedResponseHandler.java:125) at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handleResponse(CombinedResponseHandler.java:82) at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:60) at software.amazon.awssdk.core.internal.http.CombinedResponseHandler.handle(CombinedResponseHandler.java:41) at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.HandleResponseStage.execute(HandleResponseStage.java:38) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:72) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptTimeoutTrackingStage.execute(ApiCallAttemptTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:78) at software.amazon.awssdk.core.internal.http.pipeline.stages.TimeoutExceptionHandlingStage.execute(TimeoutExceptionHandlingStage.java:40) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:55) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallAttemptMetricCollectionStage.execute(ApiCallAttemptMetricCollectionStage.java:39) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:81) at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:56) at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:36) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:80) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:60) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:42) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50) at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37) at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26) at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:198) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:171) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:82) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:179) at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:76) at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:56) at software.amazon.awssdk.services.sqs.DefaultSqsClient.deleteMessageBatch(DefaultSqsClient.java:723) at org.apache.hudi.utilities.sources.helpers.CloudObjectsSelector.deleteBatchOfMessages(CloudObjectsSelector.java:213) at org.apache.hudi.utilities.sources.helpers.CloudObjectsSelector.deleteProcessedMessages(CloudObjectsSelector.java:238) at org.apache.hudi.utilities.sources.S3EventsSource.onCommit(S3EventsSource.java:104) at org.apache.hudi.utilities.streamer.StreamSync.writeToSink(StreamSync.java:844) at org.apache.hudi.utilities.streamer.StreamSync.syncOnce(StreamSync.java:446) at org.apache.hudi.utilities.streamer.HoodieStreamer$StreamSyncService.ingestOnce(HoodieStreamer.java:840) at org.apache.hudi.utilities.ingestion.HoodieIngestionService.startIngestion(HoodieIngestionService.java:72) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at org.apache.hudi.utilities.streamer.HoodieStreamer.sync(HoodieStreamer.java:205) at org.apache.hudi.utilities.streamer.HoodieStreamer.main(HoodieStreamer.java:584) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1066) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:215) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1158) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1167) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
