dyang108 opened a new issue, #7524: URL: https://github.com/apache/hudi/issues/7524
Hope I can get some help on a problem that I’ve been seeing in Deltastreamer, running on Mesos building on the [docker.io/apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest](http://docker.io/apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest) image on Docker hub. I get the job to run for a few hours successfully but then it consistently fails on delta-sync later on. I’m reading from an Avro Kafka topic Is there another image I should be using to ensure that Deltastreamer functions properly? Any hints on what I might be configuring wrong? I’m new to Hudi and open to all help! I don't see anything above the async compaction failure stacktrace to indicate anything went wrong before, but also am not sure what to look for. **To Reproduce** Steps to reproduce the behavior: Built a docker image ``` FROM docker.io/apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest ARG spark_home=/opt/spark RUN mkdir ${spark_home}/work-dir RUN apt-get update -y && \ apt-get install -y libsvn1 libcurl4-nss-dev libevent-dev libsasl2-modules python libnss3 curl net-tools libopenblas-dev jnettop awscli zip jq procps && \ apt-get autoremove -y RUN touch /usr/local/bin/systemctl && chmod +x /usr/local/bin/systemctl && \ wget http://repos.mesosphere.com/debian/pool/main/m/mesos/mesos_1.5.0-2.0.1.debian9_amd64.deb && \ dpkg -i --ignore-depends=default-jre,libcurl3 mesos_1.5.0-2.0.1.debian9_amd64.deb && rm mesos_1.5.0-2.0.1.debian9_amd64.deb ENV MESOS_NATIVE_JAVA_LIBRARY /usr/local/lib/libmesos.so RUN rm /var/lib/dpkg/info/mesos.list RUN apt-get purge mesos -y RUN curl https://repo1.maven.org/maven2/com/amazonaws/aws-java-sdk/1.7.4/aws-java-sdk-1.7.4.jar -o /opt/spark/jars/aws-java-sdk-1.7.4.jar RUN curl https://repo1.maven.org/maven2/org/apache/hadoop/hadoop-aws/2.7.3/hadoop-aws-2.7.3.jar -o /opt/spark/jars/hadoop-aws-2.7.3.jar RUN curl https://repo1.maven.org/maven2/org/apache/hudi/hudi-utilities-bundle_2.11/0.12.1/hudi-utilities-bundle_2.11-0.12.1.jar -o ${spark_home}/work-dir/hudi-utilities-bundle_2.11-0.12.1.jar ADD run.sh / ENTRYPOINT /run.sh ``` Running with Hadoop-aws and trying to write to S3, with the following command in the run.sh ``` spark-submit \ --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer \ --master mesos://zk://10.0.1.0:2181,10.0.5.0:2181,10.0.9.0:2181/mesos/test/test \ --deploy-mode client \ --conf spark.serializer=org.apache.spark.serializer.KryoSerializer \ --conf spark.sql.extensions=org.apache.spark.sql.hudi.HoodieSparkSessionExtension \ --conf spark.sql.catalog.spark_catalog=org.apache.spark.sql.hudi.catalog.HoodieCatalog \ --conf spark.executor.extraJavaOptions -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/dbfs/cluster-logs/heap-dumps/ \ --conf spark.driver.extraJavaOptions -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/dbfs/cluster-logs/heap-dumps/ \ --conf spark.master.rest.enabled=true \ --conf spark.mesos.uris=s3a://some-bucket/spark/job-service-confs/22b1f337c68605de3e90c90190197de6.conf \ --conf spark.mesos.executor.docker.image=docker.io/apachehudi/hudi-hadoop_2.8.4-hive_2.3.3-sparkadhoc_2.4.4:latest \ --conf spark.mesos.executor.home=/opt/spark \ --conf spark.cores.max=48 \ --conf spark.executor.cores=8 \ --conf spark.executor.memory=16G \ --conf spark.driver.memory=16G \ --jars /opt/spark/jars/* /opt/spark/work-dir/hudi-utilities-bundle_2.11-0.12.1.jar \ --props /mnt/mesos/sandbox/kafka-source.properties \ --schemaprovider-class org.apache.hudi.utilities.schema.SchemaRegistryProvider \ --source-class org.apache.hudi.utilities.sources.AvroKafkaSource \ --target-base-path "s3a://mlmodels/photo-activity-data13/hudi" \ --target-table "photo_activity_data" \ --op "UPSERT" \ --source-ordering-field "ts" \ --table-type "MERGE_ON_READ" \ --source-limit 1000 \ --continuous ``` **Expected behavior** A clear and concise description of what you expected to happen. **Environment Description** * Hudi version : 0.12.1 * Spark version : 2.4.4 * Hive version : 2.3.3 * Hadoop version : 2.8.4 * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : yes **Additional context** Add any other context about the problem here. **Stacktrace** ``` 22/12/19 21:00:54 ERROR deltastreamer.HoodieDeltaStreamer: Shutting down delta-sync due to exception org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:712) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 22/12/19 21:00:54 INFO deltastreamer.HoodieDeltaStreamer: Delta Sync shutdown. Error ?true 22/12/19 21:00:54 WARN deltastreamer.HoodieDeltaStreamer: Gracefully shutting down compactor 22/12/19 21:00:54 INFO deltastreamer.HoodieDeltaStreamer: DeltaSync shutdown. Closing write client. Error?true 22/12/19 21:00:54 ERROR async.HoodieAsyncService: Service shutdown with error java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:193) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:190) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:571) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:746) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:712) ... 4 more 22/12/19 21:00:54 INFO lock.LockManager: Released connection created for acquiring lock 22/12/19 21:00:54 INFO transaction.TransactionManager: Transaction manager closed 22/12/19 21:00:54 INFO deltastreamer.DeltaSync: Shutting down embedded timeline server 22/12/19 21:00:54 INFO embedded.EmbeddedTimelineService: Closing Timeline server 22/12/19 21:00:54 INFO service.TimelineService: Closing Timeline Service 22/12/19 21:00:54 INFO javalin.Javalin: Stopping Javalin ... 22/12/19 21:00:54 INFO server.AbstractConnector: Stopped Spark@65bb9029{HTTP/1.1,[http/1.1]}{0.0.0.0:8090} 22/12/19 21:00:54 INFO ui.SparkUI: Stopped Spark web UI at http://ip-10-0-17-57.ec2.internal:8090/ 22/12/19 21:00:54 ERROR javalin.Javalin: Javalin failed to stop gracefully java.lang.InterruptedException at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1326) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) at org.apache.hudi.org.eclipse.jetty.server.AbstractConnector.doStop(AbstractConnector.java:333) at org.apache.hudi.org.eclipse.jetty.server.AbstractNetworkConnector.doStop(AbstractNetworkConnector.java:88) at org.apache.hudi.org.eclipse.jetty.server.ServerConnector.doStop(ServerConnector.java:248) at org.apache.hudi.org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at org.apache.hudi.org.eclipse.jetty.server.Server.doStop(Server.java:450) at org.apache.hudi.org.eclipse.jetty.util.component.AbstractLifeCycle.stop(AbstractLifeCycle.java:89) at io.javalin.Javalin.stop(Javalin.java:195) at org.apache.hudi.timeline.service.TimelineService.close(TimelineService.java:325) at org.apache.hudi.client.embedded.EmbeddedTimelineService.stop(EmbeddedTimelineService.java:141) at org.apache.hudi.utilities.deltastreamer.DeltaSync.close(DeltaSync.java:907) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.close(HoodieDeltaStreamer.java:848) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.onDeltaSyncShutdown(HoodieDeltaStreamer.java:227) at org.apache.hudi.async.HoodieAsyncService.lambda$shutdownCallback$0(HoodieAsyncService.java:171) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 22/12/19 21:00:54 INFO javalin.Javalin: Javalin has stopped 22/12/19 21:00:54 INFO service.TimelineService: Closed Timeline Service 22/12/19 21:00:54 INFO embedded.EmbeddedTimelineService: Closed Timeline server 22/12/19 21:00:54 INFO mesos.MesosCoarseGrainedSchedulerBackend: Shutting down all executors 22/12/19 21:00:54 INFO cluster.CoarseGrainedSchedulerBackend$DriverEndpoint: Asking each executor to shut down 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 4 is now TASK_FINISHED 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 2 is now TASK_FINISHED 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_FINISHED 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 3 is now TASK_FINISHED 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 0 is now TASK_FINISHED 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: Mesos task 5 is now TASK_FINISHED I1219 21:00:55.627799 65 sched.cpp:2009] Asked to stop the driver I1219 21:00:55.627866 147 sched.cpp:1191] Stopping framework 9dcd7947-26f3-4eb3-9ace-349103ab1b14-73491 22/12/19 21:00:55 INFO mesos.MesosCoarseGrainedSchedulerBackend: driver.run() returned with code DRIVER_STOPPED 22/12/19 21:00:55 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 22/12/19 21:00:55 INFO memory.MemoryStore: MemoryStore cleared 22/12/19 21:00:55 INFO storage.BlockManager: BlockManager stopped 22/12/19 21:00:55 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 22/12/19 21:00:55 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 22/12/19 21:00:55 INFO spark.SparkContext: Successfully stopped SparkContext Exception in thread "main" org.apache.hudi.exception.HoodieException: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:195) at org.apache.hudi.common.util.Option.ifPresent(Option.java:97) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:190) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:571) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845) at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161) at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184) at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86) at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.util.concurrent.ExecutionException: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:103) at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:193) ... 15 more Caused by: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:746) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.hudi.exception.HoodieException: Async compaction failed. Shutting down Delta Sync... at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$1(HoodieDeltaStreamer.java:712) ... 4 more 22/12/19 21:00:55 INFO util.ShutdownHookManager: Shutdown hook called 22/12/19 21:00:55 INFO util.ShutdownHookManager: Deleting directory /mnt/mesos/sandbox/spark-ffb8d4f5-9422-47a3-809a-efaba70f3a51 22/12/19 21:00:55 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-905af617-d555-4481-89bf-b68482bfb16e ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
