[ 
https://issues.apache.org/jira/browse/HUDI-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388023#comment-17388023
 ] 

ASF GitHub Bot commented on HUDI-2230:
--------------------------------------

davehagman commented on pull request #3345:
URL: https://github.com/apache/hudi/pull/3345#issuecomment-887476514


   @yanghua jira link: https://issues.apache.org/jira/browse/HUDI-2230
   
   If you need any more info let me know! FYI This is preventing graphite 
metrics reporting on 0.9.0 but I was unsure whether to label it a blocker or 
not. It's currently at `Major`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


> "Task not serializable" exception due to non-serializable Codahale Timers
> -------------------------------------------------------------------------
>
>                 Key: HUDI-2230
>                 URL: https://issues.apache.org/jira/browse/HUDI-2230
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.9.0
>            Reporter: Dave Hagman
>            Priority: Major
>             Fix For: 0.9.0
>
>
> Steps to reproduce:
>  * Enable graphite metrics via props file. Example:
> {noformat}
> hoodie.metrics.on=true
> hoodie.metrics.reporter.type=GRAPHITE
> hoodie.metrics.graphite.host=<host>
> hoodie.metrics.graphite.port=<port>
> hoodie.metrics.graphite.metric.prefix=<metrics_prefix>
> {noformat}
>  * Run the Deltastreamer
>  * Note the following exception:
> {noformat}
> Exception in thread "main" org.apache.hudi.exception.HoodieException: 
> org.apache.hudi.exception.HoodieException: Task not serializable
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:165)
>       at org.apache.hudi.common.util.Option.ifPresent(Option.java:96)
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:160)
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:501)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
>       at 
> org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:959)
>       at 
> org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
>       at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
>       at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
>       at 
> org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1038)
>       at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1047)
>       at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.util.concurrent.ExecutionException: 
> org.apache.hudi.exception.HoodieException: Task not serializable
>       at 
> java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
>       at 
> java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
>       at 
> org.apache.hudi.async.HoodieAsyncService.waitForShutdown(HoodieAsyncService.java:90)
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$1(HoodieDeltaStreamer.java:163)
>       ... 15 more
> Caused by: org.apache.hudi.exception.HoodieException: Task not serializable
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:649)
>       at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1604)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.spark.SparkException: Task not serializable
>       at 
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:416)
>       at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:406)
>       at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:162)
>       at org.apache.spark.SparkContext.clean(SparkContext.scala:2502)
>       at org.apache.spark.rdd.RDD.$anonfun$map$1(RDD.scala:422)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
>       at 
> org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
>       at org.apache.spark.rdd.RDD.withScope(RDD.scala:414)
>       at org.apache.spark.rdd.RDD.map(RDD.scala:421)
>       at org.apache.spark.api.java.JavaRDDLike.map(JavaRDDLike.scala:93)
>       at org.apache.spark.api.java.JavaRDDLike.map$(JavaRDDLike.scala:92)
>       at 
> org.apache.spark.api.java.AbstractJavaRDDLike.map(JavaRDDLike.scala:45)
>       ...
>       at 
> org.apache.hudi.utilities.sources.RowSource.fetchNewData(RowSource.java:43)
>       at org.apache.hudi.utilities.sources.Source.fetchNext(Source.java:76)
>       at 
> org.apache.hudi.utilities.deltastreamer.SourceFormatAdapter.fetchNewDataInAvroFormat(SourceFormatAdapter.java:69)
>       at 
> org.apache.hudi.utilities.deltastreamer.DeltaSync.readFromSource(DeltaSync.java:399)
>       at 
> org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:271)
>       at 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer$DeltaSyncService.lambda$startService$0(HoodieDeltaStreamer.java:625)
>       ... 4 more
> Caused by: java.io.NotSerializableException: 
> org.apache.hudi.com.codahale.metrics.Timer
> Serialization stack:
>       - object not serializable (class: 
> org.apache.hudi.com.codahale.metrics.Timer, value: 
> org.apache.hudi.com.codahale.metrics.Timer@e7bcb5c)
>       - field (class: 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, name: 
> overallTimer, type: class org.apache.hudi.com.codahale.metrics.Timer)
>       - object (class 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics@6704c405)
>       ...
>       - element of array (index: 0)
>       - array (class [Ljava.lang.Object;, size 1)
>       - field (class: java.lang.invoke.SerializedLambda, name: capturedArgs, 
> type: class [Ljava.lang.Object;)
>       - object (class java.lang.invoke.SerializedLambda, 
> SerializedLambda[capturingClass=..., 
> functionalInterfaceMethod=org/apache/spark/api/java/function/Function.call:
> {noformat}
> The important line being:
> {noformat}
> - field (class: 
> org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamerMetrics, name: 
> overallTimer, type: class org.apache.hudi.com.codahale.metrics.Timer)
> {noformat}
> *FIX:*
> Make the `Timer` member variables in the `HoodieDeltaStreamerMetrics` class 
> transient so that they are not serialized over the wire. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to