[
https://issues.apache.org/jira/browse/SPARK-41013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17629764#comment-17629764
]
yutiantian commented on SPARK-41013:
------------------------------------
[~yumwang] spark-3.3.1 test appears to generate the same error.
Are there any parameters that need to be configured?
The error log is as follows
19:09:45.882 [map-output-dispatcher-0] ERROR
org.apache.spark.MapOutputTrackerMaster - null
java.lang.reflect.InvocationTargetException: null at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
~[?:1.8.0_181] at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_181] at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_181] at
java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:87)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTracker$.serializeOutputStatuses(MapOutputTracker.scala:1492)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:338)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
~[scala-library-2.12.15.jar:?] at
org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:77)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:335)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTrackerMaster$MessageLoop.handleStatusMessage(MapOutputTracker.scala:729)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:746)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_181] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181] Caused
by: java.lang.ExceptionInInitializerError: Cannot unpack libzstd-jni-1.5.2-1:
No such file or directory at
java.io.UnixFileSystem.createFileExclusively(Native Method) ~[?:1.8.0_181] at
java.io.File.createTempFile(File.java:2024) ~[?:1.8.0_181] at
com.github.luben.zstd.util.Native.load(Native.java:99)
~[zstd-jni-1.5.2-1.jar:1.5.2-1] at
com.github.luben.zstd.util.Native.load(Native.java:55)
~[zstd-jni-1.5.2-1.jar:1.5.2-1] at
com.github.luben.zstd.ZstdOutputStreamNoFinalizer.<clinit>(ZstdOutputStreamNoFinalizer.java:18)
~[zstd-jni-1.5.2-1.jar:1.5.2-1] at
com.github.luben.zstd.RecyclingBufferPool.<clinit>(RecyclingBufferPool.java:18)
~[zstd-jni-1.5.2-1.jar:1.5.2-1] at
org.apache.spark.io.ZStdCompressionCodec.<init>(CompressionCodec.scala:221)
~[spark-core_2.12-3.3.1.jar:3.3.1] ... 15 more 19:09:45.883
[map-output-dispatcher-2] ERROR org.apache.spark.MapOutputTrackerMaster - null
java.lang.reflect.InvocationTargetException: null at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
~[?:1.8.0_181] at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
~[?:1.8.0_181] at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
~[?:1.8.0_181] at
java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_181]
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:87)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTracker$.serializeOutputStatuses(MapOutputTracker.scala:1492)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:338)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
~[scala-library-2.12.15.jar:?] at
org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:77)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:335)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTrackerMaster$MessageLoop.handleStatusMessage(MapOutputTracker.scala:729)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:746)
~[spark-core_2.12-3.3.1.jar:3.3.1] at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
~[?:1.8.0_181] at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
~[?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181] Caused
by: java.lang.NoClassDefFoundError: Could not initialize class
com.github.luben.zstd.RecyclingBufferPool at
org.apache.spark.io.ZStdCompressionCodec.<init>(CompressionCodec.scala:221)
~[spark-core_2.12-3.3.1.jar:3.3.1] ... 15 more 19:09:45.883
[map-output-dispatcher-1] ERROR org.apache.spark.MapOutputTrackerMaster - null
> spark-3.1.2以cluster模式提交作业报 Could not initialize class
> com.github.luben.zstd.ZstdOutputStream
> --------------------------------------------------------------------------------------------
>
> Key: SPARK-41013
> URL: https://issues.apache.org/jira/browse/SPARK-41013
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.1.2
> Reporter: yutiantian
> Priority: Major
> Labels: libzstd-jni, spark.shuffle.mapStatus.compression.codec,
> zstd
>
> 使用spark-3.1.2版本以cluster模式提交作业,报
> Could not initialize class com.github.luben.zstd.ZstdOutputStream。具体日志如下:
> Exception in thread "map-output-dispatcher-0" Exception in thread
> "map-output-dispatcher-2" java.lang.ExceptionInInitializerError: Cannot
> unpack libzstd-jni: No such file or directory at
> java.io.UnixFileSystem.createFileExclusively(Native Method) at
> java.io.File.createTempFile(File.java:2024) at
> com.github.luben.zstd.util.Native.load(Native.java:97) at
> com.github.luben.zstd.util.Native.load(Native.java:55) at
> com.github.luben.zstd.ZstdOutputStream.<clinit>(ZstdOutputStream.java:16) at
> org.apache.spark.io.ZStdCompressionCodec.compressedOutputStream(CompressionCodec.scala:223)
> at
> org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:910)
> at
> org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:233)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at
> org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:72) at
> org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:230)
> at
> org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:466)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) Exception in thread
> "map-output-dispatcher-7" Exception in thread "map-output-dispatcher-5"
> java.lang.NoClassDefFoundError: Could not initialize class
> com.github.luben.zstd.ZstdOutputStream at
> org.apache.spark.io.ZStdCompressionCodec.compressedOutputStream(CompressionCodec.scala:223)
> at
> org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:910)
> at
> org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:233)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at
> org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:72) at
> org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:230)
> at
> org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:466)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) Exception in thread
> "map-output-dispatcher-4" Exception in thread "map-output-dispatcher-3"
> java.lang.NoClassDefFoundError: Could not initialize class
> com.github.luben.zstd.ZstdOutputStream at
> org.apache.spark.io.ZStdCompressionCodec.compressedOutputStream(CompressionCodec.scala:223)
> at
> org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:910)
> at
> org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:233)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at
> org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:72) at
> org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:230)
> at
> org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:466)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748) java.lang.NoClassDefFoundError:
> Could not initialize class com.github.luben.zstd.ZstdOutputStream at
> org.apache.spark.io.ZStdCompressionCodec.compressedOutputStream(CompressionCodec.scala:223)
> at
> org.apache.spark.MapOutputTracker$.serializeMapStatuses(MapOutputTracker.scala:910)
> at
> org.apache.spark.ShuffleStatus.$anonfun$serializedMapStatus$2(MapOutputTracker.scala:233)
> at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at
> org.apache.spark.ShuffleStatus.withWriteLock(MapOutputTracker.scala:72) at
> org.apache.spark.ShuffleStatus.serializedMapStatus(MapOutputTracker.scala:230)
> at
> org.apache.spark.MapOutputTrackerMaster$MessageLoop.run(MapOutputTracker.scala:466)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> 但是同样的代码,以client模式提交可以正常执行。
> 以cluster模式提交作业暂时的解决办法是在spark-default.conf
> 中配置spark.shuffle.mapStatus.compression.codec lz4 作业可以正常提交。
> 想咨询下cluster模式,在shuffle 过程中使用zstd压缩为什么会不能正常使用呢?
> 有任何思路提供的大佬将不胜感激。
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]