LucaCanali opened a new pull request #23898: [SPARK-26995][K8S] Running Spark 
in Docker image with Alpine Linux 3.9.0 throws errors when using snappy
URL: https://github.com/apache/spark/pull/23898
 
 
   Running Spark in Docker image with Alpine Linux 3.9.0 throws errors when 
using snappy. 
   
   The issue can be reproduced for example as follows: 
`Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` 
   The key part of the error stack is as follows `SparkException: Task failed 
while writing rows. .... Caused by: java.lang.UnsatisfiedLinkError: 
/tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so: Error 
loading shared library ld-linux-x86-64.so.2: Noded by 
/tmp/snappy-1.1.7-2b4872f1-7c41-4b84-bda1-dbcb8dd0ce4c-libsnappyjava.so)` 
   
   The source of the error appears to be that libsnappyjava.so needs 
ld-linux-x86-64.so.2 and looks for it in /lib, while in Alpine Linux 3.9.0 with 
libc6-compat version 1.1.20-r3 ld-linux-x86-64.so.2 is located in /lib64.
   Note: this issue is not present with Alpine Linux 3.8 and libc6-compat 
version 1.1.19-r10 
   
   ## What changes were proposed in this pull request?
   
   A possible workaround proposed with this PR is to modify the Dockerfile by 
adding a symbolic link between /lib and /lib64 so that linux-x86-64.so.2 can be 
found in /lib. This is probably not the cleanest solution, but I have observed 
that this is what happened/happens already when using Alpine Linux 3.8.1 (a 
version of Alpine Linux which was not affected by the issue reported here).
   
   ## How was this patch tested?
   
   Manually tested by running a simple workload with spark-shell, using docker 
on a client machine and using Spark on a Kubernetes cluster.
   The test workload is: 
`Seq(1,2).toDF("id").write.format("parquet").save("DELETEME1")` 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to