Hi Spark Community, I am using the official Docker image `apache/spark-py:v3.4.0` and installing `pyspark==3.4.0` on top of it. However, I have encountered multiple security vulnerabilities related to outdated dependencies in the base image.
Issues: 1. Security Concerns: - Prisma scan reports 89 high/critical CVEs/PRISMAs in the base image. - Some vulnerabilities are related to outdated system libraries and dependencies. ### CVE Issues from prisma scan. | S.No | CVE ID | Severity | Packages | Package Version | Fix Status | Package Path | |-------:|:--------------------|:-----------|:---------------------------------------------------------|:------------------|:--------------------------------------|:---------------------------------------------------------------------------------------------| | 1 | CVE-2022-1471 | critical | org.yaml_snakeyaml | 1.33 | fixed in 2.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/snakeyaml-1.33.jar | | 2 | CVE-2018-7489 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.5, 2.8.11.1, 2.7.9.3 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 3 | CVE-2019-17267 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 4 | CVE-2019-20330 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.2 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 5 | CVE-2020-10650 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.5 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 6 | CVE-2020-24616 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.6 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 7 | CVE-2020-24750 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.6 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 8 | CVE-2020-35490 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 9 | CVE-2020-35491 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 10 | CVE-2020-36179 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 11 | CVE-2020-36180 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 12 | CVE-2020-36181 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 13 | CVE-2020-36182 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 14 | CVE-2020-36183 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 15 | CVE-2020-36184 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 16 | CVE-2020-36185 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 17 | CVE-2020-36186 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 18 | CVE-2020-36187 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 19 | CVE-2020-36188 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 20 | CVE-2020-36189 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.8 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 21 | CVE-2020-36518 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.12.6.1, 2.13.2.1 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 22 | CVE-2020-8840 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.3, 2.8.11.5, 2.7.9.7 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 23 | CVE-2020-9547 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.4 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 24 | CVE-2020-9548 | critical | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.4 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 25 | CVE-2021-20190 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.9.10.7 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 26 | CVE-2022-42003 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.13.4.1, 2.12.7.1 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 27 | CVE-2022-42004 | high | com.fasterxml.jackson.core_jackson-databind | 2.6.7.4 | fixed in 2.13.4 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 28 | CVE-2024-47554 | high | commons-io_commons-io | 2.8.0 | fixed in 2.14.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 29 | CVE-2024-47561 | critical | org.apache.avro_avro | 1.7.7 | fixed in 1.11.4 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 30 | CVE-2023-39410 | high | org.apache.avro_avro | 1.7.7 | fixed in 1.11.3 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 31 | CVE-2022-42003 | high | com.fasterxml.jackson.core_jackson-databind | 2.12.7 | fixed in 2.13.4.1, 2.12.7.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 32 | CVE-2022-42004 | high | com.fasterxml.jackson.core_jackson-databind | 2.12.7 | fixed in 2.13.4 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 33 | CVE-2023-52428 | high | com.nimbusds_nimbus-jose-jwt | 9.8.1 | fixed in 9.37.2 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 34 | CVE-2024-23945 | high | org.apache.spark_spark-hive-thriftserver_2.12 | 3.4.0 | fixed in 3.4.2 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-hive-thriftserver_2.12-3.4.0.jar | | 35 | CVE-2024-47554 | high | commons-io_commons-io | 2.11.0 | fixed in 2.14.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/commons-io-2.11.0.jar | | 36 | GHSA-xpw8-rcwv-8f8p | high | io.netty_netty-codec-http2 | 4.1.87.Final | fixed in 4.1.100.Final | /usr/local/lib/python3.10/dist-packages/pyspark/jars/netty-codec-http2-4.1.87.Final.jar | | 37 | CVE-2023-44487 | high | io.netty_netty-codec-http2 | 4.1.87.Final | fixed in 4.1.100.Final | /usr/local/lib/python3.10/dist-packages/pyspark/jars/netty-codec-http2-4.1.87.Final.jar | | 38 | CVE-2022-31159 | high | com.amazonaws_aws-java-sdk-s3 | 1.11.1026 | fixed in 1.12.261 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 39 | CVE-2018-1330 | high | org.apache.mesos_mesos | 1.4.3 | fixed in 1.6.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 40 | CVE-2024-7254 | high | com.google.protobuf_protobuf-java | 3.7.1 | fixed in 4.28.2, 4.27.5, 3.25.5 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 41 | CVE-2021-22569 | high | com.google.protobuf_protobuf-java | 3.7.1 | fixed in 3.19.2, 3.18.2, 3.16.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 42 | CVE-2021-22570 | high | com.google.protobuf_protobuf-java | 3.7.1 | fixed in 3.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 43 | CVE-2022-3509 | high | com.google.protobuf_protobuf-java | 3.7.1 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 44 | CVE-2022-3510 | high | com.google.protobuf_protobuf-java | 3.7.1 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 45 | CVE-2021-37136 | high | io.netty_netty-codec | 4.1.61.Final | fixed in 4.1.68.Final | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 46 | CVE-2021-37137 | high | io.netty_netty-codec | 4.1.61.Final | fixed in 4.1.68.Final | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 47 | CVE-2023-44981 | critical | org.apache.zookeeper_zookeeper | 3.6.3 | fixed in 3.9.1, 3.8.3, 3.7.2 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/zookeeper-3.6.3.jar | | 48 | CVE-2022-2048 | high | org.eclipse.jetty_jetty-io | 9.4.43.v20210629 | fixed in 11.0.9, 10.0.9, 9.4.47 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 49 | CVE-2023-36478 | high | org.eclipse.jetty_jetty-io | 9.4.43.v20210629 | fixed in 11.0.16, 10.0.16, 9.4.53 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 50 | CVE-2023-44487 | high | org.eclipse.jetty_jetty-io | 9.4.43.v20210629 | fixed in 12.0.2, 11.0.17, 10.0.17,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 51 | CVE-2024-22201 | high | org.eclipse.jetty_jetty-io | 9.4.43.v20210629 | fixed in 12.0.6, 11.0.20, 10.0.20,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 52 | PRISMA-2023-0067 | high | com.fasterxml.jackson.core_jackson-core | 2.12.7 | fixed in 2.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 53 | PRISMA-2023-0067 | high | com.fasterxml.jackson.core_jackson-core | 2.13.2 | fixed in 2.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/parquet-jackson-1.12.3.jar | | 54 | CVE-2021-31684 | high | net.minidev_json-smart | 1.3.2 | fixed in 2.4.4, 1.3.3 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 55 | CVE-2023-1370 | high | net.minidev_json-smart | 1.3.2 | fixed in 2.4.9 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 56 | CVE-2023-36478 | high | org.eclipse.jetty_jetty-io | 9.4.50.v20221201 | fixed in 11.0.16, 10.0.16, 9.4.53 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-core_2.12-3.4.0.jar | | 57 | CVE-2023-44487 | high | org.eclipse.jetty_jetty-io | 9.4.50.v20221201 | fixed in 12.0.2, 11.0.17, 10.0.17,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-core_2.12-3.4.0.jar | | 58 | CVE-2024-22201 | high | org.eclipse.jetty_jetty-io | 9.4.50.v20221201 | fixed in 12.0.6, 11.0.20, 10.0.20,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-core_2.12-3.4.0.jar | | 59 | CVE-2022-25647 | high | gson | 2.2.4 | fixed in 2.8.9 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/gson-2.2.4.jar | | 60 | CVE-2024-7254 | high | com.google.protobuf_protobuf-java | 3.3.0 | fixed in 4.28.2, 4.27.5, 3.25.5 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 61 | CVE-2021-22569 | high | com.google.protobuf_protobuf-java | 3.3.0 | fixed in 3.19.2, 3.18.2, 3.16.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 62 | CVE-2021-22570 | high | com.google.protobuf_protobuf-java | 3.3.0 | fixed in 3.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 63 | CVE-2022-3509 | high | com.google.protobuf_protobuf-java | 3.3.0 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 64 | CVE-2022-3510 | high | com.google.protobuf_protobuf-java | 3.3.0 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/mesos-1.4.3-shaded-protobuf.jar | | 65 | CVE-2024-47561 | critical | org.apache.avro_avro | 1.11.1 | fixed in 1.11.4 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/avro-1.11.1.jar | | 66 | CVE-2023-39410 | high | org.apache.avro_avro | 1.11.1 | fixed in 1.11.3 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/avro-1.11.1.jar | | 67 | PRISMA-2023-0067 | high | com.fasterxml.jackson.core_jackson-core | 2.6.7 | fixed in 2.15.0 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 68 | PRISMA-2023-0067 | high | com.fasterxml.jackson.core_jackson-core | 2.14.2 | fixed in 2.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/jackson-core-2.14.2.jar | | 69 | CVE-2020-28491 | high | com.fasterxml.jackson.dataformat_jackson-dataformat-cbor | 2.6.7 | fixed in 2.12.1, 2.11.4 | /opt/spark/jars/aws-java-sdk-bundle-1.11.1026.jar | | 70 | CVE-2024-36114 | high | io.airlift_aircompressor | 0.21 | fixed in 0.27 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/aircompressor-0.21.jar | | 71 | CVE-2024-7254 | high | com.google.protobuf_protobuf-java | 3.21.12 | fixed in 4.28.2, 4.27.5, 3.25.5 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-core_2.12-3.4.0.jar | | 72 | CVE-2024-7254 | high | com.google.protobuf_protobuf-java | 3.17.3 | fixed in 4.28.2, 4.27.5, 3.25.5 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/orc-mapreduce-1.8.3-shaded-protobuf.jar | | 73 | CVE-2022-3509 | high | com.google.protobuf_protobuf-java | 3.17.3 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/orc-mapreduce-1.8.3-shaded-protobuf.jar | | 74 | CVE-2022-3510 | high | com.google.protobuf_protobuf-java | 3.17.3 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/orc-mapreduce-1.8.3-shaded-protobuf.jar | | 75 | CVE-2023-2976 | high | com.google.guava_guava | 30.1.1-jre | fixed in 32.0.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-shaded-guava-1.1.1.jar | | 76 | CVE-2022-42003 | high | com.fasterxml.jackson.core_jackson-databind | 2.13.2.2 | fixed in 2.13.4.1, 2.12.7.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/parquet-jackson-1.12.3.jar | | 77 | CVE-2022-42004 | high | com.fasterxml.jackson.core_jackson-databind | 2.13.2.2 | fixed in 2.13.4 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/parquet-jackson-1.12.3.jar | | 78 | CVE-2024-7254 | high | com.google.protobuf_protobuf-java | 2.5.0 | fixed in 4.28.2, 4.27.5, 3.25.5 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/protobuf-java-2.5.0.jar | | 79 | CVE-2021-22569 | high | com.google.protobuf_protobuf-java | 2.5.0 | fixed in 3.19.2, 3.18.2, 3.16.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/protobuf-java-2.5.0.jar | | 80 | CVE-2021-22570 | high | com.google.protobuf_protobuf-java | 2.5.0 | fixed in 3.15.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/protobuf-java-2.5.0.jar | | 81 | CVE-2022-3509 | high | com.google.protobuf_protobuf-java | 2.5.0 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/protobuf-java-2.5.0.jar | | 82 | CVE-2022-3510 | high | com.google.protobuf_protobuf-java | 2.5.0 | fixed in 3.21.7, 3.20.3, 3.19.6,... | /usr/local/lib/python3.10/dist-packages/pyspark/jars/protobuf-java-2.5.0.jar | | 83 | CVE-2019-0205 | high | libthrift | 0.12.0 | fixed in 0.13.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/libthrift-0.12.0.jar | | 84 | CVE-2019-0210 | high | libthrift | 0.12.0 | fixed in 0.13.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/libthrift-0.12.0.jar | | 85 | CVE-2020-13949 | high | libthrift | 0.12.0 | fixed in 0.14.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/libthrift-0.12.0.jar | | 86 | CVE-2024-25638 | high | dnsjava_dnsjava | 2.1.7 | fixed in 3.6.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/hadoop-client-runtime-3.3.4.jar | | 87 | CVE-2023-34455 | high | org.xerial.snappy_snappy-java | 1.1.9.1 | fixed in 1.1.10.1 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/snappy-java-1.1.9.1.jar | | 88 | CVE-2023-43642 | high | org.xerial.snappy_snappy-java | 1.1.9.1 | fixed in 1.1.10.4 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/snappy-java-1.1.9.1.jar | | 89 | CVE-2023-2976 | high | com.google.guava_guava | 14.0.1 | fixed in 32.0.0 | /usr/local/lib/python3.10/dist-packages/pyspark/jars/spark-network-common_2.12-3.4.0.jar | Questions: - Can the Spark team provide guidance on securely building a Spark 3.4.0 image with updated dependencies? - I tried upgrading problematic jars to their fixed versions but post that getting compatibility issues when running jobs. Environment Details: - Base Image: `apache/spark-py:v3.4.0` - Installed PySpark Version: `3.4.0` - Python version: 3.10 - Issue: Security vulnerabilities in outdated dependencies Let me know if there are any workarounds. Best regards, Ejas Ali ________________________________ This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Where allowed by local law, electronic communications with Accenture and its affiliates, including e-mail and instant messaging (including content), may be scanned by our systems for the purposes of information security, AI-powered support capabilities, and assessment of internal compliance with Accenture policy. Your privacy is important to us. Accenture uses your personal data only in compliance with data protection laws. For further information on how Accenture processes your personal data, please see our privacy statement at https://www.accenture.com/us-en/privacy-policy. ______________________________________________________________________________________ www.accenture.com