Greetings,

We are setting up an Apache Beam cluster using Spark as a backend to run python 
code. This is currently a toy example with 4 virtual machines running Centos (a 
client, a spark main, and two spark-workers).
We are running into version issues (detail below) and would need help on which 
versions to set up.
We currently are trying spark-2.4.8-bin-hadoop2.7, with the pip package beam 
2.37.0 on the client, and using a job-server to create docker image.


I saw here https://beam.apache.org/blog/beam-2.33.0/ that "Spark 2.x users will 
need to update Spark's Jackson runtime dependencies (spark.jackson.version) to 
at least version 2.9.2, due to Beam updating its dependencies."

 But it looks like the jackson-core version in the job-server is 2.13.0 whereas 
the jars in spark-2.4.8-bin-hadoop2.7/jars are

-. 1 mluser mluser 46986 May 8 2021 jackson-annotations-2.6.7.jar
-. 1 mluser mluser 258919 May 8 2021 jackson-core-2.6.7.jar
-. 1 mluser mluser 232248 May 8 2021 jackson-core-asl-1.9.13.jar
-. 1 mluser mluser 1166637 May 8 2021 jackson-databind-2.6.7.3.jar
-. 1 mluser mluser 320444 May 8 2021 jackson-dataformat-yaml-2.6.7.jar
-. 1 mluser mluser 18336 May 8 2021 jackson-jaxrs-1.9.13.jar
-. 1 mluser mluser 780664 May 8 2021 jackson-mapper-asl-1.9.13.jar
-. 1 mluser mluser 32612 May 8 2021 jackson-module-jaxb-annotations-2.6.7.jar
-. 1 mluser mluser 42858 May 8 2021 jackson-module-paranamer-2.7.9.jar
-. 1 mluser mluser 515645 May 8 2021 jackson-module-scala_2.11-2.6.7.1.jar

There must be something to update, but I am not sure how to update these jar 
files with their dependencies, and not sure if this would get us very far.

Would you have a list of binaries that work together or some running CI from 
the apache foundation similar to what we are trying to achieve?

Reply via email to