> On 28 Mar 2022, at 20:58, Mihai Alexe <mihai.al...@ecmwf.int> wrote: > > the jackson runtime dependencies should be updated manually (at least to > 2.9.2) in case of using Spark 2.x > > yes - that is exactly what we are looking to achieve, any hints about how to > do that? We’re not Java experts. Do you happen to have a CI recipe or binary > lis for this particular configuration? Thank you!
For our testing pipelines, that run on Spark 2.4.7, we just build them with a recent version of jackson libs [1]. Though, iiuc, you don’t build any java code. So, what actually is an issue that you are facing? > use Spark 3..x if possible since it already provides jackson jars of version > 2.10.0. > > we tried this too but ran into other compatibility problems. Seems that the > Beam Spark runner (in v 2.37.0) only supports the Spark 2.x branch, as per > the Beam docs https://beam.apache.org/documentation/runners/spark/ > <https://beam.apache.org/documentation/runners/spark/> Which exactly Spark 3.x version you did try? AFAICT, Beam 2.37.0 supports and was tested with Spark 3.1.2 / Scala 2.12 artifacts. — Alexey [1] https://github.com/Talend/beam-samples/blob/9288606495b9ba8f77383cd9709ed9b5783deeb8/pom.xml#L66 > > any ideas? > > On 2022/03/28 17:38:13 Alexey Romanenko wrote: > > Well, it’s caused by recent jackson's version update in Beam [1] - so, the > > jackson runtime dependencies should be updated manually (at least to 2.9.2) > > in case of using Spark 2.x. > > > > Either, use Spark 3..x if possible since it already provides jackson jars > > of version 2.10.0. > > > > [1] > > https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c > > > > <https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c><https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c> > > > > <https://github.com/apache/beam/commit/9694f70df1447e96684b665279679edafec13a0c%3e> > > > > — > > Alexey > > > > > On 28 Mar 2022, at 14:15, Florian Pinault <fl...@ecmwf.int > > > <mailto:fl...@ecmwf.int>> wrote: > > > > > > Greetings, > > > > > > We are setting up an Apache Beam cluster using Spark as a backend to run > > > python code. This is currently a toy example with 4 virtual machines > > > running Centos (a client, a spark main, and two spark-workers). > > > We are running into version issues (detail below) and would need help on > > > which versions to set up. > > > We currently are trying spark-2.4.8-bin-hadoop2.7, with the pip package > > > beam 2.37.0 on the client, and using a job-server to create docker image. > > > > > > I saw here https://beam.apache.org/blog/beam-2.33.0/ > > > <https://beam.apache.org/blog/beam-2.33.0/> > > > <https://beam.apache.org/blog/beam-2.33.0/> > > > <https://beam.apache.org/blog/beam-2.33.0/%3e> that "Spark 2.x users will > > > need to update Spark's Jackson runtime dependencies > > > (spark.jackson.version) to at least version 2.9.2, due to Beam updating > > > its dependencies." > > > But it looks like the jackson-core version in the job-server is 2.13.0 > > > whereas the jars in spark-2.4.8-bin-hadoop2.7/jars are > > > -. 1 mluser mluser 46986 May 8 2021 jackson-annotations-2.6.7.jar > > > -. 1 mluser mluser 258919 May 8 2021 jackson-core-2.6.7.jar > > > -. 1 mluser mluser 232248 May 8 2021 jackson-core-asl-1.9.13.jar > > > -. 1 mluser mluser 1166637 May 8 2021 jackson-databind-2.6.7.3.jar > > > -. 1 mluser mluser 320444 May 8 2021 jackson-dataformat-yaml-2.6.7.jar > > > -. 1 mluser mluser 18336 May 8 2021 jackson-jaxrs-1.9.13.jar > > > -. 1 mluser mluser 780664 May 8 2021 jackson-mapper-asl-1.9.13.jar > > > -. 1 mluser mluser 32612 May 8 2021 > > > jackson-module-jaxb-annotations-2.6.7.jar > > > -. 1 mluser mluser 42858 May 8 2021 jackson-module-paranamer-2.7.9.jar > > > -. 1 mluser mluser 515645 May 8 2021 jackson-module-scala_2.11-2.6.7.1.jar > > > > > > There must be something to update, but I am not sure how to update these > > > jar files with their dependencies, and not sure if this would get us very > > > far. > > > > > > Would you have a list of binaries that work together or some running CI > > > from the apache foundation similar to what we are trying to achieve? > > > >