The root cause was actually "java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable" which I eventually fixed by including hadoop-common as a dep for my pipeline (below). Should hadoop-common be listed as a dep of ParquetIO the beam repo itself?
implementation "org.apache.hadoop:hadoop-common:3.2.4" On Fri, Apr 21, 2023 at 10:38 AM Evan Galpin <egal...@apache.org> wrote: > Oops, I was looking at the "bootleg" mvnrepository search engine, which > shows `compileOnly` in the copy-pastable dependency installation > prompts[1]. When I received the "ClassNotFound" error, my thought was that > the dep should be installed in "implementation" mode. When I tried that, I > get other more strange errors when I try to run my pipeline: > "java.lang.NoClassDefFoundError: Could not initialize class > org.apache.beam.sdk.coders.CoderRegistry". > > My deps are like so: > implementation "org.apache.beam:beam-sdks-java-core:${beamVersion}" > implementation > "org.apache.beam:beam-sdks-java-io-parquet:${beamVersion}" > ... > > Not sure why the CoderRegistry error comes up at runtime when both of the > above deps are included. > > [1] > https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-parquet/2.46.0 > > On Fri, Apr 21, 2023 at 2:34 AM Alexey Romanenko <aromanenko....@gmail.com> > wrote: > >> Just curious. where it was documented like this? >> >> I briefly checked it on Maven Central [1] and the provided code snippet >> for Gradle uses “implementation” scope. >> >> — >> Alexey >> >> [1] >> https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-parquet/2.46.0/jar >> >> > On 21 Apr 2023, at 01:52, Evan Galpin <egal...@apache.org> wrote: >> > >> > Hi all, >> > >> > I'm trying to make use of ParquetIO. Based on what's documented in >> maven central, I'm including the artifact in "compileOnly" mode (or in >> maven parlance, 'provided' scope). I can successfully compile my pipeline, >> but when I run it I (intuitively?) am met with a ClassNotFound exception >> for ParquetIO. >> > >> > Is 'compileOnly' still the desired way to include ParquetIO as a >> pipeline dependency? >> > >> > Thanks, >> > Evan >> >>