The root cause was actually   "java.lang.ClassNotFoundException:
org.apache.hadoop.io.Writable" which I eventually fixed by including
hadoop-common as a dep for my pipeline (below).  Should hadoop-common be
listed as a dep of ParquetIO the beam repo itself?

implementation "org.apache.hadoop:hadoop-common:3.2.4"

On Fri, Apr 21, 2023 at 10:38 AM Evan Galpin <egal...@apache.org> wrote:

> Oops, I was looking at the "bootleg" mvnrepository search engine, which
> shows `compileOnly` in the copy-pastable dependency installation
> prompts[1].  When I received the "ClassNotFound" error, my thought was that
> the dep should be installed in "implementation" mode.  When I tried that, I
> get other more strange errors when I try to run my pipeline:
> "java.lang.NoClassDefFoundError: Could not initialize class
> org.apache.beam.sdk.coders.CoderRegistry".
>
> My deps are like so:
>     implementation "org.apache.beam:beam-sdks-java-core:${beamVersion}"
>     implementation
> "org.apache.beam:beam-sdks-java-io-parquet:${beamVersion}"
>     ...
>
> Not sure why the CoderRegistry error comes up at runtime when both of the
> above deps are included.
>
> [1]
> https://mvnrepository.com/artifact/org.apache.beam/beam-sdks-java-io-parquet/2.46.0
>
> On Fri, Apr 21, 2023 at 2:34 AM Alexey Romanenko <aromanenko....@gmail.com>
> wrote:
>
>> Just curious. where it was documented like this?
>>
>> I briefly checked it on Maven Central [1] and the provided code snippet
>> for Gradle uses “implementation” scope.
>>
>> —
>> Alexey
>>
>> [1]
>> https://search.maven.org/artifact/org.apache.beam/beam-sdks-java-io-parquet/2.46.0/jar
>>
>> > On 21 Apr 2023, at 01:52, Evan Galpin <egal...@apache.org> wrote:
>> >
>> > Hi all,
>> >
>> > I'm trying to make use of ParquetIO.  Based on what's documented in
>> maven central, I'm including the artifact in "compileOnly" mode (or in
>> maven parlance, 'provided' scope).  I can successfully compile my pipeline,
>> but when I run it I (intuitively?) am met with a ClassNotFound exception
>> for ParquetIO.
>> >
>> > Is 'compileOnly' still the desired way to include ParquetIO as a
>> pipeline dependency?
>> >
>> > Thanks,
>> > Evan
>>
>>

Reply via email to