For what it's worth, we believe are able to work around this issue by
adding the following line to our flink-conf.yaml:

classloader.parent-first-patterns.additional: javax.xml.;org.apache.xerces.


On Thu, Dec 6, 2018 at 2:28 AM Chesnay Schepler <ches...@apache.org> wrote:

> Small correction: Flink 1.7 does not support jdk9; we only fixed some of
> the issues, not all of them.
>
> On 06.12.2018 07:13, Mike Mintz wrote:
>
> Hi Flink developers,
>
> We're running some new DataStream jobs on Flink 1.7.0 using the shaded
> Hadoop S3 file system, and running into frequent errors saving checkpoints
> and savepoints to S3. I'm not sure what the underlying reason for the error
> is, but we often fail with the following stack trace, which appears to be
> due to missing the javax.xml.bind.DatatypeConverterImpl class in an
> error-handling path for AmazonS3Client.
>
> java.lang.NoClassDefFoundError: Could not initialize class
> javax.xml.bind.DatatypeConverterImpl
>     at
> javax.xml.bind.DatatypeConverter.initConverter(DatatypeConverter.java:140)
>     at
> javax.xml.bind.DatatypeConverter.printBase64Binary(DatatypeConverter.java:611)
>     at
> org.apache.flink.fs.s3base.shaded.com.amazonaws.util.Base64.encodeAsString(Base64.java:62)
>     at
> org.apache.flink.fs.s3base.shaded.com.amazonaws.util.Md5Utils.md5AsBase64(Md5Utils.java:104)
>     at
> org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1647)
>     at
> org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.putObjectDirect(S3AFileSystem.java:1531)
>
> I uploaded the full stack trace at
> https://gist.github.com/mikemintz/4769fc7bc3320c84ac97061e951041a0
>
> For reference, we're running flink from the "Apache 1.7.0 Flink only Scala
> 2.11" binary tgz, we've copied flink-s3-fs-hadoop-1.7.0.jar from opt/ to
> lib/, we're not defining HADOOP_CLASSPATH, and we're running java 8
> (openjdk version "1.8.0_191") on Ubuntu 18.04 x86_64.
>
> Presumably there are two issues: 1) some periodic error with S3, and 2)
> some classpath / class loading issue with
> javax.xml.bind.DatatypeConverterImpl that's preventing the original error
> from being displayed. I'm more curious about the later issue.
>
> This is super puzzling since javax/xml/bind/DatatypeConverterImpl.class is
> included in our rt.jar, and lsof confirms we're reading that rt.jar, so I
> suspect it's something tricky with custom class loaders or the way the
> shaded S3 jar works. Note that this class is not included in
> flink-s3-fs-hadoop-1.7.0.jar (which we are using), but it is included in
> flink-shaded-hadoop2-uber-1.7.0.jar (which we are not using).
>
> Another thing that jumped out to us was that Flink 1.7 is now able to
> build JDK9, but Java 9 includes deprecation of the javax.xml.bind
> libraries, requiring explicit inclusion in a Java 9 module [0]. And we saw
> that direct references to javax.xml.bind were removed from flink-core for
> 1.7 [1]
>
> Some things we tried, without success:
>
>    - Building flink from source on a computer with java 8 installed. We
>    still got NoClassDefFoundError.
>    - Using the binary version of Flink on machines with java 9 installed.
>    We get NullPointerException in ClosureCleaner.
>    - Downloading the jaxb-api jar [2], which has
>    javax/xml/bind/DatatypeConverterImpl.class, and setting HADOOP_CLASSPATH to
>    have that jar. We still got NoClassDefFoundError.
>    - Using iptables to completely block S3 traffic, hoping this would
>    make it easier to reproduce. The connection errors are properly displayed,
>    so these connection errors must go down another error handling path.
>
> Would love to hear any ideas about what might be happening, or further
> ideas we can try.
>
> Thanks!
> Mike
>
> [0]
> http://cr.openjdk.java.net/~iris/se/9/java-se-9-fr-spec/#APIs-proposed-for-removal
>
> [1] https://github.com/apache/flink/pull/6801
>
> [2] https://mvnrepository.com/artifact/javax.xml.bind/jaxb-api/2.3.1
>
>
>

Reply via email to