Fix SDK deps and enable strict enforcement After this change, the analysis by `mvn dependency:analyze` for the SDK will fail if there are any transitive dependency violations.
To enable this, we take the following scope interpretations and adjust the pom.xml accordingly: A "compile" scope optional dependency is required for build but left out of transitive dependencies. We have one of these: - org.codehaus.woodstox:stax2-api is an API that the SDK references directly, but it will not throw a class loading error until XmlSource is used. A "runtime" scope optional dependency is never reference directly from the SDK, but is an implementation referenced indirectly and loaded lazily. We have two of these: - org.tukaani:xz is an optional dependency to support xz-compressed streams at runtime we use from Apache commons, though the SDK does not directly use the library. Our use of Apache commons is an implementation detail so xz is really our dependency. - org.codehaus.woodstox:woodstox-core-asl is an implementation of stax2-api. Such an implementation must be present on the classpath when XmlSource is used, but the SDK deliberately does not reference a specific implementation. ----Release Notes---- [] ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=115604996 Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/89e62414 Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/89e62414 Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/89e62414 Branch: refs/heads/master Commit: 89e624141c589be950d792508705966b56f2bc61 Parents: 3111646 Author: klk <[email protected]> Authored: Thu Feb 25 14:30:23 2016 -0800 Committer: Davor Bonaci <[email protected]> Committed: Thu Feb 25 23:58:29 2016 -0800 ---------------------------------------------------------------------- sdk/pom.xml | 25 +++++++++++++++++--- .../google/cloud/dataflow/sdk/io/XmlSource.java | 18 ++++++++------ 2 files changed, 33 insertions(+), 10 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/89e62414/sdk/pom.xml ---------------------------------------------------------------------- diff --git a/sdk/pom.xml b/sdk/pom.xml index 1f15b02..4995da0 100644 --- a/sdk/pom.xml +++ b/sdk/pom.xml @@ -141,6 +141,19 @@ <artifactId>maven-compiler-plugin</artifactId> </plugin> + <plugin> + <groupId>org.apache.maven.plugins</groupId> + <artifactId>maven-dependency-plugin</artifactId> + <executions> + <execution> + <goals><goal>analyze-only</goal></goals> + <configuration> + <failOnWarning>true</failOnWarning> + </configuration> + </execution> + </executions> + </plugin> + <!-- Run CheckStyle pass on transforms, as they are release in source form. --> <plugin> @@ -661,8 +674,11 @@ </dependency> <!-- - To use com.google.cloud.dataflow.io.XmlSource, please explicitly declare - the following two dependencies. + To use com.google.cloud.dataflow.io.XmlSource: + + 1. Explicitly declare the following dependency for the stax2 API. + 2. Include a stax2 implementation on the classpath. One example + is given below as an optional runtime dependency on woodstox-core-asl --> <dependency> <groupId>org.codehaus.woodstox</groupId> @@ -675,6 +691,7 @@ <groupId>org.codehaus.woodstox</groupId> <artifactId>woodstox-core-asl</artifactId> <version>${woodstox.version}</version> + <scope>runtime</scope> <optional>true</optional> <exclusions> <!-- javax.xml.stream:stax-api is included in JDK 1.6+ --> @@ -687,12 +704,14 @@ <!-- To use com.google.cloud.dataflow.io.AvroSource with XZ-encoded files, - please explicitly declare this dependency. + please explicitly declare this dependency to include org.tukaani:xz on + the classpath at runtime. --> <dependency> <groupId>org.tukaani</groupId> <artifactId>xz</artifactId> <version>1.5</version> + <scope>runtime</scope> <optional>true</optional> </dependency> http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/89e62414/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java ---------------------------------------------------------------------- diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java index d684d22..1ead391 100644 --- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java +++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java @@ -17,6 +17,7 @@ package com.google.cloud.dataflow.sdk.io; import com.google.cloud.dataflow.sdk.coders.Coder; import com.google.cloud.dataflow.sdk.coders.JAXBCoder; import com.google.cloud.dataflow.sdk.options.PipelineOptions; +import com.google.cloud.dataflow.sdk.runners.PipelineRunner; import com.google.common.base.Preconditions; import org.codehaus.stax2.XMLInputFactory2; @@ -94,18 +95,21 @@ import javax.xml.stream.XMLStreamReader; * <p>Currently, only XML files that use single-byte characters are supported. Using a file that * contains multi-byte characters may result in data loss or duplication. * - * <p>To use {@code XmlSource}, explicitly declare dependencies on following two jars from Woodstox - * StAX XML parser. - * (1) stax2-api-3.1.1.jar - * (2) woodstox-core-asl-4.1.2.jar - * These dependencies have been declared as optional in Maven sdk/pom.xml file of Google Cloud - * Dataflow. + * <p>To use {@link XmlSource}: + * <ol> + * <li>Explicitly declare a dependency on org.codehaus.woodstox:stax2-api</li> + * <li>Include a compatible implementation on the classpath at run-time, + * such as org.codehaus.woodstox:woodstox-core-asl</li> + * </ol> + * + * <p>These dependencies have been declared as optional in Maven sdk/pom.xml file of + * Google Cloud Dataflow. * * <p><h3>Permissions</h3> * Permission requirements depend on the * {@link com.google.cloud.dataflow.sdk.runners.PipelineRunner PipelineRunner} that is * used to execute the Dataflow job. Please refer to the documentation of corresponding - * {@code PipelineRunner}s for more details. + * {@link PipelineRunner PipelineRunners} for more details. * * @param <T> Type of the objects that represent the records of the XML file. The * {@code PCollection} generated by this source will be of this type.
