Fix SDK deps and enable strict enforcement

After this change, the analysis by `mvn dependency:analyze` for the SDK
will fail if there are any transitive dependency violations.

To enable this, we take the following scope interpretations and adjust
the pom.xml accordingly:

A "compile" scope optional dependency is required for build but left out
of transitive dependencies. We have one of these:

 - org.codehaus.woodstox:stax2-api is an API that the SDK references
   directly, but it will not throw a class loading error until XmlSource
   is used.

A "runtime" scope optional dependency is never reference directly from the
SDK, but is an implementation referenced indirectly and loaded lazily.
We have two of these:

 - org.tukaani:xz is an optional dependency to support xz-compressed
   streams at runtime we use from Apache commons, though the SDK does not
   directly use the library. Our use of Apache commons is an implementation
   detail so xz is really our dependency.
 - org.codehaus.woodstox:woodstox-core-asl is an implementation of stax2-api.
   Such an implementation must be present on the classpath when XmlSource is
   used, but the SDK deliberately does not reference a specific
   implementation.

----Release Notes----

[]
-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=115604996


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/89e62414
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/89e62414
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/89e62414

Branch: refs/heads/master
Commit: 89e624141c589be950d792508705966b56f2bc61
Parents: 3111646
Author: klk <[email protected]>
Authored: Thu Feb 25 14:30:23 2016 -0800
Committer: Davor Bonaci <[email protected]>
Committed: Thu Feb 25 23:58:29 2016 -0800

----------------------------------------------------------------------
 sdk/pom.xml                                     | 25 +++++++++++++++++---
 .../google/cloud/dataflow/sdk/io/XmlSource.java | 18 ++++++++------
 2 files changed, 33 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/89e62414/sdk/pom.xml
----------------------------------------------------------------------
diff --git a/sdk/pom.xml b/sdk/pom.xml
index 1f15b02..4995da0 100644
--- a/sdk/pom.xml
+++ b/sdk/pom.xml
@@ -141,6 +141,19 @@
         <artifactId>maven-compiler-plugin</artifactId>
       </plugin>
 
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-dependency-plugin</artifactId>
+        <executions>
+          <execution>
+            <goals><goal>analyze-only</goal></goals>
+            <configuration>
+              <failOnWarning>true</failOnWarning>
+            </configuration>
+          </execution>
+        </executions>
+      </plugin>
+
       <!-- Run CheckStyle pass on transforms, as they are release in
            source form. -->
       <plugin>
@@ -661,8 +674,11 @@
     </dependency>
 
     <!--
-    To use com.google.cloud.dataflow.io.XmlSource, please explicitly declare
-    the following two dependencies.
+    To use com.google.cloud.dataflow.io.XmlSource:
+
+    1. Explicitly declare the following dependency for the stax2 API.
+    2. Include a stax2 implementation on the classpath. One example
+       is given below as an optional runtime dependency on woodstox-core-asl
     -->
     <dependency>
       <groupId>org.codehaus.woodstox</groupId>
@@ -675,6 +691,7 @@
       <groupId>org.codehaus.woodstox</groupId>
       <artifactId>woodstox-core-asl</artifactId>
       <version>${woodstox.version}</version>
+      <scope>runtime</scope>
       <optional>true</optional>
       <exclusions>
         <!-- javax.xml.stream:stax-api is included in JDK 1.6+ -->
@@ -687,12 +704,14 @@
 
     <!--
     To use com.google.cloud.dataflow.io.AvroSource with XZ-encoded files,
-    please explicitly declare this dependency.
+    please explicitly declare this dependency to include org.tukaani:xz on
+    the classpath at runtime.
     -->
     <dependency>
       <groupId>org.tukaani</groupId>
       <artifactId>xz</artifactId>
       <version>1.5</version>
+      <scope>runtime</scope>
       <optional>true</optional>
     </dependency>
 

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/89e62414/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java
----------------------------------------------------------------------
diff --git a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java 
b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java
index d684d22..1ead391 100644
--- a/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java
+++ b/sdk/src/main/java/com/google/cloud/dataflow/sdk/io/XmlSource.java
@@ -17,6 +17,7 @@ package com.google.cloud.dataflow.sdk.io;
 import com.google.cloud.dataflow.sdk.coders.Coder;
 import com.google.cloud.dataflow.sdk.coders.JAXBCoder;
 import com.google.cloud.dataflow.sdk.options.PipelineOptions;
+import com.google.cloud.dataflow.sdk.runners.PipelineRunner;
 import com.google.common.base.Preconditions;
 
 import org.codehaus.stax2.XMLInputFactory2;
@@ -94,18 +95,21 @@ import javax.xml.stream.XMLStreamReader;
  * <p>Currently, only XML files that use single-byte characters are supported. 
Using a file that
  * contains multi-byte characters may result in data loss or duplication.
  *
- * <p>To use {@code XmlSource}, explicitly declare dependencies on following 
two jars from Woodstox
- * StAX XML parser.
- * (1) stax2-api-3.1.1.jar
- * (2) woodstox-core-asl-4.1.2.jar
- * These dependencies have been declared as optional in Maven sdk/pom.xml file 
of Google Cloud
- * Dataflow.
+ * <p>To use {@link XmlSource}:
+ * <ol>
+ *   <li>Explicitly declare a dependency on 
org.codehaus.woodstox:stax2-api</li>
+ *   <li>Include a compatible implementation on the classpath at run-time,
+ *       such as org.codehaus.woodstox:woodstox-core-asl</li>
+ * </ol>
+ *
+ * <p>These dependencies have been declared as optional in Maven sdk/pom.xml 
file of
+ * Google Cloud Dataflow.
  *
  * <p><h3>Permissions</h3>
  * Permission requirements depend on the
  * {@link com.google.cloud.dataflow.sdk.runners.PipelineRunner PipelineRunner} 
that is
  * used to execute the Dataflow job. Please refer to the documentation of 
corresponding
- * {@code PipelineRunner}s for more details.
+ * {@link PipelineRunner PipelineRunners} for more details.
  *
  * @param <T> Type of the objects that represent the records of the XML file. 
The
  *        {@code PCollection} generated by this source will be of this type.

Reply via email to