[GitHub] incubator-beam pull request #519: avoid to fail cause of \r of windows on EO...

2016-06-22 Thread rmannibucau
GitHub user rmannibucau opened a pull request:

https://github.com/apache/incubator-beam/pull/519

avoid to fail cause of \r of windows on EOL + fixing copyOne from Fil…

Lighter version of https://github.com/apache/incubator-beam/pull/496 
(basically removed the hadoop workaround)

Build doesn't fully pass out of the box (= without setting up hadoop 
locally) but it doesn't fail for simple issues like EOL differences.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmannibucau/incubator-beam 
BEAM-357_path-handling-fails-on-windows

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/519.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #519


commit 2d3c9fee013c04fbad7013c3e334a5fae73aa53b
Author: Romain manni-Bucau 
Date:   2016-06-22T08:42:45Z

avoid to fail cause of \r of windows on EOL + fixing copyOne from 
FileBasedSink which fails on windows hardrive syntax (colon) and avoid to have 
issues cause a folder can't be deleted in WriteSinkITCase - easily happen on 
windows cause of locking mecanism




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[incubator-beam] Git Push Summary

2016-06-21 Thread rmannibucau
Repository: incubator-beam
Updated Branches:
  refs/heads/BEAM-357_windows-build-fails [deleted] 460d21cb7


[1/2] incubator-beam git commit: fixing build on windows

2016-06-21 Thread rmannibucau
Repository: incubator-beam
Updated Branches:
  refs/heads/BEAM-357_windows-build-fails [created] 460d21cb7


fixing build on windows


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/41883300
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/41883300
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/41883300

Branch: refs/heads/BEAM-357_windows-build-fails
Commit: 418833001fe6dd581f42f7fcc3c35ef36f292007
Parents: 0e4d0a9
Author: Romain manni-Bucau 
Authored: Sun Jun 19 21:19:57 2016 +0200
Committer: Romain manni-Bucau 
Committed: Sun Jun 19 21:19:57 2016 +0200

--
 .../beam/runners/flink/WriteSinkITCase.java |  13 +
 .../beam/runners/spark/SimpleWordCountTest.java |   8 +
 .../beam/runners/spark/io/AvroPipelineTest.java |   7 +
 .../beam/runners/spark/io/NumShardsTest.java|   7 +
 .../io/hadoop/HadoopFileFormatPipelineTest.java |   7 +
 .../translation/TransformTranslatorTest.java|   7 +
 .../src/main/resources/beam/checkstyle.xml  |   4 +-
 .../org/apache/beam/sdk/io/FileBasedSink.java   |   7 +-
 .../beam/sdk/testing/HadoopWorkarounds.java | 129 +
 sdks/java/io/hdfs/pom.xml   |   9 +
 .../beam/sdk/io/hdfs/HDFSFileSourceTest.java| 264 ++-
 sdks/java/maven-archetypes/starter/pom.xml  |   3 +
 12 files changed, 334 insertions(+), 131 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/41883300/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
--
diff --git 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
index 36d3aef..1a56350 100644
--- 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
+++ 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
@@ -35,6 +35,7 @@ import org.apache.flink.core.fs.Path;
 import org.apache.flink.test.util.JavaProgramTestBase;
 
 import java.io.File;
+import java.io.IOException;
 import java.io.PrintWriter;
 import java.net.URI;
 
@@ -75,6 +76,18 @@ public class WriteSinkITCase extends JavaProgramTestBase {
 p.run();
   }
 
+
+  @Override
+  public void stopCluster() throws Exception {
+try {
+  super.stopCluster();
+} catch (final IOException ioe) {
+  if (ioe.getMessage().startsWith("Unable to delete file")) {
+// that's ok for the test itself, just the OS playing with us on 
cleanup phase
+  }
+}
+  }
+
   /**
* Simple custom sink which writes to a file.
*/

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/41883300/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
--
diff --git 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
index 2b4464d..4980995 100644
--- 
a/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
+++ 
b/runners/spark/src/test/java/org/apache/beam/runners/spark/SimpleWordCountTest.java
@@ -25,6 +25,7 @@ import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.io.TextIO;
 import org.apache.beam.sdk.options.PipelineOptionsFactory;
+import org.apache.beam.sdk.testing.HadoopWorkarounds;
 import org.apache.beam.sdk.testing.PAssert;
 import org.apache.beam.sdk.transforms.Aggregator;
 import org.apache.beam.sdk.transforms.Count;
@@ -40,11 +41,13 @@ import com.google.common.collect.ImmutableSet;
 import com.google.common.collect.Sets;
 
 import org.apache.commons.io.FileUtils;
+import org.junit.BeforeClass;
 import org.junit.Rule;
 import org.junit.Test;
 import org.junit.rules.TemporaryFolder;
 
 import java.io.File;
+import java.io.IOException;
 import java.util.Arrays;
 import java.util.List;
 import java.util.Set;
@@ -61,6 +64,11 @@ public class SimpleWordCountTest {
   private static final Set EXPECTED_COUNT_SET =
   ImmutableSet.of("hi: 5", "there: 1", "sue: 2", "bob: 2");
 
+  @BeforeClass
+  public static void initWin() throws IOException {
+HadoopWorkarounds.winTests();
+  }
+
   @Test
   public void testInMem() throws Exception {
 SparkPipelineOptions options = 
PipelineOptionsFactory.as(SparkPipelineOptions.class);

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/41883300/runners/spark/src/test/java/org/apache/beam/runners/spark/io/AvroPipelineTest.java
--
dif

[2/2] incubator-beam git commit: better comments for win workaround and basic sanity checks for winutils.exe

2016-06-21 Thread rmannibucau
better comments for win workaround and basic sanity checks for winutils.exe


Project: http://git-wip-us.apache.org/repos/asf/incubator-beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-beam/commit/460d21cb
Tree: http://git-wip-us.apache.org/repos/asf/incubator-beam/tree/460d21cb
Diff: http://git-wip-us.apache.org/repos/asf/incubator-beam/diff/460d21cb

Branch: refs/heads/BEAM-357_windows-build-fails
Commit: 460d21cb7070603f789da9d13e12668194c91e9b
Parents: 4188330
Author: Romain manni-Bucau 
Authored: Tue Jun 21 10:37:05 2016 +0200
Committer: Romain manni-Bucau 
Committed: Tue Jun 21 10:37:05 2016 +0200

--
 .../beam/runners/flink/WriteSinkITCase.java |   2 +-
 .../beam/sdk/testing/HadoopWorkarounds.java | 109 +--
 sdks/java/io/hdfs/pom.xml   |   9 --
 sdks/java/maven-archetypes/starter/pom.xml  |   6 +-
 4 files changed, 104 insertions(+), 22 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/460d21cb/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
--
diff --git 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
index 1a56350..bb3778d 100644
--- 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
+++ 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/WriteSinkITCase.java
@@ -54,7 +54,7 @@ public class WriteSinkITCase extends JavaProgramTestBase {
 
   @Override
   protected void preSubmit() throws Exception {
-resultPath = getTempDirPath("result");
+resultPath = getTempDirPath("result-" + System.nanoTime());
   }
 
   @Override

http://git-wip-us.apache.org/repos/asf/incubator-beam/blob/460d21cb/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/HadoopWorkarounds.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/HadoopWorkarounds.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/HadoopWorkarounds.java
index ee2e135..1c2aa20 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/HadoopWorkarounds.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/HadoopWorkarounds.java
@@ -17,6 +17,8 @@
  */
 package org.apache.beam.sdk.testing;
 
+import static java.util.Arrays.asList;
+
 import org.apache.commons.compress.utils.IOUtils;
 
 import java.io.File;
@@ -26,15 +28,21 @@ import java.io.InputStream;
 import java.io.OutputStream;
 import java.net.MalformedURLException;
 import java.net.URL;
+import java.nio.file.Files;
+import java.util.Arrays;
 
 /**
  * A simple class ensure winutils.exe can be found in the JVM.
+ * 
+ * See http://wiki.apache.org/hadoop/WindowsProblems for details.
+ * 
+ * Note: don't forget to add org.bouncycastle:bcpg-jdk16 dependency to use it.
  */
 public class HadoopWorkarounds {
 /**
  * In practise this method only needs to be called once by JVM
  * since hadoop uses static variables to store it.
- *
+ * 
  * Note: ensure invocation is done before hadoop reads it
  * and ensure this folder survives tests
  * (avoid temporary folder usage since tests can share it).
@@ -51,6 +59,8 @@ public class HadoopWorkarounds {
 // hadoop doesn't have winutils.exe :(: 
https://issues.apache.org/jira/browse/HADOOP-10051
 // so use this github repo temporarly then just use the main tar.gz
 /*
+// note this commented code requires commons-compress dependency (to 
add if we use that)
+
 String hadoopVersion = VersionInfo.getVersion();
 final URL url = new URL("https://archive.apache.org/dist/hadoop/common/
   hadoop-" + hadoopVersion + "/hadoop-" + hadoopVersion + 
".tar.gz");
@@ -97,19 +107,49 @@ public class HadoopWorkarounds {
 + "-Dhadoop.home.dir so we'll download winutils.exe");
 
 new File(hadoopHome, "bin").mkdirs();
-final URL url;
-try {
-url = new URL("https://github.com/steveloughran/winutils/";
-+ "raw/master/hadoop-2.7.1/bin/winutils.exe");
-} catch (final MalformedURLException e) { // unlikely
-throw new IllegalArgumentException(e);
+final File winutils = new File(hadoopHome, "bin/winutils.exe");
+
+for (final String suffix : asList("", ".asc")) {
+final URL url;
+try {
+// this code is not a random URL - read HADOOP-10051
+// it is provided and signed with an ASF gpg key.
+
+// note: 2.6.3 cause 2.6.4, 2.7.1 don't have .asc
+url = new UR

[GitHub] incubator-beam pull request #496: [BEAM-357] fixing build on windows

2016-06-19 Thread rmannibucau
GitHub user rmannibucau opened a pull request:

https://github.com/apache/incubator-beam/pull/496

[BEAM-357] fixing build on windows

Small fixes or workarounds for windows.

Main issues were:

- Paths.get(...): windows has ":" which will make it failing, using new 
File(...).toPath() to fix it
- Workaround for hadoop which requires winutils.exe on windows (see 
https://issues.apache.org/jira/browse/BEAM-357 for a comment on it)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rmannibucau/incubator-beam 
BEAM-357_windows-build-fails

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-beam/pull/496.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #496


commit 418833001fe6dd581f42f7fcc3c35ef36f292007
Author: Romain manni-Bucau 
Date:   2016-06-19T19:19:57Z

fixing build on windows




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---