[GitHub] beam pull request #3870: Marks TikaIO as experimental to allow backward-inco...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3870


---


[2/2] beam git commit: This closes #3870

2017-09-19 Thread jbonofre
This closes #3870


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/64123e9d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/64123e9d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/64123e9d

Branch: refs/heads/master
Commit: 64123e9d3c1fd37515184d40ec03fadbf0b412fb
Parents: e8a5282 7d7baf7
Author: Jean-Baptiste Onofré 
Authored: Wed Sep 20 08:56:18 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Wed Sep 20 08:56:18 2017 +0200

--
 .../tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java  | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--




[1/2] beam git commit: Marks TikaIO as experimental to allow backward-incompatible changes

2017-09-19 Thread jbonofre
Repository: beam
Updated Branches:
  refs/heads/master e8a52827e -> 64123e9d3


Marks TikaIO as experimental to allow backward-incompatible changes


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7d7baf7d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7d7baf7d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7d7baf7d

Branch: refs/heads/master
Commit: 7d7baf7d419c873ecb58545c7feaaea25a862c23
Parents: cfbdb61
Author: Eugene Kirpichov 
Authored: Tue Sep 19 19:30:45 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Sep 19 19:30:45 2017 -0700

--
 .../tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java  | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/7d7baf7d/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java
--
diff --git 
a/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java 
b/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java
index 5d6eea7..4876dcf 100644
--- a/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java
+++ b/sdks/java/io/tika/src/main/java/org/apache/beam/sdk/io/tika/TikaIO.java
@@ -21,7 +21,7 @@ import static 
com.google.common.base.Preconditions.checkNotNull;
 import com.google.auto.value.AutoValue;
 
 import javax.annotation.Nullable;
-
+import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.coders.Coder;
 import org.apache.beam.sdk.coders.StringUtf8Coder;
 import org.apache.beam.sdk.io.Read.Bounded;
@@ -53,7 +53,10 @@ import org.apache.tika.metadata.Metadata;
  * // A simple Read of a local PDF file (only runs locally):
  * PCollection content = 
p.apply(TikaInput.from("/local/path/to/file.pdf"));
  * }
+ *
+ * Warning: the API of this IO is likely to change in the next release.
  */
+@Experimental(Experimental.Kind.SOURCE_SINK)
 public class TikaIO {
 
   /**



Build failed in Jenkins: beam_PerformanceTests_JDBC #212

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Fix typo in variable name: window_fn --> windowfn

[iemejia] [BEAM-2764] Change document size range to fix flakiness on SolrIO 
tests

[lcwik] Add fn API progress reporting protos

[iemejia] [BEAM-2787] Fix MongoDbIO exception when trying to write empty bundle

[lcwik] Upgrade snappy-java to version 1.1.4

[lcwik] Upgrade slf4j to version 1.7.25

[chamikara] Updates bigtable.version to 1.0.0-pre3.

[lcwik] Add ThrowingBiConsumer to the set of functional interfaces

[altay] Added snippet tags for documentation

[altay] Added concrete example for CoGroupByKey snippet

[jbonofre] Add license-maven-plugin and some default license merges

[relax] BEAM-934 Fixed build by fixing firebug error.

[relax] BEAM-934 Enabled firebug after fixing the bug.

[relax] BEAM-934 Fixed code after review.

[jbonofre] [BEAM-2328] Add TikaIO

[mingmxu] Add DSLs module

[mingmxu] [BEAM-301] Initial skeleton for Beam SQL

[mingmxu] checkstyle and rename package

[mingmxu] redesign BeamSqlExpression to execute Calcite SQL expression.

[mingmxu] Fix inconsistent mapping for SQL FLOAT

[mingmxu] [BEAM-2158] Implement the arithmetic operators

[mingmxu] [BEAM-2161] Add support for String operators

[mingmxu] [BEAM-2079] Support TextIO as SQL source/sink

[mingmxu] [BEAM-2006] window support Add support for aggregation: global, HOP,

[mingmxu] Support common-used aggregation functions in SQL, including:  

[mingmxu] [BEAM-2195] Implement conditional operator (CASE)

[mingmxu] [BEAM-2234] Change return type of buildBeamPipeline to

[mingmxu] [BEAM-2149] Improved kafka table implemention.

[mingmxu] support UDF

[mingmxu] update JavaDoc.

[mingmxu] [BEAM-2255] Implement ORDER BY

[mingmxu] [BEAM-2288] Refine DSL interface as design doc of BEAM-2010: 1. rename

[mingmxu] [BEAM-2292] Add BeamPCollectionTable to create table from

[mingmxu] fix NoSuchFieldException

[mingmxu] [BEAM-2309] Implement VALUES and add support for data type CHAR (to be

[mingmxu] [BEAM-2310] Support encoding/decoding of TIME and new DECIMAL data 
type

[mingmxu] DSL interface for Beam SQL

[mingmxu] [BEAM-2329] Add ABS and SQRT math functions

[mingmxu] upgrade to version 2.1.0-SNAPSHOT

[mingmxu] rename SQL to Sql in class name

[mingmxu] [BEAM-2247] Implement date functions in SQL DSL

[mingmxu] [BEAM-2325] Support Set operator: intersect & except

[mingmxu] Add ROUND function on DSL_SQL branch.

[mingmxu] register table for both BeamSql.simpleQuery and BeamSql.query

[mingmxu] Add NOT operator on DSL_SQL branch (plus some refactoring)

[mingmxu] [BEAM-2444] BeamSql: use java standard exception

[mingmxu] [BEAM-2442] BeamSql surface api test.

[mingmxu] [BEAM-2440] BeamSql: reduce visibility

[mingmxu] Remove unused BeamPipelineCreator class

[mingmxu] [BEAM-2443] apply AutoValue to BeamSqlRecordType

[mingmxu] Update filter/project/aggregation tests to use BeamSql

[mingmxu] Remove UnsupportedOperationVisitor, which is currently just a no-op

[mingmxu] restrict the scope of BeamSqlEnv

[mingmxu] Add ACOS, ASIN, ATAN, COS, COT, DEGREES, RADIANS, SIN, TAN, SIGN, LN,

[mingmxu] [BEAM-2477] BeamAggregationRel should use Combine.perKey instead of

[mingmxu] use static table name PCOLLECTION in BeamSql.simpleQuery.

[mingmxu] Small fixes to make the example run in a runner agnostic way: - Add

[mingmxu] [BEAM-2193] Implement FULL, INNER, and OUTER JOIN: - FULL and INNER

[mingmxu] UDAF support: - Adds an abstract class BeamSqlUdaf for defining 
Calcite

[mingmxu] BeamSql: refactor the MockedBeamSqlTable and related tests

[mingmxu] MockedBeamSqlTable -> MockedBoundedTable

[mingmxu] Test unsupported/invalid cases in DSL tests.

[mingmxu] [BEAM-2550] add UnitTest for JOIN in DSL

[mingmxu] support TUMBLE/HOP/SESSION _START function

[mingmxu] Test queries on unbounded PCollections with BeamSql DSL API. Also add

[mingmxu] [BEAM-2564] add integration test for string functions

[mingmxu] CAST operator supporting numeric, date and timestamp types

[mingmxu] POWER function

[mingmxu] support UDF/UDAF in BeamSql

[mingmxu] upgrade pom to 2.2.0-SNAPSHOT

[mingmxu] [BEAM-2560] Add integration test for arithmetic operators.

[mingmxu] cleanup BeamSqlRow

[mingmxu] proposal for new UDF

[mingmxu] [BEAM-2562] Add integration test for logical operators

[mingmxu] [BEAM-2384] CEIL, FLOOR, TRUNCATE, PI, ATAN2 math functions

[mingmxu] [BEAM-2561] add integration test for date functions

[mingmxu] refactor the datetime test to use ExpressionChecker and fix

[mingmxu] [BEAM-2565] add integration test for conditional functions

[mingmxu] rebased, add RAND/RAND_INTEGER

[mingmxu] [BEAM-2621] BeamSqlRecordType -> BeamSqlRowType

[mingmxu] [BEAM-2563] Add integration test for math operators Misc: 1. no SQRT 
in

[mingmxu] [BEAM-2613] add integration test for comparison operators

[mingmxu] remove README.md and update usages in BeamSqlExample

[mingmxu] update pom.xml to

[jira] [Commented] (BEAM-2826) Need to generate a single XML file when write is performed on small amount of data

2017-09-19 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172818#comment-16172818
 ] 

Eugene Kirpichov commented on BEAM-2826:


This will be addressed as part of the FileIO.write() effort. However, what Luke 
suggests above will also work in practice as a workaround.

> Need to generate a single XML file when write is performed on small amount of 
> data
> --
>
> Key: BEAM-2826
> URL: https://issues.apache.org/jira/browse/BEAM-2826
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Affects Versions: 2.0.0
>Reporter: Balajee Venkatesh
>Assignee: Eugene Kirpichov
>
> I'm trying to write an XML file where the source is a text file stored in 
> GCS. The code is running fine but instead of a single XML file, it is 
> generating multiple XML files. (No. of XML files seem to follow total no. of 
> records present in source text file). I have observed this scenario while 
> using 'DataflowRunner'.
> When I run the same code in local then two files get generated. First one 
> contains all the records with proper elements and the second one contains 
> only opening and closing root element.
> As I learnt,it is expected that it may produce multiple files: e.g. if the 
> runner chooses to process your data parallelizing it into 3 tasks 
> ("bundles"), you'll get 3 files. Some of the parts may turn out empty in some 
> cases, but the total data written will always add up to the expected data.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_JDBC #211

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Updates.

--
[...truncated 115.30 KB...]

2017-09-20 05:20:55,034 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:21:15,694 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 05:21:15,709 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Ran /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

 Got return code (1).
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2017-09-20 05:21:15,709 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:21:38,263 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 05:21:38,277 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Ran /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

 Got return code (1).
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2017-09-20 05:21:38,278 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:22:07,472 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 05:22:07,488 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Ran /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

 Got return code (1).
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2017-09-20 05:22:07,488 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:22:31,795 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 05:22:31,810 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Ran /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

 Got return code (1).
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2017-09-20 05:22:31,811 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:22:47,270 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 05:22:47,288 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Ran /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

 Got return code (1).
STDOUT: 
STDERR: error: stat /home/jenkins/.kube/config: no such file or directory

2017-09-20 05:22:47,289 69bade6f MainThread beam_integration_benchmark(1/1) 
ERRORRetrying exception running IssueRetryableCommand: Command returned a 
non-zero exit code.

2017-09-20 05:23:04,920 69bade6f MainThread beam_integration_benchmark(1/1) 
INFO Running: /usr/lib/google-cloud-sdk/bin/kubectl 
--kubeconfig=/home/jenkins/.kube/config delete -f 

2017-09-20 0

Build failed in Jenkins: beam_PerformanceTests_JDBC #210

2017-09-19 Thread Apache Jenkins Server
See 


--
GitHub pull request #3668 of commit 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb, 
no merge conflicts.
Setting status of 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_JDBC/210/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: JDBC Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/3668/*:refs/remotes/origin/pr/3668/*
 > git rev-parse refs/remotes/origin/pr/3668/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3668/merge^{commit} # timeout=10
Checking out Revision 2198ab86b66cc731d10511516f0f66c662fd3151 
(refs/remotes/origin/pr/3668/merge)
Commit message: "Merge 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb into 
e8a52827e792c9cbce5832e5edcd0399428c24f3"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 2198ab86b66cc731d10511516f0f66c662fd3151
 > git rev-list 2198ab86b66cc731d10511516f0f66c662fd3151 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins1181724838664805286.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2205535997086746501.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2435976276380033742.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2015970367744934407.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://

:66:
 UserWa

Build failed in Jenkins: beam_PerformanceTests_JDBC #209

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Updates.

--
GitHub pull request #3668 of commit 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb, 
no merge conflicts.
Setting status of 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_JDBC/209/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: JDBC Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/3668/*:refs/remotes/origin/pr/3668/*
 > git rev-parse refs/remotes/origin/pr/3668/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3668/merge^{commit} # timeout=10
Checking out Revision 2198ab86b66cc731d10511516f0f66c662fd3151 
(refs/remotes/origin/pr/3668/merge)
Commit message: "Merge 8adb779ec7f8b3726928ae3a7ead86bd429bc8bb into 
e8a52827e792c9cbce5832e5edcd0399428c24f3"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 2198ab86b66cc731d10511516f0f66c662fd3151
 > git rev-list 5e22c97cbaa48ad54c40d3cf091a2d2900d8afbf # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins4743726029780413796.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins6888028383971305726.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins3764398704742416428.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins7144182252056535729.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://



Build failed in Jenkins: beam_PerformanceTests_JDBC #208

2017-09-19 Thread Apache Jenkins Server
See 


--
[...truncated 29.48 KB...]
[INFO] Excluding org.slf4j:slf4j-api:jar:1.7.25 from the shaded jar.
[INFO] Excluding com.google.auto.service:auto-service:jar:1.0-rc2 from the 
shaded jar.
[INFO] Excluding com.google.auto:auto-common:jar:0.3 from the shaded jar.
[INFO] Excluding com.google.api.grpc:grpc-google-common-protos:jar:0.1.9 from 
the shaded jar.
[INFO] Replacing original artifact with shaded artifact.
[INFO] Replacing 

 with 

[INFO] Replacing original test artifact with shaded test artifact.
[INFO] Replacing 

 with 

[INFO] Dependency-reduced POM written at: 

[INFO] 
[INFO] --- maven-failsafe-plugin:2.20:integration-test (integration-test) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-dependency-plugin:3.0.1:analyze-only (default) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] No dependency problems found
[INFO] 
[INFO] --- maven-failsafe-plugin:2.20:verify (integration-test) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Tests are skipped.
[INFO] 
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ 
beam-runners-google-cloud-dataflow-java ---
[INFO] Installing 

 to 
/home/jenkins/.m2/repository/org/apache/beam/beam-runners-google-cloud-dataflow-java/2.2.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.2.0-SNAPSHOT.jar
[INFO] Installing 

 to 
/home/jenkins/.m2/repository/org/apache/beam/beam-runners-google-cloud-dataflow-java/2.2.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.2.0-SNAPSHOT.pom
[INFO] Installing 

 to 
/home/jenkins/.m2/repository/org/apache/beam/beam-runners-google-cloud-dataflow-java/2.2.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.2.0-SNAPSHOT-tests.jar
[INFO] Installing 

 to 
/home/jenkins/.m2/repository/org/apache/beam/beam-runners-google-cloud-dataflow-java/2.2.0-SNAPSHOT/beam-runners-google-cloud-dataflow-java-2.2.0-SNAPSHOT-tests.jar
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 45.507 s
[INFO] Finished at: 2017-09-20T04:26:02Z
[INFO] Final Memory: 93M/1378M
[INFO] 
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins4161867277442800029.sh
+ /home/jenkins/tools/maven/latest/bin/mvn -B -e verify -pl sdks/java/io/jdbc 
-Dio-it-suite 
-DpkbLocation=
 -DmvnBinary=/home/jenkins/tools/maven/latest/bin/mvn 
-Dkubectl=/usr/lib/google-cloud-sdk/bin/kubectl 
'-DintegrationTestPipelineOptions=[ "--project=apache-beam-testing", 
"--tempRoot=gs://temp-storage-for-end-to-end-tests" ]'
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO] 
[INFO] Detecting the operating system and CPU architecture
[INFO] 
[INFO] os.detected.name: linux
[INFO] os.detected.arch: x86_64
[INFO] os.detected.version: 3.19
[INFO] os.detected.version.major: 3
[INFO] os.detected.version.minor: 19
[INFO] os.detected.release: ubuntu
[INFO] os.detected.release.version: 14.04
[INFO] os.detected.release.like.ubuntu: true
[INFO] os.detected.release.like.debian: true
[INFO] os.detected.classifier: linux-x86_64
[INFO] 
[INFO] ---

Build failed in Jenkins: beam_PerformanceTests_JDBC #205

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Updates.

--
GitHub pull request #3668 of commit e8fcca5c3306da162cd94f4b8f41a1b322285482, 
no merge conflicts.
Setting status of e8fcca5c3306da162cd94f4b8f41a1b322285482 to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_JDBC/205/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: JDBC Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/3668/*:refs/remotes/origin/pr/3668/*
 > git rev-parse refs/remotes/origin/pr/3668/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3668/merge^{commit} # timeout=10
Checking out Revision ddfa58dc1147d691b6f20df45fb2aef9b988b8c0 
(refs/remotes/origin/pr/3668/merge)
Commit message: "Merge e8fcca5c3306da162cd94f4b8f41a1b322285482 into 
e8a52827e792c9cbce5832e5edcd0399428c24f3"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f ddfa58dc1147d691b6f20df45fb2aef9b988b8c0
 > git rev-list 6c420e87156ac89abe3c805d17b0793c70936d43 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins4679550101265306948.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins6459407781373107578.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins3245958149723676082.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins3485054189503520599.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://



Build failed in Jenkins: beam_PerformanceTests_JDBC #204

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[chamikara] Updates.

--
GitHub pull request #3668 of commit 59a80e4a72f61dfe9122105e7995b6f4fa834cf3, 
no merge conflicts.
Setting status of 59a80e4a72f61dfe9122105e7995b6f4fa834cf3 to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_JDBC/204/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: JDBC Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/3668/*:refs/remotes/origin/pr/3668/*
 > git rev-parse refs/remotes/origin/pr/3668/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3668/merge^{commit} # timeout=10
Checking out Revision 6c420e87156ac89abe3c805d17b0793c70936d43 
(refs/remotes/origin/pr/3668/merge)
Commit message: "Merge 59a80e4a72f61dfe9122105e7995b6f4fa834cf3 into 
e8a52827e792c9cbce5832e5edcd0399428c24f3"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 6c420e87156ac89abe3c805d17b0793c70936d43
 > git rev-list 9cb15fe295696602f52b632fb0f88913963e58f7 # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2037062492842766894.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins6430110395004666852.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins1379165088352268310.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2648523205540860549.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://



Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4004

2017-09-19 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PostCommit_Python_Verify #3169

2017-09-19 Thread Apache Jenkins Server
See 


--
[...truncated 43.98 KB...]
Collecting crcmod<2.0,>=1.7 (from apache-beam==2.2.0.dev0)
Collecting dill==0.2.6 (from apache-beam==2.2.0.dev0)
Requirement already satisfied: grpcio<2.0,>=1.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
apache-beam==2.2.0.dev0)
Collecting httplib2<0.10,>=0.8 (from apache-beam==2.2.0.dev0)
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: six>=1.5.2 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.5-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-CPHesU-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmpzpnp4ppip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-CPHesU-build/setup.py", line 200, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-CPHesU-build/setup.py", line 140, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, pbr, funcsigs, 
mock, pyasn1, pyasn1-modules, rsa, oauth2client, protobuf, pyyaml, typing, 
apache-beam
  Found existing installation: protobuf 3.4

[jira] [Commented] (BEAM-2271) Release guide or pom.xml needs update to avoid releasing Python binary artifacts

2017-09-19 Thread Kenneth Knowles (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172651#comment-16172651
 ] 

Kenneth Knowles commented on BEAM-2271:
---

It looks like the PR was closed but not merged. I expect that you'll find a 
bunch of extraneous Python files in the release artifacts that you'll need to 
manually remove.

> Release guide or pom.xml needs update to avoid releasing Python binary 
> artifacts
> 
>
> Key: BEAM-2271
> URL: https://issues.apache.org/jira/browse/BEAM-2271
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Daniel Halperin
>Assignee: Sourabh Bajaj
> Fix For: 2.2.0
>
>
> The following directories (and children) were discovered in 2.0.0-RC2 and 
> were present in 0.6.0.
> {code}
> sdks/python: build   dist.eggs   nose-1.3.7-py2.7.egg  (and child 
> contents)
> {code}
> Ideally, these artifacts, which are created during setup and testing, would 
> get created in the {{sdks/python/target/}} subfolder where they will 
> automatically get ignored. More info below.
> For 2.0.0, we will manually remove these files from the source release RC3+. 
> This should be fixed before the next release.
> Here is a list of other paths that get excluded, should they be useful.
> {code}
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/).*${project.build.directory}.*]
> 
> 
>  
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?maven-eclipse\.xml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.project]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.classpath]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iws]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.idea(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?out(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.ipr]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.settings(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.externalToolBuilders(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.deployables(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.wtpmodules(/.*)?]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?cobertura\.ser]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?pom\.xml\.releaseBackup]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?release\.properties]
>   
> {code}
> This list is stored inside of this jar, which you can find by tracking 
> maven-assembly-plugin from the root apache pom: 
> https://mvnrepository.com/artifact/org.apache.apache.resources/apache-source-release-assembly-descriptor/1.0.6
> http://svn.apache.org/repos/asf/maven/pom/tags/apache-18/pom.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3870: Marks TikaIO as experimental to allow backward-inco...

2017-09-19 Thread jkff
GitHub user jkff opened a pull request:

https://github.com/apache/beam/pull/3870

Marks TikaIO as experimental to allow backward-incompatible changes

R: @jbonofre 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jkff/incubator-beam tikaio-experimental

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3870.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3870


commit 7d7baf7d419c873ecb58545c7feaaea25a862c23
Author: Eugene Kirpichov 
Date:   2017-09-20T02:30:45Z

Marks TikaIO as experimental to allow backward-incompatible changes




---


Build failed in Jenkins: beam_PerformanceTests_JDBC #202

2017-09-19 Thread Apache Jenkins Server
See 


--
GitHub pull request #3668 of commit b76e566d26be4ab9eefe83308402116225ff49bc, 
no merge conflicts.
Setting status of b76e566d26be4ab9eefe83308402116225ff49bc to PENDING with url 
https://builds.apache.org/job/beam_PerformanceTests_JDBC/202/ and message: 
'Build started sha1 is merged.'
Using context: Jenkins: JDBC Performance Test
[EnvInject] - Loading node environment variables.
Building remotely on beam1 (beam) in workspace 

 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url https://github.com/apache/beam.git # timeout=10
Fetching upstream changes from https://github.com/apache/beam.git
 > git --version # timeout=10
 > git fetch --tags --progress https://github.com/apache/beam.git 
 > +refs/heads/*:refs/remotes/origin/* 
 > +refs/pull/3668/*:refs/remotes/origin/pr/3668/*
 > git rev-parse refs/remotes/origin/pr/3668/merge^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/pr/3668/merge^{commit} # timeout=10
Checking out Revision 902706dd52d5a2945f380d001f9c7ef0d91e58ad 
(refs/remotes/origin/pr/3668/merge)
Commit message: "Merge b76e566d26be4ab9eefe83308402116225ff49bc into 
e8a52827e792c9cbce5832e5edcd0399428c24f3"
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 902706dd52d5a2945f380d001f9c7ef0d91e58ad
 > git rev-list 902706dd52d5a2945f380d001f9c7ef0d91e58ad # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
[EnvInject] - Executing scripts and injecting environment variables after the 
SCM step.
[EnvInject] - Injecting as environment variables the properties content 
SPARK_LOCAL_IP=127.0.0.1

[EnvInject] - Variables injected successfully.
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins663335186045730112.sh
+ rm -rf PerfKitBenchmarker
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins6026221639858784193.sh
+ git clone https://github.com/GoogleCloudPlatform/PerfKitBenchmarker.git
Cloning into 'PerfKitBenchmarker'...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins3703854871620553873.sh
+ pip install --user -r PerfKitBenchmarker/requirements.txt
Requirement already satisfied (use --upgrade to upgrade): python-gflags==3.1.1 
in /home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 14))
Requirement already satisfied (use --upgrade to upgrade): jinja2>=2.7 in 
/usr/local/lib/python2.7/dist-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 15))
Requirement already satisfied (use --upgrade to upgrade): setuptools in 
/usr/lib/python2.7/dist-packages (from -r PerfKitBenchmarker/requirements.txt 
(line 16))
Requirement already satisfied (use --upgrade to upgrade): 
colorlog[windows]==2.6.0 in /home/jenkins/.local/lib/python2.7/site-packages 
(from -r PerfKitBenchmarker/requirements.txt (line 17))
  Installing extra requirements: 'windows'
Requirement already satisfied (use --upgrade to upgrade): blinker>=1.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 18))
Requirement already satisfied (use --upgrade to upgrade): futures>=3.0.3 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 19))
Requirement already satisfied (use --upgrade to upgrade): PyYAML==3.12 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 20))
Requirement already satisfied (use --upgrade to upgrade): pint>=0.7 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 21))
Requirement already satisfied (use --upgrade to upgrade): numpy in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 22))
Requirement already satisfied (use --upgrade to upgrade): functools32 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 23))
Requirement already satisfied (use --upgrade to upgrade): contextlib2>=0.5.1 in 
/home/jenkins/.local/lib/python2.7/site-packages (from -r 
PerfKitBenchmarker/requirements.txt (line 24))
Cleaning up...
[beam_PerformanceTests_JDBC] $ /bin/bash -xe /tmp/jenkins2836497723944081920.sh
+ pip install --user -e 'sdks/python/[gcp,test]'
Obtaining 
file://
  Running setup.py 
(path:
 egg_info for package from 
file://

:66:
 UserWar

[jira] [Updated] (BEAM-2345) Version configuration of plugins / dependencies in root pom.xml is inconsistent

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2345:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Version configuration of plugins / dependencies in root pom.xml is 
> inconsistent
> ---
>
> Key: BEAM-2345
> URL: https://issues.apache.org/jira/browse/BEAM-2345
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Jason Kuster
>Assignee: Jason Kuster
>Priority: Minor
> Fix For: 2.3.0
>
>
> Versioning in root pom.xml in some places is controlled by the properties 
> section, sometimes is just inline. Move all versioning of plugins / 
> dependencies to properties section.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2956) DataflowRunner incorrectly reports the user agent for the Dataflow distribution

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172626#comment-16172626
 ] 

Reuven Lax commented on BEAM-2956:
--

Can this be closed?

> DataflowRunner incorrectly reports the user agent for the Dataflow 
> distribution
> ---
>
> Key: BEAM-2956
> URL: https://issues.apache.org/jira/browse/BEAM-2956
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: 2.2.0
>
>
> The DataflowRunner when distributed with the Dataflow SDK distribution may 
> incorrectly submit a user agent and properties from the Apache Beam 
> distribution.
> This occurs when the Apache Beam jars appear on the classpath before the 
> Dataflow SDK distribution. The fix is to not have two files at the same path 
> but to use two different paths, where the lack of the second path means that 
> we are using the Apache Beam distribution and its existence implies we are 
> using the Dataflow distribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2604) Delegate beam metrics to runners

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2604:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Delegate beam metrics to runners
> 
>
> Key: BEAM-2604
> URL: https://issues.apache.org/jira/browse/BEAM-2604
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, runner-spark
>Reporter: Cody
>Assignee: Aljoscha Krettek
> Fix For: 2.3.0
>
>
> Delegate beam metrics to runners to avoid forwarding updates, i.e., extract 
> updates from beam metrics and commit updates in runners.
> For Flink/Spark runners, we can reference metrics within runner's metrics 
> system in beam pipelines and update them directly without forwarding.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2603) Add Meter in beam metrics

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2603:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Add Meter in beam metrics
> -
>
> Key: BEAM-2603
> URL: https://issues.apache.org/jira/browse/BEAM-2603
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-core, sdk-java-core
>Reporter: Cody
>Assignee: Cody
> Fix For: 2.3.0
>
>
> 1. Add Meter interface and implementation
> 2. Add MeterData, MeterResult. Include MeterData in metric updates, and 
> MeterResult in metric query results.
> 3. Add corresponding changes regarding MeterResult and MeterData.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2602) Fully support dropwizard metrics in beam and runners

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2602:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Fully support dropwizard metrics in beam and runners
> 
>
> Key: BEAM-2602
> URL: https://issues.apache.org/jira/browse/BEAM-2602
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-core, runner-flink, runner-spark, sdk-java-core
>Affects Versions: Not applicable
>Reporter: Cody
>Assignee: Cody
> Fix For: 2.3.0
>
>
> As proposed at 
> https://docs.google.com/document/d/1-35iyCIJ9P4EQONlakgXBFRGUYoOLanq2Uf2sw5EjJw/edit?usp=sharing
>  , I'd like to add full support of dropwizard metrics by delegating beam 
> metrics to runners.
> The proposal involves a few subtasks, as far as I see, including:
> 1. add Meter interface in sdk-java-core and extend Distribution to support 
> quantiles
> 2. add MeterData, extend DistributionData. Merge 
> {Counter/Meter/Gauge/Distribution}Data instead of 
> Counter/Meter/Gauge/Distribution at accumulators.
> 3. Runner changes over improved metrics. 
> I will create subtasks later if there's no objection.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2576) Move non-core transform payloads out of Runner API proto

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172624#comment-16172624
 ] 

Reuven Lax commented on BEAM-2576:
--

Moving to 2.3.0

> Move non-core transform payloads out of Runner API proto
> 
>
> Key: BEAM-2576
> URL: https://issues.apache.org/jira/browse/BEAM-2576
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: portability
> Fix For: 2.3.0
>
>
> The presence of e.g. WriteFilesPayload in beam_runner_api.proto makes it 
> appears as though this is a core part of the model. While it is a very 
> important transform, this is actually just a payload for a composite, like 
> any other, and should not be treated so specially.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2576) Move non-core transform payloads out of Runner API proto

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2576:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Move non-core transform payloads out of Runner API proto
> 
>
> Key: BEAM-2576
> URL: https://issues.apache.org/jira/browse/BEAM-2576
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>  Labels: portability
> Fix For: 2.3.0
>
>
> The presence of e.g. WriteFilesPayload in beam_runner_api.proto makes it 
> appears as though this is a core part of the model. While it is a very 
> important transform, this is actually just a payload for a composite, like 
> any other, and should not be treated so specially.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2271) Release guide or pom.xml needs update to avoid releasing Python binary artifacts

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172623#comment-16172623
 ] 

Reuven Lax commented on BEAM-2271:
--

Can this bug be closed now?

> Release guide or pom.xml needs update to avoid releasing Python binary 
> artifacts
> 
>
> Key: BEAM-2271
> URL: https://issues.apache.org/jira/browse/BEAM-2271
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Daniel Halperin
>Assignee: Sourabh Bajaj
> Fix For: 2.2.0
>
>
> The following directories (and children) were discovered in 2.0.0-RC2 and 
> were present in 0.6.0.
> {code}
> sdks/python: build   dist.eggs   nose-1.3.7-py2.7.egg  (and child 
> contents)
> {code}
> Ideally, these artifacts, which are created during setup and testing, would 
> get created in the {{sdks/python/target/}} subfolder where they will 
> automatically get ignored. More info below.
> For 2.0.0, we will manually remove these files from the source release RC3+. 
> This should be fixed before the next release.
> Here is a list of other paths that get excluded, should they be useful.
> {code}
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/).*${project.build.directory}.*]
> 
> 
>  
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?maven-eclipse\.xml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.project]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.classpath]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iws]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.idea(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?out(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.ipr]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?[^/]*\.iml]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.settings(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.externalToolBuilders(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.deployables(/.*)?]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?\.wtpmodules(/.*)?]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?cobertura\.ser]
> 
> 
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?pom\.xml\.releaseBackup]
> 
> %regex[(?!((?!${project.build.directory}/)[^/]+/)*src/)(.*/)?release\.properties]
>   
> {code}
> This list is stored inside of this jar, which you can find by tracking 
> maven-assembly-plugin from the root apache pom: 
> https://mvnrepository.com/artifact/org.apache.apache.resources/apache-source-release-assembly-descriptor/1.0.6
> http://svn.apache.org/repos/asf/maven/pom/tags/apache-18/pom.xml



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4829

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1868) CreateStreamTest testMultiOutputParDo is flaky on the Spark runner

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172620#comment-16172620
 ] 

Reuven Lax commented on BEAM-1868:
--

This still hasn't been worked on AFAICT, so bumping to 2.3.0.

> CreateStreamTest testMultiOutputParDo is flaky on the Spark runner
> --
>
> Key: BEAM-1868
> URL: https://issues.apache.org/jira/browse/BEAM-1868
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Thomas Groh
>  Labels: flake
> Fix For: 2.3.0
>
>
> Example excerpt from a Jenkins failure:
> {code}
> Expected: iterable over [<1>, <2>, <3>] in any order
>  but: No item matches: <1>, <2>, <3> in []
>   at 
> org.apache.beam.sdk.testing.PAssert$PAssertionSite.capture(PAssert.java:170)
>   at org.apache.beam.sdk.testing.PAssert.that(PAssert.java:366)
>   at org.apache.beam.sdk.testing.PAssert.that(PAssert.java:358)
>   at 
> org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testMultiOutputParDo(CreateStreamTest.java:387)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-1868) CreateStreamTest testMultiOutputParDo is flaky on the Spark runner

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-1868:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> CreateStreamTest testMultiOutputParDo is flaky on the Spark runner
> --
>
> Key: BEAM-1868
> URL: https://issues.apache.org/jira/browse/BEAM-1868
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Reporter: Thomas Groh
>  Labels: flake
> Fix For: 2.3.0
>
>
> Example excerpt from a Jenkins failure:
> {code}
> Expected: iterable over [<1>, <2>, <3>] in any order
>  but: No item matches: <1>, <2>, <3> in []
>   at 
> org.apache.beam.sdk.testing.PAssert$PAssertionSite.capture(PAssert.java:170)
>   at org.apache.beam.sdk.testing.PAssert.that(PAssert.java:366)
>   at org.apache.beam.sdk.testing.PAssert.that(PAssert.java:358)
>   at 
> org.apache.beam.runners.spark.translation.streaming.CreateStreamTest.testMultiOutputParDo(CreateStreamTest.java:387)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PerformanceTests_JDBC #201

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[robertwb] Add option for toProto/fromProto translations in DirectRunner, 
disabled

[robertwb] Fix typo in variable name: window_fn --> windowfn

[iemejia] [BEAM-2764] Change document size range to fix flakiness on SolrIO 
tests

[lcwik] Add fn API progress reporting protos

[iemejia] [BEAM-2787] Fix MongoDbIO exception when trying to write empty bundle

[lcwik] Upgrade snappy-java to version 1.1.4

[lcwik] Upgrade slf4j to version 1.7.25

[chamikara] Updates bigtable.version to 1.0.0-pre3.

[lcwik] Add ThrowingBiConsumer to the set of functional interfaces

[altay] Added snippet tags for documentation

[altay] Added concrete example for CoGroupByKey snippet

[jbonofre] Add license-maven-plugin and some default license merges

[relax] BEAM-934 Fixed build by fixing firebug error.

[relax] BEAM-934 Enabled firebug after fixing the bug.

[relax] BEAM-934 Fixed code after review.

[jbonofre] [BEAM-2328] Add TikaIO

[mingmxu] Add DSLs module

[mingmxu] [BEAM-301] Initial skeleton for Beam SQL

[mingmxu] checkstyle and rename package

[mingmxu] redesign BeamSqlExpression to execute Calcite SQL expression.

[mingmxu] Fix inconsistent mapping for SQL FLOAT

[mingmxu] [BEAM-2158] Implement the arithmetic operators

[mingmxu] [BEAM-2161] Add support for String operators

[mingmxu] [BEAM-2079] Support TextIO as SQL source/sink

[mingmxu] [BEAM-2006] window support Add support for aggregation: global, HOP,

[mingmxu] Support common-used aggregation functions in SQL, including:  

[mingmxu] [BEAM-2195] Implement conditional operator (CASE)

[mingmxu] [BEAM-2234] Change return type of buildBeamPipeline to

[mingmxu] [BEAM-2149] Improved kafka table implemention.

[mingmxu] support UDF

[mingmxu] update JavaDoc.

[mingmxu] [BEAM-2255] Implement ORDER BY

[mingmxu] [BEAM-2288] Refine DSL interface as design doc of BEAM-2010: 1. rename

[mingmxu] [BEAM-2292] Add BeamPCollectionTable to create table from

[mingmxu] fix NoSuchFieldException

[mingmxu] [BEAM-2309] Implement VALUES and add support for data type CHAR (to be

[mingmxu] [BEAM-2310] Support encoding/decoding of TIME and new DECIMAL data 
type

[mingmxu] DSL interface for Beam SQL

[mingmxu] [BEAM-2329] Add ABS and SQRT math functions

[mingmxu] upgrade to version 2.1.0-SNAPSHOT

[mingmxu] rename SQL to Sql in class name

[mingmxu] [BEAM-2247] Implement date functions in SQL DSL

[mingmxu] [BEAM-2325] Support Set operator: intersect & except

[mingmxu] Add ROUND function on DSL_SQL branch.

[mingmxu] register table for both BeamSql.simpleQuery and BeamSql.query

[mingmxu] Add NOT operator on DSL_SQL branch (plus some refactoring)

[mingmxu] [BEAM-2444] BeamSql: use java standard exception

[mingmxu] [BEAM-2442] BeamSql surface api test.

[mingmxu] [BEAM-2440] BeamSql: reduce visibility

[mingmxu] Remove unused BeamPipelineCreator class

[mingmxu] [BEAM-2443] apply AutoValue to BeamSqlRecordType

[mingmxu] Update filter/project/aggregation tests to use BeamSql

[mingmxu] Remove UnsupportedOperationVisitor, which is currently just a no-op

[mingmxu] restrict the scope of BeamSqlEnv

[mingmxu] Add ACOS, ASIN, ATAN, COS, COT, DEGREES, RADIANS, SIN, TAN, SIGN, LN,

[mingmxu] [BEAM-2477] BeamAggregationRel should use Combine.perKey instead of

[mingmxu] use static table name PCOLLECTION in BeamSql.simpleQuery.

[mingmxu] Small fixes to make the example run in a runner agnostic way: - Add

[mingmxu] [BEAM-2193] Implement FULL, INNER, and OUTER JOIN: - FULL and INNER

[mingmxu] UDAF support: - Adds an abstract class BeamSqlUdaf for defining 
Calcite

[mingmxu] BeamSql: refactor the MockedBeamSqlTable and related tests

[mingmxu] MockedBeamSqlTable -> MockedBoundedTable

[mingmxu] Test unsupported/invalid cases in DSL tests.

[mingmxu] [BEAM-2550] add UnitTest for JOIN in DSL

[mingmxu] support TUMBLE/HOP/SESSION _START function

[mingmxu] Test queries on unbounded PCollections with BeamSql DSL API. Also add

[mingmxu] [BEAM-2564] add integration test for string functions

[mingmxu] CAST operator supporting numeric, date and timestamp types

[mingmxu] POWER function

[mingmxu] support UDF/UDAF in BeamSql

[mingmxu] upgrade pom to 2.2.0-SNAPSHOT

[mingmxu] [BEAM-2560] Add integration test for arithmetic operators.

[mingmxu] cleanup BeamSqlRow

[mingmxu] proposal for new UDF

[mingmxu] [BEAM-2562] Add integration test for logical operators

[mingmxu] [BEAM-2384] CEIL, FLOOR, TRUNCATE, PI, ATAN2 math functions

[mingmxu] [BEAM-2561] add integration test for date functions

[mingmxu] refactor the datetime test to use ExpressionChecker and fix

[mingmxu] [BEAM-2565] add integration test for conditional functions

[mingmxu] rebased, add RAND/RAND_INTEGER

[mingmxu] [BEAM-2621] BeamSqlRecordType -> BeamSqlRowType

[mingmxu] [BEAM-2563] Add integration test for math operators Misc: 1. no SQRT 
in

[mingmxu] [BEAM-2613] add integration test for comparison operators

[mingm

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4828

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Closed] (BEAM-2837) Writing To Spanner From Google Cloud DataFlow - Failure

2017-09-19 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-2837.
--
   Resolution: Fixed
Fix Version/s: 2.2.0

> Writing To Spanner From Google Cloud DataFlow - Failure
> ---
>
> Key: BEAM-2837
> URL: https://issues.apache.org/jira/browse/BEAM-2837
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.1.0
> Environment: Google Cloud DataFlow
>Reporter: Al Yaros
>Assignee: Mairbek Khadikov
> Fix For: 2.2.0
>
>
> Simple Java Program That reads from Pub\Sub and Writes to Spanner Fails with 
> cryptic error message.
> Simple Program to Demonstrate the Error:
> [https://github.com/alyaros/ExamplePubSubToSpannerViaDataFlow]
> {code:java}
> *Caused by: org.apache.beam.sdk.util.UserCodeException: 
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor
> 
> org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
> org.apache.beam.sdk.io.
> gcp.spanner.SpannerWriteGroupFn$DoFnInvoker.invokeSetup(Unknown Source)
> 
> com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:66)
> 
> com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:48)
> 
> com.google.cloud.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:104)
> 
> com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:66)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:360)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:271)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:253)
> 
> com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:55)
> 
> com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:43)
> 
> com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:78)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:142)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:925)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:133)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:771)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)*
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2837) Writing To Spanner From Google Cloud DataFlow - Failure

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172549#comment-16172549
 ] 

ASF GitHub Bot commented on BEAM-2837:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3862


> Writing To Spanner From Google Cloud DataFlow - Failure
> ---
>
> Key: BEAM-2837
> URL: https://issues.apache.org/jira/browse/BEAM-2837
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.1.0
> Environment: Google Cloud DataFlow
>Reporter: Al Yaros
>Assignee: Mairbek Khadikov
>
> Simple Java Program That reads from Pub\Sub and Writes to Spanner Fails with 
> cryptic error message.
> Simple Program to Demonstrate the Error:
> [https://github.com/alyaros/ExamplePubSubToSpannerViaDataFlow]
> {code:java}
> *Caused by: org.apache.beam.sdk.util.UserCodeException: 
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.spanner.spi.v1.SpannerErrorInterceptor
> 
> org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:36)
> org.apache.beam.sdk.io.
> gcp.spanner.SpannerWriteGroupFn$DoFnInvoker.invokeSetup(Unknown Source)
> 
> com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.deserializeCopy(DoFnInstanceManagers.java:66)
> 
> com.google.cloud.dataflow.worker.DoFnInstanceManagers$ConcurrentQueueInstanceManager.peek(DoFnInstanceManagers.java:48)
> 
> com.google.cloud.dataflow.worker.UserParDoFnFactory.create(UserParDoFnFactory.java:104)
> 
> com.google.cloud.dataflow.worker.DefaultParDoFnFactory.create(DefaultParDoFnFactory.java:66)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory.createParDoOperation(MapTaskExecutorFactory.java:360)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:271)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory$3.typedApply(MapTaskExecutorFactory.java:253)
> 
> com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:55)
> 
> com.google.cloud.dataflow.worker.graph.Networks$TypeSafeNodeFunction.apply(Networks.java:43)
> 
> com.google.cloud.dataflow.worker.graph.Networks.replaceDirectedNetworkNodes(Networks.java:78)
> 
> com.google.cloud.dataflow.worker.MapTaskExecutorFactory.create(MapTaskExecutorFactory.java:142)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:925)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker.access$800(StreamingDataflowWorker.java:133)
> 
> com.google.cloud.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:771)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)*
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[2/2] beam git commit: This closes #3862: [BEAM-2837] Updated grpc-google-pubsub-v1 dependency

2017-09-19 Thread jkff
This closes #3862: [BEAM-2837] Updated grpc-google-pubsub-v1 dependency


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e8a52827
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e8a52827
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e8a52827

Branch: refs/heads/master
Commit: e8a52827e792c9cbce5832e5edcd0399428c24f3
Parents: cfbdb61 c6bedf1
Author: Eugene Kirpichov 
Authored: Tue Sep 19 17:01:33 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Sep 19 17:01:33 2017 -0700

--
 pom.xml| 10 --
 sdks/java/io/google-cloud-platform/pom.xml |  7 ++-
 2 files changed, 14 insertions(+), 3 deletions(-)
--




[GitHub] beam pull request #3862: [BEAM-2837] Updated grpc-google-pubsub-v1 dependenc...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3862


---


[1/2] beam git commit: Updates grpc-google-pubsub-v1 to grpc-google-cloud-pubsub-v1

2017-09-19 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master cfbdb6115 -> e8a52827e


Updates grpc-google-pubsub-v1 to grpc-google-cloud-pubsub-v1

This is good because it's a newer version, and because it gets
rid of a dependency conflict with Spanner.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c6bedf11
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c6bedf11
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c6bedf11

Branch: refs/heads/master
Commit: c6bedf11f5a44c944ee4d2a94ddb6c1c81f8c822
Parents: cfbdb61
Author: Mairbek Khadikov 
Authored: Mon Sep 18 15:03:51 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Sep 19 17:00:26 2017 -0700

--
 pom.xml| 10 --
 sdks/java/io/google-cloud-platform/pom.xml |  7 ++-
 2 files changed, 14 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/c6bedf11/pom.xml
--
diff --git a/pom.xml b/pom.xml
index a2d6aae..236645c 100644
--- a/pom.xml
+++ b/pom.xml
@@ -110,7 +110,7 @@
 v2-rev355-1.22.0
 1.0.0-pre3
 v1-rev6-1.22.0
-0.1.0
+0.1.18
 v2-rev8-1.22.0
 v1b3-rev198-1.22.0
 0.5.160222
@@ -888,7 +888,7 @@
 
   
 com.google.api.grpc
-grpc-google-pubsub-v1
+grpc-google-cloud-pubsub-v1
 ${pubsubgrpc.version}
 
   

[jira] [Updated] (BEAM-2970) Add comparator function to equal_to

2017-09-19 Thread Sarah Walters (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarah Walters updated BEAM-2970:

Description: 
The equal_to function provided by testing/util.py 
(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
 assumes that the actual and expected lists can be sorted using Python's sorted 
method (which relies on the < operator) and compared using the == operator.

If this isn't the case, equal_to sometimes reports False incorrectly, when the 
expected and actual lists are in different orders.

Add a comparator function to equal_to in order to allow callers to define a 
total order.

  was:
The equal_to function provided by testing/util.py 
(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
 assumes that the actual and expected lists can be sorted using Python's sorted 
method (which relies on the < operator) and compared using the == operator.

If this isn't the case, equal_to sometimes reports False incorrectly, when the 
expected and actual lists are in different orders.

Add a comparator function to equal_to in order to allow callers to define 
equality.


> Add comparator function to equal_to
> ---
>
> Key: BEAM-2970
> URL: https://issues.apache.org/jira/browse/BEAM-2970
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Sarah Walters
>Assignee: Ahmet Altay
>Priority: Minor
>
> The equal_to function provided by testing/util.py 
> (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
>  assumes that the actual and expected lists can be sorted using Python's 
> sorted method (which relies on the < operator) and compared using the == 
> operator.
> If this isn't the case, equal_to sometimes reports False incorrectly, when 
> the expected and actual lists are in different orders.
> Add a comparator function to equal_to in order to allow callers to define a 
> total order.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2970) Add comparator function to equal_to

2017-09-19 Thread Sarah Walters (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sarah Walters updated BEAM-2970:

Description: 
The equal_to function provided by testing/util.py 
(https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
 assumes that the actual and expected lists can be sorted using Python's sorted 
method (which relies on the < operator) and compared using the == operator.

If this isn't the case, equal_to sometimes reports False incorrectly, when the 
expected and actual lists are in different orders.

Add a comparator function to equal_to in order to allow callers to define 
equality.

  was:
The [equal_to function provided by 
testing/util.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
 assumes that the actual and expected lists can be sorted using Python's sorted 
method (which relies on the < operator) and compared using the == operator.

If this isn't the case, equal_to sometimes reports False incorrectly, when the 
expected and actual lists are in different orders.

Add a comparator function to equal_to in order to allow callers to define 
equality.


> Add comparator function to equal_to
> ---
>
> Key: BEAM-2970
> URL: https://issues.apache.org/jira/browse/BEAM-2970
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-py-core
>Reporter: Sarah Walters
>Assignee: Ahmet Altay
>Priority: Minor
>
> The equal_to function provided by testing/util.py 
> (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
>  assumes that the actual and expected lists can be sorted using Python's 
> sorted method (which relies on the < operator) and compared using the == 
> operator.
> If this isn't the case, equal_to sometimes reports False incorrectly, when 
> the expected and actual lists are in different orders.
> Add a comparator function to equal_to in order to allow callers to define 
> equality.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2970) Add comparator function to equal_to

2017-09-19 Thread Sarah Walters (JIRA)
Sarah Walters created BEAM-2970:
---

 Summary: Add comparator function to equal_to
 Key: BEAM-2970
 URL: https://issues.apache.org/jira/browse/BEAM-2970
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Sarah Walters
Assignee: Ahmet Altay
Priority: Minor


The [equal_to function provided by 
testing/util.py](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/testing/util.py#L54)
 assumes that the actual and expected lists can be sorted using Python's sorted 
method (which relies on the < operator) and compared using the == operator.

If this isn't the case, equal_to sometimes reports False incorrectly, when the 
expected and actual lists are in different orders.

Add a comparator function to equal_to in order to allow callers to define 
equality.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4827

2017-09-19 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4826

2017-09-19 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3869: Remove any_param field from the Runner API

2017-09-19 Thread tgroh
GitHub user tgroh opened a pull request:

https://github.com/apache/beam/pull/3869

Remove any_param field from the Runner API

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tgroh/beam remove_any_param

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3869.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3869


commit 5644cf25e172d3044d02ea330561c22b54e098c6
Author: Thomas Groh 
Date:   2017-09-19T23:39:44Z

Remove any_param field from the Runner API




---


[GitHub] beam pull request #3868: Add Nullable Metric getter to MetricsContainerImpl.

2017-09-19 Thread pedapudi
GitHub user pedapudi opened a pull request:

https://github.com/apache/beam/pull/3868

Add Nullable Metric getter to MetricsContainerImpl.

MetricsContainerImpl currently creates a Counter with a given name when 
attempting to read that Counter. In order to read potentially non-existent 
Counters, a getter that doesn't create a Counter on behalf of the reader is 
necessary. 'tryGetCounter' introduces a way to retrieve null Counter values.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pedapudi/beam nullable-metric-getter

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3868.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3868


commit ffeb889e46ac9bb9e4f55de880835a4e6b38d7df
Author: Sunil Pedapudi 
Date:   2017-09-19T21:55:43Z

MetricsContainerImpl: Add Nullable getter.




---


[jira] [Commented] (BEAM-2966) Allow subclasses of tuple, list, dict as pvalues.

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2966?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172441#comment-16172441
 ] 

ASF GitHub Bot commented on BEAM-2966:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3831


> Allow subclasses of tuple, list, dict as pvalues.
> -
>
> Key: BEAM-2966
> URL: https://issues.apache.org/jira/browse/BEAM-2966
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Reporter: Robert Bradshaw
>Assignee: Robert Bradshaw
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3831: [BEAM-2966] Allow subclasses of tuple, list, and di...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3831


---


[2/3] beam git commit: Support multiple materializations of the same pvalue.

2017-09-19 Thread robertwb
Support multiple materializations of the same pvalue.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/07c08cc8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/07c08cc8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/07c08cc8

Branch: refs/heads/master
Commit: 07c08cc80135ff37c1224e0e659306f65481ad65
Parents: 72960b3
Author: Robert Bradshaw 
Authored: Fri Sep 15 15:40:10 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:23:28 2017 -0700

--
 sdks/python/apache_beam/runners/runner.py| 19 ++-
 sdks/python/apache_beam/transforms/ptransform.py | 17 +++--
 .../apache_beam/transforms/ptransform_test.py|  9 +
 3 files changed, 30 insertions(+), 15 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/07c08cc8/sdks/python/apache_beam/runners/runner.py
--
diff --git a/sdks/python/apache_beam/runners/runner.py 
b/sdks/python/apache_beam/runners/runner.py
index 131d54f..bdabd81 100644
--- a/sdks/python/apache_beam/runners/runner.py
+++ b/sdks/python/apache_beam/runners/runner.py
@@ -247,17 +247,18 @@ class PValueCache(object):
 self._cache[
 self.to_cache_key(transform, tag)] = [value, transform.refcounts[tag]]
 
-  def get_pvalue(self, pvalue):
+  def get_pvalue(self, pvalue, decref=True):
 """Gets the value associated with a PValue from the cache."""
 self._ensure_pvalue_has_real_producer(pvalue)
 try:
   value_with_refcount = self._cache[self.key(pvalue)]
-  value_with_refcount[1] -= 1
-  logging.debug('PValue computed by %s (tag %s): refcount: %d => %d',
-pvalue.real_producer.full_label, self.key(pvalue)[1],
-value_with_refcount[1] + 1, value_with_refcount[1])
-  if value_with_refcount[1] <= 0:
-self.clear_pvalue(pvalue)
+  if decref:
+value_with_refcount[1] -= 1
+logging.debug('PValue computed by %s (tag %s): refcount: %d => %d',
+  pvalue.real_producer.full_label, self.key(pvalue)[1],
+  value_with_refcount[1] + 1, value_with_refcount[1])
+if value_with_refcount[1] <= 0:
+  self.clear_pvalue(pvalue)
   return value_with_refcount[0]
 except KeyError:
   if (pvalue.tag is not None
@@ -268,8 +269,8 @@ class PValueCache(object):
   else:
 raise
 
-  def get_unwindowed_pvalue(self, pvalue):
-return [v.value for v in self.get_pvalue(pvalue)]
+  def get_unwindowed_pvalue(self, pvalue, decref=True):
+return [v.value for v in self.get_pvalue(pvalue, decref)]
 
   def clear_pvalue(self, pvalue):
 """Removes a PValue from the cache."""

http://git-wip-us.apache.org/repos/asf/beam/blob/07c08cc8/sdks/python/apache_beam/transforms/ptransform.py
--
diff --git a/sdks/python/apache_beam/transforms/ptransform.py 
b/sdks/python/apache_beam/transforms/ptransform.py
index f630977..7cf1441 100644
--- a/sdks/python/apache_beam/transforms/ptransform.py
+++ b/sdks/python/apache_beam/transforms/ptransform.py
@@ -75,10 +75,12 @@ class _PValueishTransform(object):
   """
   def visit_nested(self, node, *args):
 if isinstance(node, (tuple, list)):
-  # namedtuples require unpacked arguments in their constructor,
-  # but do have a _make method that takes a sequence.
-  return getattr(node.__class__, '_make', node.__class__)(
-  [self.visit(x, *args) for x in node])
+  args = [self.visit(x, *args) for x in node]
+  if isinstance(node, tuple) and hasattr(node.__class__, '_make'):
+# namedtuples require unpacked arguments in their constructor
+return node.__class__(*args)
+  else:
+return node.__class__(args)
 elif isinstance(node, dict):
   return node.__class__(
   {key: self.visit(value, *args) for (key, value) in node.items()})
@@ -102,7 +104,9 @@ class _MaterializedDoOutputsTuple(pvalue.DoOutputsTuple):
 self._pvalue_cache = pvalue_cache
 
   def __getitem__(self, tag):
-return self._pvalue_cache.get_unwindowed_pvalue(self._deferred[tag])
+# Simply accessing the value should not use it up.
+return self._pvalue_cache.get_unwindowed_pvalue(
+self._deferred[tag], decref=False)
 
 
 class _MaterializePValues(_PValueishTransform):
@@ -111,7 +115,8 @@ class _MaterializePValues(_PValueishTransform):
 
   def visit(self, node):
 if isinstance(node, pvalue.PValue):
-  return self._pvalue_cache.get_unwindowed_pvalue(node)
+  # Simply accessing the value should not use it up.
+  return self._pvalue_cache.get_unwindowed_pvalue(node, decref=False)
 elif isinstance(node, pvalue.DoOutputsTuple):
   return _Mater

[1/3] beam git commit: Closes #3831

2017-09-19 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master 28d4f0989 -> cfbdb6115


Closes #3831


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cfbdb611
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cfbdb611
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cfbdb611

Branch: refs/heads/master
Commit: cfbdb6115eca25fd3f512355aaf4e9be80f5ed94
Parents: 28d4f09 07c08cc
Author: Robert Bradshaw 
Authored: Tue Sep 19 15:23:28 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:23:28 2017 -0700

--
 sdks/python/apache_beam/pipeline.py |  2 +-
 sdks/python/apache_beam/runners/runner.py   | 19 +++---
 .../python/apache_beam/transforms/ptransform.py | 71 +++-
 .../apache_beam/transforms/ptransform_test.py   | 25 +++
 4 files changed, 74 insertions(+), 43 deletions(-)
--




[3/3] beam git commit: Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.

2017-09-19 Thread robertwb
Allow subclasses of tuple, list, and dict as pvaluish inputs/outputs.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/72960b31
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/72960b31
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/72960b31

Branch: refs/heads/master
Commit: 72960b31843d1dcdf2b43a55db0797a15f48ef18
Parents: 28d4f09
Author: Robert Bradshaw 
Authored: Fri Sep 8 17:53:09 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:23:28 2017 -0700

--
 sdks/python/apache_beam/pipeline.py |  2 +-
 .../python/apache_beam/transforms/ptransform.py | 62 ++--
 .../apache_beam/transforms/ptransform_test.py   | 16 +
 3 files changed, 48 insertions(+), 32 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/72960b31/sdks/python/apache_beam/pipeline.py
--
diff --git a/sdks/python/apache_beam/pipeline.py 
b/sdks/python/apache_beam/pipeline.py
index 1ebd099..c670978 100644
--- a/sdks/python/apache_beam/pipeline.py
+++ b/sdks/python/apache_beam/pipeline.py
@@ -438,7 +438,7 @@ class Pipeline(object):
 if type_options is not None and type_options.pipeline_type_check:
   transform.type_check_outputs(pvalueish_result)
 
-for result in ptransform.GetPValues().visit(pvalueish_result):
+for result in ptransform.get_nested_pvalues(pvalueish_result):
   assert isinstance(result, (pvalue.PValue, pvalue.DoOutputsTuple))
 
   # Make sure we set the producer only for a leaf node in the transform 
DAG.

http://git-wip-us.apache.org/repos/asf/beam/blob/72960b31/sdks/python/apache_beam/transforms/ptransform.py
--
diff --git a/sdks/python/apache_beam/transforms/ptransform.py 
b/sdks/python/apache_beam/transforms/ptransform.py
index eccaccd..f630977 100644
--- a/sdks/python/apache_beam/transforms/ptransform.py
+++ b/sdks/python/apache_beam/transforms/ptransform.py
@@ -73,27 +73,25 @@ class _PValueishTransform(object):
 
   This visits a PValueish, contstructing a (possibly mutated) copy.
   """
-  def visit(self, node, *args):
-return getattr(
-self,
-'visit_' + node.__class__.__name__,
-lambda x, *args: x)(node, *args)
-
-  def visit_list(self, node, *args):
-return [self.visit(x, *args) for x in node]
-
-  def visit_tuple(self, node, *args):
-return tuple(self.visit(x, *args) for x in node)
-
-  def visit_dict(self, node, *args):
-return {key: self.visit(value, *args) for (key, value) in node.items()}
+  def visit_nested(self, node, *args):
+if isinstance(node, (tuple, list)):
+  # namedtuples require unpacked arguments in their constructor,
+  # but do have a _make method that takes a sequence.
+  return getattr(node.__class__, '_make', node.__class__)(
+  [self.visit(x, *args) for x in node])
+elif isinstance(node, dict):
+  return node.__class__(
+  {key: self.visit(value, *args) for (key, value) in node.items()})
+else:
+  return node
 
 
 class _SetInputPValues(_PValueishTransform):
   def visit(self, node, replacements):
 if id(node) in replacements:
   return replacements[id(node)]
-return super(_SetInputPValues, self).visit(node, replacements)
+else:
+  return self.visit_nested(node, replacements)
 
 
 class _MaterializedDoOutputsTuple(pvalue.DoOutputsTuple):
@@ -116,22 +114,25 @@ class _MaterializePValues(_PValueishTransform):
   return self._pvalue_cache.get_unwindowed_pvalue(node)
 elif isinstance(node, pvalue.DoOutputsTuple):
   return _MaterializedDoOutputsTuple(node, self._pvalue_cache)
-return super(_MaterializePValues, self).visit(node)
+else:
+  return self.visit_nested(node)
 
 
-class GetPValues(_PValueishTransform):
-  def visit(self, node, pvalues=None):
-if pvalues is None:
-  pvalues = []
-  self.visit(node, pvalues)
-  return pvalues
-elif isinstance(node, (pvalue.PValue, pvalue.DoOutputsTuple)):
+class _GetPValues(_PValueishTransform):
+  def visit(self, node, pvalues):
+if isinstance(node, (pvalue.PValue, pvalue.DoOutputsTuple)):
   pvalues.append(node)
 else:
-  super(GetPValues, self).visit(node, pvalues)
+  self.visit_nested(node, pvalues)
+
+
+def get_nested_pvalues(pvalueish):
+  pvalues = []
+  _GetPValues().visit(pvalueish, pvalues)
+  return pvalues
 
 
-class _ZipPValues(_PValueishTransform):
+class _ZipPValues(object):
   """Pairs each PValue in a pvalueish with a value in a parallel out sibling.
 
   Sibling should have the same nested structure as pvalueish.  Leaves in
@@ -153,10 +154,12 @@ class _ZipPValues(_PValueishTransform):
   return pairs
 elif isinstance(pvalueish, (pvalue.PVal

[GitHub] beam pull request #3847: Refactor fn api runner into a universal local runne...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3847


---


[2/6] beam git commit: Refactor fn api runner into universal local runner.

2017-09-19 Thread robertwb
Refactor fn api runner into universal local runner.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/bd115898
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/bd115898
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/bd115898

Branch: refs/heads/master
Commit: bd115898f8ba1f1f4f08d34df574ee42de467a6a
Parents: e7eefdd
Author: Robert Bradshaw 
Authored: Fri Sep 8 12:25:42 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:07 2017 -0700

--
 .../portability/universal_local_runner.py   | 169 +++
 .../portability/universal_local_runner_main.py  |  39 +
 .../portability/universal_local_runner_test.py  |  68 
 sdks/python/apache_beam/runners/runner.py   |   1 +
 4 files changed, 277 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/bd115898/sdks/python/apache_beam/runners/portability/universal_local_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
new file mode 100644
index 000..21f196b
--- /dev/null
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
@@ -0,0 +1,169 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+from concurrent import futures
+import time
+import threading
+import traceback
+
+import grpc
+
+from apache_beam.portability.api import beam_job_api_pb2
+from apache_beam.portability.api import beam_job_api_pb2_grpc
+from apache_beam.runners import runner
+from apache_beam.runners.portability import fn_api_runner
+
+
+TERMINAL_STATES = [
+beam_job_api_pb2.JobState.DONE,
+beam_job_api_pb2.JobState.STOPPED,
+beam_job_api_pb2.JobState.FAILED,
+beam_job_api_pb2.JobState.CANCELLED,
+]
+
+
+class UniversalLocalRunner(runner.PipelineRunner):
+
+  def __init__(self, timeout=None, use_grpc=True, use_subprocesses=False):
+super(UniversalLocalRunner, self).__init__()
+self._timeout = use_grpc
+self._use_grpc = use_grpc
+self._use_subprocesses = use_subprocesses
+
+  def run(self, pipeline):
+if self._use_subprocesses:
+  raise NotImplementedError
+else:
+  handle = JobServicer().start(use_grpc=self._use_grpc)
+prepare_response = handle.Prepare(
+beam_job_api_pb2.PrepareJobRequest(
+job_name='job',
+pipeline=pipeline.to_runner_api()))
+run_response = handle.Run(beam_job_api_pb2.RunJobRequest(
+preparation_id=prepare_response.preparation_id))
+return PipelineResult(handle, run_response.job_id, self._timeout)
+
+
+class PipelineResult(runner.PipelineResult):
+  def __init__(self, handle, job_id, timeout):
+super(PipelineResult, self).__init__(beam_job_api_pb2.JobState.UNKNOWN)
+self._handle = handle
+self._job_id = job_id
+self._timeout = timeout
+
+  def cancel(self):
+self._handle.Cancel()
+
+  @property
+  def state(self):
+runner_api_state = self._handle.GetState(
+beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)).state
+self._state = self._runner_api_state_to_pipeline_state(runner_api_state)
+return self._state
+
+  @staticmethod
+  def _runner_api_state_to_pipeline_state(runner_api_state):
+return getattr(
+runner.PipelineState,
+beam_job_api_pb2.JobState.JobStateType.Name(runner_api_state))
+
+  @staticmethod
+  def _pipeline_state_to_runner_api_state(pipeline_state):
+return beam_job_api_pb2.JobState.JobStateType.Value(pipeline_state)
+
+  def wait_until_finish(self):
+start = time.time()
+sleep_interval = 0.01
+while self._pipeline_state_to_runner_api_state(
+self.state) not in TERMINAL_STATES:
+  if self._timeout and time.time() - start > self._timeout:
+raise RuntimeError(
+"Pipeline %s timed out in state %s." % (self._job_id, self._state))
+  time.sleep(sleep_interval)
+if self._state != runner.PipelineState.DONE:
+  raise RuntimeError(
+

[4/6] beam git commit: Implement ULR subprocess mode.

2017-09-19 Thread robertwb
Implement ULR subprocess mode.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2999ec9d
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2999ec9d
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2999ec9d

Branch: refs/heads/master
Commit: 2999ec9d233760b02cdae365f2ef58593c391c03
Parents: bd11589
Author: Robert Bradshaw 
Authored: Fri Sep 8 15:31:17 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:07 2017 -0700

--
 .../portability/universal_local_runner.py   | 106 ++-
 .../portability/universal_local_runner_main.py  |  10 +-
 .../portability/universal_local_runner_test.py  |  31 --
 3 files changed, 112 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2999ec9d/sdks/python/apache_beam/runners/portability/universal_local_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
index 21f196b..71ac4f8 100644
--- a/sdks/python/apache_beam/runners/portability/universal_local_runner.py
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
@@ -16,9 +16,14 @@
 #
 
 from concurrent import futures
+import logging
+import socket
+import subprocess
+import sys
 import time
 import threading
 import traceback
+import uuid
 
 import grpc
 
@@ -38,17 +43,72 @@ TERMINAL_STATES = [
 
 class UniversalLocalRunner(runner.PipelineRunner):
 
-  def __init__(self, timeout=None, use_grpc=True, use_subprocesses=False):
+  def __init__(self, timeout=None, use_grpc=True, use_subprocesses=True):
 super(UniversalLocalRunner, self).__init__()
 self._timeout = use_grpc
 self._use_grpc = use_grpc
 self._use_subprocesses = use_subprocesses
 
+self._handle = None
+self._subprocess = None
+
+  def __del__(self):
+if self._subprocess:
+  self._subprocess.kill()
+
+  def _get_handle(self):
+if not self._handle:
+  if self._use_subprocesses:
+if self._subprocess:
+  # Kill the old one if it exists.
+  self._subprocess.kill()
+# TODO(robertwb): Consider letting the subprocess pick one and
+# communicate it back...
+port = _pick_unused_port()
+logging.info("Starting server on port %d.", port)
+self._subprocess = subprocess.Popen([
+sys.executable,
+'-m',
+'apache_beam.runners.portability.universal_local_runner_main',
+'-p',
+str(port)])
+handle = beam_job_api_pb2_grpc.JobServiceStub(
+grpc.insecure_channel('localhost:%d' % port))
+logging.info("Waiting for server to be ready...")
+start = time.time()
+timeout = 30
+while True:
+  time.sleep(0.1)
+  if self._subprocess.poll() is not None:
+raise RuntimeError(
+"Subprocess terminated unexpectedly with exit code %d." %
+self._subprocess.returncode)
+  elif time.time() - start > timeout:
+raise RuntimeError(
+"Pipeline timed out waiting for job service subprocess.")
+  else:
+try:
+  handle.GetState(
+  beam_job_api_pb2.GetJobStateRequest(job_id='[fake]'))
+  break
+except grpc.RpcError as exn:
+  if exn.code != grpc.StatusCode.UNAVAILABLE:
+break
+logging.info("Server ready.")
+self._handle = handle
+
+  elif self._use_grpc:
+self._servicer = JobServicer()
+self._handle = beam_job_api_pb2_grpc.JobServiceStub(
+grpc.insecure_channel('localhost:%d' % 
self._servicer.start_grpc()))
+
+  else:
+self._handle = JobServicer()
+
+return self._handle
+
   def run(self, pipeline):
-if self._use_subprocesses:
-  raise NotImplementedError
-else:
-  handle = JobServicer().start(use_grpc=self._use_grpc)
+handle = self._get_handle()
 prepare_response = handle.Prepare(
 beam_job_api_pb2.PrepareJobRequest(
 job_name='job',
@@ -129,27 +189,16 @@ class JobServicer(beam_job_api_pb2.JobServiceServicer):
 self._worker_command_line = worker_command_line
 self._jobs = {}
 
-  def start(self, use_grpc, port=0):
-if use_grpc:
-  self._server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
-  self._port = self._server.add_insecure_port('[::]:%d' % port)
-  beam_job_api_pb2_grpc.add_JobServiceServicer_to_server(self, 
self._server)
-  self._server.start()
-
-  channel = grpc.insecure_channel('[::]:%d' % self._port)
-  return beam_job_api_pb2_grpc.JobServiceStub(channel)
-else:

[6/6] beam git commit: Closes #3847

2017-09-19 Thread robertwb
Closes #3847


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/28d4f098
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/28d4f098
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/28d4f098

Branch: refs/heads/master
Commit: 28d4f0989de630c7e19b895a144a5fe1db706196
Parents: e7eefdd a1abccd
Author: Robert Bradshaw 
Authored: Tue Sep 19 15:20:24 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:24 2017 -0700

--
 .../runners/portability/fn_api_runner.py|  14 +-
 .../portability/universal_local_runner.py   | 409 +++
 .../portability/universal_local_runner_main.py  |  44 ++
 .../portability/universal_local_runner_test.py  |  85 
 sdks/python/apache_beam/runners/runner.py   |   1 +
 .../apache_beam/runners/worker/data_plane.py|   9 +-
 .../apache_beam/runners/worker/sdk_worker.py|   6 +-
 .../runners/worker/sdk_worker_main.py   |  26 +-
 .../runners/worker/sdk_worker_test.py   |   3 +-
 9 files changed, 573 insertions(+), 24 deletions(-)
--




[3/6] beam git commit: Allow worker to be started as a subprocess.

2017-09-19 Thread robertwb
Allow worker to be started as a subprocess.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/cb778748
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/cb778748
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/cb778748

Branch: refs/heads/master
Commit: cb778748ba5fc437da58e3054b6a73d81eabeca5
Parents: 2999ec9
Author: Robert Bradshaw 
Authored: Fri Sep 8 18:25:10 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:07 2017 -0700

--
 .../runners/portability/fn_api_runner.py| 14 +++--
 .../portability/universal_local_runner.py   | 66 +---
 .../portability/universal_local_runner_test.py  |  4 ++
 .../apache_beam/runners/worker/data_plane.py|  9 ++-
 .../apache_beam/runners/worker/sdk_worker.py|  6 +-
 .../runners/worker/sdk_worker_main.py   | 26 
 .../runners/worker/sdk_worker_test.py   |  3 +-
 7 files changed, 96 insertions(+), 32 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/cb778748/sdks/python/apache_beam/runners/portability/fn_api_runner.py
--
diff --git a/sdks/python/apache_beam/runners/portability/fn_api_runner.py 
b/sdks/python/apache_beam/runners/portability/fn_api_runner.py
index 30bfe7b..b0faa38 100644
--- a/sdks/python/apache_beam/runners/portability/fn_api_runner.py
+++ b/sdks/python/apache_beam/runners/portability/fn_api_runner.py
@@ -150,10 +150,13 @@ class _GroupingBuffer(object):
 
 class FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
 
-  def __init__(self, use_grpc=False):
+  def __init__(self, use_grpc=False, sdk_harness_factory=None):
 super(FnApiRunner, self).__init__()
 self._last_uid = -1
 self._use_grpc = use_grpc
+if sdk_harness_factory and not use_grpc:
+  raise ValueError('GRPC must be used if a harness factory is provided.')
+self._sdk_harness_factory = sdk_harness_factory
 
   def has_metrics_support(self):
 return False
@@ -625,7 +628,7 @@ class 
FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
   def run_stages(self, pipeline_components, stages, safe_coders):
 
 if self._use_grpc:
-  controller = FnApiRunner.GrpcController()
+  controller = FnApiRunner.GrpcController(self._sdk_harness_factory)
 else:
   controller = FnApiRunner.DirectController()
 
@@ -1029,7 +1032,8 @@ class 
FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
   class GrpcController(object):
 """An grpc based controller for fn API control, state and data planes."""
 
-def __init__(self):
+def __init__(self, sdk_harness_factory=None):
+  self.sdk_harness_factory = sdk_harness_factory
   self.state_handler = FnApiRunner.SimpleState()
   self.control_server = grpc.server(
   futures.ThreadPoolExecutor(max_workers=10))
@@ -1052,8 +1056,8 @@ class 
FnApiRunner(maptask_executor_runner.MapTaskExecutorRunner):
   self.data_server.start()
   self.control_server.start()
 
-  self.worker = sdk_worker.SdkHarness(
-  grpc.insecure_channel('localhost:%s' % self.control_port))
+  self.worker = (self.sdk_harness_factory or sdk_worker.SdkHarness)(
+  'localhost:%s' % self.control_port)
   self.worker_thread = threading.Thread(target=self.worker.run)
   logging.info('starting worker')
   self.worker_thread.start()

http://git-wip-us.apache.org/repos/asf/beam/blob/cb778748/sdks/python/apache_beam/runners/portability/universal_local_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
index 71ac4f8..0ddcda3 100644
--- a/sdks/python/apache_beam/runners/portability/universal_local_runner.py
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
@@ -16,7 +16,9 @@
 #
 
 from concurrent import futures
+import functools
 import logging
+import os
 import socket
 import subprocess
 import sys
@@ -26,7 +28,9 @@ import traceback
 import uuid
 
 import grpc
+from google.protobuf import text_format
 
+from apache_beam.portability.api import beam_fn_api_pb2
 from apache_beam.portability.api import beam_job_api_pb2
 from apache_beam.portability.api import beam_job_api_pb2_grpc
 from apache_beam.runners import runner
@@ -45,7 +49,7 @@ class UniversalLocalRunner(runner.PipelineRunner):
 
   def __init__(self, timeout=None, use_grpc=True, use_subprocesses=True):
 super(UniversalLocalRunner, self).__init__()
-self._timeout = use_grpc
+self._timeout = timeout
 self._use_grpc = use_grpc
 self._use_subprocesses = use_subprocesses
 
@@ -53,8 +57,13 @@ class UniversalLocalRunner(runner.PipelineRunner)

[1/6] beam git commit: Streaming Job API.

2017-09-19 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master e7eefddea -> 28d4f0989


Streaming Job API.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/8a62ceae
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/8a62ceae
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/8a62ceae

Branch: refs/heads/master
Commit: 8a62ceae111e71a5a369ccd96344ad79f907f865
Parents: cb77874
Author: Robert Bradshaw 
Authored: Fri Sep 8 18:36:00 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:07 2017 -0700

--
 .../portability/universal_local_runner.py   | 144 ---
 .../portability/universal_local_runner_test.py  |   5 +-
 2 files changed, 124 insertions(+), 25 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/8a62ceae/sdks/python/apache_beam/runners/portability/universal_local_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
index 0ddcda3..8a47213 100644
--- a/sdks/python/apache_beam/runners/portability/universal_local_runner.py
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
@@ -19,6 +19,7 @@ from concurrent import futures
 import functools
 import logging
 import os
+import Queue as queue
 import socket
 import subprocess
 import sys
@@ -47,9 +48,8 @@ TERMINAL_STATES = [
 
 class UniversalLocalRunner(runner.PipelineRunner):
 
-  def __init__(self, timeout=None, use_grpc=True, use_subprocesses=True):
+  def __init__(self, use_grpc=True, use_subprocesses=True):
 super(UniversalLocalRunner, self).__init__()
-self._timeout = timeout
 self._use_grpc = use_grpc
 self._use_subprocesses = use_subprocesses
 
@@ -127,15 +127,15 @@ class UniversalLocalRunner(runner.PipelineRunner):
 pipeline=pipeline.to_runner_api()))
 run_response = handle.Run(beam_job_api_pb2.RunJobRequest(
 preparation_id=prepare_response.preparation_id))
-return PipelineResult(handle, run_response.job_id, self._timeout)
+return PipelineResult(handle, run_response.job_id)
 
 
 class PipelineResult(runner.PipelineResult):
-  def __init__(self, handle, job_id, timeout):
+  def __init__(self, handle, job_id):
 super(PipelineResult, self).__init__(beam_job_api_pb2.JobState.UNKNOWN)
 self._handle = handle
 self._job_id = job_id
-self._timeout = timeout
+self._messages = []
 
   def cancel(self):
 self._handle.Cancel()
@@ -158,14 +158,18 @@ class PipelineResult(runner.PipelineResult):
 return beam_job_api_pb2.JobState.JobStateType.Value(pipeline_state)
 
   def wait_until_finish(self):
-start = time.time()
-sleep_interval = 0.01
-while self._pipeline_state_to_runner_api_state(
-self.state) not in TERMINAL_STATES:
-  if self._timeout and time.time() - start > self._timeout:
-raise RuntimeError(
-"Pipeline %s timed out in state %s." % (self._job_id, self._state))
-  time.sleep(sleep_interval)
+def read_messages():
+  for message in self._handle.GetMessageStream(
+  beam_job_api_pb2.JobMessagesRequest(job_id=self._job_id)):
+self._messages.append(message)
+threading.Thread(target=read_messages).start()
+
+for state_response in self._handle.GetStateStream(
+beam_job_api_pb2.GetJobStateRequest(job_id=self._job_id)):
+  self._state = self._runner_api_state_to_pipeline_state(
+  state_response.state)
+  if state_response.state in TERMINAL_STATES:
+break
 if self._state != runner.PipelineState.DONE:
   raise RuntimeError(
   "Pipeline %s failed in state %s." % (self._job_id, self._state))
@@ -180,19 +184,46 @@ class BeamJob(threading.Thread):
 self._pipeline_proto = pipeline_proto
 self._use_grpc = use_grpc
 self._sdk_harness_factory = sdk_harness_factory
+self._log_queue = queue.Queue()
+self._state_change_callbacks = [
+lambda new_state: self._log_queue.put(
+beam_job_api_pb2.JobMessagesResponse(
+state_response=
+beam_job_api_pb2.GetJobStateResponse(state=new_state)))
+]
+self._state = None
 self.state = beam_job_api_pb2.JobState.STARTING
 self.daemon = True
 
+  def add_state_change_callback(self, f):
+self._state_change_callbacks.append(f)
+
+  @property
+  def log_queue(self):
+return self._log_queue
+
+  @property
+  def state(self):
+return self._state
+
+  @state.setter
+  def state(self, new_state):
+for state_change_callback in self._state_change_callbacks:
+  state_change_callback(new_state)
+self._state = new_state
+
   def run(self):
-try:
-  fn_api_runner.FnApiRunner(
- 

[5/6] beam git commit: Lint and documentation.

2017-09-19 Thread robertwb
Lint and documentation.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a1abccda
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a1abccda
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a1abccda

Branch: refs/heads/master
Commit: a1abccda312320c00e08a67c0c60dd0d7e907162
Parents: 8a62cea
Author: Robert Bradshaw 
Authored: Wed Sep 13 10:34:53 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 15:20:08 2017 -0700

--
 .../portability/universal_local_runner.py   | 184 +++
 .../portability/universal_local_runner_main.py  |   1 -
 .../portability/universal_local_runner_test.py  |   1 +
 3 files changed, 109 insertions(+), 77 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a1abccda/sdks/python/apache_beam/runners/portability/universal_local_runner.py
--
diff --git 
a/sdks/python/apache_beam/runners/portability/universal_local_runner.py 
b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
index 8a47213..844b3a8 100644
--- a/sdks/python/apache_beam/runners/portability/universal_local_runner.py
+++ b/sdks/python/apache_beam/runners/portability/universal_local_runner.py
@@ -15,7 +15,6 @@
 # limitations under the License.
 #
 
-from concurrent import futures
 import functools
 import logging
 import os
@@ -23,10 +22,11 @@ import Queue as queue
 import socket
 import subprocess
 import sys
-import time
 import threading
+import time
 import traceback
 import uuid
+from concurrent import futures
 
 import grpc
 from google.protobuf import text_format
@@ -37,7 +37,6 @@ from apache_beam.portability.api import beam_job_api_pb2_grpc
 from apache_beam.runners import runner
 from apache_beam.runners.portability import fn_api_runner
 
-
 TERMINAL_STATES = [
 beam_job_api_pb2.JobState.DONE,
 beam_job_api_pb2.JobState.STOPPED,
@@ -47,16 +46,27 @@ TERMINAL_STATES = [
 
 
 class UniversalLocalRunner(runner.PipelineRunner):
+  """A BeamRunner that executes Python pipelines via the Beam Job API.
 
-  def __init__(self, use_grpc=True, use_subprocesses=True):
+  By default, this runner executes in process but still uses GRPC to 
communicate
+  pipeline and worker state.  It can also be configured to use inline calls
+  rather than GRPC (for speed) or launch completely separate subprocesses for
+  the runner and worker(s).
+  """
+
+  def __init__(self, use_grpc=True, use_subprocesses=False):
+if use_subprocesses and not use_grpc:
+  raise ValueError("GRPC must be used with subprocesses")
 super(UniversalLocalRunner, self).__init__()
 self._use_grpc = use_grpc
 self._use_subprocesses = use_subprocesses
 
-self._handle = None
+self._job_service = None
+self._job_service_lock = threading.Lock()
 self._subprocess = None
 
   def __del__(self):
+# Best effort to not leave any dangling processes around.
 self.cleanup()
 
   def cleanup(self):
@@ -65,84 +75,90 @@ class UniversalLocalRunner(runner.PipelineRunner):
   time.sleep(0.1)
 self._subprocess = None
 
-  def _get_handle(self):
-if not self._handle:
-  if self._use_subprocesses:
-if self._subprocess:
-  # Kill the old one if it exists.
-  self._subprocess.kill()
-# TODO(robertwb): Consider letting the subprocess pick one and
-# communicate it back...
-port = _pick_unused_port()
-logging.info("Starting server on port %d.", port)
-self._subprocess = subprocess.Popen([
-sys.executable,
-'-m',
-'apache_beam.runners.portability.universal_local_runner_main',
-'-p',
-str(port),
-'--worker_command_line',
-'%s -m apache_beam.runners.worker.sdk_worker_main' % sys.executable
-])
-handle = beam_job_api_pb2_grpc.JobServiceStub(
-grpc.insecure_channel('localhost:%d' % port))
-logging.info("Waiting for server to be ready...")
-start = time.time()
-timeout = 30
-while True:
-  time.sleep(0.1)
-  if self._subprocess.poll() is not None:
-raise RuntimeError(
-"Subprocess terminated unexpectedly with exit code %d." %
-self._subprocess.returncode)
-  elif time.time() - start > timeout:
-raise RuntimeError(
-"Pipeline timed out waiting for job service subprocess.")
-  else:
-try:
-  handle.GetState(
-  beam_job_api_pb2.GetJobStateRequest(job_id='[fake]'))
-  break
-except grpc.RpcError as exn:
-  if exn.code != grpc.StatusCode.UNAVAILABLE:
-break
-logging.info("Server ready.")
-   

Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4825

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2969) BigQueryIO fails when reading then writing timestamps before 1970

2017-09-19 Thread Kevin Peterson (JIRA)
Kevin Peterson created BEAM-2969:


 Summary: BigQueryIO fails when reading then writing timestamps 
before 1970
 Key: BEAM-2969
 URL: https://issues.apache.org/jira/browse/BEAM-2969
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-gcp
Reporter: Kevin Peterson
Assignee: Chamikara Jayalath


I have a batch pipeline which reads from BigQuery (via standard sql query), 
does a small transform, and writes the data back to BigQuery.

This fails if any timestamps are present in the BQ data from before 1970:
{{"message" : "JSON parsing error in row starting at position 0: Couldn't 
convert value to timestamp: Could not parse '1969-12-28 02:52:54.-484 UTC' as a 
timestamp. Required format is -MM-DD HH:MM[:SS[.SS]] Field: 
observed_timestamp; Value: 1969-12-28 02:52:54.-484 UTC",}}

It appears the TableRow coder doesn't handle negative timestamps properly, 
using a negative number for the fractions of a second, which BQ considers 
invalid.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #4000

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-407) Inconsistent synchronization in OffsetRangeTracker.copy

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172326#comment-16172326
 ] 

ASF GitHub Bot commented on BEAM-407:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3827


> Inconsistent synchronization in OffsetRangeTracker.copy
> ---
>
> Key: BEAM-407
> URL: https://issues.apache.org/jira/browse/BEAM-407
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Justin Tumale
>Priority: Minor
>  Labels: findbugs, newbie, starter
> Fix For: 2.2.0
>
>
> [FindBugs 
> IS2_INCONSISTENT_SYNC|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L148]:
>  Inconsistent synchronization
> Applies to: 
> [OffsetRangeTracker.copy|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java#L263].
>  Its mutating methods are all marked synchronized otherwise.
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-407) Inconsistent synchronization in OffsetRangeTracker.copy

2017-09-19 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-407.
-
   Resolution: Fixed
Fix Version/s: 2.2.0

> Inconsistent synchronization in OffsetRangeTracker.copy
> ---
>
> Key: BEAM-407
> URL: https://issues.apache.org/jira/browse/BEAM-407
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Justin Tumale
>Priority: Minor
>  Labels: findbugs, newbie, starter
> Fix For: 2.2.0
>
>
> [FindBugs 
> IS2_INCONSISTENT_SYNC|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L148]:
>  Inconsistent synchronization
> Applies to: 
> [OffsetRangeTracker.copy|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java#L263].
>  Its mutating methods are all marked synchronized otherwise.
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3827: [BEAM-407] fixes inconsistent synchronization in Of...

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3827


---


[1/2] beam git commit: [BEAM-407] Fixes findbugs warnings in OffsetRangeTracker

2017-09-19 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 5dab9e109 -> e7eefddea


[BEAM-407] Fixes findbugs warnings in OffsetRangeTracker


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/517192f7
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/517192f7
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/517192f7

Branch: refs/heads/master
Commit: 517192f749e2581a4c97a7bd5be75960818a17cc
Parents: 5dab9e1
Author: Justin Tumale 
Authored: Fri Sep 8 15:14:06 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Sep 19 13:37:29 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml | 36 
 .../beam/sdk/io/range/OffsetRangeTracker.java   | 22 
 2 files changed, 16 insertions(+), 42 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/517192f7/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
--
diff --git a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml 
b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
index 0c9080d..e54cd0b 100644
--- a/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
+++ b/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml
@@ -311,42 +311,6 @@
 
   
   
-
-
-
-
-  
-  
-
-
-
-
-  
-  
-
-
-
-
-  
-  
-
-
-
-
-  
-  
-
-
-
-
-  
-  
-
-
-
-
-  
-  
 
 
 

http://git-wip-us.apache.org/repos/asf/beam/blob/517192f7/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java
index 8f0083e..7b4b331 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/range/OffsetRangeTracker.java
@@ -56,6 +56,8 @@ public class OffsetRangeTracker implements RangeTracker 
{
 this.stopOffset = stopOffset;
   }
 
+  private OffsetRangeTracker() { }
+
   public synchronized boolean isStarted() {
 // done => started: handles the case when the reader was empty.
 return (offsetOfLastSplitPoint != -1) || done;
@@ -262,11 +264,19 @@ public class OffsetRangeTracker implements 
RangeTracker {
*/
   @VisibleForTesting
   OffsetRangeTracker copy() {
-OffsetRangeTracker res = new OffsetRangeTracker(startOffset, stopOffset);
-res.offsetOfLastSplitPoint = this.offsetOfLastSplitPoint;
-res.lastRecordStart = this.lastRecordStart;
-res.done = this.done;
-res.splitPointsSeen = this.splitPointsSeen;
-return res;
+synchronized (this) {
+  OffsetRangeTracker res = new OffsetRangeTracker();
+  // This synchronized is not really necessary, because there's no 
concurrent access to "res",
+  // however it is necessary to prevent findbugs from complaining about 
unsynchronized access.
+  synchronized (res) {
+res.startOffset = this.startOffset;
+res.stopOffset = this.stopOffset;
+res.offsetOfLastSplitPoint = this.offsetOfLastSplitPoint;
+res.lastRecordStart = this.lastRecordStart;
+res.done = this.done;
+res.splitPointsSeen = this.splitPointsSeen;
+  }
+  return res;
+}
   }
 }



[2/2] beam git commit: This closes #3827: [BEAM-407] fixes inconsistent synchronization in OffsetRangeTracker.copy

2017-09-19 Thread jkff
This closes #3827: [BEAM-407] fixes inconsistent synchronization in 
OffsetRangeTracker.copy


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/e7eefdde
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/e7eefdde
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/e7eefdde

Branch: refs/heads/master
Commit: e7eefddea925e9b5359468a74a5dffe75b4b55e7
Parents: 5dab9e1 517192f
Author: Eugene Kirpichov 
Authored: Tue Sep 19 13:52:54 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Sep 19 13:52:54 2017 -0700

--
 .../src/main/resources/beam/findbugs-filter.xml | 36 
 .../beam/sdk/io/range/OffsetRangeTracker.java   | 22 
 2 files changed, 16 insertions(+), 42 deletions(-)
--




Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #3999

2017-09-19 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4824

2017-09-19 Thread Apache Jenkins Server
See 




Jenkins build is unstable: beam_PostCommit_Java_MavenInstall #4823

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1934) Code examples for CoGroupByKey

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172216#comment-16172216
 ] 

ASF GitHub Bot commented on BEAM-1934:
--

GitHub user davidcavazos opened a pull request:

https://github.com/apache/beam/pull/3867

Included immediate results after CoGroupByKey for better readability in docs

R: @melap 
R: @aaltay 

This makes the outputs show the data structures after the `CoGroupByKey` 
transform. The `formatted_results` shows how the data is affected after using 
the data. This will make for a clearer explanation in 
[BEAM-1934](https://github.com/apache/beam-site/pull/302)

- Changed `snippets_test.py:model_group_by_key_cogroupbykey_tuple_outputs` 
to show the data structures.
- Added 
`snippets_test.py:model_group_by_key_cogroupbykey_tuple_formatted_outputs` to 
show the data after using it in another transform.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidcavazos/beam cgbk_snippet

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3867.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3867


commit 8bf2c02fa929f916fce05164606b085158c87761
Author: David Cavazos 
Date:   2017-09-19T19:15:38Z

Included immediate results after CoGroupByKey for better readability in docs




> Code examples for CoGroupByKey
> --
>
> Key: BEAM-1934
> URL: https://issues.apache.org/jira/browse/BEAM-1934
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Aviem Zur
>Assignee: Melissa Pashniak
>
> Add code examples for usage of {{CoGroupByKey}}.
> Also, it would probably be wise to give introductions to the components of a 
> {{CoGroupByKey}} such as {{KeyedPCollectionTuple}} and {{TupleTag}} to help 
> users understand how to use it correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3867: Included immediate results after CoGroupByKey for b...

2017-09-19 Thread davidcavazos
GitHub user davidcavazos opened a pull request:

https://github.com/apache/beam/pull/3867

Included immediate results after CoGroupByKey for better readability in docs

R: @melap 
R: @aaltay 

This makes the outputs show the data structures after the `CoGroupByKey` 
transform. The `formatted_results` shows how the data is affected after using 
the data. This will make for a clearer explanation in 
[BEAM-1934](https://github.com/apache/beam-site/pull/302)

- Changed `snippets_test.py:model_group_by_key_cogroupbykey_tuple_outputs` 
to show the data structures.
- Added 
`snippets_test.py:model_group_by_key_cogroupbykey_tuple_formatted_outputs` to 
show the data after using it in another transform.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/davidcavazos/beam cgbk_snippet

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3867.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3867


commit 8bf2c02fa929f916fce05164606b085158c87761
Author: David Cavazos 
Date:   2017-09-19T19:15:38Z

Included immediate results after CoGroupByKey for better readability in docs




---


[jira] [Resolved] (BEAM-2829) Add ability to set job labels in DataflowPipelineOptions

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-2829.
--
Resolution: Fixed

> Add ability to set job labels in DataflowPipelineOptions
> 
>
> Key: BEAM-2829
> URL: https://issues.apache.org/jira/browse/BEAM-2829
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Zongwei Zhou
>Assignee: Zongwei Zhou
>Priority: Minor
> Fix For: 2.2.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Enable setting job labels --labels in DataflowPipelineOptions (earlier 
> Dataflow SDK 1.x supports this)
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (BEAM-2829) Add ability to set job labels in DataflowPipelineOptions

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax closed BEAM-2829.


> Add ability to set job labels in DataflowPipelineOptions
> 
>
> Key: BEAM-2829
> URL: https://issues.apache.org/jira/browse/BEAM-2829
> Project: Beam
>  Issue Type: New Feature
>  Components: runner-dataflow
>Affects Versions: 2.0.0, 2.1.0
>Reporter: Zongwei Zhou
>Assignee: Zongwei Zhou
>Priority: Minor
> Fix For: 2.2.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Enable setting job labels --labels in DataflowPipelineOptions (earlier 
> Dataflow SDK 1.x supports this)
> https://github.com/apache/beam/blob/master/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/options/DataflowPipelineOptions.java



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2273) mvn clean doesn't fully clean up archetypes.

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2273:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> mvn clean doesn't fully clean up archetypes.
> 
>
> Key: BEAM-2273
> URL: https://issues.apache.org/jira/browse/BEAM-2273
> Project: Beam
>  Issue Type: Bug
>  Components: build-system
>Reporter: Jason Kuster
>Assignee: Jason Kuster
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2298) Java WordCount doesn't work in Window OS for glob expressions or file: prefixed paths

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172175#comment-16172175
 ] 

Reuven Lax commented on BEAM-2298:
--

is this fixed?

> Java WordCount doesn't work in Window OS for glob expressions or file: 
> prefixed paths
> -
>
> Key: BEAM-2298
> URL: https://issues.apache.org/jira/browse/BEAM-2298
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Flavio Fiszman
> Fix For: 2.2.0
>
>
> I am not able to build beam repo in Windows OS, so I copied the jar file from 
> my Mac.
> WordCount failed with the following cmd:
> java -cp beam-examples-java-2.0.0-jar-with-dependencies.jar
>  org.apache.beam.examples.WordCount --inputFile=input.txt --output=counts
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> getEstimatedSizeB
> ytes
> INFO: Filepattern input.txt matched 1 files with total size 0
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource 
> expandFilePattern
> INFO: Matched 1 files for pattern input.txt
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.FileBasedSource split
> INFO: Splitting filepattern input.txt into bundles of size 0 took 0 ms and 
> produ
> ced 1 files and 0 bundles
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Finalizing write operation 
> TextWriteOperation{tempDirectory=C:\Users\Pei\D
> esktop\.temp-beam-2017-05-135_13-09-48-1\, windowedWrites=false}.
> May 15, 2017 6:09:48 AM org.apache.beam.sdk.io.WriteFiles$2 processElement
> INFO: Creating 1 empty output shards in addition to 0 written for a total of 
> 1.
> Exception in thread "main" 
> org.apache.beam.sdk.Pipeline$PipelineExecutionExcepti
> on: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:322)
> at 
> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.wait
> UntilFinish(DirectRunner.java:292)
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:200
> )
> at 
> org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:63)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:295)
> at org.apache.beam.sdk.Pipeline.run(Pipeline.java:281)
> at org.apache.beam.examples.WordCount.main(WordCount.java:184)
> Caused by: java.lang.IllegalStateException: Unable to find registrar for c
> at 
> org.apache.beam.sdk.io.FileSystems.getFileSystemInternal(FileSystems.
> java:447)
> at org.apache.beam.sdk.io.FileSystems.match(FileSystems.java:111)
> at 
> org.apache.beam.sdk.io.FileSystems.matchResources(FileSystems.java:17
> 4)
> at 
> org.apache.beam.sdk.io.FileSystems.filterMissingFiles(FileSystems.jav
> a:367)
> at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:251)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles
> (FileBasedSink.java:641)
> at 
> org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBase
> dSink.java:529)
> at 
> org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:59
> 2)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2523) GCP IO exposes protobuf on its API surface, causing user pain

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2523:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> GCP IO exposes protobuf on its API surface, causing user pain
> -
>
> Key: BEAM-2523
> URL: https://issues.apache.org/jira/browse/BEAM-2523
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Kenneth Knowles
> Fix For: 2.3.0
>
>
> Putting the SDK, DataflowRunner, and GCP IO on the same classpath, results in 
> (at least) three versions of protobuf getting pulled in. These should be made 
> to converge. We should consider using maven enforcer, which I think can check 
> this.
> {code}
> [INFO] com.example:foo:jar:0.1
> [INFO] +- org.apache.beam:beam-sdks-java-core:jar:2.0.0:compile
> [INFO] +- 
> org.apache.beam:beam-sdks-java-io-google-cloud-platform:jar:2.0.0:compile
> [INFO] |  +- 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:2.0.0:compile
> [INFO] |  |  \- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> [INFO] |  +- com.google.api.grpc:grpc-google-pubsub-v1:jar:0.1.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  |  \- com.google.api.grpc:grpc-google-iam-v1:jar:0.1.0:compile
> [INFO] |  | \- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- 
> com.google.cloud.datastore:datastore-v1-proto-client:jar:1.4.0:compile
> [INFO] |  |  +- 
> (com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.http-client:google-http-client:jar:1.20.0:compile 
> - omitted for conflict with 1.22.0)
> [INFO] |  |  +- 
> com.google.http-client:google-http-client-protobuf:jar:1.20.0:compile
> [INFO] |  |  |  +- 
> (com.google.http-client:google-http-client:jar:1.20.0:compile - omitted for 
> conflict with 1.22.0)
> [INFO] |  |  |  \- (com.google.protobuf:protobuf-java:jar:2.4.1:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.datastore:datastore-v1-protos:jar:1.3.0:compile
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.0.0:compile - 
> omitted for conflict with 3.2.0)
> [INFO] |  +- com.google.cloud.bigtable:bigtable-protos:jar:0.9.6.2:compile
> [INFO] |  |  +- (com.google.code.findbugs:jsr305:jar:3.0.1:compile - omitted 
> for duplicate)
> [INFO] |  |  +- (com.google.protobuf:protobuf-java:jar:3.2.0:compile - 
> omitted for duplicate)
> {code}
> Incidentally, the dependency plugin stopped supporting the verbose tree, so 
> we can't even visually inspect this except by downgrading.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2299) Beam repo build fails in Windows OS

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172173#comment-16172173
 ] 

Reuven Lax commented on BEAM-2299:
--

bumping this to 2.3.0

> Beam repo build fails in Windows OS
> ---
>
> Key: BEAM-2299
> URL: https://issues.apache.org/jira/browse/BEAM-2299
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Jason Kuster
> Fix For: 2.3.0
>
>
> Need to run unit tests in Windows OS.
> Currently, many unit tests fail when doing "mvn clean install" in Windows OS.
> [ERROR] Errors:
> [ERROR]   AvroSourceTest.testCreationWithSchema:403 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   AvroSourceTest.testGetCurrentFromUnstartedReader:305 ? IllegalState 
> Un
> able to ...
> [ERROR]   AvroSourceTest.testGetProgressFromUnstartedReader:204 ? 
> IllegalState U
> nable to...
> [ERROR]   AvroSourceTest.testMultipleFiles:390 ? IllegalState Unable to find 
> reg
> istrar f...
> [ERROR]   AvroSourceTest.testProgress:225 ? IllegalState Unable to find 
> registra
> r for c
> [ERROR]   AvroSourceTest.testProgressEmptySource:278 ? IllegalState Unable to 
> fi
> nd regis...
> [ERROR]   AvroSourceTest.testReadMetadataWithCodecs:676 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   AvroSourceTest.testReadSchemaString:688 ? IllegalState Unable to 
> find
> registra...
> [ERROR]   AvroSourceTest.testReadWithDifferentCodecs:158 ? IllegalState 
> Unable t
> o find r...
> [ERROR]   AvroSourceTest.testSchemaIsInterned:460 ? IllegalState Unable to 
> find
> registra...
> [ERROR]   AvroSourceTest.testSchemaStringIsInterned:441 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   AvroSourceTest.testSchemaUpdate:425 ? IllegalState Unable to find 
> regi
> strar fo...
> [ERROR]   AvroSourceTest.testSplitAtFraction:176 ? IllegalState Unable to 
> find r
> egistrar...
> [ERROR]   AvroSourceTest.testSplitAtFractionExhaustive:322 ? IllegalState 
> Unable
>  to find...
> [ERROR]   AvroSourceTest.testSplitsWithSmallBlocks:341 ? IllegalState Unable 
> to
> find reg...
> [ERROR]   CompressedSourceTest.testEmptyGzipProgress:646 ? IllegalState 
> Unable t
> o find r...
> [ERROR]   CompressedSourceTest.testGzipProgress:673 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   CompressedSourceTest.testSplittableProgress:739 ? IllegalState 
> Unable
> to find ...
> [ERROR]   CompressedSourceTest.testUncompressedFileIsSplittable:333 ? 
> IllegalSta
> te Unabl...
> [ERROR]   CompressedSourceTest.testUnsplittable:715 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   FileBasedSinkTest.testCopyToOutputFiles:301 ? IllegalState Unable 
> to f
> ind regi...
> [ERROR]   
> FileBasedSinkTest.testFinalize:154->generateTemporaryFilesForFinalize:
> 189 ? IO
> [ERROR]   
> FileBasedSinkTest.testFinalizeMultipleCalls:161->generateTemporaryFile
> sForFinalize:189 ? IO
> [ERROR]   
> FileBasedSinkTest.testFinalizeWithIntermediateState:171->generateTempo
> raryFilesForFinalize:189 ? IO
> [ERROR]   FileBasedSourceTest.testCloseUnstartedFilePatternReader:390 ? 
> IllegalS
> tate Una...
> [ERROR]   FileBasedSourceTest.testEstimatedSizeOfFile:746 ? IllegalState 
> Unable
> to find ...
> [ERROR]   FileBasedSourceTest.testEstimatedSizeOfFilePattern:772 ? 
> IllegalState
> Unable t...
> [ERROR]   FileBasedSourceTest.testFractionConsumedWhenReadingFilepattern:422 
> ? I
> llegalState
> [ERROR]   FileBasedSourceTest.testFullyReadFilePattern:370 ? IllegalState 
> Unable
>  to find...
> [ERROR]   FileBasedSourceTest.testFullyReadFilePatternFirstRecordEmpty:461 ? 
> Ill
> egalState
> [ERROR]   FileBasedSourceTest.testFullyReadSingleFile:346 ? IllegalState 
> Unable
> to find ...
> [ERROR]   FileBasedSourceTest.testReadAllSplitsOfFilePattern:792 ? 
> IllegalState
> Unable t...
> [ERROR]   FileBasedSourceTest.testReadAllSplitsOfSingleFile:681 ? 
> IllegalState U
> nable to...
> [ERROR]   FileBasedSourceTest.testReadEverythingFromFileWithSplits:502 ? 
> Illegal
> State Un...
> [ERROR]   FileBasedSourceTest.testReadFileWithSplitsWithEmptyRange:578 ? 
> Illegal
> State Un...
> [ERROR]   FileBasedSourceTest.testReadRangeAtEnd:659 ? IllegalState Unable to 
> fi
> nd regis...
> [ERROR]   FileBasedSourceTest.testReadRangeAtMiddle:637 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   FileBasedSourceTest.testReadRangeAtStart:472 ? IllegalState Unable 
> to
> find reg...
> [ERROR]   FileBasedSourceTest.testReadRangeFromFileWithSplitsFromMiddle:546 ? 
> Il
> legalState
> [ERROR]   
> FileBasedSourceTest.testReadRangeFromFileWithSplitsFromMiddleOfHeader:
> 615 ? IllegalState
> [ERROR]   FileBasedSourceTest.testReadRangeFromFileWithSplitsFromStart:517 ? 
> Ill
> egalState
> [ERROR]   FileBasedSourceTest.testSplitAtFraction:815 ? IllegalState Unable 
> to f
> ind regi...
> [ERR

[jira] [Updated] (BEAM-2299) Beam repo build fails in Windows OS

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2299:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Beam repo build fails in Windows OS
> ---
>
> Key: BEAM-2299
> URL: https://issues.apache.org/jira/browse/BEAM-2299
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Jason Kuster
> Fix For: 2.3.0
>
>
> Need to run unit tests in Windows OS.
> Currently, many unit tests fail when doing "mvn clean install" in Windows OS.
> [ERROR] Errors:
> [ERROR]   AvroSourceTest.testCreationWithSchema:403 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   AvroSourceTest.testGetCurrentFromUnstartedReader:305 ? IllegalState 
> Un
> able to ...
> [ERROR]   AvroSourceTest.testGetProgressFromUnstartedReader:204 ? 
> IllegalState U
> nable to...
> [ERROR]   AvroSourceTest.testMultipleFiles:390 ? IllegalState Unable to find 
> reg
> istrar f...
> [ERROR]   AvroSourceTest.testProgress:225 ? IllegalState Unable to find 
> registra
> r for c
> [ERROR]   AvroSourceTest.testProgressEmptySource:278 ? IllegalState Unable to 
> fi
> nd regis...
> [ERROR]   AvroSourceTest.testReadMetadataWithCodecs:676 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   AvroSourceTest.testReadSchemaString:688 ? IllegalState Unable to 
> find
> registra...
> [ERROR]   AvroSourceTest.testReadWithDifferentCodecs:158 ? IllegalState 
> Unable t
> o find r...
> [ERROR]   AvroSourceTest.testSchemaIsInterned:460 ? IllegalState Unable to 
> find
> registra...
> [ERROR]   AvroSourceTest.testSchemaStringIsInterned:441 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   AvroSourceTest.testSchemaUpdate:425 ? IllegalState Unable to find 
> regi
> strar fo...
> [ERROR]   AvroSourceTest.testSplitAtFraction:176 ? IllegalState Unable to 
> find r
> egistrar...
> [ERROR]   AvroSourceTest.testSplitAtFractionExhaustive:322 ? IllegalState 
> Unable
>  to find...
> [ERROR]   AvroSourceTest.testSplitsWithSmallBlocks:341 ? IllegalState Unable 
> to
> find reg...
> [ERROR]   CompressedSourceTest.testEmptyGzipProgress:646 ? IllegalState 
> Unable t
> o find r...
> [ERROR]   CompressedSourceTest.testGzipProgress:673 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   CompressedSourceTest.testSplittableProgress:739 ? IllegalState 
> Unable
> to find ...
> [ERROR]   CompressedSourceTest.testUncompressedFileIsSplittable:333 ? 
> IllegalSta
> te Unabl...
> [ERROR]   CompressedSourceTest.testUnsplittable:715 ? IllegalState Unable to 
> fin
> d regist...
> [ERROR]   FileBasedSinkTest.testCopyToOutputFiles:301 ? IllegalState Unable 
> to f
> ind regi...
> [ERROR]   
> FileBasedSinkTest.testFinalize:154->generateTemporaryFilesForFinalize:
> 189 ? IO
> [ERROR]   
> FileBasedSinkTest.testFinalizeMultipleCalls:161->generateTemporaryFile
> sForFinalize:189 ? IO
> [ERROR]   
> FileBasedSinkTest.testFinalizeWithIntermediateState:171->generateTempo
> raryFilesForFinalize:189 ? IO
> [ERROR]   FileBasedSourceTest.testCloseUnstartedFilePatternReader:390 ? 
> IllegalS
> tate Una...
> [ERROR]   FileBasedSourceTest.testEstimatedSizeOfFile:746 ? IllegalState 
> Unable
> to find ...
> [ERROR]   FileBasedSourceTest.testEstimatedSizeOfFilePattern:772 ? 
> IllegalState
> Unable t...
> [ERROR]   FileBasedSourceTest.testFractionConsumedWhenReadingFilepattern:422 
> ? I
> llegalState
> [ERROR]   FileBasedSourceTest.testFullyReadFilePattern:370 ? IllegalState 
> Unable
>  to find...
> [ERROR]   FileBasedSourceTest.testFullyReadFilePatternFirstRecordEmpty:461 ? 
> Ill
> egalState
> [ERROR]   FileBasedSourceTest.testFullyReadSingleFile:346 ? IllegalState 
> Unable
> to find ...
> [ERROR]   FileBasedSourceTest.testReadAllSplitsOfFilePattern:792 ? 
> IllegalState
> Unable t...
> [ERROR]   FileBasedSourceTest.testReadAllSplitsOfSingleFile:681 ? 
> IllegalState U
> nable to...
> [ERROR]   FileBasedSourceTest.testReadEverythingFromFileWithSplits:502 ? 
> Illegal
> State Un...
> [ERROR]   FileBasedSourceTest.testReadFileWithSplitsWithEmptyRange:578 ? 
> Illegal
> State Un...
> [ERROR]   FileBasedSourceTest.testReadRangeAtEnd:659 ? IllegalState Unable to 
> fi
> nd regis...
> [ERROR]   FileBasedSourceTest.testReadRangeAtMiddle:637 ? IllegalState Unable 
> to
>  find re...
> [ERROR]   FileBasedSourceTest.testReadRangeAtStart:472 ? IllegalState Unable 
> to
> find reg...
> [ERROR]   FileBasedSourceTest.testReadRangeFromFileWithSplitsFromMiddle:546 ? 
> Il
> legalState
> [ERROR]   
> FileBasedSourceTest.testReadRangeFromFileWithSplitsFromMiddleOfHeader:
> 615 ? IllegalState
> [ERROR]   FileBasedSourceTest.testReadRangeFromFileWithSplitsFromStart:517 ? 
> Ill
> egalState
> [ERROR]   FileBasedSourceTest.testSplitAtFraction:815 ? IllegalState Unable 
> to f
> ind regi...
> [ERROR]   FileBasedSour

[jira] [Commented] (BEAM-2604) Delegate beam metrics to runners

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172172#comment-16172172
 ] 

Reuven Lax commented on BEAM-2604:
--

Does this need to block the 2.2.0 cut?

> Delegate beam metrics to runners
> 
>
> Key: BEAM-2604
> URL: https://issues.apache.org/jira/browse/BEAM-2604
> Project: Beam
>  Issue Type: Sub-task
>  Components: runner-flink, runner-spark
>Reporter: Cody
>Assignee: Aljoscha Krettek
> Fix For: 2.2.0
>
>
> Delegate beam metrics to runners to avoid forwarding updates, i.e., extract 
> updates from beam metrics and commit updates in runners.
> For Flink/Spark runners, we can reference metrics within runner's metrics 
> system in beam pipelines and update them directly without forwarding.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2865) Implement FileIO.write()

2017-09-19 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax updated BEAM-2865:
-
Fix Version/s: (was: 2.2.0)
   2.3.0

> Implement FileIO.write()
> 
>
> Key: BEAM-2865
> URL: https://issues.apache.org/jira/browse/BEAM-2865
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
> Fix For: 2.3.0
>
>
> Design doc: http://s.apache.org/fileio-write
> Discussion: 
> https://lists.apache.org/thread.html/cc543556cc709a44ed92262207215eaa0e43a0f573c630b6360d4edc@%3Cdev.beam.apache.org%3E



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #3164

2017-09-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172150#comment-16172150
 ] 

ASF GitHub Bot commented on BEAM-2870:
--

GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/3866

[BEAM-2870] Strip table decorators before creating tables.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam BEAM-2870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3866.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3866


commit 287488914f63e63303dbc3ce9f5371078a8be00c
Author: Reuven Lax 
Date:   2017-09-19T18:38:13Z

Strip table decorators before creating tables.




> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
> SDD)
>Reporter: Steven Jon Anderson
>Assignee: Reuven Lax
>  Labels: bigquery, dataflow, google, google-cloud-bigquery, 
> google-dataflow
> Fix For: 2.2.0
>
>
> Dataflow Job ID: 
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration 
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events 
> into our clients' numerous tables. To keep costs down, all of our tables are 
> partitioned. However, this requires that we create the tables before we allow 
> events to process as creating partitioned tables isn't supported in 2.1.0. 
> We've been looking forward to [~reuvenlax]'s partition table write feature 
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
> for some time now as it'll allow us to launch our client platforms much, much 
> faster. Today we got around to testing the 2.2.0 nightly and discovered this 
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to 
> an existing partitioned table with a decorator, the write succeeds. When 
> using a partitioned table destination that doesn't exist without a decorator, 
> the write succeeds. *However, when writing to a partitioned table that 
> doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations() {
> @Override
> public String getDestination(ValueInSingleWindow element) {
>   return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
>   TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>   return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
>   return TABLE_SCHEMA;
> }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
> underscores) and must be at most 1024 characters long. Also, Table decorators 
> cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172151#comment-16172151
 ] 

Reuven Lax commented on BEAM-2870:
--

Can you try out this PR and let us know if it fixes the issue you are seeing?

> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
> SDD)
>Reporter: Steven Jon Anderson
>Assignee: Reuven Lax
>  Labels: bigquery, dataflow, google, google-cloud-bigquery, 
> google-dataflow
> Fix For: 2.2.0
>
>
> Dataflow Job ID: 
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration 
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events 
> into our clients' numerous tables. To keep costs down, all of our tables are 
> partitioned. However, this requires that we create the tables before we allow 
> events to process as creating partitioned tables isn't supported in 2.1.0. 
> We've been looking forward to [~reuvenlax]'s partition table write feature 
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
> for some time now as it'll allow us to launch our client platforms much, much 
> faster. Today we got around to testing the 2.2.0 nightly and discovered this 
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to 
> an existing partitioned table with a decorator, the write succeeds. When 
> using a partitioned table destination that doesn't exist without a decorator, 
> the write succeeds. *However, when writing to a partitioned table that 
> doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations() {
> @Override
> public String getDestination(ValueInSingleWindow element) {
>   return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
>   TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>   return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
>   return TABLE_SCHEMA;
> }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
> underscores) and must be at most 1024 characters long. Also, Table decorators 
> cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3866: [BEAM-2870] Strip table decorators before creating ...

2017-09-19 Thread reuvenlax
GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/3866

[BEAM-2870] Strip table decorators before creating tables.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam BEAM-2870

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3866.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3866


commit 287488914f63e63303dbc3ce9f5371078a8be00c
Author: Reuven Lax 
Date:   2017-09-19T18:38:13Z

Strip table decorators before creating tables.




---


[2/2] beam git commit: [BEAM-2964] Exclude incompatible six release.

2017-09-19 Thread robertwb
[BEAM-2964] Exclude incompatible six release.

Upstream bugs being tracked at https://github.com/google/apitools/issues/175 
and https://github.com/benjaminp/six/issues/210


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b3a5b67b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b3a5b67b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b3a5b67b

Branch: refs/heads/master
Commit: b3a5b67b25de7e98292d86484aaca1c978952ff0
Parents: c25bead
Author: Robert Bradshaw 
Authored: Tue Sep 19 10:41:43 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 10:57:08 2017 -0700

--
 sdks/python/setup.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b3a5b67b/sdks/python/setup.py
--
diff --git a/sdks/python/setup.py b/sdks/python/setup.py
index c13da8e..2bc2e99 100644
--- a/sdks/python/setup.py
+++ b/sdks/python/setup.py
@@ -114,7 +114,9 @@ REQUIRED_SETUP_PACKAGES = [
 REQUIRED_TEST_PACKAGES = [
 'pyhamcrest>=1.9,<2.0',
 # Six required by nose plugins management.
-'six>=1.9',
+# Six 1.11.0 incompatible with apitools.
+# TODO(BEAM-2964): Remove the upper bound.
+'six>=1.9,<1.11',
 ]
 
 GCP_REQUIREMENTS = [



[jira] [Commented] (BEAM-2964) Latest six (1.11.0) produces "metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases"

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172095#comment-16172095
 ] 

ASF GitHub Bot commented on BEAM-2964:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3865


> Latest six (1.11.0) produces "metaclass conflict: the metaclass of a derived 
> class must be a (non-strict) subclass of the metaclasses of all its bases"
> ---
>
> Key: BEAM-2964
> URL: https://issues.apache.org/jira/browse/BEAM-2964
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.1.0
> Environment: Python 2.7 in a virtualenv on MacOS
>Reporter: Steven Normore
>Assignee: Ahmet Altay
>
> $ virtualenv venv
> [...]
> $ source venv/bin/activate
> [...]
> $ pip install apache-beam[gcp]==2.1.0 six==1.11.0
> [...]
> $ python -c 'import apache_beam'
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 78, in 
> from apache_beam import io
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/__init__.py",
>  line 21, in 
> from apache_beam.io.avroio import *
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/avroio.py",
>  line 29, in 
> from apache_beam.io import filebasedsource
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/filebasedsource.py",
>  line 33, in 
> from apache_beam.io.filesystems import FileSystems
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/filesystems.py",
>  line 31, in 
> from apache_beam.io.gcp.gcsfilesystem import GCSFileSystem
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/gcp/gcsfilesystem.py",
>  line 27, in 
> from apache_beam.io.gcp import gcsio
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/gcp/gcsio.py",
>  line 36, in 
> from apache_beam.utils import retry
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/utils/retry.py",
>  line 38, in 
> from apitools.base.py.exceptions import HttpError
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/py/__init__.py",
>  line 21, in 
> from apitools.base.py.base_api import *
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 31, in 
> from apitools.base.protorpclite import message_types
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/protorpclite/message_types.py",
>  line 25, in 
> from apitools.base.protorpclite import messages
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/protorpclite/messages.py",
>  line 1165, in 
> class Field(six.with_metaclass(_FieldMeta, object)):
> TypeError: Error when calling the metaclass bases
> metaclass conflict: the metaclass of a derived class must be a 
> (non-strict) subclass of the metaclasses of all its bases
> (venv)
> $ pip install six==1.10.0
> [...]
> $ python -c 'import apache_beam'
> [success]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[1/2] beam git commit: Closes #3865

2017-09-19 Thread robertwb
Repository: beam
Updated Branches:
  refs/heads/master c25bead55 -> 5dab9e109


Closes #3865


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/5dab9e10
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/5dab9e10
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/5dab9e10

Branch: refs/heads/master
Commit: 5dab9e109cffefb69ac6be22b2d902d5579797e1
Parents: c25bead b3a5b67
Author: Robert Bradshaw 
Authored: Tue Sep 19 10:57:08 2017 -0700
Committer: Robert Bradshaw 
Committed: Tue Sep 19 10:57:08 2017 -0700

--
 sdks/python/setup.py | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--




[GitHub] beam pull request #3865: [BEAM-2964] Exclude incompatible six release.

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3865


---


[jira] [Commented] (BEAM-2870) BQ Partitioned Table Write Fails When Destination has Partition Decorator

2017-09-19 Thread Reuven Lax (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172081#comment-16172081
 ] 

Reuven Lax commented on BEAM-2870:
--

I think the problem is that we need to strip the partition decorator off the 
table name before attempting to create the table. I believe that this bug 
affects the streaming-insert path only.

> BQ Partitioned Table Write Fails When Destination has Partition Decorator
> -
>
> Key: BEAM-2870
> URL: https://issues.apache.org/jira/browse/BEAM-2870
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.2.0
> Environment: Dataflow Runner, Streaming, 10 x (n1-highmem-8 & 500gb 
> SDD)
>Reporter: Steven Jon Anderson
>Assignee: Reuven Lax
>  Labels: bigquery, dataflow, google, google-cloud-bigquery, 
> google-dataflow
> Fix For: 2.2.0
>
>
> Dataflow Job ID: 
> https://console.cloud.google.com/dataflow/job/2017-09-08_23_03_14-14637186041605198816
> Tagging [~reuvenlax] as I believe he built the time partitioning integration 
> that was merged into master.
> *Background*
> Our production pipeline ingests millions of events per day and routes events 
> into our clients' numerous tables. To keep costs down, all of our tables are 
> partitioned. However, this requires that we create the tables before we allow 
> events to process as creating partitioned tables isn't supported in 2.1.0. 
> We've been looking forward to [~reuvenlax]'s partition table write feature 
> ([#3663|https://github.com/apache/beam/pull/3663]) to get merged into master 
> for some time now as it'll allow us to launch our client platforms much, much 
> faster. Today we got around to testing the 2.2.0 nightly and discovered this 
> bug.
> *Issue*
> Our pipeline writes to a table with a decorator. When attempting to write to 
> an existing partitioned table with a decorator, the write succeeds. When 
> using a partitioned table destination that doesn't exist without a decorator, 
> the write succeeds. *However, when writing to a partitioned table that 
> doesn't exist with a decorator, the write fails*. 
> *Example Implementation*
> {code:java}
> BigQueryIO.writeTableRows()
>   .withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
>   .withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
>   .withFailedInsertRetryPolicy(InsertRetryPolicy.alwaysRetry())
>   .to(new DynamicDestinations() {
> @Override
> public String getDestination(ValueInSingleWindow element) {
>   return "PROJECT_ID:DATASET_ID.TABLE_ID$20170902";
> }
> @Override
> public TableDestination getTable(String destination) {
>   TimePartitioning DAY_PARTITION = new TimePartitioning().setType("DAY");
>   return new TableDestination(destination, null, DAY_PARTITION);
> }
> @Override
> public TableSchema getSchema(String destination) {
>   return TABLE_SCHEMA;
> }
>   })
> {code}
> *Relevant Logs & Errors in StackDriver*
> {code:none}
> 23:06:26.790 
> Trying to create BigQuery table: PROJECT_ID:DATASET_ID.TABLE_ID$20170902
> 23:06:26.873 
> Invalid table ID \"TABLE_ID$20170902\". Table IDs must be alphanumeric (plus 
> underscores) and must be at most 1024 characters long. Also, Table decorators 
> cannot be used.
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2964) Latest six (1.11.0) produces "metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases"

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172077#comment-16172077
 ] 

ASF GitHub Bot commented on BEAM-2964:
--

GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/3865

[BEAM-2964] Exclude incompatible six release.

Upstream bugs being tracked at 
https://github.com/google/apitools/issues/175 and 
https://github.com/benjaminp/six/issues/210

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam BEAM-2964-six

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3865.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3865


commit 414cfbbd7a8d032ceff314cf6d07946f3269f05a
Author: Robert Bradshaw 
Date:   2017-09-19T17:41:43Z

[BEAM-2964] Exclude incompatible six release.

Upstream bugs being tracked at 
https://github.com/google/apitools/issues/175 and 
https://github.com/benjaminp/six/issues/210




> Latest six (1.11.0) produces "metaclass conflict: the metaclass of a derived 
> class must be a (non-strict) subclass of the metaclasses of all its bases"
> ---
>
> Key: BEAM-2964
> URL: https://issues.apache.org/jira/browse/BEAM-2964
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py-core
>Affects Versions: 2.1.0
> Environment: Python 2.7 in a virtualenv on MacOS
>Reporter: Steven Normore
>Assignee: Ahmet Altay
>
> $ virtualenv venv
> [...]
> $ source venv/bin/activate
> [...]
> $ pip install apache-beam[gcp]==2.1.0 six==1.11.0
> [...]
> $ python -c 'import apache_beam'
> Traceback (most recent call last):
>   File "", line 1, in 
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/__init__.py",
>  line 78, in 
> from apache_beam import io
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/__init__.py",
>  line 21, in 
> from apache_beam.io.avroio import *
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/avroio.py",
>  line 29, in 
> from apache_beam.io import filebasedsource
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/filebasedsource.py",
>  line 33, in 
> from apache_beam.io.filesystems import FileSystems
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/filesystems.py",
>  line 31, in 
> from apache_beam.io.gcp.gcsfilesystem import GCSFileSystem
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/gcp/gcsfilesystem.py",
>  line 27, in 
> from apache_beam.io.gcp import gcsio
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/io/gcp/gcsio.py",
>  line 36, in 
> from apache_beam.utils import retry
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apache_beam/utils/retry.py",
>  line 38, in 
> from apitools.base.py.exceptions import HttpError
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/py/__init__.py",
>  line 21, in 
> from apitools.base.py.base_api import *
>   File 
> "/Users/snormore/Workspace/beam-six/venv/lib/python2.7/site-packages/apitools/base/py/base_api.py",
>  line 31, in 
> from apitools.base.protorpclite import message_types
>   

[GitHub] beam pull request #3865: [BEAM-2964] Exclude incompatible six release.

2017-09-19 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/3865

[BEAM-2964] Exclude incompatible six release.

Upstream bugs being tracked at 
https://github.com/google/apitools/issues/175 and 
https://github.com/benjaminp/six/issues/210

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam BEAM-2964-six

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3865.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3865


commit 414cfbbd7a8d032ceff314cf6d07946f3269f05a
Author: Robert Bradshaw 
Date:   2017-09-19T17:41:43Z

[BEAM-2964] Exclude incompatible six release.

Upstream bugs being tracked at 
https://github.com/google/apitools/issues/175 and 
https://github.com/benjaminp/six/issues/210




---


[jira] [Commented] (BEAM-2834) NullPointerException @ BigQueryServicesImpl.java:759

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16172075#comment-16172075
 ] 

ASF GitHub Bot commented on BEAM-2834:
--

GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/3864

[BEAM-2834] Ensure that we never pass in a null retry policy

R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam BEAM-2834

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3864.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3864


commit 2bc43a2e13981d39eca1d096a82a504a651e646d
Author: Reuven Lax 
Date:   2017-09-19T17:40:12Z

Make sure that we default to alwaysRetry instead of passing in a null retry 
policy.




> NullPointerException @ BigQueryServicesImpl.java:759
> 
>
> Key: BEAM-2834
> URL: https://issues.apache.org/jira/browse/BEAM-2834
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 2.1.0
>Reporter: Andy Barron
>Assignee: Reuven Lax
> Fix For: 2.2.0
>
>
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:759)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryServicesImpl$DatasetServiceImpl.insertAll(BigQueryServicesImpl.java:809)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.flushRows(StreamingWriteFn.java:126)
>   at 
> org.apache.beam.sdk.io.gcp.bigquery.StreamingWriteFn.finishBundle(StreamingWriteFn.java:96)
> {code}
> Going through the stack trace, the likely culprit is a null {{retryPolicy}} 
> in {{StreamingWriteFn}}.
> For context, this error showed up about 70 times between 1 am and 1 pm 
> Pacific time (2017-08-31) on a Dataflow streaming job.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3864: [BEAM-2834] Ensure that we never pass in a null ret...

2017-09-19 Thread reuvenlax
GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/3864

[BEAM-2834] Ensure that we never pass in a null retry policy

R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam BEAM-2834

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3864.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3864


commit 2bc43a2e13981d39eca1d096a82a504a651e646d
Author: Reuven Lax 
Date:   2017-09-19T17:40:12Z

Make sure that we default to alwaysRetry instead of passing in a null retry 
policy.




---


[GitHub] beam pull request #3863: [Beam-2858] Ensure that file deletion happens consi...

2017-09-19 Thread reuvenlax
GitHub user reuvenlax opened a pull request:

https://github.com/apache/beam/pull/3863

[Beam-2858] Ensure that file deletion happens consistently, and only after 
table loads are completed

We do this by adding a GroupByKey between the BQ load and the file 
deletion. Once the @StableReplay proposal is implemented, we will replace with 
a solution that uses that instead.

R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/reuvenlax/incubator-beam BEAM-2858

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3863.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3863


commit c203331efdd24551e5f7270c55507791fdc9367b
Author: Reuven Lax 
Date:   2017-09-19T17:10:18Z

Fix bug.

commit 6cfacb8a6324d9b5a69c7d1dae82e6c1581d85af
Author: Reuven Lax 
Date:   2017-09-19T17:17:51Z

Some fixups.




---


[jira] [Commented] (BEAM-2500) Add support for S3 as a Apache Beam FileSystem

2017-09-19 Thread Jacob Marble (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171952#comment-16171952
 ] 

Jacob Marble commented on BEAM-2500:


Thanks, Steve. Fixed.

> Add support for S3 as a Apache Beam FileSystem
> --
>
> Key: BEAM-2500
> URL: https://issues.apache.org/jira/browse/BEAM-2500
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Priority: Minor
> Attachments: hadoop_fs_patch.patch
>
>
> Note that this is for providing direct integration with S3 as an Apache Beam 
> FileSystem.
> There is already support for using the Hadoop S3 connector by depending on 
> the Hadoop File System module[1], configuring HadoopFileSystemOptions[2] with 
> a S3 configuration[3].
> 1: https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system
> 2: 
> https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.java#L53
> 3: https://wiki.apache.org/hadoop/AmazonS3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2500) Add support for S3 as a Apache Beam FileSystem

2017-09-19 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171891#comment-16171891
 ] 

Steve Loughran commented on BEAM-2500:
--

looking at the code, the main thing I'd highlight is that using {{
Regions.fromName(region);}} means that you can't support any region which isn't 
explicitly handled in the version of the SDK you build & ship, or handle 
non-AWS implementations. Best just to let people declare the endpoint, and let 
them deal with the stack traces if they get it wrong.

> Add support for S3 as a Apache Beam FileSystem
> --
>
> Key: BEAM-2500
> URL: https://issues.apache.org/jira/browse/BEAM-2500
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Priority: Minor
> Attachments: hadoop_fs_patch.patch
>
>
> Note that this is for providing direct integration with S3 as an Apache Beam 
> FileSystem.
> There is already support for using the Hadoop S3 connector by depending on 
> the Hadoop File System module[1], configuring HadoopFileSystemOptions[2] with 
> a S3 configuration[3].
> 1: https://github.com/apache/beam/tree/master/sdks/java/io/hadoop-file-system
> 2: 
> https://github.com/apache/beam/blob/master/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemOptions.java#L53
> 3: https://wiki.apache.org/hadoop/AmazonS3



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3163

2017-09-19 Thread Apache Jenkins Server
See 


--
[...truncated 45.40 KB...]
Collecting mock<3.0.0,>=1.0.1 (from apache-beam==2.2.0.dev0)
  Using cached mock-2.0.0-py2.py3-none-any.whl
Collecting oauth2client<4.0.0,>=2.0.1 (from apache-beam==2.2.0.dev0)
Collecting protobuf<=3.3.0,>=3.2.0 (from apache-beam==2.2.0.dev0)
  Using cached protobuf-3.3.0-cp27-cp27mu-manylinux1_x86_64.whl
Collecting pyyaml<4.0.0,>=3.12 (from apache-beam==2.2.0.dev0)
Collecting typing<3.7.0,>=3.6.0 (from apache-beam==2.2.0.dev0)
  Using cached typing-3.6.2-py2-none-any.whl
Requirement already satisfied: enum34>=1.0.4 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: six>=1.5.2 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Requirement already satisfied: futures>=2.2.0 in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
grpcio<2.0,>=1.0->apache-beam==2.2.0.dev0)
Collecting funcsigs>=1; python_version < "3.3" (from 
mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached funcsigs-1.0.2-py2.py3-none-any.whl
Collecting pbr>=0.11 (from mock<3.0.0,>=1.0.1->apache-beam==2.2.0.dev0)
  Using cached pbr-3.1.1-py2.py3-none-any.whl
Collecting pyasn1>=0.1.7 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1-0.3.5-py2.py3-none-any.whl
Collecting pyasn1-modules>=0.0.5 (from 
oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached pyasn1_modules-0.1.4-py2.py3-none-any.whl
Collecting rsa>=3.1.4 (from oauth2client<4.0.0,>=2.0.1->apache-beam==2.2.0.dev0)
  Using cached rsa-3.4.2-py2.py3-none-any.whl
Requirement already satisfied: setuptools in 
./target/.tox/py27cython/lib/python2.7/site-packages (from 
protobuf<=3.3.0,>=3.2.0->apache-beam==2.2.0.dev0)
Building wheels for collected packages: apache-beam
  Running setup.py bdist_wheel for apache-beam: started
  Running setup.py bdist_wheel for apache-beam: finished with status 'error'
  Complete output from command 

 -u -c "import setuptools, 
tokenize;__file__='/tmp/pip-KnWc7E-build/setup.py';f=getattr(tokenize, 'open', 
open)(__file__);code=f.read().replace('\r\n', 
'\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d 
/tmp/tmp6P3z6Ppip-wheel- --python-tag cp27:
  
:351:
 UserWarning: Normalizing '2.2.0.dev' to '2.2.0.dev0'
normalized_version,
  running bdist_wheel
  running build
  running build_py
  Traceback (most recent call last):
File "", line 1, in 
File "/tmp/pip-KnWc7E-build/setup.py", line 198, in 
  'test': generate_protos_first(test),
File "/usr/lib/python2.7/distutils/core.py", line 151, in setup
  dist.run_commands()
File "/usr/lib/python2.7/distutils/dist.py", line 953, in run_commands
  self.run_command(cmd)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File 
"
 line 204, in run
  self.run_command('build')
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/usr/lib/python2.7/distutils/command/build.py", line 128, in run
  self.run_command(cmd_name)
File "/usr/lib/python2.7/distutils/cmd.py", line 326, in run_command
  self.distribution.run_command(command)
File "/usr/lib/python2.7/distutils/dist.py", line 972, in run_command
  cmd_obj.run()
File "/tmp/pip-KnWc7E-build/setup.py", line 138, in run
  gen_protos.generate_proto_files()
File "gen_protos.py", line 65, in generate_proto_files
  'Not in apache git tree; unable to find proto definitions.')
  RuntimeError: Not in apache git tree; unable to find proto definitions.
  
  
  Failed building wheel for apache-beam
  Running setup.py clean for apache-beam
Failed to build apache-beam
Installing collected packages: avro, crcmod, dill, httplib2, funcsigs, pbr, 
mock, pyasn1, pyasn1-modules, rsa, oauth2client, protobuf, pyyaml, typing, 
apache-beam
  Found existing installation: protobuf 3.4.0
Uninstalling protobuf-3.4.0:
  Successfully uninstalled protobuf-3.4.0
  Running setup.py install for apache-beam: started
Running setup.py install for apache-beam: finished with status 'error'
Complete output from command 


Build failed in Jenkins: beam_PostCommit_Python_Verify #3162

2017-09-19 Thread Apache Jenkins Server
See 


Changes:

[klk] Move DoFnInfo to SDK util

[klk] Update Dataflow container version to 20170918

--
[...truncated 457.23 KB...]
test_getitem_duplicates_ignored 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_getitem_must_be_valid_type_param 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_getitem_must_be_valid_type_param_cant_be_object_instance 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_getitem_nested_unions_flattened 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_nested_compatibility 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_union_hint_compatibility 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_union_hint_enforcement_composite_type_in_union 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_union_hint_enforcement_not_part_of_union 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_union_hint_enforcement_part_of_union 
(apache_beam.typehints.typehints_test.UnionHintTestCase) ... ok
test_union_hint_repr (apache_beam.typehints.typehints_test.UnionHintTestCase) 
... ok
test_deprecated_with_since_current 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_deprecated_with_since_current_message 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_deprecated_without_current 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_deprecated_without_since_should_fail 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_experimental_with_current 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_experimental_with_current_message 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_experimental_without_current 
(apache_beam.utils.annotations_test.AnnotationTests) ... ok
test_frequency (apache_beam.utils.annotations_test.AnnotationTests)
Tests that the filter 'once' is sufficient to print once per ... ok
test_equal_objects (apache_beam.utils.counters_test.CounterNameTest) ... ok
test_hash_two_objects (apache_beam.utils.counters_test.CounterNameTest) ... ok
test_method_forwarding_not_windows (apache_beam.utils.processes_test.Exec) ... 
ok
test_method_forwarding_windows (apache_beam.utils.processes_test.Exec) ... ok
test_call_two_objects (apache_beam.utils.retry_test.RetryStateTest) ... ok
test_single_failure (apache_beam.utils.retry_test.RetryStateTest) ... ok
test_two_failures (apache_beam.utils.retry_test.RetryStateTest) ... ok
test_log_calls_for_permanent_failure (apache_beam.utils.retry_test.RetryTest) 
... ok
test_log_calls_for_transient_failure (apache_beam.utils.retry_test.RetryTest) 
... ok
test_with_default_number_of_retries (apache_beam.utils.retry_test.RetryTest) 
... ok
test_with_explicit_decorator (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_explicit_initial_delay (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_explicit_number_of_retries (apache_beam.utils.retry_test.RetryTest) 
... ok
test_with_http_error_that_should_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... SKIP: google-apitools is not 
installed
test_with_http_error_that_should_not_be_retried 
(apache_beam.utils.retry_test.RetryTest) ... SKIP: google-apitools is not 
installed
test_with_no_retry_decorator (apache_beam.utils.retry_test.RetryTest) ... ok
test_with_real_clock (apache_beam.utils.retry_test.RetryTest) ... ok
test_arithmetic (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_of (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_precision (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_str (apache_beam.utils.timestamp_test.DurationTest) ... ok
test_arithmetic (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_of (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_precision (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_sort_order (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_str (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_utc_timestamp (apache_beam.utils.timestamp_test.TimestampTest) ... ok
test_equality (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_hash (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_pickle (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_timestamps (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_with_value (apache_beam.utils.windowed_value_test.WindowedValueTest) ... ok
test_retry_fork_graph (apache_beam.pipeline_test.DirectRunnerRetryTests) ... ok
test_element (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_no_tag (apache_beam.pipeline_test.DoFnTest) ... ok
test_side_input_tagged (apache_beam.pipeline_test.DoFnTest) ... ok
test_t

[jira] [Commented] (BEAM-2884) Dataflow runs portable pipelines

2017-09-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171788#comment-16171788
 ] 

ASF GitHub Bot commented on BEAM-2884:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3858


> Dataflow runs portable pipelines
> 
>
> Key: BEAM-2884
> URL: https://issues.apache.org/jira/browse/BEAM-2884
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Henning Rohde
>  Labels: portability
>
> Dataflow should run pipelines using the full portability API as currently 
> defined:
> https://s.apache.org/beam-fn-api 
> https://s.apache.org/beam-runner-api
> https://s.apache.org/beam-job-api
> https://s.apache.org/beam-fn-api-container-contract
> This issue tracks its adoption of the portability framework. New Fn API and 
> other features will be tracked separately.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3858: [BEAM-2884] Move DoFnInfo to SDK util

2017-09-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3858


---


[2/3] beam git commit: Update Dataflow container version to 20170918

2017-09-19 Thread kenn
Update Dataflow container version to 20170918


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/056b720c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/056b720c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/056b720c

Branch: refs/heads/master
Commit: 056b720c6c131d9b5dd9329b23697a4f062c5876
Parents: f3f3254
Author: Kenneth Knowles 
Authored: Mon Sep 18 20:35:13 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Sep 18 21:22:51 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/056b720c/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 4d55209..eb490cb 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170825
+
beam-master-20170918
 
1
 
6
   



[3/3] beam git commit: This closes #3858: [BEAM-2884] Move DoFnInfo to SDK util

2017-09-19 Thread kenn
This closes #3858: [BEAM-2884] Move DoFnInfo to SDK util

  Update Dataflow container version to 20170918
  Move DoFnInfo to SDK util


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/c25bead5
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/c25bead5
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/c25bead5

Branch: refs/heads/master
Commit: c25bead5598d574fe5a277f4e67ab6e275492f47
Parents: c65aca0 056b720
Author: Kenneth Knowles 
Authored: Tue Sep 19 07:19:28 2017 -0700
Committer: Kenneth Knowles 
Committed: Tue Sep 19 07:19:28 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |   2 +-
 .../dataflow/DataflowPipelineTranslator.java|   2 +-
 .../beam/runners/dataflow/util/DoFnInfo.java| 104 ---
 .../DataflowPipelineTranslatorTest.java |   4 +-
 .../java/org/apache/beam/sdk/util/DoFnInfo.java | 104 +++
 .../apache/beam/fn/harness/FnApiDoFnRunner.java |   2 +-
 .../beam/fn/harness/FnApiDoFnRunnerTest.java|   2 +-
 7 files changed, 110 insertions(+), 110 deletions(-)
--




[1/3] beam git commit: Move DoFnInfo to SDK util

2017-09-19 Thread kenn
Repository: beam
Updated Branches:
  refs/heads/master c65aca07f -> c25bead55


Move DoFnInfo to SDK util

Previously, the DoFnInfo wrapped things just enough for Dataflow to execute a
DoFn without much context. The Java SDK harness has the same need, and relies
on DoFnInfo. Effectively, DoFnInfo is the UDF that the Java SDK harness
understands.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/f3f32549
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/f3f32549
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/f3f32549

Branch: refs/heads/master
Commit: f3f325499e33f82eb8873a4e877c56dbc9928043
Parents: fdd9965
Author: Kenneth Knowles 
Authored: Sat Sep 16 15:16:56 2017 -0700
Committer: Kenneth Knowles 
Committed: Mon Sep 18 21:22:50 2017 -0700

--
 .../dataflow/DataflowPipelineTranslator.java|   2 +-
 .../beam/runners/dataflow/util/DoFnInfo.java| 104 ---
 .../DataflowPipelineTranslatorTest.java |   4 +-
 .../java/org/apache/beam/sdk/util/DoFnInfo.java | 104 +++
 .../apache/beam/fn/harness/FnApiDoFnRunner.java |   2 +-
 .../beam/fn/harness/FnApiDoFnRunnerTest.java|   2 +-
 6 files changed, 109 insertions(+), 109 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/f3f32549/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
index 2bed6be..4f9b939 100644
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
+++ 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java
@@ -67,7 +67,6 @@ import 
org.apache.beam.runners.dataflow.TransformTranslator.TranslationContext;
 import org.apache.beam.runners.dataflow.options.DataflowPipelineOptions;
 import org.apache.beam.runners.dataflow.util.CloudObject;
 import org.apache.beam.runners.dataflow.util.CloudObjects;
-import org.apache.beam.runners.dataflow.util.DoFnInfo;
 import org.apache.beam.runners.dataflow.util.OutputReference;
 import org.apache.beam.runners.dataflow.util.PropertyNames;
 import org.apache.beam.sdk.Pipeline;
@@ -92,6 +91,7 @@ import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
 import org.apache.beam.sdk.transforms.windowing.DefaultTrigger;
 import org.apache.beam.sdk.transforms.windowing.Window;
 import org.apache.beam.sdk.util.AppliedCombineFn;
+import org.apache.beam.sdk.util.DoFnInfo;
 import org.apache.beam.sdk.util.WindowedValue;
 import org.apache.beam.sdk.util.common.ReflectHelpers;
 import org.apache.beam.sdk.values.KV;

http://git-wip-us.apache.org/repos/asf/beam/blob/f3f32549/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DoFnInfo.java
--
diff --git 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DoFnInfo.java
 
b/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DoFnInfo.java
deleted file mode 100644
index 4a26795..000
--- 
a/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/util/DoFnInfo.java
+++ /dev/null
@@ -1,104 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.runners.dataflow.util;
-
-import java.io.Serializable;
-import java.util.Map;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.transforms.DoFn;
-import org.apache.beam.sdk.values.PCollectionView;
-import org.apache.beam.sdk.values.TupleTag;
-import org.apache.beam.sdk.values.WindowingStrategy;
-
-/**
- * Wrapper class holding the necessary information to serialize 

[jira] [Commented] (BEAM-2856) Update Nexmark Query 10 to use AvroIO

2017-09-19 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16171713#comment-16171713
 ] 

Etienne Chauchot commented on BEAM-2856:


Updating this issue to apply what [~reuvenlax] suggested here: 
https://lists.apache.org/thread.html/61a151c4898afa8495c4a59f622c7cc1b1e9a97813f0ed3407d075f1@%3Cdev.beam.apache.org%3E

> Update Nexmark Query 10 to use AvroIO
> -
>
> Key: BEAM-2856
> URL: https://issues.apache.org/jira/browse/BEAM-2856
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Ismaël Mejía
>
> Nexmark's Query 10 tested writing to sharded files on Google Storage, it used 
> some google specific APIs and it 'manually' ensured sharding. I suppose we 
> can update this to support the other filesystems and withSharding, or we 
> should if not redefine the use case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (BEAM-2856) Update Nexmark Query 10 to use AvroIO

2017-09-19 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-2856:
---
Summary: Update Nexmark Query 10 to use AvroIO  (was: Fix Nexmark Query 10 
'log to sharded files' to be less Google specific)

> Update Nexmark Query 10 to use AvroIO
> -
>
> Key: BEAM-2856
> URL: https://issues.apache.org/jira/browse/BEAM-2856
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Ismaël Mejía
>
> Nexmark's Query 10 tested writing to sharded files on Google Storage, it used 
> some google specific APIs and it 'manually' ensured sharding. I suppose we 
> can update this to support the other filesystems and withSharding, or we 
> should if not redefine the use case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


  1   2   >