Jenkins build is unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #4060

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2995) can't read/write hdfs in Flink CLUSTER(Standalone)

2017-09-27 Thread huangjianhuang (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183553#comment-16183553
 ] 

huangjianhuang commented on BEAM-2995:
--

by the way, my pom.xml is:

{code:java}

com.joe
flinkBeam
2.2.0-SNAPSHOT




org.apache.beam
beam-runners-flink_2.10
${project.version}





org.apache.beam
beam-sdks-java-core
${project.version}



org.apache.beam
beam-sdks-java-io-kafka
${project.version}









org.apache.beam
beam-sdks-java-io-hadoop-file-system
${project.version}



org.apache.beam
beam-sdks-java-io-google-cloud-platform
${project.version}



org.apache.beam

beam-sdks-java-extensions-google-cloud-platform-core
${project.version}



org.apache.hadoop
hadoop-common
2.8.1



org.apache.hadoop
hadoop-hdfs
2.8.1



org.apache.hadoop
hadoop-client
2.8.1


org.apache.beam
beam-sdks-java-extensions-protobuf
${project.version}


com.google.protobuf
protobuf-java
3.2.0



com.google.protobuf
protobuf-java-util
3.2.0






org.codehaus.mojo
exec-maven-plugin
1.4.0

false



org.apache.maven.plugins
maven-shade-plugin


false


*:*

META-INF/*.SF
META-INF/*.DSA
META-INF/*.RSA






package

shade



true
shaded






{code}

run with flink1.3.2, hadoop2.8.1


> can't read/write hdfs in Flink CLUSTER(Standalone)
> --
>
> Key: BEAM-2995
> URL: https://issues.apache.org/jira/browse/BEAM-2995
> Project: Beam
>  Issue Type: Bug
>  Components: runner-flink
>Affects Versions: 2.2.0
>Reporter: huangjianhuang
>Assignee: Aljoscha Krettek
>
> i just write a simple demo like:
> {code:java}
> Configuration conf = new Configuration();
> conf.set("fs.default.name", "hdfs://localhost:9000");
> //other codes
> p.apply("ReadLines", 
> TextIO.read().from("hdfs://localhost:9000/tmp/words"))
> 
> .apply(TextIO.write().to("hdfs://localhost:9000/tmp/hdfsout"));
> {code}
> it works in flink local model with cmd:
> {code:java}
> mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS -Pflink-runner 
> -Dexec.args="--runner=FlinkRunner 
> --filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar"
> {code}
> but not works in CLUSTER mode:
> {code:java}
> mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS -Pflink-runner 
> -Dexec.args="--runner=FlinkRunner 
> --filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar 
> --flinkMaster=localhost:6123 "
> {code}
> it seems the flink cluster regard the hdfs as local file system. 
> The input log from flink-jobmanger.log is:
> {code:java}
> 2017-09-27 20:17:37,962 INFO  org.apache.flink.runtime.jobmanager.JobManager  
>   - Successfully ran initialization on master in 136 ms.
> 2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource  
>   - {color:red}Filepattern hdfs://localhost:9000/tmp/words2 
> matched 0 files with total size 0{color}
> 2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource  
>   - Splitting filepattern hdfs://localhost:9000/tmp/words2 into 
> bundles of size 0 took 0 ms and produced 0 files a
> nd 0 bundles
> {code}
> The output  error message is :
> {code:java}
> Caused by: java.lang.ClassCastException: 
> {color:red}org.apache.beam.sdk.io.hdfs.HadoopResourceId cannot be cast to 
> 

[jira] [Created] (BEAM-3000) No python equivalent of org.apache.beam.sdk.transforms.Sample.any(100)?

2017-09-27 Thread Rodrigo Benenson (JIRA)
Rodrigo Benenson created BEAM-3000:
--

 Summary: No python equivalent of 
org.apache.beam.sdk.transforms.Sample.any(100)?
 Key: BEAM-3000
 URL: https://issues.apache.org/jira/browse/BEAM-3000
 Project: Beam
  Issue Type: Improvement
  Components: sdk-py-core
Reporter: Rodrigo Benenson
Assignee: Ahmet Altay
Priority: Critical


Java's org.apache.beam.sdk.transforms.Sample.any will return a PCollection with 
bounded size (filtering strategy).
The closest python eqiuvalent is beam.Sample.FixedSizeGlobally(n) whover, this 
version uses a combiner strategy, returning a list with n elements; which does 
not scale if n is "bigger than what fits in memory".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (BEAM-2794) Failed to run Wordcount example following quick start guide

2017-09-27 Thread Huafeng Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Huafeng Wang resolved BEAM-2794.

   Resolution: Not A Problem
Fix Version/s: Not applicable

> Failed to run Wordcount example following quick start guide
> ---
>
> Key: BEAM-2794
> URL: https://issues.apache.org/jira/browse/BEAM-2794
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Huafeng Wang
>Assignee: Reuven Lax
>Priority: Minor
> Fix For: Not applicable
>
>
> Try to run the wordcount example according to the quick start but I got the 
> resolution failure:
> {code}
> [INFO] 
> 
> [INFO] Building word-count-beam 0.1
> [INFO] 
> 
> [WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
> no dependency information available
> [WARNING] Failed to retrieve plugin descriptor for 
> org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
> org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not 
> be resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> [WARNING] The POM for org.apache.beam:beam-sdks-java-core:jar:0.1 is missing, 
> no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> is missing, no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1 is missing, no 
> dependency information available
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 0.960 s
> [INFO] Finished at: 2017-08-16T14:14:42+08:00
> [INFO] Final Memory: 18M/309M
> [INFO] 
> 
> [ERROR] Failed to execute goal on project word-count-beam: Could not resolve 
> dependencies for project org.example:word-count-beam:jar:0.1: The following 
> artifacts could not be resolved: 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1, 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1: Failure to find 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> in https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> My environment's network is all good.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2794) Failed to run Wordcount example following quick start guide

2017-09-27 Thread Huafeng Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183540#comment-16183540
 ] 

Huafeng Wang commented on BEAM-2794:


Thanks for your explanation [~jingc]! I can close this one now.

> Failed to run Wordcount example following quick start guide
> ---
>
> Key: BEAM-2794
> URL: https://issues.apache.org/jira/browse/BEAM-2794
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Huafeng Wang
>Assignee: Reuven Lax
>Priority: Minor
> Fix For: Not applicable
>
>
> Try to run the wordcount example according to the quick start but I got the 
> resolution failure:
> {code}
> [INFO] 
> 
> [INFO] Building word-count-beam 0.1
> [INFO] 
> 
> [WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
> no dependency information available
> [WARNING] Failed to retrieve plugin descriptor for 
> org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
> org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not 
> be resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> [WARNING] The POM for org.apache.beam:beam-sdks-java-core:jar:0.1 is missing, 
> no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> is missing, no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1 is missing, no 
> dependency information available
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 0.960 s
> [INFO] Finished at: 2017-08-16T14:14:42+08:00
> [INFO] Final Memory: 18M/309M
> [INFO] 
> 
> [ERROR] Failed to execute goal on project word-count-beam: Could not resolve 
> dependencies for project org.example:word-count-beam:jar:0.1: The following 
> artifacts could not be resolved: 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1, 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1: Failure to find 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> in https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> My environment's network is all good.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2999) Split validatesrunner tests from Python postcommit

2017-09-27 Thread Mark Liu (JIRA)
Mark Liu created BEAM-2999:
--

 Summary: Split validatesrunner tests from Python postcommit
 Key: BEAM-2999
 URL: https://issues.apache.org/jira/browse/BEAM-2999
 Project: Beam
  Issue Type: Bug
  Components: testing
Reporter: Mark Liu
Assignee: Mark Liu


The only Python Postcommit Jenkins build includes too many tests which makes 
the build (and test) time over 1 hour. Also it became hard to found error in 
long console logs if build failed.

We can separate validatesrunner tests which currently take ~20mins out from the 
Postcommit build to a separate Jenkins branch. This will shorten the total 
build time of Postcommit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_MavenInstall #4897

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-2998) add IT test for SQL

2017-09-27 Thread Xu Mingmin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xu Mingmin updated BEAM-2998:
-
Description: 
Add IT test for SQL module

https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java
 is the base example.

  was:Add IT test for SQL module


> add IT test for SQL
> ---
>
> Key: BEAM-2998
> URL: https://issues.apache.org/jira/browse/BEAM-2998
> Project: Beam
>  Issue Type: Test
>  Components: dsl-sql
>Reporter: Xu Mingmin
> Fix For: 2.2.0
>
>
> Add IT test for SQL module
> https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/example/BeamSqlExample.java
>  is the base example.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2998) add IT test for SQL

2017-09-27 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2998:


 Summary: add IT test for SQL
 Key: BEAM-2998
 URL: https://issues.apache.org/jira/browse/BEAM-2998
 Project: Beam
  Issue Type: Test
  Components: dsl-sql
Reporter: Xu Mingmin
 Fix For: 2.2.0


Add IT test for SQL module



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3943

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2958) Expose a top level user agent PipelineOption which can be communicated to external services

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183491#comment-16183491
 ] 

ASF GitHub Bot commented on BEAM-2958:
--

GitHub user youngoli opened a pull request:

https://github.com/apache/beam/pull/3915

[BEAM-2958] Adding a top-level user agent string to PipelineOptions

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

Adding a string to PipelineOptions that can be used for setting a user 
agent string with information about the distribution of Beam being used. That 
string can be sent to external services too.

This PR is probably not completely done yet. While the actual user agent 
string is in and working, more might have to be done to replace BigTableIO and 
DataflowRunner usages of user agents to use this new one. I'm still working on 
that right now.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/youngoli/beam bugfix-2958

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3915


commit a70ff0bd973807cc99b1aee65c372476a295259b
Author: Daniel Oliveira 
Date:   2017-09-27T19:24:17Z

[BEAM-2958] Adding user agent string to PipelineOptions.

commit e56ebf6b034f0b0d0f98e35ea2a18b852593d373
Author: Daniel Oliveira 
Date:   2017-09-28T00:14:02Z

[BEAM-2958] Adding a unit test for UserAgentFactory.




> Expose a top level user agent PipelineOption which can be communicated to 
> external services
> ---
>
> Key: BEAM-2958
> URL: https://issues.apache.org/jira/browse/BEAM-2958
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-apex, runner-dataflow, runner-direct, 
> runner-flink, runner-gearpump, runner-jstorm, runner-mapreduce, runner-spark, 
> runner-tez, sdk-java-core, sdk-java-gcp
>Reporter: Luke Cwik
>Assignee: Daniel Oliveira
>Priority: Minor
>
> This concept is used by Bigtable and Dataflow service to specify what version 
> of the SDK is being used and is currently available through ReleaseInfo as a 
> static property.
> The Dataflow distribution attempts to override this but is unable to 
> propagate this user agent to dependent modules cleanly. Having dependent 
> modules get the user agent from a PipelineOption would make it possible for a 
> runner to modify the user agent during execution to be able to identify its 
> flavor.
> It seems likely that Flink/Spark/... would like to modify the user agent as 
> well for the same purpose.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3915: [BEAM-2958] Adding a top-level user agent string to...

2017-09-27 Thread youngoli
GitHub user youngoli opened a pull request:

https://github.com/apache/beam/pull/3915

[BEAM-2958] Adding a top-level user agent string to PipelineOptions

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [x] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [x] Each commit in the pull request should have a meaningful subject 
line and body.
 - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [x] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [x] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---

Adding a string to PipelineOptions that can be used for setting a user 
agent string with information about the distribution of Beam being used. That 
string can be sent to external services too.

This PR is probably not completely done yet. While the actual user agent 
string is in and working, more might have to be done to replace BigTableIO and 
DataflowRunner usages of user agents to use this new one. I'm still working on 
that right now.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/youngoli/beam bugfix-2958

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3915.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3915


commit a70ff0bd973807cc99b1aee65c372476a295259b
Author: Daniel Oliveira 
Date:   2017-09-27T19:24:17Z

[BEAM-2958] Adding user agent string to PipelineOptions.

commit e56ebf6b034f0b0d0f98e35ea2a18b852593d373
Author: Daniel Oliveira 
Date:   2017-09-28T00:14:02Z

[BEAM-2958] Adding a unit test for UserAgentFactory.




---


[jira] [Commented] (BEAM-2794) Failed to run Wordcount example following quick start guide

2017-09-27 Thread Jing Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183484#comment-16183484
 ] 

Jing Chen commented on BEAM-2794:
-

Hi Huafeng,
I believe it is not a bug.

I would suggest you get the word count code outside the beam project's root 
directory, which means, make sure that your current working directory is not 
the beam's root project when you run the maven command from java quickstart.

if you run command at the root directory of the beam project, it would lead to 
*transitive properties*. 

As you can see from your printed stack, version for beam-sdks-java-core, 
beam-sdks-java-extensions-google-cloud-platform-core and 
beam-sdks-java-extensions-protobuf are 0.1.

FYI, it makes difference if you run the command outside the root directory of 
the beam project. 

{noformat}
$ diff ~/beam/word-count-beam/pom.xml ~/word-count-beam/pom.xml 
17c17,20
< -->http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
---
> -->
> http://maven.apache.org/POM/4.0.0;
>  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
>  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
> http://maven.apache.org/xsd/maven-4.0.0.xsd;>
19,23d21
<   
< beam-parent
< org.apache.beam
< 2.2.0-SNAPSHOT
<   
59a58
> true
130c129
< 
---
>  implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/>
{noformat}

> Failed to run Wordcount example following quick start guide
> ---
>
> Key: BEAM-2794
> URL: https://issues.apache.org/jira/browse/BEAM-2794
> Project: Beam
>  Issue Type: Bug
>  Components: examples-java
>Reporter: Huafeng Wang
>Assignee: Reuven Lax
>Priority: Minor
>
> Try to run the wordcount example according to the quick start but I got the 
> resolution failure:
> {code}
> [INFO] 
> 
> [INFO] Building word-count-beam 0.1
> [INFO] 
> 
> [WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
> no dependency information available
> [WARNING] Failed to retrieve plugin descriptor for 
> org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
> org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not 
> be resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
> https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced
> [WARNING] The POM for org.apache.beam:beam-sdks-java-core:jar:0.1 is missing, 
> no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> is missing, no dependency information available
> [WARNING] The POM for 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1 is missing, no 
> dependency information available
> [INFO] 
> 
> [INFO] BUILD FAILURE
> [INFO] 
> 
> [INFO] Total time: 0.960 s
> [INFO] Finished at: 2017-08-16T14:14:42+08:00
> [INFO] Final Memory: 18M/309M
> [INFO] 
> 
> [ERROR] Failed to execute goal on project word-count-beam: Could not resolve 
> dependencies for project org.example:word-count-beam:jar:0.1: The following 
> artifacts could not be resolved: 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1, 
> org.apache.beam:beam-sdks-java-extensions-protobuf:jar:0.1: Failure to find 
> org.apache.beam:beam-sdks-java-extensions-google-cloud-platform-core:jar:0.1 
> in https://repo.maven.apache.org/maven2 was cached in the local repository, 
> resolution will not be reattempted until the update interval of central has 
> elapsed or updates are forced -> [Help 1]
> {code}
> My environment's network is all good.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3942

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2606) WindowFnTestUtils should allow using the value in addition to the timestamp of the elements

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183462#comment-16183462
 ] 

ASF GitHub Bot commented on BEAM-2606:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3592


> WindowFnTestUtils should allow using the value in addition to the timestamp 
> of the elements
> ---
>
> Key: BEAM-2606
> URL: https://issues.apache.org/jira/browse/BEAM-2606
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> {{WindowFnTestUtils}} relies only on timeStamps for everything related to 
> windows assignment in the test helpers. But when creating a custom 
> {{WindowFn}} (and most likely CustomWindow as well), that {{WindowFn}} might 
> rely on element value in addition to timestamp to decide the windows that 
> will be assigned to the element. To be able to test this kind of custom 
> WindowFn, we need versions of the helper methods in WindowFnTestUtils that 
> allow passing {{TimeStampedValues}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3592: [BEAM-2606] make WindowFnTestUtils use the value in...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3592


---


[1/2] beam git commit: [BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp of the elements

2017-09-27 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 393e56310 -> da531b7bc


[BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp 
of the elements


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/2144c8dd
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/2144c8dd
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/2144c8dd

Branch: refs/heads/master
Commit: 2144c8ddbba2f8245e0a15d4bfa476825ea7d51f
Parents: 393e563
Author: Etienne Chauchot 
Authored: Mon Jul 10 15:57:44 2017 +0200
Committer: Luke Cwik 
Committed: Wed Sep 27 16:39:26 2017 -0700

--
 .../beam/sdk/testing/WindowFnTestUtils.java | 141 +++
 1 file changed, 115 insertions(+), 26 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/2144c8dd/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/WindowFnTestUtils.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/WindowFnTestUtils.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/WindowFnTestUtils.java
index e8c2f8d..7fa1056 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/WindowFnTestUtils.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/WindowFnTestUtils.java
@@ -40,6 +40,7 @@ import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
 import org.apache.beam.sdk.transforms.windowing.IntervalWindow;
 import org.apache.beam.sdk.transforms.windowing.TimestampCombiner;
 import org.apache.beam.sdk.transforms.windowing.WindowFn;
+import org.apache.beam.sdk.values.TimestampedValue;
 import org.joda.time.Instant;
 import org.joda.time.ReadableInstant;
 
@@ -67,14 +68,28 @@ public class WindowFnTestUtils {
   public static  Map runWindowFn(
   WindowFn windowFn,
   List timestamps) throws Exception {
+List timestampedValues = new ArrayList<>();
+for (Long timestamp : timestamps){
+  timestampedValues.add(TimestampedValue.of((T) null, new 
Instant(timestamp)));
+}
+return runWindowFnWithValue(windowFn, timestampedValues);
+  }
+
+  /**
+   * Runs the {@link WindowFn} over the provided input, returning a map
+   * of windows to the timestamps in those windows. This version allows to 
pass a list of
+   * {@link TimestampedValue} in case the values are used to assign windows.
+   */
+  public static  Map 
runWindowFnWithValue(
+  WindowFn windowFn,
+  List timestampedValues) throws Exception {
 
-final TestWindowSet windowSet = new TestWindowSet();
-for (final Long timestamp : timestamps) {
-  for (W window : windowFn.assignWindows(
-  new TestAssignContext(new Instant(timestamp), windowFn))) {
-windowSet.put(window, timestampValue(timestamp));
+final TestWindowSet windowSet = new TestWindowSet<>();
+for (final TimestampedValue element : timestampedValues) {
+  for (W window : assignedWindowsWithValue(windowFn, element)) {
+windowSet.put(window, 
timestampValue(element.getTimestamp().getMillis()));
   }
-  windowFn.mergeWindows(new TestMergeContext(windowSet, windowFn));
+  windowFn.mergeWindows(new TestMergeContext<>(windowSet, windowFn));
 }
 Map actual = new HashMap<>();
 for (W window : windowSet.windows()) {
@@ -83,9 +98,23 @@ public class WindowFnTestUtils {
 return actual;
   }
 
+  /**
+  * runs {@link WindowFn#assignWindows(WindowFn.AssignContext)}.
+   */
   public static  Collection assignedWindows(
   WindowFn windowFn, long timestamp) throws Exception {
-return windowFn.assignWindows(new TestAssignContext(new 
Instant(timestamp), windowFn));
+return assignedWindowsWithValue(windowFn,
+TimestampedValue.of((T) null, new Instant(timestamp)));
+  }
+
+  /**
+   * runs {@link WindowFn#assignWindows(WindowFn.AssignContext)}. This version 
allows passing
+   * a {@link TimestampedValue} in case the value is needed to assign windows.
+   */
+  public static  Collection 
assignedWindowsWithValue(
+  WindowFn windowFn, TimestampedValue timestampedValue) throws 
Exception {
+return windowFn.assignWindows(
+new TestAssignContext<>(timestampedValue, windowFn));
   }
 
   private static String timestampValue(long timestamp) {
@@ -97,21 +126,21 @@ public class WindowFnTestUtils {
*/
   private static class TestAssignContext
   extends WindowFn.AssignContext {
-private Instant timestamp;
+private 

[2/2] beam git commit: [BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp of the elements

2017-09-27 Thread lcwik
[BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp 
of the elements

This closes #3592


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/da531b7b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/da531b7b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/da531b7b

Branch: refs/heads/master
Commit: da531b7bcdb8654c8b379e00d0c8a2895d4e9b2e
Parents: 393e563 2144c8d
Author: Luke Cwik 
Authored: Wed Sep 27 16:39:56 2017 -0700
Committer: Luke Cwik 
Committed: Wed Sep 27 16:39:56 2017 -0700

--
 .../beam/sdk/testing/WindowFnTestUtils.java | 141 +++
 1 file changed, 115 insertions(+), 26 deletions(-)
--




[jira] [Commented] (BEAM-2975) Results of ReadableState.read() should be snapshots of the underlying state

2017-09-27 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183417#comment-16183417
 ] 

Luke Cwik commented on BEAM-2975:
-

After thinking this for some more, I don't have a strong opinion on the matter, 
when read() is invoked it can either return an immutable view of state at that 
point in time or it can return a live view of state. Both have pros and cons.

Whether its a live view or a snapshot during the life of a process element call 
is an implementation detail as the portability framework will need to deal with 
appends/reads/caching anyways since when the iterable is output from the DoFn, 
it will need to be snapshotted.

> Results of ReadableState.read() should be snapshots of the underlying state
> ---
>
> Key: BEAM-2975
> URL: https://issues.apache.org/jira/browse/BEAM-2975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Mills
>Assignee: Daniel Mills
>Priority: Minor
>
> Future modification of state should not be reflected in previous calls to 
> read().  For example:
> @StateId("tag") BagState state;
> Iterable ints = state.read();
> state.add(17);
> // ints should still be empty here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3941

2017-09-27 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3890: Introduces Reshuffle.viaRandomKey()

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3890


---


[2/4] beam git commit: Introduces Reshuffle.viaRandomKey()

2017-09-27 Thread jkff
Introduces Reshuffle.viaRandomKey()

It's a commonly used pattern for breaking fusion
https://cloud.google.com/dataflow/service/dataflow-service-desc#fusion-optimization

viaRandomKey() only abstracts away the current commonly used pattern.
It has the same caveats as using Reshuffle.of() directly - the semantics
are technically not guaranteed by the Beam model, but it works in
practice, and this is the pattern we keep recommending to users.

The naming is deliberately operational rather than semantic, to
emphasize that we don't have the semantics figured out, and the
transform promises only that it expands into exactly the sequence
"pair with random key, reshuffle, drop key".
The goal of this change is just to reduce copy-paste.

See prior discussion at
https://lists.apache.org/thread.html/ac34c9ac665a8d9f67b0254015e44c59ea65ecc1360d4014b95d3b2e@%3Cdev.beam.apache.org%3E

This change also converts several existing usages to use it, and adds another
one in Match.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b74644a2
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b74644a2
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b74644a2

Branch: refs/heads/master
Commit: b74644a2bccd7fca90da4912781196c804e41325
Parents: 30e7d15
Author: Eugene Kirpichov 
Authored: Fri Sep 22 15:24:36 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 15:08:38 2017 -0700

--
 .../java/org/apache/beam/sdk/io/FileIO.java |  7 ++-
 .../beam/sdk/io/ReadAllViaFileBasedSource.java  | 29 +---
 .../apache/beam/sdk/transforms/Reshuffle.java   | 47 
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 12 +
 .../io/gcp/bigquery/StreamingWriteTables.java   | 10 ++---
 .../beam/sdk/io/gcp/datastore/DatastoreV1.java  | 27 ---
 .../beam/sdk/io/gcp/spanner/SpannerIO.java  | 27 +++
 .../sdk/io/gcp/datastore/DatastoreV1Test.java   | 15 +++
 .../sdk/io/gcp/datastore/SplitQueryFnIT.java|  5 +--
 .../org/apache/beam/sdk/io/jdbc/JdbcIO.java | 24 +-
 10 files changed, 82 insertions(+), 121 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b74644a2/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
index c909c3c..6b75370 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
@@ -37,6 +37,7 @@ import org.apache.beam.sdk.transforms.Create;
 import org.apache.beam.sdk.transforms.DoFn;
 import org.apache.beam.sdk.transforms.PTransform;
 import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.Reshuffle;
 import org.apache.beam.sdk.transforms.Values;
 import org.apache.beam.sdk.transforms.Watch;
 import org.apache.beam.sdk.transforms.Watch.Growth.TerminationCondition;
@@ -305,12 +306,13 @@ public class FileIO {
 
 @Override
 public PCollection expand(PCollection input) 
{
+  PCollection res;
   if (getConfiguration().getWatchInterval() == null) {
-return input.apply(
+res = input.apply(
 "Match filepatterns",
 ParDo.of(new 
MatchFn(getConfiguration().getEmptyMatchTreatment(;
   } else {
-return input
+res = input
 .apply(
 "Continuously match filepatterns",
 Watch.growthOf(new MatchPollFn())
@@ -318,6 +320,7 @@ public class FileIO {
 
.withTerminationPerInput(getConfiguration().getWatchTerminationCondition()))
 .apply(Values.create());
   }
+  return res.apply(Reshuffle.viaRandomKey());
 }
 
 private static class MatchFn extends DoFn {

http://git-wip-us.apache.org/repos/asf/beam/blob/b74644a2/sdks/java/core/src/main/java/org/apache/beam/sdk/io/ReadAllViaFileBasedSource.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/ReadAllViaFileBasedSource.java
 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/ReadAllViaFileBasedSource.java
index 03cdbb1..c53f405 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/ReadAllViaFileBasedSource.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/ReadAllViaFileBasedSource.java
@@ -18,7 +18,6 @@
 package org.apache.beam.sdk.io;
 
 import java.io.IOException;
-import java.util.concurrent.ThreadLocalRandom;
 import org.apache.beam.sdk.annotations.Experimental;
 import org.apache.beam.sdk.coders.Coder;
 import 

[3/4] beam git commit: PAssert improvements.

2017-09-27 Thread jkff
PAssert improvements.

- Captures stack trace by introducing a SerializableThrowable.
  Fixes an incorrect test of this.
- PAssert.thatSingletonIterable, thatMap/Multimap/Singleton no longer
  require that the collection is produced by a trigger that promises
  a single firing. thatSingletonIterable checks that the iterable is a
  singleton by other means. thatMap/Multimap/Singleton don't need this
  requirement at all.
  PaneExtractors.onlyPane() is now used only when the user explicitly
  specifies inOnlyPane().


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/30e7d15b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/30e7d15b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/30e7d15b

Branch: refs/heads/master
Commit: 30e7d15b3ab68d0c129dbd2be76e77346f2e1f38
Parents: 7e3f591
Author: Eugene Kirpichov 
Authored: Mon Sep 25 11:59:04 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 15:08:38 2017 -0700

--
 .../org/apache/beam/sdk/testing/PAssert.java| 63 +++-
 .../apache/beam/sdk/testing/PaneExtractors.java | 25 +---
 .../beam/sdk/testing/SuccessOrFailure.java  | 41 -
 .../apache/beam/sdk/testing/PAssertTest.java| 41 +++--
 .../beam/sdk/testing/PaneExtractorsTest.java|  7 +--
 5 files changed, 91 insertions(+), 86 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/30e7d15b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
index ed80f2f..d2ad67d 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/PAssert.java
@@ -31,7 +31,6 @@ import java.util.Arrays;
 import java.util.Collection;
 import java.util.Collections;
 import java.util.Map;
-import java.util.NoSuchElementException;
 import org.apache.beam.sdk.Pipeline;
 import org.apache.beam.sdk.Pipeline.PipelineVisitor;
 import org.apache.beam.sdk.PipelineRunner;
@@ -74,8 +73,6 @@ import org.apache.beam.sdk.values.PDone;
 import org.apache.beam.sdk.values.ValueInSingleWindow;
 import org.apache.beam.sdk.values.WindowingStrategy;
 import org.joda.time.Duration;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
 
 /**
  * An assertion on the contents of a {@link PCollection} incorporated into the 
pipeline. Such an
@@ -105,8 +102,6 @@ import org.slf4j.LoggerFactory;
  * JUnit and Hamcrest must be linked in by any code that uses PAssert.
  */
 public class PAssert {
-
-  private static final Logger LOG = LoggerFactory.getLogger(PAssert.class);
   public static final String SUCCESS_COUNTER = "PAssertSuccess";
   public static final String FAILURE_COUNTER = "PAssertFailure";
   private static final Counter successCounter = Metrics.counter(
@@ -170,10 +165,6 @@ public class PAssert {
   return new PAssertionSite(message, new Throwable().getStackTrace());
 }
 
-PAssertionSite() {
-  this(null, new StackTraceElement[0]);
-}
-
 PAssertionSite(String message, StackTraceElement[] creationStackTrace) {
   this.message = message;
   this.creationStackTrace = creationStackTrace;
@@ -381,15 +372,6 @@ public class PAssert {
*/
   public static  IterableAssert thatSingletonIterable(
   String reason, PCollection> actual) {
-
-try {
-} catch (NoSuchElementException | IllegalArgumentException exc) {
-  throw new IllegalArgumentException(
-  "PAssert.thatSingletonIterable requires a 
PCollection"
-  + " with a Coder where getCoderArguments() yields a"
-  + " single Coder to apply to the elements.");
-}
-
 @SuppressWarnings("unchecked") // Safe covariant cast
 PCollection actualIterables = (PCollection) 
actual;
 
@@ -581,7 +563,7 @@ public class PAssert {
 @SafeVarargs
 final PCollectionContentsAssert containsInAnyOrder(
 SerializableMatcher... elementMatchers) {
-  return 
satisfies(SerializableMatchers.containsInAnyOrder(elementMatchers));
+  return 
satisfies(SerializableMatchers.containsInAnyOrder(elementMatchers));
 }
 
 /**
@@ -592,7 +574,7 @@ public class PAssert {
 private PCollectionContentsAssert satisfies(
 AssertRelation relation, Iterable 
expectedElements) {
   return satisfies(
-  new CheckRelationAgainstExpected(
+  new CheckRelationAgainstExpected<>(
   relation, expectedElements, 
IterableCoder.of(actual.getCoder(;
 }
 
@@ -668,7 

[4/4] beam git commit: This closes #3890: Introduces Reshuffle.viaRandomKey()

2017-09-27 Thread jkff
This closes #3890: Introduces Reshuffle.viaRandomKey()


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/393e5631
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/393e5631
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/393e5631

Branch: refs/heads/master
Commit: 393e5631054a81ae1fdcd304f81cc68cf53d3422
Parents: 7e3f591 a6e2001
Author: Eugene Kirpichov 
Authored: Wed Sep 27 15:08:59 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 15:08:59 2017 -0700

--
 .../java/org/apache/beam/sdk/io/FileIO.java | 12 +++-
 .../beam/sdk/io/ReadAllViaFileBasedSource.java  | 29 +
 .../org/apache/beam/sdk/testing/PAssert.java| 63 +++-
 .../apache/beam/sdk/testing/PaneExtractors.java | 25 +---
 .../beam/sdk/testing/SuccessOrFailure.java  | 41 -
 .../apache/beam/sdk/transforms/Reshuffle.java   | 47 +++
 .../apache/beam/sdk/testing/PAssertTest.java| 41 +++--
 .../beam/sdk/testing/PaneExtractorsTest.java|  7 +--
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java| 12 +---
 .../io/gcp/bigquery/StreamingWriteTables.java   | 10 ++--
 .../beam/sdk/io/gcp/datastore/DatastoreV1.java  | 27 +++--
 .../beam/sdk/io/gcp/spanner/SpannerIO.java  | 27 ++---
 .../sdk/io/gcp/datastore/DatastoreV1Test.java   | 15 ++---
 .../sdk/io/gcp/datastore/SplitQueryFnIT.java|  5 +-
 .../org/apache/beam/sdk/io/jdbc/JdbcIO.java | 24 +---
 15 files changed, 178 insertions(+), 207 deletions(-)
--




[1/4] beam git commit: Adds ReadableFile.toString() for debugging

2017-09-27 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 7e3f591b2 -> 393e56310


Adds ReadableFile.toString() for debugging


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/a6e20017
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/a6e20017
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/a6e20017

Branch: refs/heads/master
Commit: a6e20017e0aadf6d0015696c8d8a22cec6d48077
Parents: b74644a
Author: Eugene Kirpichov 
Authored: Mon Sep 25 11:58:53 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 15:08:38 2017 -0700

--
 sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java | 5 +
 1 file changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/a6e20017/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
--
diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
index 6b75370..7df4bde 100644
--- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
+++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileIO.java
@@ -152,6 +152,11 @@ public class FileIO {
 public String readFullyAsUTF8String() throws IOException {
   return new String(readFullyAsBytes(), StandardCharsets.UTF_8);
 }
+
+@Override
+public String toString() {
+  return "ReadableFile{metadata=" + metadata + ", compression=" + 
compression + '}';
+}
   }
 
   /**



[beam-site] branch asf-site updated (07615ea -> 82cd4ff)

2017-09-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 07615ea  Prepare repository for deployment.
 add 939306f  Manually add missing generated files
 add 82cd4ff  This closes #325

No new revisions were added by this update.

Summary of changes:
 .../docker-images}/index.html  | 227 -
 {src => content}/images/logo_gearpump.png  | Bin
 2 files changed, 129 insertions(+), 98 deletions(-)
 copy content/{get-started/quickstart-py => 
contribute/docker-images}/index.html (62%)
 copy {src => content}/images/logo_gearpump.png (100%)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 02/02: This closes #325

2017-09-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 82cd4ffbac0dd806aa75b17ff59c5d464e2778e8
Merge: 07615ea 939306f
Author: Mergebot 
AuthorDate: Wed Sep 27 21:55:55 2017 +

This closes #325

 content/contribute/docker-images/index.html | 383 
 content/images/logo_gearpump.png| Bin 0 -> 4691 bytes
 2 files changed, 383 insertions(+)

-- 
To stop receiving notification emails like this one, please contact
"commits@beam.apache.org" .


[beam-site] branch mergebot updated (8c112de -> 82cd4ff)

2017-09-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a change to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git.


from 8c112de  This closes #324
 add 07615ea  Prepare repository for deployment.
 new 939306f  Manually add missing generated files
 new 82cd4ff  This closes #325

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 .../docker-images}/index.html  | 227 -
 .../runners/capability-matrix/index.html   |  30 +--
 {src => content}/images/logo_gearpump.png  | Bin
 3 files changed, 144 insertions(+), 113 deletions(-)
 copy content/{get-started/quickstart-py => 
contribute/docker-images}/index.html (62%)
 copy {src => content}/images/logo_gearpump.png (100%)

-- 
To stop receiving notification emails like this one, please contact
['"commits@beam.apache.org" '].


[beam-site] 01/02: Manually add missing generated files

2017-09-27 Thread mergebot-role
This is an automated email from the ASF dual-hosted git repository.

mergebot-role pushed a commit to branch mergebot
in repository https://gitbox.apache.org/repos/asf/beam-site.git

commit 939306f7540a77298fe08718b3a5afa98b46033b
Author: melissa 
AuthorDate: Mon Sep 25 10:51:39 2017 -0700

Manually add missing generated files
---
 content/contribute/docker-images/index.html | 383 
 content/images/logo_gearpump.png| Bin 0 -> 4691 bytes
 2 files changed, 383 insertions(+)

diff --git a/content/contribute/docker-images/index.html 
b/content/contribute/docker-images/index.html
new file mode 100644
index 000..cd160aa
--- /dev/null
+++ b/content/contribute/docker-images/index.html
@@ -0,0 +1,383 @@
+
+
+  
+  
+  
+  
+  Beam Docker Images
+  
+  https://fonts.googleapis.com/css?family=Roboto:100,300,400; 
rel="stylesheet">
+  
+  https://ajax.googleapis.com/ajax/libs/jquery/2.2.0/jquery.min.js";>
+  
+  
+  https://beam.apache.org/contribute/docker-images/; data-proofer-ignore>
+  
+  https://beam.apache.org/feed.xml;>
+  
+
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new 
Date();a=s.createElement(o),
+
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+ga('create', 'UA-73650088-1', 'auto');
+ga('send', 'pageview');
+  
+
+
+  
+
+
+  
+
+  
+
+  Toggle navigation
+  
+  
+  
+
+
+
+  
+
+  Get Started 
+  
+Beam Overview
+Quickstart - 
Java
+Quickstart - 
Python
+
+Example Walkthroughs
+WordCount
+Mobile 
Gaming
+
+Resources
+Downloads
+Support
+  
+
+
+  Documentation 
+  
+Using the Documentation
+
+Beam Concepts
+Programming 
Guide
+Additional 
Resources
+
+Pipeline Fundamentals
+Design Your 
Pipeline
+Create Your 
Pipeline
+Test 
Your Pipeline
+Pipeline I/O
+
+SDKs
+Java SDK
+Java SDK API Reference 
+
+Python SDK
+Python SDK API Reference 
+
+
+Runners
+Capability 
Matrix
+Direct Runner
+Apache Apex 
Runner
+Apache Flink 
Runner
+Apache Gearpump 
Runner
+Apache Spark 
Runner
+Cloud Dataflow 
Runner
+
+
+DSLs
+SQL
+  
+
+
+  Contribute 
+  
+Get Started Contributing
+
+Guides
+Contribution 
Guide
+Testing Guide
+Release Guide
+PTransform Style 
Guide
+Runner Authoring 
Guide
+
+Technical References
+Design 
Principles
+Ongoing 
Projects
+Source 
Repository
+Docker Images
+
+Promotion
+Presentation 
Materials
+Logos and Design
+
+Maturity Model
+Team
+  
+
+
+Blog
+  
+  
+
+  https://www.apache.org/foundation/press/kit/feather_small.png; alt="Apache 
Logo" style="height:20px;">
+  
+http://www.apache.org/;>ASF Homepage
+http://www.apache.org/licenses/;>License
+http://www.apache.org/security/;>Security
+http://www.apache.org/foundation/thanks.html;>Thanks
+http://www.apache.org/foundation/sponsorship.html;>Sponsorship
+https://www.apache.org/foundation/policies/conduct;>Code of 
Conduct
+  
+
+  
+
+
+
+
+  Docker Images
+
+Docker images allow to create a reproducible environment to build and test
+Beam. You can use the docker images by using the provided https://github.com/apache/beam/tree/master/sdks/java/build-tools/src/main/resources/docker;>Docker
 scripts.
+
+In this directory you will find scripts to build and run docker images for
+different purposes:
+
+
+  
+file: Create a Docker container from a Beam source 
code .zip file
+in a given environment. It is useful to test a specific version of Beam,
+for example to validate a release vote.
+  
+  
+git: Same as file but the Beam source code comes 
from the git repository,
+you can choose a given branch/tag/pull-request. Useful to test in a specific
+environment.
+  
+  
+release: It builds an end-user distribution of 
the latest version of Beam
+and its dependencies. 

[jira] [Commented] (BEAM-2996) Metric names should not be null or empty

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183305#comment-16183305
 ] 

ASF GitHub Bot commented on BEAM-2996:
--

GitHub user bjchambers opened a pull request:

https://github.com/apache/beam/pull/3914

[BEAM-2996] Ensure metric names are not null or empty

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [*] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [*] Each commit in the pull request should have a meaningful subject 
line and body.
 - [*] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [*] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [*] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [*] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bjchambers/beam metric-names

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3914


commit 8635e46ca2be718e9a5a32108be177b2a096ca51
Author: bchambers 
Date:   2017-09-27T17:44:48Z

Ensure metric names are not null or empty




> Metric names should not be null or empty
> 
>
> Key: BEAM-2996
> URL: https://issues.apache.org/jira/browse/BEAM-2996
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core, sdk-py-core
>Reporter: Ben Chambers
>Assignee: Ben Chambers
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3914: [BEAM-2996] Ensure metric names are not null or emp...

2017-09-27 Thread bjchambers
GitHub user bjchambers opened a pull request:

https://github.com/apache/beam/pull/3914

[BEAM-2996] Ensure metric names are not null or empty

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [*] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [*] Each commit in the pull request should have a meaningful subject 
line and body.
 - [*] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [*] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [*] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [*] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/bjchambers/beam metric-names

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3914.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3914


commit 8635e46ca2be718e9a5a32108be177b2a096ca51
Author: bchambers 
Date:   2017-09-27T17:44:48Z

Ensure metric names are not null or empty




---


[jira] [Commented] (BEAM-2058) BigQuery load job id should be generated at run time, not submission time

2017-09-27 Thread Ben Chambers (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183303#comment-16183303
 ] 

Ben Chambers commented on BEAM-2058:


Reuven -- it looks like the PR went in to fix this. Should it be marked as 
closed / added to appropriate release notes / etc?

> BigQuery load job id should be generated at run time, not submission time
> -
>
> Key: BEAM-2058
> URL: https://issues.apache.org/jira/browse/BEAM-2058
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Reuven Lax
>Assignee: Reuven Lax
>
> Currently the job id is generated at submission time, which means that 
> rerunning template jobs will produce the same job id. Generate at run time 
> instead, so a different job id is generated on each execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to normal : beam_PostCommit_Python_Verify #3227

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Resolved] (BEAM-2992) Remove codepaths for reading unsplit BigQuery sources

2017-09-27 Thread Reuven Lax (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Reuven Lax resolved BEAM-2992.
--
Resolution: Fixed

> Remove codepaths for reading unsplit BigQuery sources
> -
>
> Key: BEAM-2992
> URL: https://issues.apache.org/jira/browse/BEAM-2992
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>Priority: Minor
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3940

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2959) Fix proto enums to not use 0 for a valid value

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183219#comment-16183219
 ] 

ASF GitHub Bot commented on BEAM-2959:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3913

[BEAM-2959] Encapsulate enums within a message so that C++/Python have 
meaningful namespaces when importing.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam fn_api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3913.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3913


commit 40c0a74e4a9426b79ecddc2f5d29cae41e0445ce
Author: Luke Cwik 
Date:   2017-09-27T20:33:16Z

[BEAM-2959] Encapsulate enums within a message so that C++/Python have 
meaningful namespaces when importing.




> Fix proto enums to not use 0 for a valid value
> --
>
> Key: BEAM-2959
> URL: https://issues.apache.org/jira/browse/BEAM-2959
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: 2.2.0
>
>
> Proto3 uses 0 as the default value for enums and does not encode it on the 
> wire which means that you can not detect the difference between a value that 
> is unset and an enum that is set but has a value of zero.
> Defining an "YYY_UNSPECIFIED" is considered a best practice. Unfortunately 
> this is not done automatically because of proto2 compatibility within proto3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3913: [BEAM-2959] Encapsulate enums within a message so t...

2017-09-27 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/3913

[BEAM-2959] Encapsulate enums within a message so that C++/Python have 
meaningful namespaces when importing.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam fn_api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3913.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3913


commit 40c0a74e4a9426b79ecddc2f5d29cae41e0445ce
Author: Luke Cwik 
Date:   2017-09-27T20:33:16Z

[BEAM-2959] Encapsulate enums within a message so that C++/Python have 
meaningful namespaces when importing.




---


[jira] [Created] (BEAM-2997) Encapsulate all proto enum types within a message so that they are namespaced in C++/Python

2017-09-27 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2997:
---

 Summary: Encapsulate all proto enum types within a message so that 
they are namespaced in C++/Python
 Key: BEAM-2997
 URL: https://issues.apache.org/jira/browse/BEAM-2997
 Project: Beam
  Issue Type: Improvement
  Components: beam-model, sdk-java-core, sdk-py-core
Reporter: Luke Cwik
Assignee: Luke Cwik


Python/C++ dump enums into the parent namespace. By placing the enum within a 
containing message, we can provide a meaningful name for enums and avoid 
collisions in names.

See:
https://developers.google.com/protocol-buffers/docs/cpptutorial#enums-and-nested-classes
https://developers.google.com/protocol-buffers/docs/pythontutorial#enums



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Build failed in Jenkins: beam_PostCommit_Python_Verify #3226

2017-09-27 Thread Apache Jenkins Server
See 


--
[...truncated 991.48 KB...]
}
  ], 
  "is_pair_like": true
}
  ], 
  "is_stream_like": true
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": "assert_that/Group/GroupByKey.out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s11"
}, 
"serialized_fn": 
"%0AZ%22X%0A%1Dref_Coder_GlobalWindowCoder_1%127%0A5%0A3%0A1urn%3Abeam%3Acoders%3Aurn%3Abeam%3Acoders%3Aglobal_window%3A0.1jR%0A%25%0A%23%0A%21beam%3Awindowfn%3Aglobal_windows%3Av0.1%10%01%1A%1Dref_Coder_GlobalWindowCoder_1%22%02%3A%00%28%010%018%01",
 
"user_name": "assert_that/Group/GroupByKey"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s13", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": "_merge_tagged_vals_under_key"
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}
  ], 
  "is_pair_like": true
}, 
{
  "@type": "kind:global_window"
}
  ], 
  "is_wrapper": true
}, 
"output_name": "out", 
"user_name": 
"assert_that/Group/Map(_merge_tagged_vals_under_key).out"
  }
], 
"parallel_input": {
  "@type": "OutputReference", 
  "output_name": "out", 
  "step_name": "s12"
}, 
"serialized_fn": "", 
"user_name": "assert_that/Group/Map(_merge_tagged_vals_under_key)"
  }
}, 
{
  "kind": "ParallelDo", 
  "name": "s14", 
  "properties": {
"display_data": [
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.CallableWrapperDoFn", 
"type": "STRING", 
"value": ""
  }, 
  {
"key": "fn", 
"label": "Transform Function", 
"namespace": "apache_beam.transforms.core.ParDo", 
"shortValue": "CallableWrapperDoFn", 
"type": "STRING", 
"value": "apache_beam.transforms.core.CallableWrapperDoFn"
  }
], 
"non_parallel_inputs": {}, 
"output_info": [
  {
"encoding": {
  "@type": "kind:windowed_value", 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": [
{
  "@type": 
"FastPrimitivesCoder$eNprYEpOLEhMzkiNT0pNzNVLzk9JLSqGUlxuicUlAUWZuZklmWWpxc4gQa5CBs3GQsbaQqZQ/vi0xJycpMTk7Hiw+kJmPEYFZCZn56RCjWABGsFaW8iWVJykBwDlGS3/",
 
  "component_encodings": []
}, 
{
  "@type": 

[jira] [Commented] (BEAM-2992) Remove codepaths for reading unsplit BigQuery sources

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183206#comment-16183206
 ] 

ASF GitHub Bot commented on BEAM-2992:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3891


> Remove codepaths for reading unsplit BigQuery sources
> -
>
> Key: BEAM-2992
> URL: https://issues.apache.org/jira/browse/BEAM-2992
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Eugene Kirpichov
>Assignee: Eugene Kirpichov
>Priority: Minor
> Fix For: 2.2.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3891: [BEAM-2992] Removes codepaths for reading unsplit B...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3891


---


[2/2] beam git commit: This closes #3891: [BEAM-2992] Removes codepaths for reading unsplit BigQuery sources

2017-09-27 Thread jkff
This closes #3891: [BEAM-2992] Removes codepaths for reading unsplit BigQuery 
sources


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7e3f591b
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7e3f591b
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7e3f591b

Branch: refs/heads/master
Commit: 7e3f591b271bfbe45c20120743eebd410c37dd6b
Parents: 87ea068 18d7c29
Author: Eugene Kirpichov 
Authored: Wed Sep 27 12:37:10 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 12:37:10 2017 -0700

--
 .../beam/sdk/testing/SourceTestUtils.java   |  11 +
 .../io/gcp/bigquery/BigQueryQuerySource.java|   8 -
 .../sdk/io/gcp/bigquery/BigQueryServices.java   |  44 --
 .../io/gcp/bigquery/BigQueryServicesImpl.java   |  64 ---
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java |  42 +-
 .../gcp/bigquery/BigQueryTableRowIterator.java  | 501 ---
 .../io/gcp/bigquery/BigQueryTableSource.java|   9 -
 .../sdk/io/gcp/bigquery/CalculateSchemas.java   |  78 ---
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |  63 +--
 .../bigquery/BigQueryTableRowIteratorTest.java  | 358 -
 .../sdk/io/gcp/bigquery/BigQueryUtilTest.java   | 187 ---
 .../io/gcp/bigquery/FakeBigQueryServices.java   |  78 ---
 12 files changed, 26 insertions(+), 1417 deletions(-)
--




[1/2] beam git commit: Removes codepaths for reading unsplit BigQuery sources

2017-09-27 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 87ea0681e -> 7e3f591b2


Removes codepaths for reading unsplit BigQuery sources


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/18d7c296
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/18d7c296
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/18d7c296

Branch: refs/heads/master
Commit: 18d7c2969eac36c0cb02cf8869c299eb334fbe80
Parents: 87ea068
Author: Eugene Kirpichov 
Authored: Fri Sep 22 16:28:38 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Sep 27 12:32:41 2017 -0700

--
 .../beam/sdk/testing/SourceTestUtils.java   |  11 +
 .../io/gcp/bigquery/BigQueryQuerySource.java|   8 -
 .../sdk/io/gcp/bigquery/BigQueryServices.java   |  44 --
 .../io/gcp/bigquery/BigQueryServicesImpl.java   |  64 ---
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java |  42 +-
 .../gcp/bigquery/BigQueryTableRowIterator.java  | 501 ---
 .../io/gcp/bigquery/BigQueryTableSource.java|   9 -
 .../sdk/io/gcp/bigquery/CalculateSchemas.java   |  78 ---
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java |  63 +--
 .../bigquery/BigQueryTableRowIteratorTest.java  | 358 -
 .../sdk/io/gcp/bigquery/BigQueryUtilTest.java   | 187 ---
 .../io/gcp/bigquery/FakeBigQueryServices.java   |  78 ---
 12 files changed, 26 insertions(+), 1417 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/18d7c296/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java
--
diff --git 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java
index e147221..a324bdd 100644
--- 
a/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java
+++ 
b/sdks/java/core/src/main/java/org/apache/beam/sdk/testing/SourceTestUtils.java
@@ -27,6 +27,7 @@ import static org.junit.Assert.assertThat;
 import static org.junit.Assert.assertTrue;
 
 import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Lists;
 import java.io.IOException;
 import java.util.ArrayList;
 import java.util.List;
@@ -139,6 +140,16 @@ public class SourceTestUtils {
 }
   }
 
+  public static  List readFromSplitsOfSource(
+  BoundedSource source, long desiredBundleSizeBytes, PipelineOptions 
options)
+  throws Exception {
+List res = Lists.newArrayList();
+for (BoundedSource split : source.split(desiredBundleSizeBytes, 
options)) {
+  res.addAll(readFromSource(split, options));
+}
+return res;
+  }
+
   /**
* Reads all elements from the given unstarted {@link Source.Reader}.
*/

http://git-wip-us.apache.org/repos/asf/beam/blob/18d7c296/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
index 2572e19..b92f8cc 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryQuerySource.java
@@ -27,7 +27,6 @@ import 
com.google.api.services.bigquery.model.JobConfigurationQuery;
 import com.google.api.services.bigquery.model.JobReference;
 import com.google.api.services.bigquery.model.JobStatistics;
 import com.google.api.services.bigquery.model.TableReference;
-import com.google.api.services.bigquery.model.TableRow;
 import com.google.common.annotations.VisibleForTesting;
 import java.io.IOException;
 import java.io.ObjectInputStream;
@@ -89,13 +88,6 @@ class BigQueryQuerySource extends BigQuerySourceBase {
   }
 
   @Override
-  public BoundedReader createReader(PipelineOptions options) throws 
IOException {
-BigQueryOptions bqOptions = options.as(BigQueryOptions.class);
-return new BigQueryReader(this, bqServices.getReaderFromQuery(
-bqOptions, bqOptions.getProject(), createBasicQueryConfig()));
-  }
-
-  @Override
   protected TableReference getTableToExtract(BigQueryOptions bqOptions)
   throws IOException, InterruptedException {
 // 1. Find the location of the query.

http://git-wip-us.apache.org/repos/asf/beam/blob/18d7c296/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryServices.java
--
diff --git 

[jira] [Commented] (BEAM-2724) MSEC counters should support Structured Names in Dataflow

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183137#comment-16183137
 ] 

ASF GitHub Bot commented on BEAM-2724:
--

GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/3912

[BEAM-2724] Updating BEAM_CONTAINER_VERSION for new worker

r: @charlesccychen 

Updating to a new worker harness that supports structured names for msec 
counters.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam newworka

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3912


commit 66a988a9e1f97e13a162d004bc8fa6c2d871663d
Author: Pablo 
Date:   2017-09-27T19:44:35Z

Updating BEAM_CONTAINER_VERSION for new worker




> MSEC counters should support Structured Names in Dataflow
> -
>
> Key: BEAM-2724
> URL: https://issues.apache.org/jira/browse/BEAM-2724
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3912: [BEAM-2724] Updating BEAM_CONTAINER_VERSION for new...

2017-09-27 Thread pabloem
GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/3912

[BEAM-2724] Updating BEAM_CONTAINER_VERSION for new worker

r: @charlesccychen 

Updating to a new worker harness that supports structured names for msec 
counters.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam newworka

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3912.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3912


commit 66a988a9e1f97e13a162d004bc8fa6c2d871663d
Author: Pablo 
Date:   2017-09-27T19:44:35Z

Updating BEAM_CONTAINER_VERSION for new worker




---


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Dataflow #4056

2017-09-27 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3911: Avoid using beta grpc implementation.

2017-09-27 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/3911

Avoid using beta grpc implementation.

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [ ] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [ ] Each commit in the pull request should have a meaningful subject 
line and body.
 - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [ ] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [ ] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam no-grpc-beta

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3911.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3911


commit 40ccd43cab9e1d37146a66fda50ef9a9cd9378b3
Author: Robert Bradshaw 
Date:   2017-09-27T19:29:05Z

Avoid using beta grpc implementation.

commit f966e43526c926b64bdfafa5fbf820c5cb92038d
Author: Robert Bradshaw 
Date:   2017-09-27T19:35:41Z

Fix access of grpc via pb2 files.




---


[jira] [Resolved] (BEAM-2959) Fix proto enums to not use 0 for a valid value

2017-09-27 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2959.
-
   Resolution: Fixed
Fix Version/s: 2.2.0

> Fix proto enums to not use 0 for a valid value
> --
>
> Key: BEAM-2959
> URL: https://issues.apache.org/jira/browse/BEAM-2959
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: 2.2.0
>
>
> Proto3 uses 0 as the default value for enums and does not encode it on the 
> wire which means that you can not detect the difference between a value that 
> is unset and an enum that is set but has a value of zero.
> Defining an "YYY_UNSPECIFIED" is considered a best practice. Unfortunately 
> this is not done automatically because of proto2 compatibility within proto3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3939

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2959) Fix proto enums to not use 0 for a valid value

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183085#comment-16183085
 ] 

ASF GitHub Bot commented on BEAM-2959:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3904


> Fix proto enums to not use 0 for a valid value
> --
>
> Key: BEAM-2959
> URL: https://issues.apache.org/jira/browse/BEAM-2959
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> Proto3 uses 0 as the default value for enums and does not encode it on the 
> wire which means that you can not detect the difference between a value that 
> is unset and an enum that is set but has a value of zero.
> Defining an "YYY_UNSPECIFIED" is considered a best practice. Unfortunately 
> this is not done automatically because of proto2 compatibility within proto3.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3904: [BEAM-2959] Fix proto enums to use "YYY_UNSPECIFIED...

2017-09-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/3904


---


[1/2] beam git commit: [BEAM-2959] Fix proto enums to use "YYY_UNSPECIFIED" as the first declared enum.

2017-09-27 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 41239d808 -> 87ea0681e


[BEAM-2959] Fix proto enums to use "YYY_UNSPECIFIED" as the first declared enum.


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/b6c68a6c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/b6c68a6c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/b6c68a6c

Branch: refs/heads/master
Commit: b6c68a6cb3c84d6445a5a494812b59df17627f22
Parents: 41239d8
Author: Luke Cwik 
Authored: Fri Sep 15 11:23:19 2017 -0700
Committer: Luke Cwik 
Committed: Wed Sep 27 11:52:20 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../fn-api/src/main/proto/beam_fn_api.proto | 15 ++---
 .../src/main/proto/beam_job_api.proto   | 34 +-
 .../src/main/proto/beam_runner_api.proto| 66 
 4 files changed, 66 insertions(+), 51 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/b6c68a6c/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index 4d2c5ee..36ccb5a 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -33,7 +33,7 @@
   jar
 
   
-
beam-master-20170922-01
+
beam-master-20170926
 
1
 
6
   

http://git-wip-us.apache.org/repos/asf/beam/blob/b6c68a6c/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
--
diff --git a/sdks/common/fn-api/src/main/proto/beam_fn_api.proto 
b/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
index 9bf1b5f..f2bbd3c 100644
--- a/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
+++ b/sdks/common/fn-api/src/main/proto/beam_fn_api.proto
@@ -631,23 +631,24 @@ message LogEntry {
   // can provide filtering and searching across log types. Users of the API are
   // free not to use all severity levels in their log messages.
   enum Severity {
+SEVERITY_UNSPECIFIED = 0;
 // Trace level information, also the default log level unless
 // another severity is specified.
-TRACE = 0;
+TRACE = 1;
 // Debugging information.
-DEBUG = 10;
+DEBUG = 2;
 // Normal events.
-INFO = 20;
+INFO = 3;
 // Normal but significant events, such as start up, shut down, or
 // configuration.
-NOTICE = 30;
+NOTICE = 4;
 // Warning events might cause problems.
-WARN = 40;
+WARN = 5;
 // Error events are likely to cause problems.
-ERROR = 50;
+ERROR = 6;
 // Critical events cause severe problems or brief outages and may
 // indicate that a person must take action.
-CRITICAL = 60;
+CRITICAL = 7;
   }
 
   // (Required) The severity of the log statement.

http://git-wip-us.apache.org/repos/asf/beam/blob/b6c68a6c/sdks/common/runner-api/src/main/proto/beam_job_api.proto
--
diff --git a/sdks/common/runner-api/src/main/proto/beam_job_api.proto 
b/sdks/common/runner-api/src/main/proto/beam_job_api.proto
index 5fa02ba..9d826ff 100644
--- a/sdks/common/runner-api/src/main/proto/beam_job_api.proto
+++ b/sdks/common/runner-api/src/main/proto/beam_job_api.proto
@@ -134,11 +134,12 @@ message JobMessage {
   string message_text = 4;
 
   enum MessageImportance {
-JOB_MESSAGE_DEBUG = 0;
-JOB_MESSAGE_DETAILED = 1;
-JOB_MESSAGE_BASIC = 2;
-JOB_MESSAGE_WARNING = 3;
-JOB_MESSAGE_ERROR = 4;
+MESSAGE_IMPORTANCE_UNSPECIFIED = 0;
+JOB_MESSAGE_DEBUG = 1;
+JOB_MESSAGE_DETAILED = 2;
+JOB_MESSAGE_BASIC = 3;
+JOB_MESSAGE_WARNING = 4;
+JOB_MESSAGE_ERROR = 5;
   }
 }
 
@@ -152,16 +153,17 @@ message JobMessagesResponse {
 message JobState {
   // Enumeration of all JobStates
   enum JobStateType {
-UNKNOWN = 0;
-STOPPED = 1;
-RUNNING = 2;
-DONE = 3;
-FAILED = 4;
-CANCELLED = 5;
-UPDATED = 6;
-DRAINING = 7;
-DRAINED = 8;
-STARTING = 9;
-CANCELLING = 10;
+JOB_STATE_TYPE_UNSPECIFIED = 0;
+UNKNOWN = 1;
+STOPPED = 2;
+RUNNING = 3;
+DONE = 4;
+FAILED = 5;
+CANCELLED = 6;
+UPDATED = 7;
+DRAINING = 8;
+DRAINED = 9;
+STARTING = 10;
+CANCELLING = 11;
   }
 }

http://git-wip-us.apache.org/repos/asf/beam/blob/b6c68a6c/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
--
diff --git a/sdks/common/runner-api/src/main/proto/beam_runner_api.proto 
b/sdks/common/runner-api/src/main/proto/beam_runner_api.proto
index fb5d47e..3b68993 100644
--- a/sdks/common/runner-api/src/main/proto/beam_runner_api.proto

[2/2] beam git commit: [BEAM-2959] Fix proto enums to use "YYY_UNSPECIFIED" as the first declared enum.

2017-09-27 Thread lcwik
[BEAM-2959] Fix proto enums to use "YYY_UNSPECIFIED" as the first declared enum.

This closes #3904


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/87ea0681
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/87ea0681
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/87ea0681

Branch: refs/heads/master
Commit: 87ea0681e36c0640c02c8d137f83ad73b95e1bc3
Parents: 41239d8 b6c68a6
Author: Luke Cwik 
Authored: Wed Sep 27 11:53:23 2017 -0700
Committer: Luke Cwik 
Committed: Wed Sep 27 11:53:23 2017 -0700

--
 runners/google-cloud-dataflow-java/pom.xml  |  2 +-
 .../fn-api/src/main/proto/beam_fn_api.proto | 15 ++---
 .../src/main/proto/beam_job_api.proto   | 34 +-
 .../src/main/proto/beam_runner_api.proto| 66 
 4 files changed, 66 insertions(+), 51 deletions(-)
--




[jira] [Commented] (BEAM-2802) TextIO should allow specifying a custom delimiter

2017-09-27 Thread Davor Bonaci (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16183062#comment-16183062
 ] 

Davor Bonaci commented on BEAM-2802:


[~ryanskraba] and [~echauchot] -- thank you both for making this happen and 
pushing it through!

> TextIO should allow specifying a custom delimiter
> -
>
> Key: BEAM-2802
> URL: https://issues.apache.org/jira/browse/BEAM-2802
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
> Fix For: 2.2.0
>
>
> Currently TextIO use {{\r}} {{\n}} or {{\r\n}} or a mix of the two to split a 
> text file into PCollection elements. It might happen that a record is spread 
> across more than one line. In that case we should be able to specify a custom 
> record delimiter to be used in place of the default ones.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Spark #3167

2017-09-27 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : beam_PerformanceTests_Python #380

2017-09-27 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3938

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2996) Metric names should not be null or empty

2017-09-27 Thread Ben Chambers (JIRA)
Ben Chambers created BEAM-2996:
--

 Summary: Metric names should not be null or empty
 Key: BEAM-2996
 URL: https://issues.apache.org/jira/browse/BEAM-2996
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core, sdk-py-core
Reporter: Ben Chambers
Assignee: Ben Chambers
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2975) Results of ReadableState.read() should be snapshots of the underlying state

2017-09-27 Thread Daniel Mills (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182927#comment-16182927
 ] 

Daniel Mills commented on BEAM-2975:


In order for the results of a read() call to be used as output() from a DoFn, 
it needs to be an immutable snapshot of the state, given 
[https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L148].
  Not being able to output the results of the read puts a large, unnecessary 
restriction on the State API, so I think the semantics here are the correct 
ones.  Prior to this change, the behavior was unspecified and different across 
runners, which seems even worse.

readLater() should have no semantic effect; it's just a hint to runners to do 
anything they'd like to in order to improve the performance of future reads.

With regards to the Fn API: given that reads and modifications are performed in 
order, the runner has all the information it needs to only return the correct 
elements for a read call given the snapshot semantics.

If the fix in Flink is straightforward, I'd prefer to fix it, or alternatively 
to temporarily disable those three tests for Flink until a fix is ready.

> Results of ReadableState.read() should be snapshots of the underlying state
> ---
>
> Key: BEAM-2975
> URL: https://issues.apache.org/jira/browse/BEAM-2975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Mills
>Assignee: Daniel Mills
>Priority: Minor
>
> Future modification of state should not be reflected in previous calls to 
> read().  For example:
> @StateId("tag") BagState state;
> Iterable ints = state.read();
> state.add(17);
> // ints should still be empty here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182818#comment-16182818
 ] 

Eugene Kirpichov commented on BEAM-2993:


Do you have a more concrete use case? I don't think it's possible to *create* a 
PCollection without knowing the record schema, because the Coder 
for GenericRecord requires a schema - so if somebody has such a collection, I'd 
assume they have the schema as well.

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2724) MSEC counters should support Structured Names in Dataflow

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182782#comment-16182782
 ] 

ASF GitHub Bot commented on BEAM-2724:
--

Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/3786


> MSEC counters should support Structured Names in Dataflow
> -
>
> Key: BEAM-2724
> URL: https://issues.apache.org/jira/browse/BEAM-2724
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2724) MSEC counters should support Structured Names in Dataflow

2017-09-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182783#comment-16182783
 ] 

ASF GitHub Bot commented on BEAM-2724:
--

GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/3786

[BEAM-2724] Preparing statesampler to work with structured names




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam ssampler-structured

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3786.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3786






> MSEC counters should support Structured Names in Dataflow
> -
>
> Key: BEAM-2724
> URL: https://issues.apache.org/jira/browse/BEAM-2724
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3786: [BEAM-2724] Preparing statesampler to work with str...

2017-09-27 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/3786

[BEAM-2724] Preparing statesampler to work with structured names




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam ssampler-structured

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3786.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3786






---


[GitHub] beam pull request #3786: [BEAM-2724] Preparing statesampler to work with str...

2017-09-27 Thread pabloem
Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/3786


---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #3166

2017-09-27 Thread Apache Jenkins Server
See 




Build failed in Jenkins: beam_PerformanceTests_Python #379

2017-09-27 Thread Apache Jenkins Server
See 


Changes:

[ekirpichov] Sets a TTL on BigQueryIO.read().fromQuery() temp dataset

--
[...truncated 714.23 KB...]
   for (int i = 0; i < parts.size(); ++i) {
  ^
third_party/protobuf/src/google/protobuf/util/field_mask_util.cc: In member 
function void 
google::protobuf::util::{anonymous}::FieldMaskTree::IntersectPath(const 
string&, google::protobuf::util::{anonymous}::FieldMaskTree*):
third_party/protobuf/src/google/protobuf/util/field_mask_util.cc:342:34: 
warning: comparison between signed and unsigned integer expressions 
[-Wsign-compare]
   for (int i = 0; i < parts.size(); ++i) {
  ^
third_party/protobuf/src/google/protobuf/util/field_mask_util.cc: In static 
member function static bool 
google::protobuf::util::FieldMaskUtil::IsPathInFieldMask(google::protobuf::StringPiece,
 const FieldMask&):
third_party/protobuf/src/google/protobuf/util/field_mask_util.cc:568:49: 
warning: comparison between signed and unsigned integer expressions 
[-Wsign-compare]
 } else if (mask_path.length() < path.length()) {
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DHAVE_PTHREAD=1 -I. -Igrpc_root 
-Igrpc_root/include -Ithird_party/protobuf/src -I/usr/include/python2.7 -c 
third_party/protobuf/src/google/protobuf/util/field_comparator.cc -o 
build/temp.linux-x86_64-2.7/third_party/protobuf/src/google/protobuf/util/field_comparator.o
 -std=c++11 -fno-wrapv -frtti
cc1plus: warning: command line option -Wstrict-prototypes is valid for 
C/ObjC but not for C++ [enabled by default]
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DHAVE_PTHREAD=1 -I. -Igrpc_root 
-Igrpc_root/include -Ithird_party/protobuf/src -I/usr/include/python2.7 -c 
third_party/protobuf/src/google/protobuf/util/delimited_message_util.cc -o 
build/temp.linux-x86_64-2.7/third_party/protobuf/src/google/protobuf/util/delimited_message_util.o
 -std=c++11 -fno-wrapv -frtti
cc1plus: warning: command line option -Wstrict-prototypes is valid for 
C/ObjC but not for C++ [enabled by default]
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DHAVE_PTHREAD=1 -I. -Igrpc_root 
-Igrpc_root/include -Ithird_party/protobuf/src -I/usr/include/python2.7 -c 
third_party/protobuf/src/google/protobuf/unknown_field_set.cc -o 
build/temp.linux-x86_64-2.7/third_party/protobuf/src/google/protobuf/unknown_field_set.o
 -std=c++11 -fno-wrapv -frtti
cc1plus: warning: command line option -Wstrict-prototypes is valid for 
C/ObjC but not for C++ [enabled by default]
third_party/protobuf/src/google/protobuf/unknown_field_set.cc: In member 
function size_t google::protobuf::UnknownFieldSet::SpaceUsedExcludingSelfLong() 
const:
third_party/protobuf/src/google/protobuf/unknown_field_set.cc:131:37: 
warning: comparison between signed and unsigned integer expressions 
[-Wsign-compare]
   for (int i = 0; i < fields_->size(); i++) {
 ^
third_party/protobuf/src/google/protobuf/unknown_field_set.cc: In member 
function void google::protobuf::UnknownFieldSet::DeleteSubrange(int, int):
third_party/protobuf/src/google/protobuf/unknown_field_set.cc:213:47: 
warning: comparison between signed and unsigned integer expressions 
[-Wsign-compare]
   for (int i = start + num; i < fields_->size(); ++i) {
   ^
third_party/protobuf/src/google/protobuf/unknown_field_set.cc: In member 
function void google::protobuf::UnknownFieldSet::DeleteByNumber(int):
third_party/protobuf/src/google/protobuf/unknown_field_set.cc:230:37: 
warning: comparison between signed and unsigned integer expressions 
[-Wsign-compare]
   for (int i = 0; i < fields_->size(); ++i) {
 ^
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DHAVE_PTHREAD=1 -I. -Igrpc_root 
-Igrpc_root/include -Ithird_party/protobuf/src -I/usr/include/python2.7 -c 
third_party/protobuf/src/google/protobuf/type.pb.cc -o 
build/temp.linux-x86_64-2.7/third_party/protobuf/src/google/protobuf/type.pb.o 
-std=c++11 -fno-wrapv -frtti
cc1plus: warning: command line option -Wstrict-prototypes is valid for 
C/ObjC but not for C++ [enabled by default]
x86_64-linux-gnu-gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 
-Wall -Wstrict-prototypes -fPIC -DHAVE_PTHREAD=1 -I. -Igrpc_root 
-Igrpc_root/include -Ithird_party/protobuf/src -I/usr/include/python2.7 -c 
third_party/protobuf/src/google/protobuf/timestamp.pb.cc -o 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3937

2017-09-27 Thread Apache Jenkins Server
See 




[jira] [Created] (BEAM-2995) can't read/write hdfs in Flink CLUSTER(Standalone)

2017-09-27 Thread huangjianhuang (JIRA)
huangjianhuang created BEAM-2995:


 Summary: can't read/write hdfs in Flink CLUSTER(Standalone)
 Key: BEAM-2995
 URL: https://issues.apache.org/jira/browse/BEAM-2995
 Project: Beam
  Issue Type: Bug
  Components: runner-flink
Affects Versions: 2.2.0
Reporter: huangjianhuang
Assignee: Aljoscha Krettek


i just write a simple demo like:

{code:java}
Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
//other codes
p.apply("ReadLines", 
TextIO.read().from("hdfs://localhost:9000/tmp/words"))
.apply(TextIO.write().to("hdfs://localhost:9000/tmp/hdfsout"));
{code}

it works in flink local model with cmd:

{code:java}
mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS -Pflink-runner 
-Dexec.args="--runner=FlinkRunner 
--filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar"
{code}

but not works in CLUSTER mode:

{code:java}
mvn exec:java -Dexec.mainClass=com.joe.FlinkWithHDFS -Pflink-runner 
-Dexec.args="--runner=FlinkRunner 
--filesToStage=target/flinkBeam-2.2.0-SNAPSHOT-shaded.jar 
--flinkMaster=localhost:6123 "
{code}

it seems the flink cluster regard the hdfs as local file system. 
The input log from flink-jobmanger.log is:

{code:java}
2017-09-27 20:17:37,962 INFO  org.apache.flink.runtime.jobmanager.JobManager
- Successfully ran initialization on master in 136 ms.
2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource
- {color:red}Filepattern hdfs://localhost:9000/tmp/words2 matched 0 
files with total size 0{color}
2017-09-27 20:17:37,968 INFO  org.apache.beam.sdk.io.FileBasedSource
- Splitting filepattern hdfs://localhost:9000/tmp/words2 into 
bundles of size 0 took 0 ms and produced 0 files a
nd 0 bundles

{code}

The output  error message is :

{code:java}
Caused by: java.lang.ClassCastException: 
{color:red}org.apache.beam.sdk.io.hdfs.HadoopResourceId cannot be cast to 
org.apache.beam.sdk.io.LocalResourceId{color}
at 
org.apache.beam.sdk.io.LocalFileSystem.create(LocalFileSystem.java:77)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:256)
at org.apache.beam.sdk.io.FileSystems.create(FileSystems.java:243)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.open(FileBasedSink.java:922)
at 
org.apache.beam.sdk.io.FileBasedSink$Writer.openUnwindowed(FileBasedSink.java:884)
at 
org.apache.beam.sdk.io.WriteFiles.finalizeForDestinationFillEmptyShards(WriteFiles.java:909)
at org.apache.beam.sdk.io.WriteFiles.access$900(WriteFiles.java:110)
at 
org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:858)

{code}

can somebody help me, i've try all the way just can't work it out [cry]
https://issues.apache.org/jira/browse/BEAM-2457






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2994) Refactor TikaIO

2017-09-27 Thread Sergey Beryozkin (JIRA)
Sergey Beryozkin created BEAM-2994:
--

 Summary: Refactor TikaIO
 Key: BEAM-2994
 URL: https://issues.apache.org/jira/browse/BEAM-2994
 Project: Beam
  Issue Type: Task
  Components: sdk-java-extensions
Affects Versions: 2.2.0
Reporter: Sergey Beryozkin
Assignee: Reuven Lax
 Fix For: 2.2.0


TikaIO is currently implemented as a BoundedSource and asynchronous 
BoundedReader returning individual document's text chunks as Strings, 
eventually passed unordered (and not linked to the original documents) to the 
pipeline functions.

It was decided in the recent beam-dev thread that initially TikaIO should 
support the cases where only a single composite bean per file, capturing the 
file content, location (or name) and metadata, should flow to the pipeline, and 
thus avoiding the need to implement TikaIO as a BoundedSource/Reader.

Enhancing  TikaIO to support the streaming of the content into the pipelines 
may be considered in the next phase, based on the specific use-cases... 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-995) Apache Pig DSL

2017-09-27 Thread James Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182502#comment-16182502
 ] 

James Xu commented on BEAM-995:
---

[~nielsbasjes] Pig actually is doing the similar thing as Beam: 

# They both defined an unified data processing API
# They both support several backends.

So do pig-on-beam on either side does not have so much difference. I prefer to 
do it on beam-side because:

# BEAM is already doing the `support several backends` thing, let's just let 
BEAM do it, make pig focus more on its primary advantage: the friendly API.
# To align with other extension like SQL.


For the pros to do it on pig-side you mentioned:

1. Builtin facilities for loading UDFs and UDAFs

> Yes, I agree, the existing UDFs and UDAFs are very important. If we do 
> pig-on-beam on beam-side, we will have something like `UDFAdapter` which will 
> adapt all existing UDFs, so we can use them in the new pig-on-beam.

2. Execution flow optimizer(s)

> There is pipeline optimizer in BEAM, and also an optimizer in underline 
> engine(Spark, MapReduce), will pig optimizer matter so much in this context? 
> (I am not familiar with Pig, correct me if I am wrong)

3. A selection of execution backends.

> Beam itself supports all the different backends.

> Apache Pig DSL
> --
>
> Key: BEAM-995
> URL: https://issues.apache.org/jira/browse/BEAM-995
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Apache Pig is still popular and the language is not so large.
> Providing a DSL using the Pig language would potentially allow more people to 
> use Beam (at least during a transition period).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182143#comment-16182143
 ] 

Etienne Chauchot edited comment on BEAM-2993 at 9/27/17 11:43 AM:
--

I was thinking of maybe introducing a schema sideInput in 
{{ConstantAvroDestination}}
[~jkff] you coded https://issues.apache.org/jira/browse/BEAM-2677, do you have 
any comments?


was (Author: echauchot):
[~jkff] you coded https://issues.apache.org/jira/browse/BEAM-2677, do you have 
any comments?

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-995) Apache Pig DSL

2017-09-27 Thread Niels Basjes (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182308#comment-16182308
 ] 

Niels Basjes commented on BEAM-995:
---

[~xumingming] I was talking about this issue last week with a collegue and we 
have doubts about building this at the Beam side. 
If I look at Apache Pig at a high level I see a lot more than _just a parser_. 
A quick list of things I see in Pig
# Language spec and parser
# Builtin facilities for loading UDFs and UDAFs
# A lot of builtin functions (some in pig, some in pig contrib)
# Execution flow optimizer(s)
# A selection of execution backends (i.e. MapReduce, Tez, Spark). 

Looking at the large number of features in the first few of this list makes me 
believe that adding Beam as an additional execution backend (without ANY 
optimizations) is easier to build and maintain in the long run.


> Apache Pig DSL
> --
>
> Key: BEAM-995
> URL: https://issues.apache.org/jira/browse/BEAM-995
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-ideas
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>
> Apache Pig is still popular and the language is not so large.
> Providing a DSL using the Pig language would potentially allow more people to 
> use Beam (at least during a transition period).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot reassigned BEAM-2993:
--

Assignee: Etienne Chauchot

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] beam pull request #3910: Fix "Writing data to multiple destinations" part of...

2017-09-27 Thread echauchot
GitHub user echauchot opened a pull request:

https://github.com/apache/beam/pull/3910

Fix "Writing data to multiple destinations" part of AvroIO javadoc

Follow this checklist to help us incorporate your contribution quickly and 
easily:

 - [] Make sure there is a [JIRA 
issue](https://issues.apache.org/jira/projects/BEAM/issues/) filed for the 
change (usually before you start working on it).  Trivial changes like typos do 
not require a JIRA issue.  Your pull request should address just this issue, 
without pulling in other changes.
 - [X] Each commit in the pull request should have a meaningful subject 
line and body.
 - [] Format the pull request title like `[BEAM-XXX] Fixes bug in 
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA 
issue.
 - [] Write a pull request description that is detailed enough to 
understand what the pull request does, how, and why.
 - [] Run `mvn clean verify` to make sure basic checks pass. A more 
thorough check will be performed on your pull request automatically.
 - [X] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).

---
R: @jkff 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/echauchot/beam AvroIO_javadoc_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3910.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3910


commit 902f556573eec3f069c3f4d8983562ff6477155a
Author: Etienne Chauchot 
Date:   2017-09-27T09:27:07Z

Fix "Writing data to multiple destinations" part of AvroIO javadoc




---


Jenkins build is back to normal : beam_PostCommit_Python_Verify #3224

2017-09-27 Thread Apache Jenkins Server
See 




[GitHub] beam pull request #3909: Adapt Flink StateInternals to new state semantics

2017-09-27 Thread aljoscha
GitHub user aljoscha opened a pull request:

https://github.com/apache/beam/pull/3909

Adapt Flink StateInternals to new state semantics

This was changed in #3876.

R: @tgroh (not sure I caught all the places where this needs to change)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aljoscha/beam fix-flink-post-commit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/3909.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3909


commit 93d1d9e89c82c2b79faebb1408cb54035197a4d5
Author: Aljoscha Krettek 
Date:   2017-09-27T08:00:36Z

Adapt Flink StateInternals to new state semantics




---


[jira] [Updated] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-2993:
---
Description: Similarly to https://issues.apache.org/jira/browse/BEAM-2677, 
we should be able to write to avro files using {{AvroIO}} without specifying a 
schema at build time. Consider the following use case: a user has a 
{{PCollection}}  but the schema is only known while running the 
pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the schema is 
already available in {{GenericRecord}}. We should be able to call 
{{AvroIO.writeGenericRecords()}} with no schema.  (was: Similarly to 
https://issues.apache.org/jira/browse/BEAM-2677, we should be able to write to 
avro files using {{AvroIO}} without specifying a schema at build time. Consider 
the following use case: a user has a {{PCollection}}  but the 
schema is only known while running the pipeline.  
{{AvroIO.writeGenericRecords}} needs the schema, but the schema is already 
available in {{GenericRecord}}. We should be able to call 
{{AvroIO.writeGenericRecords()} with no schema.)

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()}} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2975) Results of ReadableState.read() should be snapshots of the underlying state

2017-09-27 Thread Aljoscha Krettek (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182151#comment-16182151
 ] 

Aljoscha Krettek commented on BEAM-2975:


These changes also broke the PostCommit tests for the Flink Runner: 
https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/org.apache.beam$beam-runners-flink_2.10/3936/#showFailuresLink

It seems there is still need for some discussion so what should we do? Revert 
the changes for now or fix the Flink Runner to work with the new semantics to 
get the signal back?

> Results of ReadableState.read() should be snapshots of the underlying state
> ---
>
> Key: BEAM-2975
> URL: https://issues.apache.org/jira/browse/BEAM-2975
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Daniel Mills
>Assignee: Daniel Mills
>Priority: Minor
>
> Future modification of state should not be reflected in previous calls to 
> read().  For example:
> @StateId("tag") BagState state;
> Iterable ints = state.read();
> state.add(17);
> // ints should still be empty here.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16182143#comment-16182143
 ] 

Etienne Chauchot commented on BEAM-2993:


[~jkff] you coded https://issues.apache.org/jira/browse/BEAM-2677, do you have 
any comments?

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot reassigned BEAM-2993:
--

Assignee: (was: Reuven Lax)

> AvroIO.write without specifying a schema
> 
>
> Key: BEAM-2993
> URL: https://issues.apache.org/jira/browse/BEAM-2993
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Etienne Chauchot
>
> Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be 
> able to write to avro files using {{AvroIO}} without specifying a schema at 
> build time. Consider the following use case: a user has a 
> {{PCollection}}  but the schema is only known while running 
> the pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the 
> schema is already available in {{GenericRecord}}. We should be able to call 
> {{AvroIO.writeGenericRecords()} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (BEAM-2993) AvroIO.write without specifying a schema

2017-09-27 Thread Etienne Chauchot (JIRA)
Etienne Chauchot created BEAM-2993:
--

 Summary: AvroIO.write without specifying a schema
 Key: BEAM-2993
 URL: https://issues.apache.org/jira/browse/BEAM-2993
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Etienne Chauchot
Assignee: Reuven Lax


Similarly to https://issues.apache.org/jira/browse/BEAM-2677, we should be able 
to write to avro files using {{AvroIO}} without specifying a schema at build 
time. Consider the following use case: a user has a 
{{PCollection}}  but the schema is only known while running the 
pipeline.  {{AvroIO.writeGenericRecords}} needs the schema, but the schema is 
already available in {{GenericRecord}}. We should be able to call 
{{AvroIO.writeGenericRecords()} with no schema.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #3936

2017-09-27 Thread Apache Jenkins Server
See