[jira] [Created] (BEAM-2014) Upgrade to Google Auth 0.6.1

2017-04-19 Thread JIRA
Jean-Baptiste Onofré created BEAM-2014:
--

 Summary: Upgrade to Google Auth 0.6.1
 Key: BEAM-2014
 URL: https://issues.apache.org/jira/browse/BEAM-2014
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PostCommit_Java_ValidatesRunner_Flink #2400

2017-04-19 Thread Apache Jenkins Server
See 


--
[...truncated 800.03 KB...]
2017-04-19T15:42:52.796 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptor-kerberos/2.0.0-M15/apacheds-interceptor-kerberos-2.0.0-M15.jar
2017-04-19T15:42:52.827 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptor-kerberos/2.0.0-M15/apacheds-interceptor-kerberos-2.0.0-M15.jar
 (16 KB at 7.3 KB/sec)
2017-04-19T15:42:52.827 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-core/2.0.0-M15/apacheds-core-2.0.0-M15.jar
2017-04-19T15:42:52.861 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-core/2.0.0-M15/apacheds-core-2.0.0-M15.jar
 (45 KB at 21.3 KB/sec)
2017-04-19T15:42:52.861 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-admin/2.0.0-M15/apacheds-interceptors-admin-2.0.0-M15.jar
2017-04-19T15:42:52.878 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/api/api-ldap-model/1.0.0-M20/api-ldap-model-1.0.0-M20.jar
 (868 KB at 410.3 KB/sec)
2017-04-19T15:42:52.879 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-authn/2.0.0-M15/apacheds-interceptors-authn-2.0.0-M15.jar
2017-04-19T15:42:52.881 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/mina/mina-core/2.0.7/mina-core-2.0.7.jar
 (630 KB at 297.3 KB/sec)
2017-04-19T15:42:52.881 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-authz/2.0.0-M15/apacheds-interceptors-authz-2.0.0-M15.jar
2017-04-19T15:42:52.891 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-admin/2.0.0-M15/apacheds-interceptors-admin-2.0.0-M15.jar
 (20 KB at 9.2 KB/sec)
2017-04-19T15:42:52.891 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-changelog/2.0.0-M15/apacheds-interceptors-changelog-2.0.0-M15.jar
2017-04-19T15:42:52.909 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-authn/2.0.0-M15/apacheds-interceptors-authn-2.0.0-M15.jar
 (41 KB at 18.9 KB/sec)
2017-04-19T15:42:52.909 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-collective/2.0.0-M15/apacheds-interceptors-collective-2.0.0-M15.jar
2017-04-19T15:42:52.915 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-authz/2.0.0-M15/apacheds-interceptors-authz-2.0.0-M15.jar
 (63 KB at 29.2 KB/sec)
2017-04-19T15:42:52.915 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-event/2.0.0-M15/apacheds-interceptors-event-2.0.0-M15.jar
2017-04-19T15:42:52.915 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/net/sf/ehcache/ehcache-core/2.4.4/ehcache-core-2.4.4.jar
 (986 KB at 458.1 KB/sec)
2017-04-19T15:42:52.915 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-exception/2.0.0-M15/apacheds-interceptors-exception-2.0.0-M15.jar
2017-04-19T15:42:52.918 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-changelog/2.0.0-M15/apacheds-interceptors-changelog-2.0.0-M15.jar
 (21 KB at 9.7 KB/sec)
2017-04-19T15:42:52.919 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-journal/2.0.0-M15/apacheds-interceptors-journal-2.0.0-M15.jar
2017-04-19T15:42:52.937 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-collective/2.0.0-M15/apacheds-interceptors-collective-2.0.0-M15.jar
 (14 KB at 6.2 KB/sec)
2017-04-19T15:42:52.937 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-normalization/2.0.0-M15/apacheds-interceptors-normalization-2.0.0-M15.jar
2017-04-19T15:42:52.942 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-event/2.0.0-M15/apacheds-interceptors-event-2.0.0-M15.jar
 (19 KB at 8.4 KB/sec)
2017-04-19T15:42:52.942 [INFO] Downloading: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-operational/2.0.0-M15/apacheds-interceptors-operational-2.0.0-M15.jar
2017-04-19T15:42:52.942 [INFO] Downloaded: 
https://repo.maven.apache.org/maven2/org/apache/directory/server/apacheds-interceptors-exception/2.0.0-M15/apacheds-interceptors-exception-2.0.0-M15.jar
 (11 KB at 5.0 KB/sec)
2017-04-19T15:42:52.942 [INFO] Downloading: 

Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #2880

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974783#comment-15974783
 ] 

Jean-Baptiste Onofré commented on BEAM-2005:


I'm thinking about something like:

{code}
TextIO.from("hdfs://credentialTag@namenode/path/to/file")
{code}

[~sisk] [~dhalp...@google.com] Thoughts ?

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Affects Versions: First stable release
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-1948) Null pointer exception in DirectRunner.DirectPipelineResult.getAggregatorValues()

2017-04-19 Thread Etienne Chauchot (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Etienne Chauchot updated BEAM-1948:
---
Description: 
null pointer exception is due to an {{Aggregator}} not being present in 
{{aggregatorSteps}} (maybe because not present in the DAG).
We can reproduce the null pointer exception with a simple pipeline with an 
{{Aggregator}} and a {{State}} like this one:

{code}
IdentityDoFn identityDoFn = new IdentityDoFn();
p.apply(Create.of(KV.of("key", "element1"), KV.of("key", "element2"), 
KV.of("key", "element3")))
.apply(ParDo.of(identityDoFn));
PipelineResult pipelineResult = p.run();
pipelineResult.getAggregatorValues(identityDoFn.getCounter()).getValues();



  private static class IdentityDoFn extends DoFn, String> {
private final Aggregator counter = createAggregator("counter", 
Sum.ofLongs());
private static final String STATE_ID = "state";
@StateId(STATE_ID)
private static final StateSpec stateSpec =
StateSpecs.value(StringUtf8Coder.of());

@ProcessElement
public void processElement(ProcessContext context, @StateId(STATE_ID) 
ValueState state){
  state.write("state content");
  counter.addValue(1L);
  context.output(context.element().getValue());
}

public Aggregator getCounter() {
  return counter;
}
  }
{code}

  was:
Running query3 of nexmark 
(https://github.com/iemejia/beam/blob/BEAM-160-nexmark/integration/java/nexmark/src/main/java/org/apache/beam/integration/nexmark/queries/Query3Model.java)
 in streaming mode (UnboundedSource) on Direct runner generates a null pointer 
exception  in {code} DirectRunner.DirectPipelineResult.getAggregatorValues() 
{code}

In
{code} if (steps.contains(transform.getTransform())) {code}
{code} steps == null  {code}

to reproduce it :
run {code} NexmarkDirectDriver.main() {code}
with options
{code}
--query=3 --streaming=true --numEventGenerators=4 --manageResources=false 
--monitorJobs=true --enforceEncodability=false --enforceImmutability=false
{code}
see the repo in link above


> Null pointer exception in 
> DirectRunner.DirectPipelineResult.getAggregatorValues()
> -
>
> Key: BEAM-1948
> URL: https://issues.apache.org/jira/browse/BEAM-1948
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> null pointer exception is due to an {{Aggregator}} not being present in 
> {{aggregatorSteps}} (maybe because not present in the DAG).
> We can reproduce the null pointer exception with a simple pipeline with an 
> {{Aggregator}} and a {{State}} like this one:
> {code}
> IdentityDoFn identityDoFn = new IdentityDoFn();
> p.apply(Create.of(KV.of("key", "element1"), KV.of("key", "element2"), 
> KV.of("key", "element3")))
> .apply(ParDo.of(identityDoFn));
> PipelineResult pipelineResult = p.run();
> pipelineResult.getAggregatorValues(identityDoFn.getCounter()).getValues();
>   private static class IdentityDoFn extends DoFn, String> {
> private final Aggregator counter = 
> createAggregator("counter", Sum.ofLongs());
> private static final String STATE_ID = "state";
> @StateId(STATE_ID)
> private static final StateSpec stateSpec =
> StateSpecs.value(StringUtf8Coder.of());
> @ProcessElement
> public void processElement(ProcessContext context, @StateId(STATE_ID) 
> ValueState state){
>   state.write("state content");
>   counter.addValue(1L);
>   context.output(context.element().getValue());
> }
> public Aggregator getCounter() {
>   return counter;
> }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2013) Upgrade to Jackson 2.8.8

2017-04-19 Thread JIRA
Jean-Baptiste Onofré created BEAM-2013:
--

 Summary: Upgrade to Jackson 2.8.8
 Key: BEAM-2013
 URL: https://issues.apache.org/jira/browse/BEAM-2013
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Jean-Baptiste Onofré
Assignee: Jean-Baptiste Onofré






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1783) Add Integration Tests for HBaseIO

2017-04-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-1783:
--

Assignee: (was: Davor Bonaci)

> Add Integration Tests for HBaseIO
> -
>
> Key: BEAM-1783
> URL: https://issues.apache.org/jira/browse/BEAM-1783
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-extensions, testing
>Reporter: Ismaël Mejía
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (BEAM-1948) Null pointer exception in DirectRunner.DirectPipelineResult.getAggregatorValues()

2017-04-19 Thread Etienne Chauchot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965893#comment-15965893
 ] 

Etienne Chauchot edited comment on BEAM-1948 at 4/19/17 2:51 PM:
-

put priority = minor because it uses aggregators (to implement metrics) that 
will disappear. 


was (Author: echauchot):
put priority = minor because it uses aggregators (to implement metrics) that 
will disappear. [~tgroh], if you need any info on the query3 of nexmark, please 
contact me.

> Null pointer exception in 
> DirectRunner.DirectPipelineResult.getAggregatorValues()
> -
>
> Key: BEAM-1948
> URL: https://issues.apache.org/jira/browse/BEAM-1948
> Project: Beam
>  Issue Type: Bug
>  Components: runner-direct
>Reporter: Etienne Chauchot
>Assignee: Etienne Chauchot
>Priority: Minor
>
> Running query3 of nexmark 
> (https://github.com/iemejia/beam/blob/BEAM-160-nexmark/integration/java/nexmark/src/main/java/org/apache/beam/integration/nexmark/queries/Query3Model.java)
>  in streaming mode (UnboundedSource) on Direct runner generates a null 
> pointer exception  in {code} 
> DirectRunner.DirectPipelineResult.getAggregatorValues() {code}
> In
> {code} if (steps.contains(transform.getTransform())) {code}
> {code} steps == null  {code}
> to reproduce it :
> run {code} NexmarkDirectDriver.main() {code}
> with options
> {code}
> --query=3 --streaming=true --numEventGenerators=4 --manageResources=false 
> --monitorJobs=true --enforceEncodability=false --enforceImmutability=false
> {code}
> see the repo in link above



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2590: [BEAM-2014] Upgrade to Google Auth 0.6.1

2017-04-19 Thread jbonofre
GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2590

[BEAM-2014] Upgrade to Google Auth 0.6.1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam BEAM-2014-GOOGLE-AUTH

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2590.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2590






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2014) Upgrade to Google Auth 0.6.1

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974832#comment-15974832
 ] 

ASF GitHub Bot commented on BEAM-2014:
--

GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2590

[BEAM-2014] Upgrade to Google Auth 0.6.1

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam BEAM-2014-GOOGLE-AUTH

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2590.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2590






> Upgrade to Google Auth 0.6.1
> 
>
> Key: BEAM-2014
> URL: https://issues.apache.org/jira/browse/BEAM-2014
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2013) Upgrade to Jackson 2.8.8

2017-04-19 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974831#comment-15974831
 ] 

Daniel Halperin commented on BEAM-2013:
---

What motivates this change?

> Upgrade to Jackson 2.8.8
> 
>
> Key: BEAM-2013
> URL: https://issues.apache.org/jira/browse/BEAM-2013
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1532) Improve splitting (support sub-splits) for HBaseIO

2017-04-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-1532:
--

Assignee: (was: Ismaël Mejía)

> Improve splitting (support sub-splits) for HBaseIO
> --
>
> Key: BEAM-1532
> URL: https://issues.apache.org/jira/browse/BEAM-1532
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Ismaël Mejía
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1783) Add Integration Tests for HBaseIO

2017-04-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-1783:
--

Assignee: Davor Bonaci  (was: Ismaël Mejía)

> Add Integration Tests for HBaseIO
> -
>
> Key: BEAM-1783
> URL: https://issues.apache.org/jira/browse/BEAM-1783
> Project: Beam
>  Issue Type: Test
>  Components: sdk-java-extensions, testing
>Reporter: Ismaël Mejía
>Assignee: Davor Bonaci
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1533) Switch split implementation to use Beam’s classes for HBaseIO

2017-04-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-1533:
--

Assignee: (was: Ismaël Mejía)

> Switch split implementation to use Beam’s classes for HBaseIO
> -
>
> Key: BEAM-1533
> URL: https://issues.apache.org/jira/browse/BEAM-1533
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Ismaël Mejía
>Priority: Minor
>
> Current implementation uses classic java's byte array manipulation, this is 
> error-prone, it could benefit of using the existing ByteKey, ByteKeyRange,  
> ByteKeyRangeTracker classes from the SDK. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1997) Scaling Problem of Beam (size of the serialized JSON representation of the pipeline exceeds the allowable limit)

2017-04-19 Thread Tobias Feldhaus (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tobias Feldhaus resolved BEAM-1997.
---
   Resolution: Invalid
Fix Version/s: 0.6.0

> Scaling Problem of Beam (size of the serialized JSON representation of the 
> pipeline exceeds the allowable limit)
> 
>
> Key: BEAM-1997
> URL: https://issues.apache.org/jira/browse/BEAM-1997
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 0.6.0
>Reporter: Tobias Feldhaus
>Assignee: Daniel Halperin
> Fix For: 0.6.0
>
>
> After switching from Dataflow SDK 1.9 to Apache Beam SDK 0.6 my pipeline does 
> no longer run with 180 output days (BigQuery partitions as sinks), but only 
> 60 output days. If using a larger number with Beam the response from the 
> Cloud  Dataflow service reads as follows:
> {code}
> Failed to create a workflow job: The size of the serialized JSON 
> representation of the pipeline exceeds the allowable limit. For more 
> information, please check the FAQ link below:
> {code}
> This is the pipeline in dataflow: 
> https://gist.github.com/james-woods/f84b6784ee6d1b87b617f80f8c7dd59f
> The resulting graph in Dataflow looks like this: 
> https://puu.sh/vhWAW/a12f3246a1.png
> This is the same pipeline in beam: 
> https://gist.github.com/james-woods/c4565db769b0494e0bef5e9c334c
> The constructed graph looks somewhat different:
> https://puu.sh/vhWvm/78a40d422d.png
> Methods used are taken from this example 
> https://gist.github.com/dhalperi/4bbd13021dd5f9998250cff99b155db6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-501) Update website skin

2017-04-19 Thread Adelaide Taylor (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adelaide Taylor updated BEAM-501:
-
Description: 
Update the main landing page and website skin as discussed here

https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
 

--

*[Adelaide added - 4/18]*
- Attached is the most up-to-date mock up for the Beam website redesign

  was:
Update the main landing page and website skin as discussed here

https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
 

--

*[Adelaide added - 4/18]*
- Attached is the most up-to-date mock up for the Beam website redesign
- Here is the Beam dev trix to log any feedback, notes, timeline, and staging 
links: 
https://docs.google.com/spreadsheets/d/1mo4qIm3JmLd1eal7-mpLTsVDai53Xt-dDsIzhgyFDDA/edit#gid=388817810
- Apache Beam (current website) Teardown - 
https://docs.google.com/a/google.com/document/d/1tQWTPeaGGSYWJlOtTGMoYgWr_xaSIeBNaNk5gHo_0oU/edit?usp=drive_web


> Update website skin
> ---
>
> Key: BEAM-501
> URL: https://issues.apache.org/jira/browse/BEAM-501
> Project: Beam
>  Issue Type: Improvement
>  Components: website
>Reporter: Frances Perry
>Assignee: Jeremy Weinstein
> Attachments: Design - Desktop Landscape.png
>
>
> Update the main landing page and website skin as discussed here
> https://docs.google.com/document/d/1-0jMv7NnYp0Ttt4voulUMwVe_qjBYeNMLm2LusYF3gQ/edit
>  
> --
> *[Adelaide added - 4/18]*
> - Attached is the most up-to-date mock up for the Beam website redesign



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2013) Upgrade to Jackson 2.8.8

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974842#comment-15974842
 ] 

Jean-Baptiste Onofré commented on BEAM-2013:


The main reason is fixes in jackson-databind and jackson-yaml in 2.8.8, 
especially:

* https://github.com/FasterXML/jackson-databind/issues/1345
* https://github.com/FasterXML/jackson-databind/issues/1533
* https://github.com/FasterXML/jackson-databind/issues/1570
* https://github.com/FasterXML/jackson-dataformat-yaml/issues/72

(that I faced ;))

> Upgrade to Jackson 2.8.8
> 
>
> Key: BEAM-2013
> URL: https://issues.apache.org/jira/browse/BEAM-2013
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2589: [BEAM-2013] Upgrade to Jackson 2.8.8

2017-04-19 Thread jbonofre
GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2589

[BEAM-2013] Upgrade to Jackson 2.8.8

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam BEAM-2013-JACKSON

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2589.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2589






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Assigned] (BEAM-1531) Support dynamic work rebalancing for HBaseIO

2017-04-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/BEAM-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ismaël Mejía reassigned BEAM-1531:
--

Assignee: (was: Ismaël Mejía)

> Support dynamic work rebalancing for HBaseIO
> 
>
> Key: BEAM-1531
> URL: https://issues.apache.org/jira/browse/BEAM-1531
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Ismaël Mejía
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2013) Upgrade to Jackson 2.8.8

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974775#comment-15974775
 ] 

ASF GitHub Bot commented on BEAM-2013:
--

GitHub user jbonofre opened a pull request:

https://github.com/apache/beam/pull/2589

[BEAM-2013] Upgrade to Jackson 2.8.8

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [X] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [X] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [X] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [X] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jbonofre/beam BEAM-2013-JACKSON

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2589.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2589






> Upgrade to Jackson 2.8.8
> 
>
> Key: BEAM-2013
> URL: https://issues.apache.org/jira/browse/BEAM-2013
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-1997) Scaling Problem of Beam (size of the serialized JSON representation of the pipeline exceeds the allowable limit)

2017-04-19 Thread Tobias Feldhaus (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974777#comment-15974777
 ] 

Tobias Feldhaus commented on BEAM-1997:
---

Mea culpa, it seems like I've had more than one file per day, leading to a 3-4 
times larger pipeline, this explains the problem. 

> Scaling Problem of Beam (size of the serialized JSON representation of the 
> pipeline exceeds the allowable limit)
> 
>
> Key: BEAM-1997
> URL: https://issues.apache.org/jira/browse/BEAM-1997
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: 0.6.0
>Reporter: Tobias Feldhaus
>Assignee: Daniel Halperin
>
> After switching from Dataflow SDK 1.9 to Apache Beam SDK 0.6 my pipeline does 
> no longer run with 180 output days (BigQuery partitions as sinks), but only 
> 60 output days. If using a larger number with Beam the response from the 
> Cloud  Dataflow service reads as follows:
> {code}
> Failed to create a workflow job: The size of the serialized JSON 
> representation of the pipeline exceeds the allowable limit. For more 
> information, please check the FAQ link below:
> {code}
> This is the pipeline in dataflow: 
> https://gist.github.com/james-woods/f84b6784ee6d1b87b617f80f8c7dd59f
> The resulting graph in Dataflow looks like this: 
> https://puu.sh/vhWAW/a12f3246a1.png
> This is the same pipeline in beam: 
> https://gist.github.com/james-woods/c4565db769b0494e0bef5e9c334c
> The constructed graph looks somewhat different:
> https://puu.sh/vhWvm/78a40d422d.png
> Methods used are taken from this example 
> https://gist.github.com/dhalperi/4bbd13021dd5f9998250cff99b155db6



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974779#comment-15974779
 ] 

Jean-Baptiste Onofré commented on BEAM-2005:


By the way, an important part in HDFS is the support of Kerberos. I think it 
would require a couple of additional methods (as we have for Google Storage) 
around kerberos authentication.

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Affects Versions: First stable release
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974983#comment-15974983
 ] 

ASF GitHub Bot commented on BEAM-2015:
--

GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2592

[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Diff'd effective poms to compare changes and find all inherited properties 
in sub runner modules.

I was limited by the seed job failing to make some other minor changes in 
clean-up.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam dataflow_post_commit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2592.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2592


commit ece72d42bdda512eadc78cfcf8e9dfe7b117a41c
Author: Luke Cwik 
Date:   2017-04-19T16:10:39Z

[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit




> Dataflow PostCommit has been failing since build #2814
> --
>
> Key: BEAM-2015
> URL: https://issues.apache.org/jira/browse/BEAM-2015
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/
> The issue is the runners/core-java is inheriting the maven profile which is 
> being activated by the specification of the validatesRunnerPipelineOptions 
> system property.
> This issue had not been a problem in the past because the dependency order 
> had made the runners/google-cloud-dataflow-java part of the classpath.
> There was an assumption that runners/ would only have runner modules and that 
> all submodules would want to have the shared definition. This is no longer 
> the case.
> {code}
> Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
> FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
> testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
>   Time elapsed: 0.003 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Unknown 'runner' specified 
> 'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
> pipeline runners [RegisteredTestRunner]
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
>   at 
> org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at 

[jira] [Updated] (BEAM-1441) Add FileSystem support to Python SDK

2017-04-19 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-1441:
--
Summary: Add FileSystem support to Python SDK  (was: Add an 
IOChannelFactory interface to Python SDK)

> Add FileSystem support to Python SDK
> 
>
> Key: BEAM-1441
> URL: https://issues.apache.org/jira/browse/BEAM-1441
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Sourabh Bajaj
> Fix For: First stable release
>
>
> Based on proposal [1], an IOChannelFactory interface was added to Java SDK  
> [2].
> We should add a similar interface to Python SDK and provide proper 
> implementations for native files, GCS, and other useful formats.
> Python SDK currently has a basic ChannelFactory interface [3] which is used 
> by FileBasedSource [4].
> [1] 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#heading=h.kpqagzh8i11w
> [2] 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelFactory.java
> [3] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio.py#L107
> [4] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filebasedsource.py
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Build failed in Jenkins: beam_PerformanceTests_Dataflow #328

2017-04-19 Thread Apache Jenkins Server
See 


Changes:

[iemejia] [BEAM-1994] Remove Flink examples package

[chamikara] [BEAM-1441] Remove deprecated ChannelFactory

[tgroh] Ensure all Read outputs are consumed in Dataflow

--
[...truncated 239.71 KB...]
 x [deleted] (none) -> origin/pr/93/head
 x [deleted] (none) -> origin/pr/930/head
 x [deleted] (none) -> origin/pr/930/merge
 x [deleted] (none) -> origin/pr/931/head
 x [deleted] (none) -> origin/pr/931/merge
 x [deleted] (none) -> origin/pr/932/head
 x [deleted] (none) -> origin/pr/932/merge
 x [deleted] (none) -> origin/pr/933/head
 x [deleted] (none) -> origin/pr/933/merge
 x [deleted] (none) -> origin/pr/934/head
 x [deleted] (none) -> origin/pr/934/merge
 x [deleted] (none) -> origin/pr/935/head
 x [deleted] (none) -> origin/pr/936/head
 x [deleted] (none) -> origin/pr/936/merge
 x [deleted] (none) -> origin/pr/937/head
 x [deleted] (none) -> origin/pr/937/merge
 x [deleted] (none) -> origin/pr/938/head
 x [deleted] (none) -> origin/pr/939/head
 x [deleted] (none) -> origin/pr/94/head
 x [deleted] (none) -> origin/pr/940/head
 x [deleted] (none) -> origin/pr/940/merge
 x [deleted] (none) -> origin/pr/941/head
 x [deleted] (none) -> origin/pr/941/merge
 x [deleted] (none) -> origin/pr/942/head
 x [deleted] (none) -> origin/pr/942/merge
 x [deleted] (none) -> origin/pr/943/head
 x [deleted] (none) -> origin/pr/943/merge
 x [deleted] (none) -> origin/pr/944/head
 x [deleted] (none) -> origin/pr/945/head
 x [deleted] (none) -> origin/pr/945/merge
 x [deleted] (none) -> origin/pr/946/head
 x [deleted] (none) -> origin/pr/946/merge
 x [deleted] (none) -> origin/pr/947/head
 x [deleted] (none) -> origin/pr/947/merge
 x [deleted] (none) -> origin/pr/948/head
 x [deleted] (none) -> origin/pr/948/merge
 x [deleted] (none) -> origin/pr/949/head
 x [deleted] (none) -> origin/pr/949/merge
 x [deleted] (none) -> origin/pr/95/head
 x [deleted] (none) -> origin/pr/95/merge
 x [deleted] (none) -> origin/pr/950/head
 x [deleted] (none) -> origin/pr/951/head
 x [deleted] (none) -> origin/pr/951/merge
 x [deleted] (none) -> origin/pr/952/head
 x [deleted] (none) -> origin/pr/952/merge
 x [deleted] (none) -> origin/pr/953/head
 x [deleted] (none) -> origin/pr/954/head
 x [deleted] (none) -> origin/pr/954/merge
 x [deleted] (none) -> origin/pr/955/head
 x [deleted] (none) -> origin/pr/955/merge
 x [deleted] (none) -> origin/pr/956/head
 x [deleted] (none) -> origin/pr/957/head
 x [deleted] (none) -> origin/pr/958/head
 x [deleted] (none) -> origin/pr/959/head
 x [deleted] (none) -> origin/pr/959/merge
 x [deleted] (none) -> origin/pr/96/head
 x [deleted] (none) -> origin/pr/96/merge
 x [deleted] (none) -> origin/pr/960/head
 x [deleted] (none) -> origin/pr/960/merge
 x [deleted] (none) -> origin/pr/961/head
 x [deleted] (none) -> origin/pr/962/head
 x [deleted] (none) -> origin/pr/962/merge
 x [deleted] (none) -> origin/pr/963/head
 x [deleted] (none) -> origin/pr/963/merge
 x [deleted] (none) -> origin/pr/964/head
 x [deleted] (none) -> origin/pr/965/head
 x [deleted] (none) -> origin/pr/965/merge
 x [deleted] (none) -> origin/pr/966/head
 x [deleted] (none) -> origin/pr/967/head
 x [deleted] (none) -> origin/pr/967/merge
 x [deleted] (none) -> origin/pr/968/head
 x [deleted] (none) -> origin/pr/968/merge
 x [deleted] (none) -> origin/pr/969/head
 x [deleted] (none) -> origin/pr/969/merge
 x [deleted] (none) -> origin/pr/97/head
 x [deleted] (none) -> origin/pr/97/merge
 x [deleted] (none) -> origin/pr/970/head
 x [deleted] (none) -> origin/pr/970/merge
 x [deleted] (none) -> origin/pr/971/head
 x [deleted] (none) -> origin/pr/971/merge
 x [deleted] (none) -> origin/pr/972/head
 x [deleted] (none) -> origin/pr/973/head
 x [deleted] (none) -> origin/pr/974/head
 x [deleted] (none) -> origin/pr/974/merge
 x [deleted] (none) -> origin/pr/975/head
 x [deleted] (none)   

Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2403

2017-04-19 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Flink #2404

2017-04-19 Thread Apache Jenkins Server
See 




[GitHub] beam-site pull request #218: Adds Elasticsearch to I/O table

2017-04-19 Thread ericmand
GitHub user ericmand opened a pull request:

https://github.com/apache/beam-site/pull/218

Adds Elasticsearch to I/O table



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ericmand/beam-site patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam-site/pull/218.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #218


commit dce0b3911e94601115f59ded43405f1963adca58
Author: Eric Anderson 
Date:   2017-04-19T18:35:48Z

Adds Elasticsearch to I/O table




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Flink #2405

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-1984) Enable dependency analysis of non-compile dependencies

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974960#comment-15974960
 ] 

ASF GitHub Bot commented on BEAM-1984:
--

GitHub user iemejia opened a pull request:

https://github.com/apache/beam/pull/2591

[BEAM-1984] Enable dependency analysis of non-compile dependencies

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iemejia/beam 
BEAM-1984-enable-dep-analysis-for-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2591.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2591


commit 852d8aea354714a6d4b2ca6121fba454a9939de3
Author: Ismaël Mejía 
Date:   2017-04-13T16:52:58Z

[BEAM-1984] Update maven surefire and failsafe plugins to version 2.20

commit 14af68d717e87162bbda81c7f64d456f82182674
Author: Ismaël Mejía 
Date:   2017-04-16T12:41:44Z

[BEAM-1984] Fix scope for dependencies needed only for test purposes

commit 28bbca63b474cd622a99a67bc30b2e6370368138
Author: Ismaël Mejía 
Date:   2017-04-17T21:34:12Z

[BEAM-1984] Enable runtime dependency analysis




> Enable dependency analysis of non-compile dependencies
> --
>
> Key: BEAM-1984
> URL: https://issues.apache.org/jira/browse/BEAM-1984
> Project: Beam
>  Issue Type: Improvement
>  Components: build-system
>Affects Versions: Not applicable
>Reporter: Ismaël Mejía
>Assignee: Ismaël Mejía
>Priority: Minor
>
> Currently the dependency analysis is ignoring test dependencies, however if 
> we run:
> mvn install -Dmaven.test.skip=true
> It complains on multiple modules on dependencies that should be scoped 
> properly into the test mode but aren’t currently.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2592: [BEAM-2015] Remove shared profile in runners/pom.xm...

2017-04-19 Thread lukecwik
GitHub user lukecwik opened a pull request:

https://github.com/apache/beam/pull/2592

[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---
Diff'd effective poms to compare changes and find all inherited properties 
in sub runner modules.

I was limited by the seed job failing to make some other minor changes in 
clean-up.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lukecwik/incubator-beam dataflow_post_commit

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2592.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2592


commit ece72d42bdda512eadc78cfcf8e9dfe7b117a41c
Author: Luke Cwik 
Date:   2017-04-19T16:10:39Z

[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975001#comment-15975001
 ] 

Daniel Halperin commented on BEAM-2005:
---

`core` vs `extensions` -- this won't be in `sdk-java-core` itself, it will 
probably be in `sdk-java-extensions-hadoop` or whatever (just like 
`GcsFileSystem` either is moving or has moved to the new 
`sdks-java-extensions-gcp-core`).

I could see also `sdk-java-io-hadoop`, but I think this is a reasonable-ish use 
of `core`. Our JIRA tags are not perfect.

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2009) support JdbcIO as source/sink

2017-04-19 Thread Xu Mingmin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975048#comment-15975048
 ] 

Xu Mingmin commented on BEAM-2009:
--

[~jbonofre] Yes, this task is created for dsl_sql only.

> support JdbcIO as source/sink
> -
>
> Key: BEAM-2009
> URL: https://issues.apache.org/jira/browse/BEAM-2009
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Jean-Baptiste Onofré
>
> support JdbcIO in both source/sink part:
> 1. as source, JdbcIO read data from databases that supports JDBC such as 
> Oracle/MySQL/Cassandra/...;
> It leads to a bounded pipeline;
> 2. as sink, JdbcIO can persistent data from both unbounded and bounded 
> pipeline;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (BEAM-1355) HDFS IO should comply with PTransform style guide

2017-04-19 Thread Eugene Kirpichov (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Kirpichov closed BEAM-1355.
--
   Resolution: Won't Fix
Fix Version/s: First stable release

I'm closing this as Won't Fix because as I understand, HDFSFileSource/Sink will 
go away in favor of https://issues.apache.org/jira/browse/BEAM-2005

> HDFS IO should comply with PTransform style guide
> -
>
> Key: BEAM-1355
> URL: https://issues.apache.org/jira/browse/BEAM-1355
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-extensions
>Reporter: Eugene Kirpichov
>Assignee: Stephen Sisk
>  Labels: backward-incompatible
> Fix For: First stable release
>
>
> https://github.com/apache/beam/tree/master/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs
>  does not comply with 
> https://beam.apache.org/contribute/ptransform-style-guide/ in a number of 
> ways:
> - It is not packaged as a PTransform (should be: HDFSIO.Read,Write or 
> something like that)
> - Should probably use AutoValue for specifying parameters
> Stephen knows about the current state of HDFS IO.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Aviem Zur (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975193#comment-15975193
 ] 

Aviem Zur commented on BEAM-2005:
-

I'll refine - it should be in a separate module, but I think core should depend 
on it (i.e. out of the box functionality).

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-2015:
---

 Summary: Dataflow PostCommit has been failing since build #2814
 Key: BEAM-2015
 URL: https://issues.apache.org/jira/browse/BEAM-2015
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Affects Versions: First stable release
Reporter: Luke Cwik
Assignee: Luke Cwik


https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/

The issue is the runners/core-java is inheriting the maven profile which is 
being activated by the specification of the validatesRunnerPipelineOptions 
system property.

This issue had not been a problem in the past because the dependency order had 
made the runners/google-cloud-dataflow-java part of the classpath.

There was an assumption that runners/ would only have runner modules and that 
all submodules would want to have the shared definition. This is no longer the 
case.

Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
  Time elapsed: 0.003 sec  <<< ERROR!
java.lang.IllegalArgumentException: Unknown 'runner' specified 
'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
pipeline runners [RegisteredTestRunner]
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
at 
org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
at 
org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
at 
org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
at 
org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:161)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 

[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15974991#comment-15974991
 ] 

Jean-Baptiste Onofré commented on BEAM-2005:


I don't mind about core or extension (it could make sense to have filesystem 
extension). Full agree that it's a must have for first stable release. That's 
why I started to focus on this last week end.

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : beam_PostCommit_Java_ValidatesRunner_Flink #2401

2017-04-19 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #2882

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2007) DataflowRunner drops Reads with no consumers

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975166#comment-15975166
 ] 

ASF GitHub Bot commented on BEAM-2007:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2587


> DataflowRunner drops Reads with no consumers
> 
>
> Key: BEAM-2007
> URL: https://issues.apache.org/jira/browse/BEAM-2007
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Thomas Groh
> Fix For: First stable release
>
>
> Basically, if a pipeline has "just" a Read with no consumers, the optimizer 
> in Dataflow will drop it. To preserve Beam semantics, we do want to run the 
> Read and drop its output, e.g., because the Read may have side effects that 
> we're testing for.
> Is it possible with pipeline surgery to find such Reads and add an Identity 
> ParDo to them?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2007) DataflowRunner drops Reads with no consumers

2017-04-19 Thread Thomas Groh (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Groh reassigned BEAM-2007:
-

Assignee: (was: Thomas Groh)

This is fixed in the Java Dataflow runner. The Python Dataflow runner will have 
to perform a similar change.

> DataflowRunner drops Reads with no consumers
> 
>
> Key: BEAM-2007
> URL: https://issues.apache.org/jira/browse/BEAM-2007
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
> Fix For: First stable release
>
>
> Basically, if a pipeline has "just" a Read with no consumers, the optimizer 
> in Dataflow will drop it. To preserve Beam semantics, we do want to run the 
> Read and drop its output, e.g., because the Read may have side effects that 
> we're testing for.
> Is it possible with pipeline surgery to find such Reads and add an Identity 
> ParDo to them?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: Cache result of BigQuerySourceBase.split

2017-04-19 Thread jkff
Repository: beam
Updated Branches:
  refs/heads/master 29e054a8d -> 391fb77c3


Cache result of BigQuerySourceBase.split


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/1533e2b9
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/1533e2b9
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/1533e2b9

Branch: refs/heads/master
Commit: 1533e2b9bc49971929277b804587d93d8d2cae4c
Parents: 29e054a
Author: Eugene Kirpichov 
Authored: Wed Apr 19 10:09:42 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Apr 19 11:39:21 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java | 31 +---
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 18 +---
 .../sdk/io/gcp/bigquery/FakeJobService.java |  9 ++
 3 files changed, 37 insertions(+), 21 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/1533e2b9/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
index 1b90dc3..4142da9 100644
--- 
a/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
+++ 
b/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQuerySourceBase.java
@@ -69,6 +69,8 @@ abstract class BigQuerySourceBase extends 
BoundedSource {
   protected final BigQueryServices bqServices;
   protected final ValueProvider executingProject;
 
+  private List cachedSplitResult;
+
   BigQuerySourceBase(
   ValueProvider jobIdToken,
   String extractDestinationDir,
@@ -83,17 +85,24 @@ abstract class BigQuerySourceBase extends 
BoundedSource {
   @Override
   public List split(
   long desiredBundleSizeBytes, PipelineOptions options) throws Exception {
-BigQueryOptions bqOptions = options.as(BigQueryOptions.class);
-TableReference tableToExtract = getTableToExtract(bqOptions);
-JobService jobService = bqServices.getJobService(bqOptions);
-String extractJobId = BigQueryIO.getExtractJobId(jobIdToken);
-List tempFiles = executeExtract(extractJobId, tableToExtract, 
jobService);
-
-TableSchema tableSchema = bqServices.getDatasetService(bqOptions)
-.getTable(tableToExtract).getSchema();
-
-cleanupTempResource(bqOptions);
-return createSources(tempFiles, tableSchema);
+// split() can be called multiple times, e.g. Dataflow runner may call it 
multiple times
+// with different desiredBundleSizeBytes in case the split() call produces 
too many sources.
+// We ignore desiredBundleSizeBytes anyway, however in any case, we should 
not initiate
+// another BigQuery extract job for the repeated split() calls.
+if (cachedSplitResult == null) {
+  BigQueryOptions bqOptions = options.as(BigQueryOptions.class);
+  TableReference tableToExtract = getTableToExtract(bqOptions);
+  JobService jobService = bqServices.getJobService(bqOptions);
+  String extractJobId = BigQueryIO.getExtractJobId(jobIdToken);
+  List tempFiles = executeExtract(extractJobId, tableToExtract, 
jobService);
+
+  TableSchema tableSchema = bqServices.getDatasetService(bqOptions)
+  .getTable(tableToExtract).getSchema();
+
+  cleanupTempResource(bqOptions);
+  cachedSplitResult = checkNotNull(createSources(tempFiles, tableSchema));
+}
+return cachedSplitResult;
   }
 
   protected abstract TableReference getTableToExtract(BigQueryOptions 
bqOptions) throws Exception;

http://git-wip-us.apache.org/repos/asf/beam/blob/1533e2b9/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
--
diff --git 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
index d0004e4..62c5b5f 100644
--- 
a/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
+++ 
b/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIOTest.java
@@ -28,7 +28,6 @@ import static org.junit.Assert.assertEquals;
 import static org.junit.Assert.assertNull;
 import static org.junit.Assert.assertThat;
 import static org.junit.Assert.assertTrue;
-
 import com.google.api.client.util.Data;
 import 

[2/2] beam git commit: This closes #2594

2017-04-19 Thread jkff
This closes #2594


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/391fb77c
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/391fb77c
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/391fb77c

Branch: refs/heads/master
Commit: 391fb77c379d271494527c6f78ef8ada6f40dc23
Parents: 29e054a 1533e2b
Author: Eugene Kirpichov 
Authored: Wed Apr 19 11:39:36 2017 -0700
Committer: Eugene Kirpichov 
Committed: Wed Apr 19 11:39:36 2017 -0700

--
 .../sdk/io/gcp/bigquery/BigQuerySourceBase.java | 31 +---
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 18 +---
 .../sdk/io/gcp/bigquery/FakeJobService.java |  9 ++
 3 files changed, 37 insertions(+), 21 deletions(-)
--




[GitHub] beam pull request #2594: Cache result of BigQuerySourceBase.split

2017-04-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2594


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Aviem Zur (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975196#comment-15975196
 ] 

Aviem Zur commented on BEAM-2005:
-

Wouldn't this ticket mean actually implementing 
https://github.com/apache/beam/blob/master/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystem.java
 ?

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-1002) Enable caching of side-input dependent computations

2017-04-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles reassigned BEAM-1002:
-

Assignee: (was: Kenneth Knowles)

> Enable caching of side-input dependent computations
> ---
>
> Key: BEAM-1002
> URL: https://issues.apache.org/jira/browse/BEAM-1002
> Project: Beam
>  Issue Type: New Feature
>  Components: beam-model
>Reporter: Robert Bradshaw
>
> Sometimes the kind of computations one wants to perform in startBundle depend 
> on side inputs (and, implicitly, the window). For example, one might want to 
> initialize a (non-serializable) stateful object. In particular, this leads to 
> users incorrectly (in the case of triggered or non-globally-windowed side 
> inputs) memoizing this computation in the first processElement call. 
> One option would be to fold this into a customizable ViewFn. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2591: [BEAM-1984] Enable dependency analysis of non-compi...

2017-04-19 Thread iemejia
GitHub user iemejia opened a pull request:

https://github.com/apache/beam/pull/2591

[BEAM-1984] Enable dependency analysis of non-compile dependencies

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/iemejia/beam 
BEAM-1984-enable-dep-analysis-for-tests

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2591.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2591


commit 852d8aea354714a6d4b2ca6121fba454a9939de3
Author: Ismaël Mejía 
Date:   2017-04-13T16:52:58Z

[BEAM-1984] Update maven surefire and failsafe plugins to version 2.20

commit 14af68d717e87162bbda81c7f64d456f82182674
Author: Ismaël Mejía 
Date:   2017-04-16T12:41:44Z

[BEAM-1984] Fix scope for dependencies needed only for test purposes

commit 28bbca63b474cd622a99a67bc30b2e6370368138
Author: Ismaël Mejía 
Date:   2017-04-17T21:34:12Z

[BEAM-1984] Enable runtime dependency analysis




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Updated] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-2015:

Description: 
https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/

The issue is the runners/core-java is inheriting the maven profile which is 
being activated by the specification of the validatesRunnerPipelineOptions 
system property.

This issue had not been a problem in the past because the dependency order had 
made the runners/google-cloud-dataflow-java part of the classpath.

There was an assumption that runners/ would only have runner modules and that 
all submodules would want to have the shared definition. This is no longer the 
case.

{code}
Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
  Time elapsed: 0.003 sec  <<< ERROR!
java.lang.IllegalArgumentException: Unknown 'runner' specified 
'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
pipeline runners [RegisteredTestRunner]
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
at 
org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
at 
org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
at 
org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at 
org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
at 
org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.junit.runners.Suite.runChild(Suite.java:128)
at org.junit.runners.Suite.runChild(Suite.java:27)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
at 
org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
at 
org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:161)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray2(ReflectionUtils.java:202)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:155)
at 

[jira] [Commented] (BEAM-1575) Add ValidatesRunner test to PipelineTest.test_metrics_in_source

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975070#comment-15975070
 ] 

ASF GitHub Bot commented on BEAM-1575:
--

GitHub user pabloem opened a pull request:

https://github.com/apache/beam/pull/2593

[BEAM-1575] Adding validatesrunner test for sources



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam source-metrics-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2593.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2593






> Add ValidatesRunner test to PipelineTest.test_metrics_in_source
> ---
>
> Key: BEAM-1575
> URL: https://issues.apache.org/jira/browse/BEAM-1575
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-py
>Reporter: Pablo Estrada
>Assignee: Pablo Estrada
>
> Currently, the source does not work other than in unittest. Need a source 
> that can be used in all runners.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[2/2] beam git commit: This closes #2544

2017-04-19 Thread chamikara
This closes #2544


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/714fdd29
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/714fdd29
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/714fdd29

Branch: refs/heads/master
Commit: 714fdd2923ed379eba1de9aaae5d76cb02d69b20
Parents: 8319369 97c6678
Author: chamik...@google.com 
Authored: Wed Apr 19 09:56:39 2017 -0700
Committer: chamik...@google.com 
Committed: Wed Apr 19 09:56:39 2017 -0700

--
 sdks/python/apache_beam/io/fileio.py | 90 ---
 1 file changed, 90 deletions(-)
--




[1/2] beam git commit: [BEAM-1441] Remove deprecated ChannelFactory

2017-04-19 Thread chamikara
Repository: beam
Updated Branches:
  refs/heads/master 83193698d -> 714fdd292


[BEAM-1441] Remove deprecated ChannelFactory


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/97c66784
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/97c66784
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/97c66784

Branch: refs/heads/master
Commit: 97c667846b566c312ceaadc66fb14fde1dfa7ebe
Parents: 8319369
Author: Sourabh Bajaj 
Authored: Fri Apr 14 14:45:16 2017 -0700
Committer: chamik...@google.com 
Committed: Wed Apr 19 09:56:28 2017 -0700

--
 sdks/python/apache_beam/io/fileio.py | 90 ---
 1 file changed, 90 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/97c66784/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index 8ee5198..f61289e 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -27,7 +27,6 @@ import time
 from apache_beam.internal import util
 from apache_beam.io import iobase
 from apache_beam.io.filesystem import BeamIOError
-from apache_beam.io.filesystem import CompressedFile as _CompressedFile
 from apache_beam.io.filesystem import CompressionTypes
 from apache_beam.io.filesystems_util import get_filesystem
 from apache_beam.transforms.display import DisplayDataItem
@@ -38,95 +37,6 @@ from apache_beam.utils.value_provider import check_accessible
 DEFAULT_SHARD_NAME_TEMPLATE = '-S-of-N'
 
 
-# TODO(sourabhbajaj): Remove this after BFS API is used everywhere
-class ChannelFactory(object):
-  @staticmethod
-  def mkdir(path):
-bfs = get_filesystem(path)
-return bfs.mkdirs(path)
-
-  @staticmethod
-  def open(path,
-   mode,
-   mime_type='application/octet-stream',
-   compression_type=CompressionTypes.AUTO):
-bfs = get_filesystem(path)
-if mode == 'rb':
-  return bfs.open(path, mime_type, compression_type)
-elif mode == 'wb':
-  return bfs.create(path, mime_type, compression_type)
-
-  @staticmethod
-  def is_compressed(fileobj):
-return isinstance(fileobj, _CompressedFile)
-
-  @staticmethod
-  def rename(src, dest):
-bfs = get_filesystem(src)
-return bfs.rename([src], [dest])
-
-  @staticmethod
-  def rename_batch(src_dest_pairs):
-sources = [s for s, _ in src_dest_pairs]
-destinations = [d for _, d in src_dest_pairs]
-if not sources:
-  return []
-bfs = get_filesystem(sources[0])
-try:
-  bfs.rename(sources, destinations)
-  return []
-except BeamIOError as exp:
-  return [(s, d, e) for (s, d), e in exp.exception_details.iteritems()]
-
-  @staticmethod
-  def copytree(src, dest):
-bfs = get_filesystem(src)
-return bfs.copy([src], [dest])
-
-  @staticmethod
-  def exists(path):
-bfs = get_filesystem(path)
-return bfs.exists(path)
-
-  @staticmethod
-  def rmdir(path):
-bfs = get_filesystem(path)
-return bfs.delete([path])
-
-  @staticmethod
-  def rm(path):
-bfs = get_filesystem(path)
-return bfs.delete([path])
-
-  @staticmethod
-  def glob(path, limit=None):
-bfs = get_filesystem(path)
-match_result = bfs.match([path], [limit])[0]
-return [f.path for f in match_result.metadata_list]
-
-  @staticmethod
-  def size_in_bytes(path):
-bfs = get_filesystem(path)
-match_result = bfs.match([path])[0]
-return [f.size_in_bytes for f in match_result.metadata_list][0]
-
-  @staticmethod
-  def size_of_files_in_glob(path, file_names=None):
-bfs = get_filesystem(path)
-match_result = bfs.match([path])[0]
-part_files = {f.path:f.size_in_bytes for f in match_result.metadata_list}
-
-if file_names is not None:
-  specific_files = {}
-  match_results = bfs.match(file_names)
-  for match_result in match_results:
-for metadata in match_result.metadata_list:
-  specific_files[metadata.path] = metadata.size_in_bytes
-
-  part_files.update(specific_files)
-return part_files
-
-
 class FileSink(iobase.Sink):
   """A sink to a GCS or local files.
 



[jira] [Commented] (BEAM-1441) Add an IOChannelFactory interface to Python SDK

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975082#comment-15975082
 ] 

ASF GitHub Bot commented on BEAM-1441:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2544


> Add an IOChannelFactory interface to Python SDK
> ---
>
> Key: BEAM-1441
> URL: https://issues.apache.org/jira/browse/BEAM-1441
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-py
>Reporter: Chamikara Jayalath
>Assignee: Sourabh Bajaj
> Fix For: First stable release
>
>
> Based on proposal [1], an IOChannelFactory interface was added to Java SDK  
> [2].
> We should add a similar interface to Python SDK and provide proper 
> implementations for native files, GCS, and other useful formats.
> Python SDK currently has a basic ChannelFactory interface [3] which is used 
> by FileBasedSource [4].
> [1] 
> https://docs.google.com/document/d/11TdPyZ9_zmjokhNWM3Id-XJsVG3qel2lhdKTknmZ_7M/edit#heading=h.kpqagzh8i11w
> [2] 
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/IOChannelFactory.java
> [3] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/fileio.py#L107
> [4] 
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/filebasedsource.py
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2544: [BEAM-1441] Remove deprecated ChannelFactory

2017-04-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2544


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (BEAM-2016) Delete HDFSFileSource/Sink

2017-04-19 Thread Eugene Kirpichov (JIRA)
Eugene Kirpichov created BEAM-2016:
--

 Summary: Delete HDFSFileSource/Sink
 Key: BEAM-2016
 URL: https://issues.apache.org/jira/browse/BEAM-2016
 Project: Beam
  Issue Type: Task
  Components: sdk-java-extensions
Reporter: Eugene Kirpichov
Assignee: Jean-Baptiste Onofré
 Fix For: First stable release


After https://issues.apache.org/jira/browse/BEAM-2005, delete 
https://github.com/apache/beam/tree/master/sdks/java/io/hdfs since it'll be 
redundant with the ability to read HDFS via other file-based IOs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Eugene Kirpichov (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975164#comment-15975164
 ] 

Eugene Kirpichov commented on BEAM-2005:


After this is done, and before the first stable release, we should delete 
https://github.com/apache/beam/tree/master/sdks/java/io/hdfs/src/main/java/org/apache/beam/sdk/io/hdfs.

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (BEAM-2007) DataflowRunner drops Reads with no consumers

2017-04-19 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin reassigned BEAM-2007:
-

Assignee: Ahmet Altay

> DataflowRunner drops Reads with no consumers
> 
>
> Key: BEAM-2007
> URL: https://issues.apache.org/jira/browse/BEAM-2007
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py
>Reporter: Daniel Halperin
>Assignee: Ahmet Altay
> Fix For: First stable release
>
>
> Basically, if a pipeline has "just" a Read with no consumers, the optimizer 
> in Dataflow will drop it. To preserve Beam semantics, we do want to run the 
> Read and drop its output, e.g., because the Read may have side effects that 
> we're testing for.
> Is it possible with pipeline surgery to find such Reads and add an Identity 
> ParDo to them?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2007) DataflowRunner drops Reads with no consumers

2017-04-19 Thread Daniel Halperin (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975169#comment-15975169
 ] 

Daniel Halperin commented on BEAM-2007:
---

[~altay] -- assigning to you solely as FYI

> DataflowRunner drops Reads with no consumers
> 
>
> Key: BEAM-2007
> URL: https://issues.apache.org/jira/browse/BEAM-2007
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py
>Reporter: Daniel Halperin
>Assignee: Ahmet Altay
> Fix For: First stable release
>
>
> Basically, if a pipeline has "just" a Read with no consumers, the optimizer 
> in Dataflow will drop it. To preserve Beam semantics, we do want to run the 
> Read and drop its output, e.g., because the Read may have side effects that 
> we're testing for.
> Is it possible with pipeline surgery to find such Reads and add an Identity 
> ParDo to them?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Issue Comment Deleted] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Aviem Zur (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aviem Zur updated BEAM-2005:

Comment: was deleted

(was: I'll refine - it should be in a separate module, but I think core should 
depend on it (i.e. out of the box functionality).)

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2595: [Beam-115] More complete translation of the graph t...

2017-04-19 Thread robertwb
GitHub user robertwb opened a pull request:

https://github.com/apache/beam/pull/2595

[Beam-115] More complete translation of the graph through the Runner API

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/robertwb/incubator-beam py-runner-api

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2595.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2595


commit d5b5c5ca23fce24e3c7028836641737609274596
Author: Robert Bradshaw 
Date:   2017-04-18T22:29:04Z

Per-transform runner api dispatch.

commit dcf2bda4cbad8d2f1bbaa9f650fe59cd40522b5e
Author: Robert Bradshaw 
Date:   2017-04-18T22:51:50Z

Translate flatten to Runner API.

commit d8d96772a60d53c4ffdaa480fd0bee575c8ae4df
Author: Robert Bradshaw 
Date:   2017-04-18T23:07:32Z

Translate WindowInto through the Runner API.

commit 41a7bb060ecc46182252469fe2c055562a72a6a0
Author: Robert Bradshaw 
Date:   2017-04-19T00:01:53Z

Translate Reads through the Runner API.

commit 05da3fd47701434f004a9d7ad8c5b80261d3c11d
Author: Robert Bradshaw 
Date:   2017-04-19T05:46:25Z

Factor out common URN registration.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is still unstable: beam_PostCommit_Java_ValidatesRunner_Dataflow #2881

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Updated] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-2005:
--
Fix Version/s: First stable release

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2005) Add a Hadoop FileSystem implementation of Beam's FileSystem

2017-04-19 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-2005:
--
Affects Version/s: (was: First stable release)

> Add a Hadoop FileSystem implementation of Beam's FileSystem
> ---
>
> Key: BEAM-2005
> URL: https://issues.apache.org/jira/browse/BEAM-2005
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Stephen Sisk
>Assignee: Stephen Sisk
> Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (BEAM-2007) DataflowRunner drops Reads with no consumers

2017-04-19 Thread Daniel Halperin (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Halperin updated BEAM-2007:
--
Component/s: sdk-py

> DataflowRunner drops Reads with no consumers
> 
>
> Key: BEAM-2007
> URL: https://issues.apache.org/jira/browse/BEAM-2007
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow, sdk-py
>Reporter: Daniel Halperin
> Fix For: First stable release
>
>
> Basically, if a pipeline has "just" a Read with no consumers, the optimizer 
> in Dataflow will drop it. To preserve Beam semantics, we do want to run the 
> Read and drop its output, e.g., because the Read may have side effects that 
> we're testing for.
> Is it possible with pipeline surgery to find such Reads and add an Identity 
> ParDo to them?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-1328) Serialize/deserialize WindowingStrategy in a language-agnostic manner

2017-04-19 Thread Kenneth Knowles (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kenneth Knowles resolved BEAM-1328.
---
   Resolution: Fixed
Fix Version/s: First stable release

> Serialize/deserialize WindowingStrategy in a language-agnostic manner
> -
>
> Key: BEAM-1328
> URL: https://issues.apache.org/jira/browse/BEAM-1328
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-core, sdk-py
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: First stable release
>
>
> This is an upcoming blocker for Python, as the Python SDK needs to be able to 
> ship the pieces of the windowing strategy in a way a runner can grok. Only 
> the WindowFn should remain language-specific.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build is back to normal : beam_PostCommit_Java_MavenInstall #3373

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2016) Delete HDFSFileSource/Sink

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975349#comment-15975349
 ] 

Jean-Baptiste Onofré commented on BEAM-2016:


It makes sense (as mentioned on the mailing list). I will tackle that !

> Delete HDFSFileSource/Sink
> --
>
> Key: BEAM-2016
> URL: https://issues.apache.org/jira/browse/BEAM-2016
> Project: Beam
>  Issue Type: Task
>  Components: sdk-java-extensions
>Reporter: Eugene Kirpichov
>Assignee: Jean-Baptiste Onofré
> Fix For: First stable release
>
>
> After https://issues.apache.org/jira/browse/BEAM-2005, delete 
> https://github.com/apache/beam/tree/master/sdks/java/io/hdfs since it'll be 
> redundant with the ability to read HDFS via other file-based IOs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Resolved] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2015.
-
   Resolution: Fixed
Fix Version/s: Not applicable

> Dataflow PostCommit has been failing since build #2814
> --
>
> Key: BEAM-2015
> URL: https://issues.apache.org/jira/browse/BEAM-2015
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
> Fix For: Not applicable
>
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/
> The issue is the runners/core-java is inheriting the maven profile which is 
> being activated by the specification of the validatesRunnerPipelineOptions 
> system property.
> This issue had not been a problem in the past because the dependency order 
> had made the runners/google-cloud-dataflow-java part of the classpath.
> There was an assumption that runners/ would only have runner modules and that 
> all submodules would want to have the shared definition. This is no longer 
> the case.
> {code}
> Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
> FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
> testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
>   Time elapsed: 0.003 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Unknown 'runner' specified 
> 'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
> pipeline runners [RegisteredTestRunner]
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
>   at 
> org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:161)
>   at 

[2/2] beam git commit: [BEAM-2014] Upgrade to Google Auth 0.6.1

2017-04-19 Thread lcwik
[BEAM-2014] Upgrade to Google Auth 0.6.1

This closes #2590


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/7515c260
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/7515c260
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/7515c260

Branch: refs/heads/master
Commit: 7515c2602b2df77fd5886be10ab1ae89faa3a280
Parents: 19ae877 be4f8b7
Author: Luke Cwik 
Authored: Wed Apr 19 12:58:33 2017 -0700
Committer: Luke Cwik 
Committed: Wed Apr 19 12:58:33 2017 -0700

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--




[jira] [Commented] (BEAM-775) Remove Aggregators from the Java SDK

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975419#comment-15975419
 ] 

ASF GitHub Bot commented on BEAM-775:
-

Github user pabloem closed the pull request at:

https://github.com/apache/beam/pull/2184


> Remove Aggregators from the Java SDK
> 
>
> Key: BEAM-775
> URL: https://issues.apache.org/jira/browse/BEAM-775
> Project: Beam
>  Issue Type: Sub-task
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Assignee: Pablo Estrada
>  Labels: backward-incompatible
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2018) refine expression of Calcite method/function

2017-04-19 Thread Xu Mingmin (JIRA)
Xu Mingmin created BEAM-2018:


 Summary: refine expression of Calcite method/function
 Key: BEAM-2018
 URL: https://issues.apache.org/jira/browse/BEAM-2018
 Project: Beam
  Issue Type: Improvement
  Components: dsl-sql
Reporter: Xu Mingmin
Assignee: Xu Mingmin


https://calcite.apache.org/docs/reference.html list the method/functions that 
are supported in Calcite SQL statements. 

In this task, it defines an interface on how to mapping SQL expressions(with 
method/functions, and those without like direct-field reference) into a Java 
expression that can be evaluated against row records. 

--It's supposed to replace the reference implementation {{BeamSQLSpELExecutor}} 
for better capability. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975320#comment-15975320
 ] 

ASF GitHub Bot commented on BEAM-2015:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2592


> Dataflow PostCommit has been failing since build #2814
> --
>
> Key: BEAM-2015
> URL: https://issues.apache.org/jira/browse/BEAM-2015
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/
> The issue is the runners/core-java is inheriting the maven profile which is 
> being activated by the specification of the validatesRunnerPipelineOptions 
> system property.
> This issue had not been a problem in the past because the dependency order 
> had made the runners/google-cloud-dataflow-java part of the classpath.
> There was an assumption that runners/ would only have runner modules and that 
> all submodules would want to have the shared definition. This is no longer 
> the case.
> {code}
> Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
> FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
> testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
>   Time elapsed: 0.003 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Unknown 'runner' specified 
> 'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
> pipeline runners [RegisteredTestRunner]
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
>   at 
> org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> 

[jira] [Commented] (BEAM-919) Remove remaining old use/learn links from website src

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-919?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975324#comment-15975324
 ] 

ASF GitHub Bot commented on BEAM-919:
-

Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/217


> Remove remaining old use/learn links from website src
> -
>
> Key: BEAM-919
> URL: https://issues.apache.org/jira/browse/BEAM-919
> Project: Beam
>  Issue Type: Bug
>  Components: website
>Reporter: Frances Perry
>Assignee: Melissa Pashniak
>Priority: Minor
>
> We still have old links lingering after the website refactoring.
> For example, the release guide 
> (https://github.com/apache/incubator-beam-site/blob/asf-site/src/contribute/release-guide.md)
>  still links to "/use/..." in a bunch of places. 
> impact: links still work because of redirects, but it's tech debt we should 
> fix.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (BEAM-2017) DataflowRunner: fix NullPointerException that can occur when no metrics are present

2017-04-19 Thread Daniel Halperin (JIRA)
Daniel Halperin created BEAM-2017:
-

 Summary: DataflowRunner: fix NullPointerException that can occur 
when no metrics are present
 Key: BEAM-2017
 URL: https://issues.apache.org/jira/browse/BEAM-2017
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow
Reporter: Daniel Halperin
Assignee: Daniel Halperin
 Fix For: First stable release


{code}
Exception in thread "main" java.lang.NullPointerException
at 
org.apache.beam.runners.dataflow.DataflowMetrics.populateMetricQueryResults(DataflowMetrics.java:118)
at 
org.apache.beam.runners.dataflow.DataflowMetrics.queryServiceForMetrics(DataflowMetrics.java:173)
at 
org.apache.beam.runners.dataflow.DataflowMetrics.queryMetrics(DataflowMetrics.java:186)
at 
com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparationInPipelineWithHangingFn(SeparationHarness.java:146)
at 
com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparationInPipeline(SeparationHarness.java:129)
at 
com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparation(SeparationHarness.java:120)
at 
com.google.cloud.dataflow.integration.autotuning.separation.InMemorySeparation.main(InMemorySeparation.java:19)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[27/50] [abbrv] beam git commit: [BEAM-1994] Remove Flink examples package

2017-04-19 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/beam/blob/cdd2544b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BoundedSourceWrapper.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BoundedSourceWrapper.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BoundedSourceWrapper.java
new file mode 100644
index 000..2ed5024
--- /dev/null
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/BoundedSourceWrapper.java
@@ -0,0 +1,218 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.io;
+
+import com.google.common.annotations.VisibleForTesting;
+import java.util.ArrayList;
+import java.util.List;
+import 
org.apache.beam.runners.flink.translation.utils.SerializedPipelineOptions;
+import org.apache.beam.sdk.io.BoundedSource;
+import org.apache.beam.sdk.options.PipelineOptions;
+import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
+import org.apache.beam.sdk.transforms.windowing.PaneInfo;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.flink.api.common.functions.StoppableFunction;
+import 
org.apache.flink.streaming.api.functions.source.RichParallelSourceFunction;
+import org.apache.flink.streaming.api.watermark.Watermark;
+import org.joda.time.Instant;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Wrapper for executing {@link BoundedSource BoundedSources} as a Flink 
Source.
+ */
+public class BoundedSourceWrapper
+extends RichParallelSourceFunction
+implements StoppableFunction {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(BoundedSourceWrapper.class);
+
+  /**
+   * Keep the options so that we can initialize the readers.
+   */
+  private final SerializedPipelineOptions serializedOptions;
+
+  /**
+   * The split sources. We split them in the constructor to ensure that all 
parallel
+   * sources are consistent about the split sources.
+   */
+  private List> splitSources;
+
+  /**
+   * Make it a field so that we can access it in {@link #close()}.
+   */
+  private transient List readers;
+
+  /**
+   * Initialize here and not in run() to prevent races where we cancel a job 
before run() is
+   * ever called or run() is called after cancel().
+   */
+  private volatile boolean isRunning = true;
+
+  @SuppressWarnings("unchecked")
+  public BoundedSourceWrapper(
+  PipelineOptions pipelineOptions,
+  BoundedSource source,
+  int parallelism) throws Exception {
+this.serializedOptions = new SerializedPipelineOptions(pipelineOptions);
+
+long desiredBundleSize = source.getEstimatedSizeBytes(pipelineOptions) / 
parallelism;
+
+// get the splits early. we assume that the generated splits are stable,
+// this is necessary so that the mapping of state to source is correct
+// when restoring
+splitSources = source.split(desiredBundleSize, pipelineOptions);
+  }
+
+  @Override
+  public void run(SourceContext ctx) throws Exception {
+
+// figure out which split sources we're responsible for
+int subtaskIndex = getRuntimeContext().getIndexOfThisSubtask();
+int numSubtasks = getRuntimeContext().getNumberOfParallelSubtasks();
+
+List localSources = new ArrayList<>();
+
+for (int i = 0; i < splitSources.size(); i++) {
+  if (i % numSubtasks == subtaskIndex) {
+localSources.add(splitSources.get(i));
+  }
+}
+
+LOG.info("Bounded Flink Source {}/{} is reading from sources: {}",
+subtaskIndex,
+numSubtasks,
+localSources);
+
+readers = new ArrayList<>();
+// initialize readers from scratch
+for (BoundedSource source : localSources) {
+  readers.add(source.createReader(serializedOptions.getPipelineOptions()));
+}
+
+   if (readers.size() == 1) {
+  // the easy case, we just read from one reader
+  BoundedSource.BoundedReader reader = readers.get(0);
+
+

[jira] [Commented] (BEAM-2009) support JdbcIO as source/sink

2017-04-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/BEAM-2009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975351#comment-15975351
 ] 

Jean-Baptiste Onofré commented on BEAM-2009:


It makes sense, thanks !

> support JdbcIO as source/sink
> -
>
> Key: BEAM-2009
> URL: https://issues.apache.org/jira/browse/BEAM-2009
> Project: Beam
>  Issue Type: New Feature
>  Components: dsl-sql
>Reporter: Xu Mingmin
>Assignee: Jean-Baptiste Onofré
>
> support JdbcIO in both source/sink part:
> 1. as source, JdbcIO read data from databases that supports JDBC such as 
> Oracle/MySQL/Cassandra/...;
> It leads to a bounded pipeline;
> 2. as sink, JdbcIO can persistent data from both unbounded and bounded 
> pipeline;



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1707

2017-04-19 Thread Apache Jenkins Server
See 




[jira] [Commented] (BEAM-2017) DataflowRunner: fix NullPointerException that can occur when no metrics are present

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975355#comment-15975355
 ] 

ASF GitHub Bot commented on BEAM-2017:
--

GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2596

[BEAM-2017] Fix NPE in DataflowRunner when there are no metrics

R: @pabloem or @bjchambers 

CC: @malo-denielou 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam dataflow-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2596.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2596


commit b0675c79bfcf500456a275f0033bc5844a4d4600
Author: Dan Halperin 
Date:   2017-04-19T19:13:59Z

[BEAM-2017] Fix NPE in DataflowRunner when there are no metrics




> DataflowRunner: fix NullPointerException that can occur when no metrics are 
> present
> ---
>
> Key: BEAM-2017
> URL: https://issues.apache.org/jira/browse/BEAM-2017
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
> Fix For: First stable release
>
>
> {code}
> Exception in thread "main" java.lang.NullPointerException
> at 
> org.apache.beam.runners.dataflow.DataflowMetrics.populateMetricQueryResults(DataflowMetrics.java:118)
> at 
> org.apache.beam.runners.dataflow.DataflowMetrics.queryServiceForMetrics(DataflowMetrics.java:173)
> at 
> org.apache.beam.runners.dataflow.DataflowMetrics.queryMetrics(DataflowMetrics.java:186)
> at 
> com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparationInPipelineWithHangingFn(SeparationHarness.java:146)
> at 
> com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparationInPipeline(SeparationHarness.java:129)
> at 
> com.google.cloud.dataflow.integration.autotuning.separation.SeparationHarness.verifySeparation(SeparationHarness.java:120)
> at 
> com.google.cloud.dataflow.integration.autotuning.separation.InMemorySeparation.main(InMemorySeparation.java:19)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2596: [BEAM-2017] Fix NPE in DataflowRunner when there ar...

2017-04-19 Thread dhalperi
GitHub user dhalperi opened a pull request:

https://github.com/apache/beam/pull/2596

[BEAM-2017] Fix NPE in DataflowRunner when there are no metrics

R: @pabloem or @bjchambers 

CC: @malo-denielou 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dhalperi/beam dataflow-metrics

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2596.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2596


commit b0675c79bfcf500456a275f0033bc5844a4d4600
Author: Dan Halperin 
Date:   2017-04-19T19:13:59Z

[BEAM-2017] Fix NPE in DataflowRunner when there are no metrics




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (BEAM-2015) Dataflow PostCommit has been failing since build #2814

2017-04-19 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975390#comment-15975390
 ] 

Luke Cwik commented on BEAM-2015:
-

Tests are now passing again:
https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2883/

> Dataflow PostCommit has been failing since build #2814
> --
>
> Key: BEAM-2015
> URL: https://issues.apache.org/jira/browse/BEAM-2015
> Project: Beam
>  Issue Type: Bug
>  Components: runner-dataflow
>Affects Versions: First stable release
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/2814/
> The issue is the runners/core-java is inheriting the maven profile which is 
> being activated by the specification of the validatesRunnerPipelineOptions 
> system property.
> This issue had not been a problem in the past because the dependency order 
> had made the runners/google-cloud-dataflow-java part of the classpath.
> There was an assumption that runners/ would only have runner modules and that 
> all submodules would want to have the shared definition. This is no longer 
> the case.
> {code}
> Tests run: 7, Failures: 0, Errors: 7, Skipped: 0, Time elapsed: 0.046 sec <<< 
> FAILURE! - in org.apache.beam.runners.core.SplittableParDoTest
> testCheckpointsAfterNumOutputs(org.apache.beam.runners.core.SplittableParDoTest)
>   Time elapsed: 0.003 sec  <<< ERROR!
> java.lang.IllegalArgumentException: Unknown 'runner' specified 
> 'org.apache.beam.runners.dataflow.testing.TestDataflowRunner', supported 
> pipeline runners [RegisteredTestRunner]
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1609)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:104)
>   at 
> org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:289)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.testingPipelineOptions(TestPipeline.java:392)
>   at 
> org.apache.beam.sdk.testing.TestPipeline.create(TestPipeline.java:257)
>   at 
> org.apache.beam.runners.core.SplittableParDoTest.(SplittableParDoTest.java:149)
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.createTest(BlockJUnit4ClassRunner.java:217)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.runReflectiveCall(BlockJUnit4ClassRunner.java:266)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.methodBlock(BlockJUnit4ClassRunner.java:263)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.junit.runners.Suite.runChild(Suite.java:128)
>   at org.junit.runners.Suite.runChild(Suite.java:27)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>   at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>   at org.apache.maven.surefire.junitcore.JUnitCore.run(JUnitCore.java:55)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:137)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:107)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:83)
>   at 
> org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:75)
>   at 
> 

[jira] [Commented] (BEAM-2014) Upgrade to Google Auth 0.6.1

2017-04-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15975408#comment-15975408
 ] 

ASF GitHub Bot commented on BEAM-2014:
--

Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2590


> Upgrade to Google Auth 0.6.1
> 
>
> Key: BEAM-2014
> URL: https://issues.apache.org/jira/browse/BEAM-2014
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[GitHub] beam pull request #2590: [BEAM-2014] Upgrade to Google Auth 0.6.1

2017-04-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam/pull/2590


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Resolved] (BEAM-2014) Upgrade to Google Auth 0.6.1

2017-04-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-2014.
-
   Resolution: Fixed
Fix Version/s: First stable release

> Upgrade to Google Auth 0.6.1
> 
>
> Key: BEAM-2014
> URL: https://issues.apache.org/jira/browse/BEAM-2014
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
> Fix For: First stable release
>
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[1/2] beam git commit: [BEAM-2014] Upgrade to Google Auth 0.6.1

2017-04-19 Thread lcwik
Repository: beam
Updated Branches:
  refs/heads/master 19ae87762 -> 7515c2602


[BEAM-2014] Upgrade to Google Auth 0.6.1


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/be4f8b7f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/be4f8b7f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/be4f8b7f

Branch: refs/heads/master
Commit: be4f8b7fbae66a1ed5c48b0995065522fd13cae9
Parents: 19ae877
Author: Jean-Baptiste Onofré 
Authored: Wed Apr 19 16:54:41 2017 +0200
Committer: Luke Cwik 
Committed: Wed Apr 19 12:58:11 2017 -0700

--
 pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/be4f8b7f/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 09659db..802f305 100644
--- a/pom.xml
+++ b/pom.xml
@@ -115,7 +115,7 @@
 1.3.0
 1.0-rc2
 1.3
-0.6.0
+0.6.1
 1.22.0
 1.4.5
 
0.5.160304



[GitHub] beam pull request #2184: [BEAM-775] Remove Aggregators from PipelineResults ...

2017-04-19 Thread pabloem
GitHub user pabloem reopened a pull request:

https://github.com/apache/beam/pull/2184

[BEAM-775] Remove Aggregators from PipelineResults and Examples in Java SDK

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [x] Make sure the PR title is formatted like:
   `[BEAM-] Description of pull request`
 - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [x] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [x] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).

---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/pabloem/incubator-beam remove-agg-java

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/beam/pull/2184.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2184


commit 161dd0f5e991b5532a956c8e197dccb612d0e050
Author: Pablo 
Date:   2017-03-07T20:50:38Z

Removing Aggregators from Examples

commit 906f372cdafce2ef50ea1a09005c34f1e40266f2
Author: Pablo 
Date:   2017-03-07T21:03:27Z

Removing Aggregators from runner-specific examples and tests.

commit 7c4f66f02972c18a936a99cd190aebd130670609
Author: Pablo 
Date:   2017-03-09T21:37:33Z

Addressing comment.

commit 36a1bf748d316088e503870b131880f8116f6fec
Author: Pablo 
Date:   2017-04-17T18:50:29Z

Unused imports

commit 179511acc5d45251bb1ffc3e653a50bb66681995
Author: Pablo 
Date:   2017-04-17T23:23:26Z

Adding matchers

commit 5752b24c650879fb08d6709a03d0523ff3b6da6b
Author: Pablo 
Date:   2017-03-07T21:06:15Z

Removing Aggregators from PipelineResults and subclasses.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build became unstable: beam_PostCommit_Java_ValidatesRunner_Spark #1709

2017-04-19 Thread Apache Jenkins Server
See 




Jenkins build is back to stable : beam_PostCommit_Java_ValidatesRunner_Dataflow #2883

2017-04-19 Thread Apache Jenkins Server
See 




[3/3] beam-site git commit: This closes #217

2017-04-19 Thread davor
This closes #217


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/dd3a16da
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/dd3a16da
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/dd3a16da

Branch: refs/heads/asf-site
Commit: dd3a16da7ae2441f5068b08de6808b200d800839
Parents: 747746b d557ce3
Author: Davor Bonaci 
Authored: Wed Apr 19 12:08:06 2017 -0700
Committer: Davor Bonaci 
Committed: Wed Apr 19 12:08:06 2017 -0700

--
 content/beam/capability/2016/03/17/capability-matrix.html |  4 ++--
 .../capability/2016/04/03/presentation-materials.html |  4 ++--
 content/beam/release/2016/06/15/first-release.html|  4 ++--
 .../update/2016/10/11/strata-hadoop-world-and-beam.html   |  2 +-
 content/blog/2016/08/03/six-months.html   |  2 +-
 content/blog/2016/10/20/test-stream.html  |  2 +-
 content/blog/index.html   |  2 +-
 content/contribute/contribution-guide/index.html  | 10 +-
 content/contribute/index.html |  4 ++--
 content/contribute/release-guide/index.html   |  8 
 content/documentation/sdks/java/index.html|  2 +-
 content/documentation/sdks/python/index.html  |  2 +-
 content/feed.xml  | 10 +-
 content/get-started/mobile-gaming-example/index.html  |  2 +-
 content/v2/index.html |  2 +-
 src/_posts/2016-03-17-capability-matrix.md|  4 ++--
 src/_posts/2016-04-03-presentation-materials.md   |  4 ++--
 src/_posts/2016-06-15-first-release.md|  4 ++--
 src/_posts/2016-08-03-six-months.md   |  2 +-
 src/_posts/2016-10-12-strata-hadoop-world-and-beam.md |  2 +-
 src/_posts/2016-10-20-test-stream.md  |  2 +-
 src/contribute/contribution-guide.md  | 10 +-
 src/contribute/index.md   |  4 ++--
 src/contribute/release-guide.md   |  8 
 src/documentation/sdks/java.md|  2 +-
 src/documentation/sdks/python.md  |  2 +-
 src/get-started/mobile-gaming-example.md  |  2 +-
 src/v2/index.md   |  2 +-
 28 files changed, 54 insertions(+), 54 deletions(-)
--




[1/3] beam-site git commit: [BEAM-919] Remove remaining old use/learn links from website src

2017-04-19 Thread davor
Repository: beam-site
Updated Branches:
  refs/heads/asf-site 747746bc4 -> dd3a16da7


[BEAM-919] Remove remaining old use/learn links from website src


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/84eef87d
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/84eef87d
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/84eef87d

Branch: refs/heads/asf-site
Commit: 84eef87d0f84045d4a19e00b6d210899d9323b71
Parents: 747746b
Author: melissa 
Authored: Tue Apr 18 18:20:29 2017 -0700
Committer: Davor Bonaci 
Committed: Wed Apr 19 12:07:45 2017 -0700

--
 src/_posts/2016-03-17-capability-matrix.md|  4 ++--
 src/_posts/2016-04-03-presentation-materials.md   |  4 ++--
 src/_posts/2016-06-15-first-release.md|  4 ++--
 src/_posts/2016-08-03-six-months.md   |  2 +-
 src/_posts/2016-10-12-strata-hadoop-world-and-beam.md |  2 +-
 src/_posts/2016-10-20-test-stream.md  |  2 +-
 src/contribute/contribution-guide.md  | 10 +-
 src/contribute/index.md   |  4 ++--
 src/contribute/release-guide.md   |  8 
 src/documentation/sdks/java.md|  2 +-
 src/documentation/sdks/python.md  |  2 +-
 src/get-started/mobile-gaming-example.md  |  2 +-
 src/v2/index.md   |  2 +-
 13 files changed, 24 insertions(+), 24 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/84eef87d/src/_posts/2016-03-17-capability-matrix.md
--
diff --git a/src/_posts/2016-03-17-capability-matrix.md 
b/src/_posts/2016-03-17-capability-matrix.md
index a125103..775f014 100644
--- a/src/_posts/2016-03-17-capability-matrix.md
+++ b/src/_posts/2016-03-17-capability-matrix.md
@@ -579,9 +579,9 @@ With initial code drops complete ([Dataflow SDK and 
Runner](https://github.com/a
 
 While we’d love to have a world where all runners support the full suite of 
semantics included in the Beam Model (formerly referred to as the [Dataflow 
Model](http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf)), practically speaking, 
there will always be certain features that some runners can’t provide. For 
example, a Hadoop-based runner would be inherently batch-based and may be 
unable to (easily) implement support for unbounded collections. However, that 
doesn’t prevent it from being extremely useful for a large set of uses. In 
other cases, the implementations provided by one runner may have slightly 
different semantics that those provided by another (e.g. even though the 
current suite of runners all support exactly-once delivery guarantees, an 
[Apache Samza](http://samza.apache.org/) runner, which would be a welcome 
addition, would currently only support at-least-once).
 
-To help clarify things, we’ve been working on enumerating the key features 
of the Beam model in a [capability matrix]({{ site.baseurl 
}}/learn/runners/capability-matrix/) for all existing runners, categorized 
around the four key questions addressed by the model: What / Where 
/ When / How (if you’re not familiar with those 
questions, you might want to read through [Streaming 
102](http://oreilly.com/ideas/the-world-beyond-batch-streaming-102) for an 
overview). This table will be maintained over time as the model evolves, our 
understanding grows, and runners are created or features added.
+To help clarify things, we’ve been working on enumerating the key features 
of the Beam model in a [capability matrix]({{ site.baseurl 
}}/documentation/runners/capability-matrix/) for all existing runners, 
categorized around the four key questions addressed by the model: What / Where 
/ When / How (if you’re not familiar with those 
questions, you might want to read through [Streaming 
102](http://oreilly.com/ideas/the-world-beyond-batch-streaming-102) for an 
overview). This table will be maintained over time as the model evolves, our 
understanding grows, and runners are created or features added.
 
-Included below is a summary snapshot of our current understanding of the 
capabilities of the existing runners (see the [live version]({{ site.baseurl 
}}/learn/runners/capability-matrix/) for full details, descriptions, and Jira 
links); since integration is still under way, the system as whole isn’t yet 
in a completely stable, usable state. But that should be changing in the near 
future, and we’ll be updating loud and clear on this blog when the first 
supported Beam 1.0 release happens.
+Included below is a summary snapshot of our current understanding of the 
capabilities of the existing runners (see the [live 

[2/3] beam-site git commit: Regenerate website

2017-04-19 Thread davor
Regenerate website


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/d557ce3f
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/d557ce3f
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/d557ce3f

Branch: refs/heads/asf-site
Commit: d557ce3fc592e9e17a61c47416dbc3ca765d85bc
Parents: 84eef87
Author: Davor Bonaci 
Authored: Wed Apr 19 12:08:06 2017 -0700
Committer: Davor Bonaci 
Committed: Wed Apr 19 12:08:06 2017 -0700

--
 content/beam/capability/2016/03/17/capability-matrix.html |  4 ++--
 .../capability/2016/04/03/presentation-materials.html |  4 ++--
 content/beam/release/2016/06/15/first-release.html|  4 ++--
 .../update/2016/10/11/strata-hadoop-world-and-beam.html   |  2 +-
 content/blog/2016/08/03/six-months.html   |  2 +-
 content/blog/2016/10/20/test-stream.html  |  2 +-
 content/blog/index.html   |  2 +-
 content/contribute/contribution-guide/index.html  | 10 +-
 content/contribute/index.html |  4 ++--
 content/contribute/release-guide/index.html   |  8 
 content/documentation/sdks/java/index.html|  2 +-
 content/documentation/sdks/python/index.html  |  2 +-
 content/feed.xml  | 10 +-
 content/get-started/mobile-gaming-example/index.html  |  2 +-
 content/v2/index.html |  2 +-
 15 files changed, 30 insertions(+), 30 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam-site/blob/d557ce3f/content/beam/capability/2016/03/17/capability-matrix.html
--
diff --git a/content/beam/capability/2016/03/17/capability-matrix.html 
b/content/beam/capability/2016/03/17/capability-matrix.html
index 82da83a..9f08727 100644
--- a/content/beam/capability/2016/03/17/capability-matrix.html
+++ b/content/beam/capability/2016/03/17/capability-matrix.html
@@ -167,9 +167,9 @@
 
 While we’d love to have a world where all runners support the full suite 
of semantics included in the Beam Model (formerly referred to as the http://www.vldb.org/pvldb/vol8/p1792-Akidau.pdf;>Dataflow Model), 
practically speaking, there will always be certain features that some runners 
can’t provide. For example, a Hadoop-based runner would be inherently 
batch-based and may be unable to (easily) implement support for unbounded 
collections. However, that doesn’t prevent it from being extremely useful for 
a large set of uses. In other cases, the implementations provided by one runner 
may have slightly different semantics that those provided by another (e.g. even 
though the current suite of runners all support exactly-once delivery 
guarantees, an http://samza.apache.org/;>Apache Samza runner, 
which would be a welcome addition, would currently only support 
at-least-once).
 
-To help clarify things, we’ve been working on enumerating the key 
features of the Beam model in a capability matrix for all existing 
runners, categorized around the four key questions addressed by the model: 
What / Where / When 
/ How (if you’re not familiar with those 
questions, you might want to read through http://oreilly.com/ideas/the-world-beyond-batch-streaming-102;>Streaming 
102 for an overview). This table will be maintained over time as the model 
evolves, our understanding grows, and runners are created or features added.
+To help clarify things, we’ve been working on enumerating the key 
features of the Beam model in a capability matrix for all 
existing runners, categorized around the four key questions addressed by the 
model: What / Where / When 
/ How (if you’re not familiar with those 
questions, you might want to read through http://oreilly.com/ideas/the-world-beyond-batch-streaming-102;>Streaming 
102 for an overview). This table will be maintained over time as the model 
evolves, our understanding grows, and runners are created or features added.
 
-Included below is a summary snapshot of our current understanding of the 
capabilities of the existing runners (see the live version for full details, 
descriptions, and Jira links); since integration is still under way, the system 
as whole isn’t yet in a completely stable, usable state. But that should be 
changing in the near future, and we’ll be updating loud and clear on this 
blog when the first supported Beam 1.0 release happens.
+Included below is a summary snapshot of our current understanding of the 
capabilities of the existing runners (see the live version for full 
details, descriptions, and Jira links); since integration is still under way, 
the system as whole isn’t yet in a 

[1/2] beam git commit: [BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow ValidatesRunner PostCommit

2017-04-19 Thread dhalperi
Repository: beam
Updated Branches:
  refs/heads/master 391fb77c3 -> 19ae87762


[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/546aa61f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/546aa61f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/546aa61f

Branch: refs/heads/master
Commit: 546aa61f217dc59f95727970a8dbc7c4b2f76e54
Parents: 391fb77
Author: Luke Cwik 
Authored: Wed Apr 19 09:20:38 2017 -0700
Committer: Dan Halperin 
Committed: Wed Apr 19 12:07:33 2017 -0700

--
 runners/apex/pom.xml   |  1 +
 runners/direct-java/pom.xml|  1 +
 runners/flink/pom.xml  |  2 ++
 runners/google-cloud-dataflow-java/pom.xml | 43 +
 runners/pom.xml| 40 ---
 runners/spark/pom.xml  |  1 +
 6 files changed, 48 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/apex/pom.xml
--
diff --git a/runners/apex/pom.xml b/runners/apex/pom.xml
index 40fc93c..f441e3d 100644
--- a/runners/apex/pom.xml
+++ b/runners/apex/pom.xml
@@ -229,6 +229,7 @@
 
   
   ${skipIntegrationTests}
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/direct-java/pom.xml
--
diff --git a/runners/direct-java/pom.xml b/runners/direct-java/pom.xml
index 03ed791..fc28fd6 100644
--- a/runners/direct-java/pom.xml
+++ b/runners/direct-java/pom.xml
@@ -81,6 +81,7 @@
   ]
 
   
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/flink/pom.xml
--
diff --git a/runners/flink/pom.xml b/runners/flink/pom.xml
index 351035e..808219b 100644
--- a/runners/flink/pom.xml
+++ b/runners/flink/pom.xml
@@ -75,6 +75,7 @@
   ]
 
   
+  4
 
   
 
@@ -108,6 +109,7 @@
   ]
 
   
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index e8aadb8..4cde923 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -38,6 +38,49 @@
 
6
   
 
+  
+
+
+  validates-runner-tests
+  
+validatesRunnerPipelineOptions
+  
+  
+
+  
+
+  org.apache.maven.plugins
+  maven-surefire-plugin
+  
+
+  validates-runner-tests
+  integration-test
+  
+test
+  
+  
+false
+
org.apache.beam.sdk.testing.ValidatesRunner
+all
+4
+
+  
org.apache.beam:beam-sdks-java-core
+
+
+  
${validatesRunnerPipelineOptions}
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
   

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/pom.xml
--
diff --git a/runners/pom.xml b/runners/pom.xml
index 150e987..8f3cabd 100644
--- a/runners/pom.xml
+++ b/runners/pom.xml
@@ -54,46 +54,6 @@
 
   
 
-
-
-
-  validates-runner-tests
-  
-validatesRunnerPipelineOptions
-  
-  
-
-  
-
-  org.apache.maven.plugins
-  maven-surefire-plugin
-  
-
-  validates-runner-tests
-  integration-test
-  
-test
-  
-  
-
org.apache.beam.sdk.testing.ValidatesRunner
-all
-4
-
-  
org.apache.beam:beam-sdks-java-core
- 

[2/2] beam git commit: This closes #2592

2017-04-19 Thread dhalperi
This closes #2592


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/19ae8776
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/19ae8776
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/19ae8776

Branch: refs/heads/master
Commit: 19ae8776261a5a78044091d9172223244a2b8042
Parents: 391fb77 546aa61
Author: Dan Halperin 
Authored: Wed Apr 19 12:07:37 2017 -0700
Committer: Dan Halperin 
Committed: Wed Apr 19 12:07:37 2017 -0700

--
 runners/apex/pom.xml   |  1 +
 runners/direct-java/pom.xml|  1 +
 runners/flink/pom.xml  |  2 ++
 runners/google-cloud-dataflow-java/pom.xml | 43 +
 runners/pom.xml| 40 ---
 runners/spark/pom.xml  |  1 +
 6 files changed, 48 insertions(+), 40 deletions(-)
--




[GitHub] beam-site pull request #217: [BEAM-919] Remove remaining old use/learn links...

2017-04-19 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/beam-site/pull/217


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[21/50] [abbrv] beam git commit: [BEAM-1914] This closes #2558

2017-04-19 Thread dhalperi
[BEAM-1914] This closes #2558


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/470808c0
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/470808c0
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/470808c0

Branch: refs/heads/DSL_SQL
Commit: 470808c06fc10ad545712d6b1831530e3d5313ad
Parents: 57929fb d0c0a60
Author: Jean-Baptiste Onofré 
Authored: Wed Apr 19 10:58:42 2017 +0200
Committer: Jean-Baptiste Onofré 
Committed: Wed Apr 19 10:58:42 2017 +0200

--
 .../apache/beam/sdk/io/CompressedSource.java|   4 +-
 .../main/java/org/apache/beam/sdk/io/XmlIO.java | 477 +++
 .../java/org/apache/beam/sdk/io/XmlSink.java| 226 ++---
 .../java/org/apache/beam/sdk/io/XmlSource.java  | 191 +---
 .../sdk/transforms/display/DisplayData.java |   6 +
 .../org/apache/beam/sdk/io/XmlSinkTest.java |  89 ++--
 .../org/apache/beam/sdk/io/XmlSourceTest.java   | 248 ++
 .../sdk/transforms/display/DisplayDataTest.java |  17 +
 8 files changed, 740 insertions(+), 518 deletions(-)
--




[30/50] [abbrv] beam git commit: [BEAM-1994] Remove Flink examples package

2017-04-19 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/beam/blob/cdd2544b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
new file mode 100644
index 000..123d5e7
--- /dev/null
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkStreamingTransformTranslators.java
@@ -0,0 +1,1044 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.beam.runners.flink;
+
+import static com.google.common.base.Preconditions.checkArgument;
+
+import com.google.common.collect.Lists;
+import com.google.common.collect.Maps;
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.Collection;
+import java.util.Collections;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Map.Entry;
+import org.apache.beam.runners.core.ElementAndRestriction;
+import org.apache.beam.runners.core.KeyedWorkItem;
+import org.apache.beam.runners.core.SplittableParDo;
+import org.apache.beam.runners.core.SystemReduceFn;
+import org.apache.beam.runners.flink.translation.functions.FlinkAssignWindows;
+import org.apache.beam.runners.flink.translation.types.CoderTypeInformation;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.DoFnOperator;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.KvToByteBufferKeySelector;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SingletonKeyedWorkItem;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SingletonKeyedWorkItemCoder;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.SplittableDoFnOperator;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WindowDoFnOperator;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.WorkItemKeySelector;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.BoundedSourceWrapper;
+import 
org.apache.beam.runners.flink.translation.wrappers.streaming.io.UnboundedSourceWrapper;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.coders.KvCoder;
+import org.apache.beam.sdk.coders.StringUtf8Coder;
+import org.apache.beam.sdk.coders.VoidCoder;
+import org.apache.beam.sdk.io.Read;
+import org.apache.beam.sdk.io.TextIO;
+import org.apache.beam.sdk.transforms.Combine;
+import org.apache.beam.sdk.transforms.DoFn;
+import org.apache.beam.sdk.transforms.Flatten;
+import org.apache.beam.sdk.transforms.GroupByKey;
+import org.apache.beam.sdk.transforms.PTransform;
+import org.apache.beam.sdk.transforms.ParDo;
+import org.apache.beam.sdk.transforms.join.RawUnionValue;
+import org.apache.beam.sdk.transforms.join.UnionCoder;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignature;
+import org.apache.beam.sdk.transforms.reflect.DoFnSignatures;
+import org.apache.beam.sdk.transforms.splittabledofn.RestrictionTracker;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.GlobalWindow;
+import org.apache.beam.sdk.transforms.windowing.Window;
+import org.apache.beam.sdk.transforms.windowing.WindowFn;
+import org.apache.beam.sdk.util.AppliedCombineFn;
+import org.apache.beam.sdk.util.Reshuffle;
+import org.apache.beam.sdk.util.WindowedValue;
+import org.apache.beam.sdk.util.WindowingStrategy;
+import org.apache.beam.sdk.values.KV;
+import org.apache.beam.sdk.values.PCollection;
+import org.apache.beam.sdk.values.PCollectionView;
+import org.apache.beam.sdk.values.PValue;
+import org.apache.beam.sdk.values.TupleTag;
+import org.apache.flink.api.common.functions.FlatMapFunction;
+import org.apache.flink.api.common.functions.MapFunction;
+import org.apache.flink.api.common.functions.RichFlatMapFunction;
+import org.apache.flink.api.common.typeinfo.TypeInformation;
+import org.apache.flink.api.java.tuple.Tuple2;
+import org.apache.flink.core.fs.FileSystem;
+import 

[32/50] [abbrv] beam git commit: [BEAM-1994] Remove Flink examples package

2017-04-19 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/beam/blob/cdd2544b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/TestCountingSource.java
--
diff --git 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/TestCountingSource.java
 
b/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/TestCountingSource.java
deleted file mode 100644
index 3a08088..000
--- 
a/runners/flink/runner/src/test/java/org/apache/beam/runners/flink/streaming/TestCountingSource.java
+++ /dev/null
@@ -1,254 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * "License"); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- *
- * http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an "AS IS" BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-package org.apache.beam.runners.flink.streaming;
-
-import static org.apache.beam.sdk.util.CoderUtils.encodeToByteArray;
-
-import java.io.IOException;
-import java.util.ArrayList;
-import java.util.List;
-import java.util.concurrent.ThreadLocalRandom;
-import javax.annotation.Nullable;
-import org.apache.beam.sdk.coders.Coder;
-import org.apache.beam.sdk.coders.DelegateCoder;
-import org.apache.beam.sdk.coders.KvCoder;
-import org.apache.beam.sdk.coders.VarIntCoder;
-import org.apache.beam.sdk.io.UnboundedSource;
-import org.apache.beam.sdk.options.PipelineOptions;
-import org.apache.beam.sdk.values.KV;
-import org.joda.time.Instant;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-/**
- * An unbounded source for testing the unbounded sources framework code.
- *
- * Each split of this sources produces records of the form KV(split_id, i),
- * where i counts up from 0.  Each record has a timestamp of i, and the 
watermark
- * accurately tracks these timestamps.  The reader will occasionally return 
false
- * from {@code advance}, in order to simulate a source where not all the data 
is
- * available immediately.
- */
-public class TestCountingSource
-extends UnboundedSource, 
TestCountingSource.CounterMark> {
-  private static final Logger LOG = 
LoggerFactory.getLogger(TestCountingSource.class);
-
-  private static List finalizeTracker;
-  private final int numMessagesPerShard;
-  private final int shardNumber;
-  private final boolean dedup;
-  private final boolean throwOnFirstSnapshot;
-  private final boolean allowSplitting;
-
-  /**
-   * We only allow an exception to be thrown from getCheckpointMark
-   * at most once. This must be static since the entire TestCountingSource
-   * instance may re-serialized when the pipeline recovers and retries.
-   */
-  private static boolean thrown = false;
-
-  public static void setFinalizeTracker(List finalizeTracker) {
-TestCountingSource.finalizeTracker = finalizeTracker;
-  }
-
-  public TestCountingSource(int numMessagesPerShard) {
-this(numMessagesPerShard, 0, false, false, true);
-  }
-
-  public TestCountingSource withDedup() {
-return new TestCountingSource(
-numMessagesPerShard, shardNumber, true, throwOnFirstSnapshot, true);
-  }
-
-  private TestCountingSource withShardNumber(int shardNumber) {
-return new TestCountingSource(
-numMessagesPerShard, shardNumber, dedup, throwOnFirstSnapshot, true);
-  }
-
-  public TestCountingSource withThrowOnFirstSnapshot(boolean 
throwOnFirstSnapshot) {
-return new TestCountingSource(
-numMessagesPerShard, shardNumber, dedup, throwOnFirstSnapshot, true);
-  }
-
-  public TestCountingSource withoutSplitting() {
-return new TestCountingSource(
-numMessagesPerShard, shardNumber, dedup, throwOnFirstSnapshot, false);
-  }
-
-  private TestCountingSource(int numMessagesPerShard, int shardNumber, boolean 
dedup,
- boolean throwOnFirstSnapshot, boolean 
allowSplitting) {
-this.numMessagesPerShard = numMessagesPerShard;
-this.shardNumber = shardNumber;
-this.dedup = dedup;
-this.throwOnFirstSnapshot = throwOnFirstSnapshot;
-this.allowSplitting = allowSplitting;
-  }
-
-  public int getShardNumber() {
-return shardNumber;
-  }
-
-  @Override
-  public List split(
-  int desiredNumSplits, PipelineOptions options) {
-List splits = new ArrayList<>();
-int numSplits = allowSplitting ? desiredNumSplits : 1;
-for (int i = 0; i < numSplits; i++) {

[48/50] [abbrv] beam git commit: [BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow ValidatesRunner PostCommit

2017-04-19 Thread dhalperi
[BEAM-2015] Remove shared profile in runners/pom.xml and fix Dataflow 
ValidatesRunner PostCommit


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/546aa61f
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/546aa61f
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/546aa61f

Branch: refs/heads/DSL_SQL
Commit: 546aa61f217dc59f95727970a8dbc7c4b2f76e54
Parents: 391fb77
Author: Luke Cwik 
Authored: Wed Apr 19 09:20:38 2017 -0700
Committer: Dan Halperin 
Committed: Wed Apr 19 12:07:33 2017 -0700

--
 runners/apex/pom.xml   |  1 +
 runners/direct-java/pom.xml|  1 +
 runners/flink/pom.xml  |  2 ++
 runners/google-cloud-dataflow-java/pom.xml | 43 +
 runners/pom.xml| 40 ---
 runners/spark/pom.xml  |  1 +
 6 files changed, 48 insertions(+), 40 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/apex/pom.xml
--
diff --git a/runners/apex/pom.xml b/runners/apex/pom.xml
index 40fc93c..f441e3d 100644
--- a/runners/apex/pom.xml
+++ b/runners/apex/pom.xml
@@ -229,6 +229,7 @@
 
   
   ${skipIntegrationTests}
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/direct-java/pom.xml
--
diff --git a/runners/direct-java/pom.xml b/runners/direct-java/pom.xml
index 03ed791..fc28fd6 100644
--- a/runners/direct-java/pom.xml
+++ b/runners/direct-java/pom.xml
@@ -81,6 +81,7 @@
   ]
 
   
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/flink/pom.xml
--
diff --git a/runners/flink/pom.xml b/runners/flink/pom.xml
index 351035e..808219b 100644
--- a/runners/flink/pom.xml
+++ b/runners/flink/pom.xml
@@ -75,6 +75,7 @@
   ]
 
   
+  4
 
   
 
@@ -108,6 +109,7 @@
   ]
 
   
+  4
 
   
 

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/google-cloud-dataflow-java/pom.xml
--
diff --git a/runners/google-cloud-dataflow-java/pom.xml 
b/runners/google-cloud-dataflow-java/pom.xml
index e8aadb8..4cde923 100644
--- a/runners/google-cloud-dataflow-java/pom.xml
+++ b/runners/google-cloud-dataflow-java/pom.xml
@@ -38,6 +38,49 @@
 
6
   
 
+  
+
+
+  validates-runner-tests
+  
+validatesRunnerPipelineOptions
+  
+  
+
+  
+
+  org.apache.maven.plugins
+  maven-surefire-plugin
+  
+
+  validates-runner-tests
+  integration-test
+  
+test
+  
+  
+false
+
org.apache.beam.sdk.testing.ValidatesRunner
+all
+4
+
+  
org.apache.beam:beam-sdks-java-core
+
+
+  
${validatesRunnerPipelineOptions}
+
+  
+
+  
+
+  
+
+  
+
+  
+
   
 
   

http://git-wip-us.apache.org/repos/asf/beam/blob/546aa61f/runners/pom.xml
--
diff --git a/runners/pom.xml b/runners/pom.xml
index 150e987..8f3cabd 100644
--- a/runners/pom.xml
+++ b/runners/pom.xml
@@ -54,46 +54,6 @@
 
   
 
-
-
-
-  validates-runner-tests
-  
-validatesRunnerPipelineOptions
-  
-  
-
-  
-
-  org.apache.maven.plugins
-  maven-surefire-plugin
-  
-
-  validates-runner-tests
-  integration-test
-  
-test
-  
-  
-
org.apache.beam.sdk.testing.ValidatesRunner
-all
-4
-
-  
org.apache.beam:beam-sdks-java-core
-
-
-  

[04/50] [abbrv] beam git commit: ProcessFn remembers more info about its application context

2017-04-19 Thread dhalperi
ProcessFn remembers more info about its application context


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/3fd88901
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/3fd88901
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/3fd88901

Branch: refs/heads/DSL_SQL
Commit: 3fd889015afa8528801d2c35c8c9f72b944ea472
Parents: a51bdd2
Author: Eugene Kirpichov 
Authored: Sat Apr 15 16:39:51 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Apr 18 18:02:06 2017 -0700

--
 .../beam/runners/core/SplittableParDo.java  | 35 +++-
 .../beam/runners/core/SplittableParDoTest.java  |  8 -
 2 files changed, 34 insertions(+), 9 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/3fd88901/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
--
diff --git 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
index 9cc965a..44db1f7 100644
--- 
a/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
+++ 
b/runners/core-java/src/main/java/org/apache/beam/runners/core/SplittableParDo.java
@@ -115,7 +115,7 @@ public class SplittableParDo
 fn,
 input.getCoder(),
 restrictionCoder,
-input.getWindowingStrategy(),
+(WindowingStrategy) input.getWindowingStrategy(),
 parDo.getSideInputs(),
 parDo.getMainOutputTag(),
 parDo.getAdditionalOutputTags()));
@@ -185,7 +185,7 @@ public class SplittableParDo
 private final DoFn fn;
 private final Coder elementCoder;
 private final Coder restrictionCoder;
-private final WindowingStrategy windowingStrategy;
+private final WindowingStrategy windowingStrategy;
 private final List sideInputs;
 private final TupleTag mainOutputTag;
 private final TupleTagList additionalOutputTags;
@@ -202,7 +202,7 @@ public class SplittableParDo
 DoFn fn,
 Coder elementCoder,
 Coder restrictionCoder,
-WindowingStrategy windowingStrategy,
+WindowingStrategy windowingStrategy,
 List sideInputs,
 TupleTag mainOutputTag,
 TupleTagList additionalOutputTags) {
@@ -234,7 +234,7 @@ public class SplittableParDo
 public ProcessFn newProcessFn(
 DoFn fn) {
   return new SplittableParDo.ProcessFn<>(
-  fn, elementCoder, restrictionCoder, 
windowingStrategy.getWindowFn().windowCoder());
+  fn, elementCoder, restrictionCoder, windowingStrategy);
 }
 
 @Override
@@ -351,7 +351,9 @@ public class SplittableParDo
 private StateTag restrictionTag;
 
 private final DoFn fn;
-private final Coder windowCoder;
+private final Coder elementCoder;
+private final Coder restrictionCoder;
+private final WindowingStrategy inputWindowingStrategy;
 
 private transient StateInternalsFactory stateInternalsFactory;
 private transient TimerInternalsFactory timerInternalsFactory;
@@ -364,11 +366,16 @@ public class SplittableParDo
 DoFn fn,
 Coder elementCoder,
 Coder restrictionCoder,
-Coder windowCoder) {
+WindowingStrategy inputWindowingStrategy) {
   this.fn = fn;
-  this.windowCoder = windowCoder;
+  this.elementCoder = elementCoder;
+  this.restrictionCoder = restrictionCoder;
+  this.inputWindowingStrategy = inputWindowingStrategy;
   this.elementTag =
-  StateTags.value("element", WindowedValue.getFullCoder(elementCoder, 
this.windowCoder));
+  StateTags.value(
+  "element",
+  WindowedValue.getFullCoder(
+  elementCoder, 
inputWindowingStrategy.getWindowFn().windowCoder()));
   this.restrictionTag = StateTags.value("restriction", restrictionCoder);
 }
 
@@ -389,6 +396,18 @@ public class SplittableParDo
   return fn;
 }
 
+public Coder getElementCoder() {
+  return elementCoder;
+}
+
+public Coder getRestrictionCoder() {
+  return restrictionCoder;
+}
+
+public WindowingStrategy getInputWindowingStrategy() {
+  return 

[42/50] [abbrv] beam git commit: [BEAM-1441] Remove deprecated ChannelFactory

2017-04-19 Thread dhalperi
[BEAM-1441] Remove deprecated ChannelFactory


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/97c66784
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/97c66784
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/97c66784

Branch: refs/heads/DSL_SQL
Commit: 97c667846b566c312ceaadc66fb14fde1dfa7ebe
Parents: 8319369
Author: Sourabh Bajaj 
Authored: Fri Apr 14 14:45:16 2017 -0700
Committer: chamik...@google.com 
Committed: Wed Apr 19 09:56:28 2017 -0700

--
 sdks/python/apache_beam/io/fileio.py | 90 ---
 1 file changed, 90 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/beam/blob/97c66784/sdks/python/apache_beam/io/fileio.py
--
diff --git a/sdks/python/apache_beam/io/fileio.py 
b/sdks/python/apache_beam/io/fileio.py
index 8ee5198..f61289e 100644
--- a/sdks/python/apache_beam/io/fileio.py
+++ b/sdks/python/apache_beam/io/fileio.py
@@ -27,7 +27,6 @@ import time
 from apache_beam.internal import util
 from apache_beam.io import iobase
 from apache_beam.io.filesystem import BeamIOError
-from apache_beam.io.filesystem import CompressedFile as _CompressedFile
 from apache_beam.io.filesystem import CompressionTypes
 from apache_beam.io.filesystems_util import get_filesystem
 from apache_beam.transforms.display import DisplayDataItem
@@ -38,95 +37,6 @@ from apache_beam.utils.value_provider import check_accessible
 DEFAULT_SHARD_NAME_TEMPLATE = '-S-of-N'
 
 
-# TODO(sourabhbajaj): Remove this after BFS API is used everywhere
-class ChannelFactory(object):
-  @staticmethod
-  def mkdir(path):
-bfs = get_filesystem(path)
-return bfs.mkdirs(path)
-
-  @staticmethod
-  def open(path,
-   mode,
-   mime_type='application/octet-stream',
-   compression_type=CompressionTypes.AUTO):
-bfs = get_filesystem(path)
-if mode == 'rb':
-  return bfs.open(path, mime_type, compression_type)
-elif mode == 'wb':
-  return bfs.create(path, mime_type, compression_type)
-
-  @staticmethod
-  def is_compressed(fileobj):
-return isinstance(fileobj, _CompressedFile)
-
-  @staticmethod
-  def rename(src, dest):
-bfs = get_filesystem(src)
-return bfs.rename([src], [dest])
-
-  @staticmethod
-  def rename_batch(src_dest_pairs):
-sources = [s for s, _ in src_dest_pairs]
-destinations = [d for _, d in src_dest_pairs]
-if not sources:
-  return []
-bfs = get_filesystem(sources[0])
-try:
-  bfs.rename(sources, destinations)
-  return []
-except BeamIOError as exp:
-  return [(s, d, e) for (s, d), e in exp.exception_details.iteritems()]
-
-  @staticmethod
-  def copytree(src, dest):
-bfs = get_filesystem(src)
-return bfs.copy([src], [dest])
-
-  @staticmethod
-  def exists(path):
-bfs = get_filesystem(path)
-return bfs.exists(path)
-
-  @staticmethod
-  def rmdir(path):
-bfs = get_filesystem(path)
-return bfs.delete([path])
-
-  @staticmethod
-  def rm(path):
-bfs = get_filesystem(path)
-return bfs.delete([path])
-
-  @staticmethod
-  def glob(path, limit=None):
-bfs = get_filesystem(path)
-match_result = bfs.match([path], [limit])[0]
-return [f.path for f in match_result.metadata_list]
-
-  @staticmethod
-  def size_in_bytes(path):
-bfs = get_filesystem(path)
-match_result = bfs.match([path])[0]
-return [f.size_in_bytes for f in match_result.metadata_list][0]
-
-  @staticmethod
-  def size_of_files_in_glob(path, file_names=None):
-bfs = get_filesystem(path)
-match_result = bfs.match([path])[0]
-part_files = {f.path:f.size_in_bytes for f in match_result.metadata_list}
-
-if file_names is not None:
-  specific_files = {}
-  match_results = bfs.match(file_names)
-  for match_result in match_results:
-for metadata in match_result.metadata_list:
-  specific_files[metadata.path] = metadata.size_in_bytes
-
-  part_files.update(specific_files)
-return part_files
-
-
 class FileSink(iobase.Sink):
   """A sink to a GCS or local files.
 



[26/50] [abbrv] beam git commit: [BEAM-1994] Remove Flink examples package

2017-04-19 Thread dhalperi
http://git-wip-us.apache.org/repos/asf/beam/blob/cdd2544b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkSplitStateInternals.java
--
diff --git 
a/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkSplitStateInternals.java
 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkSplitStateInternals.java
new file mode 100644
index 000..2bf0bf1
--- /dev/null
+++ 
b/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/state/FlinkSplitStateInternals.java
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink.translation.wrappers.streaming.state;
+
+import com.google.common.collect.Iterators;
+import java.util.Collections;
+import org.apache.beam.runners.core.StateInternals;
+import org.apache.beam.runners.core.StateNamespace;
+import org.apache.beam.runners.core.StateTag;
+import org.apache.beam.runners.flink.translation.types.CoderTypeInformation;
+import org.apache.beam.sdk.coders.Coder;
+import org.apache.beam.sdk.transforms.Combine;
+import org.apache.beam.sdk.transforms.CombineWithContext;
+import org.apache.beam.sdk.transforms.windowing.BoundedWindow;
+import org.apache.beam.sdk.transforms.windowing.OutputTimeFn;
+import org.apache.beam.sdk.util.state.BagState;
+import org.apache.beam.sdk.util.state.CombiningState;
+import org.apache.beam.sdk.util.state.MapState;
+import org.apache.beam.sdk.util.state.ReadableState;
+import org.apache.beam.sdk.util.state.SetState;
+import org.apache.beam.sdk.util.state.State;
+import org.apache.beam.sdk.util.state.StateContext;
+import org.apache.beam.sdk.util.state.StateContexts;
+import org.apache.beam.sdk.util.state.ValueState;
+import org.apache.beam.sdk.util.state.WatermarkHoldState;
+import org.apache.flink.api.common.ExecutionConfig;
+import org.apache.flink.api.common.state.ListStateDescriptor;
+import org.apache.flink.runtime.state.OperatorStateBackend;
+
+/**
+ * {@link StateInternals} that uses a Flink {@link OperatorStateBackend}
+ * to manage the split-distribute state.
+ *
+ * Elements in ListState will be redistributed in round robin fashion
+ * to operators when restarting with a different parallelism.
+ *
+ *  Note:
+ *  Ignore index of key and namespace.
+ *  Just implement BagState.
+ */
+public class FlinkSplitStateInternals implements StateInternals {
+
+  private final OperatorStateBackend stateBackend;
+
+  public FlinkSplitStateInternals(OperatorStateBackend stateBackend) {
+this.stateBackend = stateBackend;
+  }
+
+  @Override
+  public K getKey() {
+return null;
+  }
+
+  @Override
+  public  T state(
+  final StateNamespace namespace,
+  StateTag address) {
+
+return state(namespace, address, StateContexts.nullContext());
+  }
+
+  @Override
+  public  T state(
+  final StateNamespace namespace,
+  StateTag address,
+  final StateContext context) {
+
+return address.bind(new StateTag.StateBinder() {
+
+  @Override
+  public  ValueState bindValue(
+  StateTag> address,
+  Coder coder) {
+throw new UnsupportedOperationException(
+String.format("%s is not supported", 
ValueState.class.getSimpleName()));
+  }
+
+  @Override
+  public  BagState bindBag(
+  StateTag> address,
+  Coder elemCoder) {
+
+return new FlinkSplitBagState<>(stateBackend, address, namespace, 
elemCoder);
+  }
+
+  @Override
+  public  SetState bindSet(
+  StateTag> address,
+  Coder elemCoder) {
+throw new UnsupportedOperationException(
+String.format("%s is not supported", 
SetState.class.getSimpleName()));
+  }
+
+  @Override
+  public  MapState bindMap(
+  StateTag> spec,
+  Coder mapKeyCoder, Coder mapValueCoder) {
+throw new UnsupportedOperationException(
+String.format("%s is not supported", 
MapState.class.getSimpleName()));
+  }
+
+  @Override
+  public 

[19/50] [abbrv] beam git commit: This closes #2415

2017-04-19 Thread dhalperi
This closes #2415


Project: http://git-wip-us.apache.org/repos/asf/beam/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam/commit/57929fb8
Tree: http://git-wip-us.apache.org/repos/asf/beam/tree/57929fb8
Diff: http://git-wip-us.apache.org/repos/asf/beam/diff/57929fb8

Branch: refs/heads/DSL_SQL
Commit: 57929fb802d0cb6a6b6c3f14819d473dc2ace113
Parents: e0df7d8 7d13061
Author: Eugene Kirpichov 
Authored: Tue Apr 18 21:13:05 2017 -0700
Committer: Eugene Kirpichov 
Committed: Tue Apr 18 21:13:05 2017 -0700

--
 .../apache/beam/sdk/util/IOChannelUtils.java|9 +
 .../sdk/io/gcp/bigquery/BatchLoadBigQuery.java  |  180 ---
 .../beam/sdk/io/gcp/bigquery/BatchLoads.java|  225 +++
 .../sdk/io/gcp/bigquery/BigQueryHelpers.java|   13 +
 .../beam/sdk/io/gcp/bigquery/BigQueryIO.java|  113 +-
 .../io/gcp/bigquery/BigQueryTableSource.java|4 +-
 .../beam/sdk/io/gcp/bigquery/CreateTables.java  |  127 ++
 .../io/gcp/bigquery/GenerateShardedTable.java   |   47 +
 .../beam/sdk/io/gcp/bigquery/PrepareWrite.java  |   81 +
 .../beam/sdk/io/gcp/bigquery/ShardedKey.java|   25 +-
 .../sdk/io/gcp/bigquery/StreamWithDeDup.java|   90 --
 .../sdk/io/gcp/bigquery/StreamingInserts.java   |   79 +
 .../sdk/io/gcp/bigquery/StreamingWriteFn.java   |   81 +-
 .../io/gcp/bigquery/StreamingWriteTables.java   |   86 ++
 .../sdk/io/gcp/bigquery/TableDestination.java   |   76 +
 .../io/gcp/bigquery/TableDestinationCoder.java  |   60 +
 .../sdk/io/gcp/bigquery/TableRowWriter.java |   19 +-
 .../sdk/io/gcp/bigquery/TagWithUniqueIds.java   |   62 +
 .../gcp/bigquery/TagWithUniqueIdsAndTable.java  |  135 --
 .../beam/sdk/io/gcp/bigquery/WriteBundles.java  |   82 --
 .../io/gcp/bigquery/WriteBundlesToFiles.java|  157 ++
 .../sdk/io/gcp/bigquery/WritePartition.java |  163 +-
 .../beam/sdk/io/gcp/bigquery/WriteRename.java   |   71 +-
 .../beam/sdk/io/gcp/bigquery/WriteTables.java   |   58 +-
 .../sdk/io/gcp/bigquery/BigQueryIOTest.java | 1393 +++---
 .../io/gcp/bigquery/FakeBigQueryServices.java   |  166 +++
 .../sdk/io/gcp/bigquery/FakeDatasetService.java |  208 +++
 .../sdk/io/gcp/bigquery/FakeJobService.java |  395 +
 .../sdk/io/gcp/bigquery/TableContainer.java |   61 +
 29 files changed, 2642 insertions(+), 1624 deletions(-)
--




  1   2   3   4   >