[jira] [Created] (BEAM-1200) PubsubIO should allow for a user to supply the function which computes the watermark that is reported

2016-12-21 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-1200:
---

 Summary: PubsubIO should allow for a user to supply the function 
which computes the watermark that is reported
 Key: BEAM-1200
 URL: https://issues.apache.org/jira/browse/BEAM-1200
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Daniel Halperin
Priority: Minor


A user wanted to build a watermark function which tracked the datas watermark 
but never falls behind current time more than Y minutes. PubsubIO does not 
support specifying the function which computes and reports the watermark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

2016-12-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-430:
---
Labels: backward-incompatible  (was: )

> Introducing gcpTempLocation that default to tempLocation
> 
>
> Key: BEAM-430
> URL: https://issues.apache.org/jira/browse/BEAM-430
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: 0.2.0-incubating
>
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. 
> And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on 
> gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or 
> BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount 
> defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if 
> tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

2016-12-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik closed BEAM-430.
--

> Introducing gcpTempLocation that default to tempLocation
> 
>
> Key: BEAM-430
> URL: https://issues.apache.org/jira/browse/BEAM-430
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: 0.2.0-incubating
>
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. 
> And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on 
> gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or 
> BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount 
> defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if 
> tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

2016-12-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-430.

Resolution: Fixed

> Introducing gcpTempLocation that default to tempLocation
> 
>
> Key: BEAM-430
> URL: https://issues.apache.org/jira/browse/BEAM-430
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: 0.2.0-incubating
>
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. 
> And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on 
> gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or 
> BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount 
> defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if 
> tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (BEAM-430) Introducing gcpTempLocation that default to tempLocation

2016-12-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reopened BEAM-430:


> Introducing gcpTempLocation that default to tempLocation
> 
>
> Key: BEAM-430
> URL: https://issues.apache.org/jira/browse/BEAM-430
> Project: Beam
>  Issue Type: Improvement
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: 0.2.0-incubating
>
>
> Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. 
> And, it requires tempLocation to be a gcs path.
> Another case is BigQueryIO uses tempLocation and also requires it to be on 
> gcs.
> So, users cannot set tempLocation to a non-gcs path with DataflowRunner or 
> BigQueryIO.
> However, tempLocation could be on any file system. For example, WordCount 
> defaults to output to tempLocation.
> The proposal is to add gcpTempLocation. And, it defaults to tempLocation if 
> tempLocation is a gcs path.
> StagingLocation and BigQueryIO will use gcpTempLocation by default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-1187) GCP Transport not performing timed backoff after connection failure

2016-12-20 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-1187:
---

 Summary: GCP Transport not performing timed backoff after 
connection failure
 Key: BEAM-1187
 URL: https://issues.apache.org/jira/browse/BEAM-1187
 Project: Beam
  Issue Type: Bug
  Components: runner-dataflow, sdk-java-core, sdk-java-gcp
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


The http request retries are failing and seemingly being immediately retried if 
there is a connection exception. Note that below all the times are the same, 
and also that we are logging too much. This seems to be related to the 
interaction by the chaining http request initializer combining the Credential 
initializer followed by the RetryHttpRequestInitializer. Also, note that we 
never log "Request failed with IOException, will NOT retry" which implies that 
the retry logic never made it to the RetryHttpRequestInitializer.

Action items are:
1) Ensure that the RetryHttpRequestInitializer is used
2) Ensure that calls do backoff
3) Reduce the logging to one terminal statement saying that we retried X times 
and final failure was YYY.

Dump of console output:
Dec 20, 2016 9:12:20 AM 
com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner fromOptions
INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from 
the classpath: will stage 1 files. Enable logging at DEBUG level to see which 
files will be staged.
Dec 20, 2016 9:12:21 AM 
com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner run
INFO: Executing pipeline on the Dataflow Service, which will have billing 
implications related to Google Compute Engine usage and other Google Cloud 
Services.
Dec 20, 2016 9:12:21 AM com.google.cloud.dataflow.sdk.util.PackageUtil 
stageClasspathElements
INFO: Uploading 1 files from PipelineOptions.filesToStage to staging location 
to prepare for execution.
Dec 20, 2016 9:12:21 AM com.google.cloud.dataflow.sdk.util.PackageUtil 
stageClasspathElements
INFO: Uploading PipelineOptions.filesToStage complete: 1 files newly uploaded, 
0 files cached
Dec 20, 2016 9:12:22 AM com.google.api.client.http.HttpRequest execute
WARNING: exception thrown while executing request
java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at 
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at 
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at 
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:432)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:527)
at sun.net.www.http.HttpClient.(HttpClient.java:211)
at sun.net.www.http.HttpClient.New(HttpClient.java:308)
at sun.net.www.http.HttpClient.New(HttpClient.java:326)
at 
sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105)
at 
sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999)
at 
sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933)
at 
sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1283)
at 
sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1258)
at 
com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at 
com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at 
com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:632)
at 
com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:201)
at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181)
at 
com.google.cloud.dataflow.integration.NumbersStreaming.numbersStreamingFromPubsub(NumbersStreaming.java:378)
at 
com.google.cloud.dataflow.integration.NumbersStreaming.main(NumbersStreaming.java:831)

Dec 20, 2016 9:12:22 AM com.google.api.client.http.HttpRequest execute
WARNING: exception thrown while executing request

[jira] [Commented] (BEAM-1176) Make our test suites use @Rule TestPipeline

2016-12-20 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764778#comment-15764778
 ] 

Luke Cwik commented on BEAM-1176:
-

I took a look at AvroIOGeneratedClassTest, ApproximateUniqueTest, SampleTest 
and BigtableIOTest. It seemed as though all multi-pipeline cases were just 
different variants of the same pipeline. It seems as though if these tests were 
broken up into multiple tests or better yet a set of paramemterized tests 
(https://github.com/Pragmatists/junitparams), we could use the test rule.

> Make our test suites use @Rule TestPipeline
> ---
>
> Key: BEAM-1176
> URL: https://issues.apache.org/jira/browse/BEAM-1176
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Stas Levin
>Priority: Minor
>
> Now that [~staslev] has made {{TestPipeline}} a JUnit rule that performs 
> useful sanity checks, we should port all of our tests to it so that they set 
> a good example for users. Maybe we'll even catch some straggling tests with 
> errors :-)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (BEAM-1005) Autogenerate example archetypes as part of build process

2016-12-13 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik closed BEAM-1005.
---
   Resolution: Duplicate
Fix Version/s: Not applicable

Duplicate of BEAM-1004

> Autogenerate example archetypes as part of build process
> 
>
> Key: BEAM-1005
> URL: https://issues.apache.org/jira/browse/BEAM-1005
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-extensions
>Reporter: Kenneth Knowles
> Fix For: Not applicable
>
>
> Previously, the maven archetypes were manually curated. Recently, the 
> generation of the content for the example archetype was automated, and 
> another Java 8 example archetype created. The generated content is currently 
> checked into source control, but should be instead generated as part of the 
> build process.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-758) Per-step, per-execution nonce

2016-12-09 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736667#comment-15736667
 ] 

Luke Cwik commented on BEAM-758:


I would suggest building this into new DoFn, add an annotation called @Nonce 
that can be added to methods and it will automatically be populated.

> Per-step, per-execution nonce
> -
>
> Key: BEAM-758
> URL: https://issues.apache.org/jira/browse/BEAM-758
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Sam McVeety
>
> In the forthcoming runner API, a user will be able to save a pipeline to JSON 
> and then run it repeatedly.
> Many pieces of code (e.g., BigQueryIO.Read or Write) rely on a single random 
> value (nonce). These values are typically generated at apply time, so that 
> they are deterministic (don't change across retries of DoFns) and global (are 
> the same across all workers).
> However, once the runner API lands the existing code would result in the same 
> nonce being reused across jobs. Other possible solutions:
> * Generate nonce in {{Create(1) | ParDo}} then use this as a side input. 
> Should work, as along as side inputs are actually checkpointed. But does not 
> work for {{BoundedSource}}.
> * If a nonce is only needed for the lifetime of one bundle, can be generated 
> in {{startBundle}} and used in {{finishBundle}} [or {{tearDown}}].
> * Add some context somewhere that lets user code access unique step name, and 
> somehow generate a nonce consistently e.g. by hashing. Will usually work, but 
> this is similarly not available to sources.
> Another Q: I'm not sure we have a good way to generate nonces in unbounded 
> pipelines -- we probably need one. This would enable us to, e.g., use 
> {{BigQueryIO.Write}} in an unbounded pipeline [if we had, e.g., exactly-once 
> triggering per window]. Or generalizing to multiple firings...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-682) Invoker Class should be created in Thread Context Classloader

2016-12-02 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717358#comment-15717358
 ] 

Luke Cwik commented on BEAM-682:


ReflectHelpers exposes a method which figures out the correct class loader to 
use:
https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/ReflectHelpers.java#L224

You can't assume the current threads context class loader is always available 
since it can be null.
http://stackoverflow.com/questions/3459216/can-the-thread-context-class-loader-be-null

> Invoker Class should be created in Thread Context Classloader
> -
>
> Key: BEAM-682
> URL: https://issues.apache.org/jira/browse/BEAM-682
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Affects Versions: 0.3.0-incubating
>Reporter: Sumit Chawla
>Assignee: Sumit Chawla
>
> As of now the InvokerClass is being loaded in wrong classloader. It should be 
> loaded into Thread.currentThread.getContextClassLoader()
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvokers.java#L167
> {code}
>  Class> res =
> (Class>)
> unloaded
> .load(DoFnInvokers.class.getClassLoader(), 
> ClassLoadingStrategy.Default.INJECTION)
> .getLoaded();
> {code}
> Fix 
> {code}
>  Class> res =
> (Class>)
> unloaded
> .load(Thread.currentThread().getContextClassLoader(),
> ClassLoadingStrategy.Default.INJECTION)
> .getLoaded();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1061) PreCommit test with side inputs

2016-11-29 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707206#comment-15707206
 ] 

Luke Cwik commented on BEAM-1061:
-

Is this not covered by the RunnableOnService tests found in ViewTest.java?

> PreCommit test with side inputs
> ---
>
> Key: BEAM-1061
> URL: https://issues.apache.org/jira/browse/BEAM-1061
> Project: Beam
>  Issue Type: Test
>  Components: testing
>Reporter: Daniel Halperin
>Assignee: Daniel Halperin
>
> We should have at least one precommit integration test that exercises side 
> inputs on all runners. Existing tests exercise sources, files, per-key, 
> combiners, windowing, ...; side inputs is one lacking part of the model it 
> would be nice to touch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-1024) upgrade to protobuf-3.1.0

2016-11-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15684774#comment-15684774
 ] 

Luke Cwik commented on BEAM-1024:
-

There was an  upgrade to protobuf 3.0.0 in commit 
https://github.com/apache/incubator-beam/commit/f93ca9ce803a8847a7178ff0d7c5e1631bed8f2d
 for Apache Beam.

Upgrading to 3.1.0 would require either shading protobuf everywhere or making 
sure that all our dependencies use protobuf 3.1.0

> upgrade to protobuf-3.1.0
> -
>
> Key: BEAM-1024
> URL: https://issues.apache.org/jira/browse/BEAM-1024
> Project: Beam
>  Issue Type: Wish
>Reporter: Rafael Fernandez
>
> The SDK currently uses protobuf 3.0.0-beta-1. There are critical improvements 
> to the library since (such as JsonFormat.parser().ignoringUnknownFields()).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-950) DoFn Setup and Teardown methods should have access to PipelineOptions

2016-11-10 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654527#comment-15654527
 ] 

Luke Cwik commented on BEAM-950:


The primary use case is getting access to things like credentials/executor 
service.

> DoFn Setup and Teardown methods should have access to PipelineOptions
> -
>
> Key: BEAM-950
> URL: https://issues.apache.org/jira/browse/BEAM-950
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Thomas Groh
>Assignee: Davor Bonaci
>
> This enables any options-relevant decisions to be made once per DoFn, without 
> having to lazily initialize in {{startBundle}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-626) AvroCoder not deserializing correctly in Kryo

2016-11-08 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-626.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> AvroCoder not deserializing correctly in Kryo
> -
>
> Key: BEAM-626
> URL: https://issues.apache.org/jira/browse/BEAM-626
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> Unlike with Java serialization, when deserializing AvroCoder using Kryo, the 
> resulting AvroCoder is missing all of its transient fields.
> The reason it works with Java serialization is because of the usage of 
> writeReplace and readResolve, which Kryo does not adhere to.
> In ProtoCoder for example there are also unserializable members, the way it 
> is solved there is lazy initializing these members via their getters, so they 
> are initialized in the deserialized object on first call to the member.
> It seems AvroCoder is the only class in Beam to use writeReplace convention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner

2016-11-07 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646506#comment-15646506
 ] 

Luke Cwik commented on BEAM-939:


Yes, did that in https://github.com/apache/incubator-beam/pull/1308

Unfortunately, the fact that we had to pass around the BigtableService instance 
for testing reasons made this more difficult than just deferring for 
PipelineOptions when its available in the few places we need the service.

Added you and Thomas Groh for review.

> New credentials code broke Dataflow runner
> --
>
> Key: BEAM-939
> URL: https://issues.apache.org/jira/browse/BEAM-939
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/
> {code}
> java.lang.NoSuchMethodError: 
> com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials;
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207)
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112)
>   at 
> com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94)
>   at 
> com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185)
>   at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399)
>   at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307)
>   at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
>   at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-939) New credentials code broke Dataflow runner

2016-11-07 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-939:
---
Priority: Minor  (was: Major)

> New credentials code broke Dataflow runner
> --
>
> Key: BEAM-939
> URL: https://issues.apache.org/jira/browse/BEAM-939
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/
> {code}
> java.lang.NoSuchMethodError: 
> com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials;
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207)
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112)
>   at 
> com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94)
>   at 
> com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185)
>   at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399)
>   at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307)
>   at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
>   at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner

2016-11-07 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646147#comment-15646147
 ] 

Luke Cwik commented on BEAM-939:


Turns out that BigtableIO was never using the user specified credentials and 
relying on the application default via com.google.auth.GoogleCredentials. 
Unfortunately between 0.4.0 and 0.6.0, com.google.auth.oauth2.GoogleCredentials 
had a backwards incompatible change (removed the method 
com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials).

It seems like the proper way to fix this is to pass through the 
com.google.auth.Credentials object from pipeline options through.

> New credentials code broke Dataflow runner
> --
>
> Key: BEAM-939
> URL: https://issues.apache.org/jira/browse/BEAM-939
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: 0.4.0-incubating
>
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/
> {code}
> java.lang.NoSuchMethodError: 
> com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials;
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207)
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112)
>   at 
> com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94)
>   at 
> com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185)
>   at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399)
>   at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307)
>   at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
>   at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner

2016-11-07 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646116#comment-15646116
 ] 

Luke Cwik commented on BEAM-939:


Taking a look

> New credentials code broke Dataflow runner
> --
>
> Key: BEAM-939
> URL: https://issues.apache.org/jira/browse/BEAM-939
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-gcp
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Luke Cwik
> Fix For: 0.4.0-incubating
>
>
> https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/
> {code}
> java.lang.NoSuchMethodError: 
> com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials;
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207)
>   at 
> com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112)
>   at 
> com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94)
>   at 
> com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185)
>   at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399)
>   at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307)
>   at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47)
>   at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158)
>   at 
> org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>   at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>   at 
> org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow

2016-11-07 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-725.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Remove legacy credentials flags related to GCP and adopt application default 
> credentials as only supported default flow
> ---
>
> Key: BEAM-725
> URL: https://issues.apache.org/jira/browse/BEAM-725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>  Labels: backward-incompatible
> Fix For: 0.4.0-incubating
>
>
> Drop the following GcpOptions and use ADC 
> (https://developers.google.com/identity/protocols/application-default-credentials)
>  to clean-up credentials story for GCP:
> AuthorizationServerEncodedUrl
> TokenServerUrl
> CredentialDir
> CredentialId
> SecretsFile
> ServiceAccountName
> ServiceAccountKeyfile
> Also migrate from Apiary Credentials class to Google OAuth Credentials class 
> when available from google-cloud-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-582) Allow usage of the new GCP service account JSON key

2016-11-07 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-582.

   Resolution: Won't Fix
Fix Version/s: Not applicable

> Allow usage of the new GCP service account JSON key
> ---
>
> Key: BEAM-582
> URL: https://issues.apache.org/jira/browse/BEAM-582
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Alex Van Boxel
>Assignee: Davor Bonaci
> Fix For: Not applicable
>
>
> The new JSON service account files are a lot easier to use, you don't need to 
> provide the accountId (as it's embedded in the JSON files, including the 
> private key as well).
> I noticed this will integrating Cloud DataFlow in Apache Airflow, where I 
> upgraded the usage of the service keys. Airflow will drop support for the old 
> service files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-582) Allow usage of the new GCP service account JSON key

2016-11-07 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644918#comment-15644918
 ] 

Luke Cwik commented on BEAM-582:


This is being superseded by https://issues.apache.org/jira/browse/BEAM-725

> Allow usage of the new GCP service account JSON key
> ---
>
> Key: BEAM-582
> URL: https://issues.apache.org/jira/browse/BEAM-582
> Project: Beam
>  Issue Type: Improvement
>  Components: runner-dataflow
>Reporter: Alex Van Boxel
>Assignee: Davor Bonaci
>
> The new JSON service account files are a lot easier to use, you don't need to 
> provide the accountId (as it's embedded in the JSON files, including the 
> private key as well).
> I noticed this will integrating Cloud DataFlow in Apache Airflow, where I 
> upgraded the usage of the service keys. Airflow will drop support for the old 
> service files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-898) BigQueryTornadoes IT has invalid PipelineOptions

2016-11-04 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637972#comment-15637972
 ] 

Luke Cwik commented on BEAM-898:


Still failing:
https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1734/testReport/

Error Message

Expected getter for property [output] to be marked with @Default on all 
[org.apache.beam.examples.WordCount$WordCountOptions, 
org.apache.beam.examples.cookbook.BigQueryTornadoes$Options], found only on 
[org.apache.beam.examples.WordCount$WordCountOptions]
Stacktrace

java.lang.IllegalArgumentException: Expected getter for property [output] to be 
marked with @Default on all 
[org.apache.beam.examples.WordCount$WordCountOptions, 
org.apache.beam.examples.cookbook.BigQueryTornadoes$Options], found only on 
[org.apache.beam.examples.WordCount$WordCountOptions]
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.throwForGettersWithInconsistentAnnotation(PipelineOptionsFactory.java:1309)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.validateGettersHaveConsistentAnnotation(PipelineOptionsFactory.java:1150)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.validateMethodAnnotations(PipelineOptionsFactory.java:1065)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.validateClass(PipelineOptionsFactory.java:995)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:627)
at 
org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:561)
at 
org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2EBigQueryTornadoes(BigQueryTornadoesIT.java:50)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at 
org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

> BigQueryTornadoes IT has invalid PipelineOptions
> 
>
> Key: BEAM-898
> URL: https://issues.apache.org/jira/browse/BEAM-898
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp, testing
>Reporter: Daniel Halperin
>Assignee: Mark Liu
>
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1718/
> This PR: https://github.com/apache/incubator-beam/pull/1159
> checks that pipeline options cannot have multiple incompatible defaults.
> BigQueryTornadoes ITs have a problem with how they register pipeline options. 
> Luke can give more details on fix.
> cc [~pei...@gmail.com] [~lcwik] [~jasonkuster]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-790) Validate PipelineOptions Default annotation

2016-11-03 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-790:
---
Priority: Minor  (was: Major)

> Validate PipelineOptions Default annotation
> ---
>
> Key: BEAM-790
> URL: https://issues.apache.org/jira/browse/BEAM-790
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> It shouldn't allow @Override with @Default annotation, for example the 
> following is broken:
> interface A {
>   @Default.Integer(1)
>   Integer getFoo();
>   void setFoo();
> }
> interface B extends A {
>   @Default.Integer(-1)
>   @Override
>   Integer getFoo();
> }
> It is broken, because PipelineOptions default values are lazily evaluated. 
> And, it will depends on which one of the two following operations happen 
> first:
> options.as(A.class) and options.as(B.class)
> If users want to change the default value, users should do setFoo(...) 
> explicitly.
> It shouldn't allow adding Default annotation as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-790) Validate PipelineOptions Default annotation

2016-11-03 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-790.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Validate PipelineOptions Default annotation
> ---
>
> Key: BEAM-790
> URL: https://issues.apache.org/jira/browse/BEAM-790
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Pei He
>Assignee: Pei He
> Fix For: 0.4.0-incubating
>
>
> It shouldn't allow @Override with @Default annotation, for example the 
> following is broken:
> interface A {
>   @Default.Integer(1)
>   Integer getFoo();
>   void setFoo();
> }
> interface B extends A {
>   @Default.Integer(-1)
>   @Override
>   Integer getFoo();
> }
> It is broken, because PipelineOptions default values are lazily evaluated. 
> And, it will depends on which one of the two following operations happen 
> first:
> options.as(A.class) and options.as(B.class)
> If users want to change the default value, users should do setFoo(...) 
> explicitly.
> It shouldn't allow adding Default annotation as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-874) sdks/java/microbenchmarks instructions incorrect since benchmarks no longer run

2016-11-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-874:
---
Description: 
microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which 
leads to this failure upon executing
"java -jar target/microbenchmarks.jar":

No matching benchmarks. Miss-spelled regexp?
Use EXTRA verbose mode to debug the pattern matching.

Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid 
usage of mvn install when possible. An alternate suggestion could be:
mvn package -pl sdks/java/microbenchmarks -am

  was:
microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which 
leads to this failure upon executing
```java -jar target/microbenchmarks.jar```:

No matching benchmarks. Miss-spelled regexp?
Use EXTRA verbose mode to debug the pattern matching.

Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid 
usage of mvn install when possible. An alternate suggestion could be:
mvn package -pl sdks/java/microbenchmarks -am


> sdks/java/microbenchmarks instructions incorrect since benchmarks no longer 
> run
> ---
>
> Key: BEAM-874
> URL: https://issues.apache.org/jira/browse/BEAM-874
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Davor Bonaci
>
> microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which 
> leads to this failure upon executing
> "java -jar target/microbenchmarks.jar":
> No matching benchmarks. Miss-spelled regexp?
> Use EXTRA verbose mode to debug the pattern matching.
> Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid 
> usage of mvn install when possible. An alternate suggestion could be:
> mvn package -pl sdks/java/microbenchmarks -am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-874) sdks/java/microbenchmarks instructions and execution no longer function

2016-11-01 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-874:
--

 Summary: sdks/java/microbenchmarks instructions and execution no 
longer function
 Key: BEAM-874
 URL: https://issues.apache.org/jira/browse/BEAM-874
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci


microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which 
leads to this failure upon executing
```java -jar target/microbenchmarks.jar```:

No matching benchmarks. Miss-spelled regexp?
Use EXTRA verbose mode to debug the pattern matching.

Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid 
usage of mvn install when possible. An alternate suggestion could be:
mvn package -pl sdks/java/microbenchmarks -am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-874) sdks/java/microbenchmarks instructions incorrect since benchmarks no longer run

2016-11-01 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-874:
---
Summary: sdks/java/microbenchmarks instructions incorrect since benchmarks 
no longer run  (was: sdks/java/microbenchmarks instructions and execution no 
longer function)

> sdks/java/microbenchmarks instructions incorrect since benchmarks no longer 
> run
> ---
>
> Key: BEAM-874
> URL: https://issues.apache.org/jira/browse/BEAM-874
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Davor Bonaci
>
> microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which 
> leads to this failure upon executing
> ```java -jar target/microbenchmarks.jar```:
> No matching benchmarks. Miss-spelled regexp?
> Use EXTRA verbose mode to debug the pattern matching.
> Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid 
> usage of mvn install when possible. An alternate suggestion could be:
> mvn package -pl sdks/java/microbenchmarks -am



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds

2016-10-31 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-822:
---
Priority: Minor  (was: Major)

> SDK build writes timestamp to source tree, causing spurious builds
> --
>
> Key: BEAM-822
> URL: https://issues.apache.org/jira/browse/BEAM-822
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> The SDK build puts the build timestamp into {{sdk.properties}}. To have a 
> timestamp that does not break incremental build, the right place for it is in 
> the manifest of the built artifact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds

2016-10-31 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-822.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> SDK build writes timestamp to source tree, causing spurious builds
> --
>
> Key: BEAM-822
> URL: https://issues.apache.org/jira/browse/BEAM-822
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> The SDK build puts the build timestamp into {{sdk.properties}}. To have a 
> timestamp that does not break incremental build, the right place for it is in 
> the manifest of the built artifact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds

2016-10-31 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-822:
---
Issue Type: Improvement  (was: Bug)

> SDK build writes timestamp to source tree, causing spurious builds
> --
>
> Key: BEAM-822
> URL: https://issues.apache.org/jira/browse/BEAM-822
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Kenneth Knowles
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> The SDK build puts the build timestamp into {{sdk.properties}}. To have a 
> timestamp that does not break incremental build, the right place for it is in 
> the manifest of the built artifact.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo

2016-10-31 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622892#comment-15622892
 ] 

Luke Cwik commented on BEAM-626:


I'm not for/against fixing AvroCoder to work with Kryo, just pointing out that 
the problem is that we lack a spec that says how things need to be serializable 
for portability reasons. Until we get the Beam Runner API 
[https://issues.apache.org/jira/browse/BEAM-115] up and going we will continue 
to run into these issues.

Dataflow relied on Java serialization for DoFns, and Jackson for Coders. Spark 
relies on Kryo. Another runner may pull in yet another way as to how they 
serialize DoFns/Coders/etc...

Still looking at the PR.

> AvroCoder not deserializing correctly in Kryo
> -
>
> Key: BEAM-626
> URL: https://issues.apache.org/jira/browse/BEAM-626
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Minor
>
> Unlike with Java serialization, when deserializing AvroCoder using Kryo, the 
> resulting AvroCoder is missing all of its transient fields.
> The reason it works with Java serialization is because of the usage of 
> writeReplace and readResolve, which Kryo does not adhere to.
> In ProtoCoder for example there are also unserializable members, the way it 
> is solved there is lazy initializing these members via their getters, so they 
> are initialized in the deserialized object on first call to the member.
> It seems AvroCoder is the only class in Beam to use writeReplace convention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo

2016-10-31 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622626#comment-15622626
 ] 

Luke Cwik commented on BEAM-626:


[~amitsela] I was referring to DoFn's/Coders/... that are being serialized via 
Kryo and not referring to the users data which is being encoded/decoded using 
Coders.

> AvroCoder not deserializing correctly in Kryo
> -
>
> Key: BEAM-626
> URL: https://issues.apache.org/jira/browse/BEAM-626
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Minor
>
> Unlike with Java serialization, when deserializing AvroCoder using Kryo, the 
> resulting AvroCoder is missing all of its transient fields.
> The reason it works with Java serialization is because of the usage of 
> writeReplace and readResolve, which Kryo does not adhere to.
> In ProtoCoder for example there are also unserializable members, the way it 
> is solved there is lazy initializing these members via their getters, so they 
> are initialized in the deserialized object on first call to the member.
> It seems AvroCoder is the only class in Beam to use writeReplace convention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-398) JAXBCoder uses incorrect Double-Checked Locking

2016-10-31 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622594#comment-15622594
 ] 

Luke Cwik commented on BEAM-398:


Was merged here: 
https://github.com/apache/incubator-beam/commit/c29afb119be034b6b93083d9e8ec5542f13b4373

> JAXBCoder uses incorrect Double-Checked Locking
> ---
>
> Key: BEAM-398
> URL: https://issues.apache.org/jira/browse/BEAM-398
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Thomas Groh
>Priority: Minor
>  Labels: findbugs, newbie, starter
> Fix For: 0.3.0-incubating
>
>
> [FindBugs 
> DC_DOUBLECHECK|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L72]:
>  Possible double check of field
> Applies to: 
> [JAXBCoder.getContext|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java#L113].
>  For details on why this is incorrect, see: 
> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-398) JAXBCoder uses incorrect Double-Checked Locking

2016-10-31 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-398.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> JAXBCoder uses incorrect Double-Checked Locking
> ---
>
> Key: BEAM-398
> URL: https://issues.apache.org/jira/browse/BEAM-398
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Scott Wegner
>Assignee: Thomas Groh
>Priority: Minor
>  Labels: findbugs, newbie, starter
> Fix For: 0.3.0-incubating
>
>
> [FindBugs 
> DC_DOUBLECHECK|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L72]:
>  Possible double check of field
> Applies to: 
> [JAXBCoder.getContext|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java#L113].
>  For details on why this is incorrect, see: 
> http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
> This is a good starter bug. When fixing, please remove the corresponding 
> entries from 
> [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml]
>  and verify the build passes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo

2016-10-28 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15617003#comment-15617003
 ] 

Luke Cwik commented on BEAM-626:


This will only solve the short term problem that AvroCoder is not serializable 
via Kryo.

Users who write their own objects that rely on readResolve will still have the 
same problem that your facing with AvroCoder. They will need to do additional 
work to get their objects to work with the Spark runner.

We'll need an official schema / serialization story for many of the objects 
used such as Coder/DoFn/... to be part of the Beam model for portability 
reasons but until then it seems worthwhile to fix this.

> AvroCoder not deserializing correctly in Kryo
> -
>
> Key: BEAM-626
> URL: https://issues.apache.org/jira/browse/BEAM-626
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Aviem Zur
>Assignee: Aviem Zur
>Priority: Minor
>
> Unlike with Java serialization, when deserializing AvroCoder using Kryo, the 
> resulting AvroCoder is missing all of its transient fields.
> The reason it works with Java serialization is because of the usage of 
> writeReplace and readResolve, which Kryo does not adhere to.
> In ProtoCoder for example there are also unserializable members, the way it 
> is solved there is lazy initializing these members via their getters, so they 
> are initialized in the deserialized object on first call to the member.
> It seems AvroCoder is the only class in Beam to use writeReplace convention.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-813) Support metadata in Avro sink

2016-10-26 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-813.

   Resolution: Fixed
Fix Version/s: 0.4.0-incubating

> Support metadata in Avro sink
> -
>
> Key: BEAM-813
> URL: https://issues.apache.org/jira/browse/BEAM-813
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> It'd be nice to support custom metadata in Avro files. This change is similar 
> to [BEAM-701].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-813) Support metadata in Avro sink

2016-10-26 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610102#comment-15610102
 ] 

Luke Cwik commented on BEAM-813:


Merged to master here: 
https://github.com/apache/incubator-beam/commit/eba099f564dba3dfbba30ae3533496b9e14f57a7

> Support metadata in Avro sink
> -
>
> Key: BEAM-813
> URL: https://issues.apache.org/jira/browse/BEAM-813
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Neville Li
>Assignee: Neville Li
>Priority: Minor
> Fix For: 0.4.0-incubating
>
>
> It'd be nice to support custom metadata in Avro files. This change is similar 
> to [BEAM-701].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-803) Maven configuration that easily launches examples IT tests on one specific runner

2016-10-24 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602292#comment-15602292
 ] 

Luke Cwik commented on BEAM-803:


Does it fail because the ServiceLoader files during bundling are being ignored 
and not merged?

> Maven configuration that easily launches examples IT tests on one specific 
> runner
> -
>
> Key: BEAM-803
> URL: https://issues.apache.org/jira/browse/BEAM-803
> Project: Beam
>  Issue Type: Wish
>Reporter: Kenneth Knowles
>Assignee: Jason Kuster
>Priority: Minor
>
> Today, there is {{-Pjenkins-precommit}} that activates separate executions 
> for each of the runners, but no easy way to invoke just one of those 
> executions that I can discern.
> The most promising command that I can come up with to run, for example, the 
> Flink wordcount integration test, is {{mvn 
> failsafe:integration-test@flink-runner-integration-tests -Pjenkins-precommit 
> -pl examples/java/}} but this fails due to runner registrar issues. Ideally, 
> this would be a fail-proof one-liner.
> Any tips?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-769) Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done.

2016-10-21 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595461#comment-15595461
 ] 

Luke Cwik commented on BEAM-769:


I would prefer a bigger timeout over having flaky tests if we couldn't make it 
deterministic in some way. If a test never flakes, people won't have to look at 
it.

> Spark streaming tests fail on "nothing processed" if runtime env. is slow 
> because timeout is hit before processing is done.
> ---
>
> Key: BEAM-769
> URL: https://issues.apache.org/jira/browse/BEAM-769
> Project: Beam
>  Issue Type: Bug
>  Components: runner-spark
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Amit Sela
>
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1586/
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1587/
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1588/
> {code}
> org.apache.beam.runners.spark.translation.streaming.FlattenStreamingTest.testFlattenUnbounded
> org.apache.beam.runners.spark.translation.streaming.KafkaStreamingTest.testRun
> org.apache.beam.runners.spark.translation.streaming.SimpleStreamingWordCountTest.testFixedWindows
> {code}
> The above tests use a hard-timeout (ungraceful stop) so if the runtime env. 
> is slow enough so that the batch is not done, it'll stop anyway and assert 
> and rightfully fail.
> It's difficult to create locally because I never had trouble on my laptop.
> Since Jenkins will be slow from time to time, it is reasonable enough to have 
> a more robust solution here :
> # don't use checkpoint (Spark) if not necessary - only really necessary for 
> one test in {{KafkaStreamingTest}} and {{ResumeFromCheckpointStreamingTest}} 
> I think.
> #  allow for graceful stop - will take longer for each test, but should allow 
> the test to finish even if runtime env. is slow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-779) filesToStage should allow for common ways in which people package their resources

2016-10-19 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-779:
--

 Summary: filesToStage should allow for common ways in which people 
package their resources
 Key: BEAM-779
 URL: https://issues.apache.org/jira/browse/BEAM-779
 Project: Beam
  Issue Type: Improvement
  Components: runner-dataflow
Reporter: Luke Cwik
Assignee: Davor Bonaci


Different application environments launch and maintain their classpath 
resources in various ways. See these SO questions for examples of how people 
launch their pipelines that currently are unsupported:
http://stackoverflow.com/questions/31978566/detectclasspathresourcestostage-unable-to-convert-url
http://stackoverflow.com/questions/40099952/launching-dataflow-jobs-from-a-java-application

Add support for classpaths which:
* use URLs that are embedded within other jars
* allow for manifest files that specify their classpaths
* add support for rsrc:// URIs to support Eclipse jar packaging



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-739) Log full exception stack trace in WordCountIT and BigQueryTornadoesIT

2016-10-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-739:
---
Fix Version/s: (was: Not applicable)
   0.3.0-incubating

> Log full exception stack trace in WordCountIT and BigQueryTornadoesIT
> -
>
> Key: BEAM-739
> URL: https://issues.apache.org/jira/browse/BEAM-739
> Project: Beam
>  Issue Type: Improvement
>  Components: testing
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> When IT tests are broken, they don't provide the full stack trace, such as in:
> https://issues.apache.org/jira/browse/BEAM-736
> It makes investigating root causes slower.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-763) BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid RunnableOnService test

2016-10-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-763:
---
Fix Version/s: (was: Not applicable)
   0.3.0-incubating

> BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid 
> RunnableOnService test
> --
>
> Key: BEAM-763
> URL: https://issues.apache.org/jira/browse/BEAM-763
> Project: Beam
>  Issue Type: Test
>Reporter: Luke Cwik
>Assignee: Pei He
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> TestPipeline.create(options) is not compatible with how TestPipeline 
> functions. This overrides the properties provided by the maven 
> surefire/failsafe profiles setup by the various runners for integration 
> testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-755) beam-runners-core-java NeedsRunner tests not executing

2016-10-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-755.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> beam-runners-core-java NeedsRunner tests not executing
> --
>
> Key: BEAM-755
> URL: https://issues.apache.org/jira/browse/BEAM-755
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> org.apache.beam:beam-runners-core-java is not specified as an integration 
> test dependency to scan within runners/pom.xml
> There is also in runners/direct-java/pom.xml where its 
> org.apache.beam:beam-runners-java-core and should be 
> org.apache.beam:beam-runners-core-java
> Finally, even if these dependencies are added and the typo fixed. When 
> running the runnable on service integration tests, SplittableParDoTest which 
> contains @RunnableOnService tests (part of runners/core-java) doesn't execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-755) beam-runners-core-java NeedsRunner tests not executing

2016-10-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-755:
---
Priority: Minor  (was: Major)

> beam-runners-core-java NeedsRunner tests not executing
> --
>
> Key: BEAM-755
> URL: https://issues.apache.org/jira/browse/BEAM-755
> Project: Beam
>  Issue Type: Bug
>  Components: runner-core
>Reporter: Luke Cwik
>Assignee: Kenneth Knowles
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> org.apache.beam:beam-runners-core-java is not specified as an integration 
> test dependency to scan within runners/pom.xml
> There is also in runners/direct-java/pom.xml where its 
> org.apache.beam:beam-runners-java-core and should be 
> org.apache.beam:beam-runners-core-java
> Finally, even if these dependencies are added and the typo fixed. When 
> running the runnable on service integration tests, SplittableParDoTest which 
> contains @RunnableOnService tests (part of runners/core-java) doesn't execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-756) Checkstyle suppression for JavadocPackage not working on Windows

2016-10-19 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-756.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> Checkstyle suppression for JavadocPackage not working on Windows
> 
>
> Key: BEAM-756
> URL: https://issues.apache.org/jira/browse/BEAM-756
> Project: Beam
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> Exclusions for test and other files don't consider '\' as separator. Hence 
> checkstyle complains about missing package-info files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-764) Remove cloneAs from PipelineOptions

2016-10-18 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586902#comment-15586902
 ] 

Luke Cwik commented on BEAM-764:


This was merged into master here:
https://github.com/apache/incubator-beam/commit/71c69b31b6894064bf8111007f947150ff725528

> Remove cloneAs from PipelineOptions
> ---
>
> Key: BEAM-764
> URL: https://issues.apache.org/jira/browse/BEAM-764
> Project: Beam
>  Issue Type: Task
>Reporter: Pei He
>Assignee: Pei He
>  Labels: codehealth
> Fix For: 0.3.0-incubating
>
>
> PipelineOptions.cloneAs was a workaround to support running multiple 
> pipelines in Dataflow examples for a streaming pipeline and its injector.
> After the Beam examples refactoring, cloneAs is no longer needed.
> cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and 
> requires users to manually set them. So, I am deleting it. 
> However, we should figure out a better API and implementation to support 
> running multiple pipelines with the same configurations (whether through 
> PipelineOptions or not).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-764) Remove cloneAs from PipelineOptions

2016-10-18 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-764:
---
Priority: Minor  (was: Major)

> Remove cloneAs from PipelineOptions
> ---
>
> Key: BEAM-764
> URL: https://issues.apache.org/jira/browse/BEAM-764
> Project: Beam
>  Issue Type: Task
>Reporter: Pei He
>Assignee: Pei He
>Priority: Minor
>  Labels: codehealth
> Fix For: 0.3.0-incubating
>
>
> PipelineOptions.cloneAs was a workaround to support running multiple 
> pipelines in Dataflow examples for a streaming pipeline and its injector.
> After the Beam examples refactoring, cloneAs is no longer needed.
> cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and 
> requires users to manually set them. So, I am deleting it. 
> However, we should figure out a better API and implementation to support 
> running multiple pipelines with the same configurations (whether through 
> PipelineOptions or not).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-764) Remove cloneAs from PipelineOptions

2016-10-18 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-764.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> Remove cloneAs from PipelineOptions
> ---
>
> Key: BEAM-764
> URL: https://issues.apache.org/jira/browse/BEAM-764
> Project: Beam
>  Issue Type: Task
>Reporter: Pei He
>Assignee: Pei He
>  Labels: codehealth
> Fix For: 0.3.0-incubating
>
>
> PipelineOptions.cloneAs was a workaround to support running multiple 
> pipelines in Dataflow examples for a streaming pipeline and its injector.
> After the Beam examples refactoring, cloneAs is no longer needed.
> cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and 
> requires users to manually set them. So, I am deleting it. 
> However, we should figure out a better API and implementation to support 
> running multiple pipelines with the same configurations (whether through 
> PipelineOptions or not).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-763) BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid RunnableOnService test

2016-10-17 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-763:
--

 Summary: BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is 
not a valid RunnableOnService test
 Key: BEAM-763
 URL: https://issues.apache.org/jira/browse/BEAM-763
 Project: Beam
  Issue Type: Test
Reporter: Luke Cwik
Assignee: Pei He
Priority: Minor


TestPipeline.create(options) is not compatible with how TestPipeline functions. 
This overrides the properties provided by the maven surefire/failsafe profiles 
setup by the various runners for integration testing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-761) SplittableParDoTest fails

2016-10-17 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-761.

   Resolution: Fixed
 Assignee: Ben Chambers
Fix Version/s: 0.3.0-incubating

> SplittableParDoTest fails
> -
>
> Key: BEAM-761
> URL: https://issues.apache.org/jira/browse/BEAM-761
> Project: Beam
>  Issue Type: Test
>Reporter: Luke Cwik
>Assignee: Ben Chambers
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> Coder propagation was missing in GBKIntoKeyedWorkItems
> Fixed with https://github.com/apache/incubator-beam/pull/1117/files
> Filed for completeness.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-761) SplittableParDoTest fails

2016-10-17 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-761:
--

 Summary: SplittableParDoTest fails
 Key: BEAM-761
 URL: https://issues.apache.org/jira/browse/BEAM-761
 Project: Beam
  Issue Type: Test
Reporter: Luke Cwik
Priority: Minor


Coder propagation was missing in GBKIntoKeyedWorkItems

Fixed with https://github.com/apache/incubator-beam/pull/1117/files

Filed for completeness.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-760) Validation needs to exist that @NeedsRunner / @RunnableOnService tests execute

2016-10-17 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-760:
--

 Summary: Validation needs to exist that @NeedsRunner / 
@RunnableOnService tests execute
 Key: BEAM-760
 URL: https://issues.apache.org/jira/browse/BEAM-760
 Project: Beam
  Issue Type: Improvement
  Components: runner-core, runner-dataflow, runner-direct, 
runner-flink, runner-gearpump, runner-spark, sdk-java-core
Reporter: Luke Cwik
Assignee: Jason Kuster


We lack the validation that tests that were supposed to execute actually 
executed part of pre/post commit.

This is worrisome in an automated test environment since its difficult to know 
if all the tests that were supposed to run did run.

Repro steps:
checkout apache/master @ b8e6eea691b48e14c4e2c3e84609d750769e09ee
mvn clean integration-test -T 1C -pl runners/direct-java -am

Note that the SplittableParDoTest part of beam-runners-core-java doesn't 
execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-756) Checkstyle suppression for JavadocPackage not working on Windows

2016-10-17 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-756:
---
Priority: Minor  (was: Major)

> Checkstyle suppression for JavadocPackage not working on Windows
> 
>
> Key: BEAM-756
> URL: https://issues.apache.org/jira/browse/BEAM-756
> Project: Beam
>  Issue Type: Bug
>Reporter: Thomas Weise
>Assignee: Thomas Weise
>Priority: Minor
>
> Exclusions for test and other files don't consider '\' as separator. Hence 
> checkstyle complains about missing package-info files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-747) Text checksum verifier is not resilient to eventually consistent filesystems

2016-10-17 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582963#comment-15582963
 ] 

Luke Cwik commented on BEAM-747:


The number of shards is not deterministic without explicitly limiting it on the 
sink. Also, requiring support for limited parallelism increases the barrier to 
entry for this test for runners. Typically if you get one filename for the 
YYY-of-ZZZ case, you can figure out all the remaining by parsing out the bounds 
and knowing exactly how many files exist and what they are named.

> Text checksum verifier is not resilient to eventually consistent filesystems
> 
>
> Key: BEAM-747
> URL: https://issues.apache.org/jira/browse/BEAM-747
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>Assignee: Mark Liu
>
> Example 1: 
> https://builds.apache.org/job/beam_PreCommit_MavenVerify/3934/org.apache.beam$beam-examples-java/console
> Here it looks like we need to retry listing files, at least a little bit, if 
> none are found. They did show up:
> {code}
> gsutil ls 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results\*
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-0-of-3
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-1-of-3
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-2-of-3
> {code}
> Example 2: 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1525/testReport/junit/org.apache.beam.examples/WordCountIT/testE2EWordCount/
> Here it looks like we need to fill in the shard template if the filesystem 
> does not give us a consistent result:
> {code}
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> readLines
> INFO: [0 of 1] Read 162 lines from file: 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-0-of-3
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> readLines
> INFO: [1 of 1] Read 144 lines from file: 
> gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-2-of-3
> Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher 
> matchesSafely
> INFO: Generated checksum for output data: 
> aec68948b2515e6ea35fd1ed7649c267a10a01e5
> {code}
> We missed shard 1-of-3 and hence got the wrong checksum.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-758) Per-step, per-execution nonce

2016-10-17 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582798#comment-15582798
 ] 

Luke Cwik commented on BEAM-758:


Several runners have the concept of a job or pipeline id, using a stable hash 
of the job or pipeline id could be used to generate the nonce.

Currently we expose job name within PipelineOptions, we could also expose the 
concept of job id which is populated by a runner and is expected to uniquely 
identify the job with respect to the runner if the runner supports running 
multiple jobs at the same time.

> Per-step, per-execution nonce
> -
>
> Key: BEAM-758
> URL: https://issues.apache.org/jira/browse/BEAM-758
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Affects Versions: Not applicable
>Reporter: Daniel Halperin
>
> In the forthcoming runner API, a user will be able to save a pipeline to JSON 
> and then run it repeatedly.
> Many pieces of code (e.g., BigQueryIO.Read or Write) rely on a single random 
> value (nonce). These values are typically generated at apply time, so that 
> they are deterministic (don't change across retries of DoFns) and global (are 
> the same across all workers).
> However, once the runner API lands the existing code would result in the same 
> nonce being reused across jobs. Other possible solutions:
> * Generate nonce in {{Create(1) | ParDo}} then use this as a side input. 
> Should work, as along as side inputs are actually checkpointed. But does not 
> work for {{BoundedSource}}.
> * If a nonce is only needed for the lifetime of one bundle, can be generated 
> in {{startBundle}} and used in {{finishBundle}} [or {{tearDown}}].
> * Add some context somewhere that lets user code access unique step name, and 
> somehow generate a nonce consistently e.g. by hashing. Will usually work, but 
> this is similarly not available to sources.
> Another Q: I'm not sure we have a good way to generate nonces in unbounded 
> pipelines -- we probably need one. This would enable us to, e.g., use 
> {{BigQueryIO.Write}} in an unbounded pipeline [if we had, e.g., exactly-once 
> triggering per window]. Or generalizing to multiple firings...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-755) beam-runners-core-java RunnableOnService tests not executing

2016-10-14 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-755:
--

 Summary: beam-runners-core-java RunnableOnService tests not 
executing
 Key: BEAM-755
 URL: https://issues.apache.org/jira/browse/BEAM-755
 Project: Beam
  Issue Type: Bug
  Components: runner-core
Reporter: Luke Cwik
Assignee: Frances Perry


org.apache.beam:beam-runners-core-java is not specified as an integration test 
dependency to scan within runners/pom.xml

There is also in runners/direct-java/pom.xml where its 
org.apache.beam:beam-runners-java-core and should be 
org.apache.beam:beam-runners-core-java

Finally, even if these dependencies are added and the typo fixed. When running 
the runnable on service integration tests, SplittableParDoTest which contains 
@RunnableOnService tests (part of runners/core-java) doesn't execute.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-736) BigQueryTornadoesIT broken, blocking nightly release.

2016-10-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-736.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> BigQueryTornadoesIT broken, blocking nightly release.
> -
>
> Key: BEAM-736
> URL: https://issues.apache.org/jira/browse/BEAM-736
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jason Kuster
>Assignee: Pei He
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> Build break begins here: 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1471/
> listing 3 potential culprit commits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-736) BigQueryTornadoesIT broken, blocking nightly release.

2016-10-11 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-736:
---
Priority: Minor  (was: Major)

> BigQueryTornadoesIT broken, blocking nightly release.
> -
>
> Key: BEAM-736
> URL: https://issues.apache.org/jira/browse/BEAM-736
> Project: Beam
>  Issue Type: Bug
>  Components: testing
>Reporter: Jason Kuster
>Assignee: Pei He
>Priority: Minor
>
> Build break begins here: 
> https://builds.apache.org/job/beam_PostCommit_MavenVerify/1471/
> listing 3 potential culprit commits.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-726) Standardize naming of PipelineResult objects

2016-10-06 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553364#comment-15553364
 ] 

Luke Cwik commented on BEAM-726:


Is PipelineResult the appropriate suffix?
If I support a non-blocking mode then its not really a result yet which 
explains the choice of the Job suffix for Dataflow.

> Standardize naming of PipelineResult objects
> 
>
> Key: BEAM-726
> URL: https://issues.apache.org/jira/browse/BEAM-726
> Project: Beam
>  Issue Type: Bug
>  Components: beam-model
>Reporter: Ben Chambers
>Assignee: Frances Perry
>Priority: Minor
>
> Today:
> PipelineResult is an interface returned by running a pipeline.
> DataflowPipelineJob is the Dataflow implementation of that interface
> FlinkRunnerResult is the Flink implementation
> EvaluationContext is the Spark implementation
> DirectPipelineResult is the DirectRunner implementation
> Ideally, all the names would indicate that they are a PipelineResult, like 
> the DirectRunner does.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow

2016-10-06 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-725:
---
Description: 
Drop the following GcpOptions and use ADC 
(https://developers.google.com/identity/protocols/application-default-credentials)
 to clean-up credentials story for GCP:
AuthorizationServerEncodedUrl
TokenServerUrl
CredentialDir
CredentialId
SecretsFile
ServiceAccountName
ServiceAccountKeyfile

Also migrate from Apiary Credentials class to Google OAuth Credentials class 
when available from google-cloud-java.

  was:
Drop the following GcpOptions and use ADC to clean-up credentials story for GCP:
AuthorizationServerEncodedUrl
TokenServerUrl
CredentialDir
CredentialId
SecretsFile
ServiceAccountName
ServiceAccountKeyfile

Also migrate from Apiary Credentials class to Google OAuth Credentials class 
when available from google-cloud-java.


> Remove legacy credentials flags related to GCP and adopt application default 
> credentials as only supported default flow
> ---
>
> Key: BEAM-725
> URL: https://issues.apache.org/jira/browse/BEAM-725
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-gcp
>Reporter: Luke Cwik
>Assignee: Luke Cwik
>Priority: Minor
>
> Drop the following GcpOptions and use ADC 
> (https://developers.google.com/identity/protocols/application-default-credentials)
>  to clean-up credentials story for GCP:
> AuthorizationServerEncodedUrl
> TokenServerUrl
> CredentialDir
> CredentialId
> SecretsFile
> ServiceAccountName
> ServiceAccountKeyfile
> Also migrate from Apiary Credentials class to Google OAuth Credentials class 
> when available from google-cloud-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow

2016-10-06 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-725:
--

 Summary: Remove legacy credentials flags related to GCP and adopt 
application default credentials as only supported default flow
 Key: BEAM-725
 URL: https://issues.apache.org/jira/browse/BEAM-725
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Luke Cwik
Priority: Minor


Drop the following GcpOptions and use ADC to clean-up credentials story for GCP:
AuthorizationServerEncodedUrl
TokenServerUrl
CredentialDir
CredentialId
SecretsFile
ServiceAccountName
ServiceAccountKeyfile

Also migrate from Apiary Credentials class to Google OAuth Credentials class 
when available from google-cloud-java.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-716) Migrate JmsIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-716:
--

 Summary: Migrate JmsIO to use AutoValue to reduce boilerplate
 Key: BEAM-716
 URL: https://issues.apache.org/jira/browse/BEAM-716
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Luke Cwik
Assignee: James Malone
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-718) Migrate KinesisIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-718:
--

 Summary: Migrate KinesisIO to use AutoValue to reduce boilerplate
 Key: BEAM-718
 URL: https://issues.apache.org/jira/browse/BEAM-718
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Luke Cwik
Assignee: James Malone
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-717) Migrate KafkaIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-717:
--

 Summary: Migrate KafkaIO to use AutoValue to reduce boilerplate
 Key: BEAM-717
 URL: https://issues.apache.org/jira/browse/BEAM-717
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Luke Cwik
Assignee: James Malone
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-715) Migrate AvroHDFSFileSource/HDFSFileSource/HDFSFileSink to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-715:
--

 Summary: Migrate AvroHDFSFileSource/HDFSFileSource/HDFSFileSink to 
use AutoValue to reduce boilerplate
 Key: BEAM-715
 URL: https://issues.apache.org/jira/browse/BEAM-715
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-extensions
Reporter: Luke Cwik
Assignee: James Malone
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-714) Migrate DatastoreV1 to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-714:
--

 Summary: Migrate DatastoreV1 to use AutoValue to reduce boilerplate
 Key: BEAM-714
 URL: https://issues.apache.org/jira/browse/BEAM-714
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Daniel Halperin
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-712) Migrate BigQueryIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-712:
--

 Summary: Migrate BigQueryIO to use AutoValue to reduce boilerplate
 Key: BEAM-712
 URL: https://issues.apache.org/jira/browse/BEAM-712
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Daniel Halperin
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-713) Migrate BigTableIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-713:
--

 Summary: Migrate BigTableIO to use AutoValue to reduce boilerplate
 Key: BEAM-713
 URL: https://issues.apache.org/jira/browse/BEAM-713
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Daniel Halperin
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-711) Migrate XmlSource/XmlSink to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-711:
--

 Summary: Migrate XmlSource/XmlSink to use AutoValue to reduce 
boilerplate
 Key: BEAM-711
 URL: https://issues.apache.org/jira/browse/BEAM-711
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-710) Migrate Read/Write to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-710:
---
Summary: Migrate Read/Write to use AutoValue to reduce boilerplate  (was: 
Migrate Read to use AutoValue to reduce boilerplate)

> Migrate Read/Write to use AutoValue to reduce boilerplate
> -
>
> Key: BEAM-710
> URL: https://issues.apache.org/jira/browse/BEAM-710
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Davor Bonaci
>Priority: Minor
>  Labels: io, simple, starter
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-709) Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-709:
---
Summary: Migrate CountingSource/CountingInput to use AutoValue to reduce 
boilerplate  (was: Migrate CountingSource to use AutoValue to reduce 
boilerplate)

> Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate
> ---
>
> Key: BEAM-709
> URL: https://issues.apache.org/jira/browse/BEAM-709
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Davor Bonaci
>Priority: Minor
>  Labels: io, simple, starter
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-707) Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-707:
---
Summary: Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use 
AutoValue to reduce boilerplate  (was: Migrate PubsubIO to use AutoValue to 
reduce boilerplate)

> Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue 
> to reduce boilerplate
> -
>
> Key: BEAM-707
> URL: https://issues.apache.org/jira/browse/BEAM-707
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Davor Bonaci
>Priority: Minor
>  Labels: io, simple, starter
>
> Use the AutoValue functionality to reduce boilerplate.
> See this PR for an example:
> https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-710) Migrate Read to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-710:
--

 Summary: Migrate Read to use AutoValue to reduce boilerplate
 Key: BEAM-710
 URL: https://issues.apache.org/jira/browse/BEAM-710
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-709) Migrate CountingSource to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-709:
--

 Summary: Migrate CountingSource to use AutoValue to reduce 
boilerplate
 Key: BEAM-709
 URL: https://issues.apache.org/jira/browse/BEAM-709
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-708) Migrate BoundedReadFromUnboundedSource to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-708:
--

 Summary: Migrate BoundedReadFromUnboundedSource to use AutoValue 
to reduce boilerplate
 Key: BEAM-708
 URL: https://issues.apache.org/jira/browse/BEAM-708
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-707) Migrate PubsubIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-707:
--

 Summary: Migrate PubsubIO to use AutoValue to reduce boilerplate
 Key: BEAM-707
 URL: https://issues.apache.org/jira/browse/BEAM-707
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-706) Migrate TextIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-706:
--

 Summary: Migrate TextIO to use AutoValue to reduce boilerplate
 Key: BEAM-706
 URL: https://issues.apache.org/jira/browse/BEAM-706
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.
See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-705) Migrate AvroIO to use AutoValue to reduce boilerplate

2016-10-05 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-705:
--

 Summary: Migrate AvroIO to use AutoValue to reduce boilerplate
 Key: BEAM-705
 URL: https://issues.apache.org/jira/browse/BEAM-705
 Project: Beam
  Issue Type: Improvement
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Davor Bonaci
Priority: Minor


Use the AutoValue functionality to reduce boilerplate.

See this PR for an example:
https://github.com/apache/incubator-beam/pull/1054



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-695) DisplayData for PipelineOptions fails to correctly toString array types

2016-10-03 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-695.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> DisplayData for PipelineOptions fails to correctly toString array types
> ---
>
> Key: BEAM-695
> URL: https://issues.apache.org/jira/browse/BEAM-695
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Scott Wegner
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> For array types in Java, toString produces an uninformative message like  
> [Ljava.lang.String;@fc258b1
> You need to check to see if its an array type and call Arrays.toString(array).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-695) DisplayData for PipelineOptions fails to correctly toString array types

2016-09-30 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-695:
--

 Summary: DisplayData for PipelineOptions fails to correctly 
toString array types
 Key: BEAM-695
 URL: https://issues.apache.org/jira/browse/BEAM-695
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-core
Reporter: Luke Cwik
Assignee: Scott Wegner
Priority: Minor


For array types in Java, toString produces an uninformative message like
[Ljava.lang.String;@fc258b1

You need to check to see if its an array type and call Arrays.toString(array).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner

2016-09-27 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-604.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> Use Watermark Check Streaming Job Finish in TestDataflowRunner
> --
>
> Key: BEAM-604
> URL: https://issues.apache.org/jira/browse/BEAM-604
> Project: Beam
>  Issue Type: Improvement
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> Currently, streaming job with bounded input can't be terminated automatically 
> and TestDataflowRunner can't handle this case. Need to update 
> TestDataflowRunner so that streaming integration test such as 
> WindowedWordCountIT can run with it.
> Implementation:
> Query watermark of each step and wait until all watermarks set to MAX then 
> cancel the job.
> Update:
> Suggesting by [~pei...@gmail.com], implement checkMaxWatermark in 
> DataflowPipelineJob#waitUntilFinish. Thus, all dataflow streaming jobs with 
> bounded input will take advantage of this change and are canceled 
> automatically when watermarks reach to max value. Also Dataflow runners can 
> keep simple and free from handling batch and streaming two cases.
> Update:
> Pipeline author should have control on whether or not canceling streaming job 
> and when. Test framework is a better place to auto-cancel streaming test job 
> when curtain conditions meet, rather than in waitUntilFinish().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-679) Bigtable IO integration tests are failing

2016-09-27 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-679.

   Resolution: Fixed
 Assignee: Luke Cwik  (was: Jean-Baptiste Onofré)
Fix Version/s: 0.3.0-incubating

> Bigtable IO integration tests are failing
> -
>
> Key: BEAM-679
> URL: https://issues.apache.org/jira/browse/BEAM-679
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Luke Cwik
>Priority: Critical
> Fix For: 0.3.0-incubating
>
>
> Bigtable ITests are failing with the following issue:
> {code}
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools 
> {code}
> I'm investigating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing

2016-09-27 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527323#comment-15527323
 ] 

Luke Cwik commented on BEAM-679:


It turned out that the worker images for Dataflow weren't publicly available 
which caused them to get stuck.

> Bigtable IO integration tests are failing
> -
>
> Key: BEAM-679
> URL: https://issues.apache.org/jira/browse/BEAM-679
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
>
> Bigtable ITests are failing with the following issue:
> {code}
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools 
> {code}
> I'm investigating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing

2016-09-27 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526790#comment-15526790
 ] 

Luke Cwik commented on BEAM-679:


Got permissions, BigtableWriteIT passed. Next postcommit should validate these 
findings.

> Bigtable IO integration tests are failing
> -
>
> Key: BEAM-679
> URL: https://issues.apache.org/jira/browse/BEAM-679
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
>
> Bigtable ITests are failing with the following issue:
> {code}
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools 
> {code}
> I'm investigating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing

2016-09-27 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526647#comment-15526647
 ] 

Luke Cwik commented on BEAM-679:


Trying to rerun the integration test locally to verify.

> Bigtable IO integration tests are failing
> -
>
> Key: BEAM-679
> URL: https://issues.apache.org/jira/browse/BEAM-679
> Project: Beam
>  Issue Type: Bug
>  Components: build-system, sdk-java-extensions
>Reporter: Jean-Baptiste Onofré
>Assignee: Jean-Baptiste Onofré
>Priority: Critical
>
> Bigtable ITests are failing with the following issue:
> {code}
> java.lang.NoClassDefFoundError: Could not initialize class 
> com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools 
> {code}
> I'm investigating.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-672) Figure out TestPipeline.create(PipelineOptions) / TestPipeline.fromOptions(PipelineOptions) story

2016-09-23 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-672:
---
Component/s: sdk-java-core

> Figure out TestPipeline.create(PipelineOptions) / 
> TestPipeline.fromOptions(PipelineOptions) story
> -
>
> Key: BEAM-672
> URL: https://issues.apache.org/jira/browse/BEAM-672
> Project: Beam
>  Issue Type: Improvement
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Priority: Minor
>  Labels: test
>
> TestPipeline integrates with the integration testing environment and relies 
> heavily on being able to be configured by the environment and executed on 
> many runners.
> Tests which rely on mutating PipelineOptions before creating the TestPipeline 
> easily can get the integration wrong by creating PipelineOptions from 
> PipelineOptionsFactory and then calling either TestPipeline.create(options) 
> or TestPipeline.fromOptions(options), thus ignoring any integration 
> environment pipeline options specified.
> We should fix the exposed methods on TestPipeline to prevent users from 
> making this simple mistake.
> One suggestion is to create a TestPipeline builder which will give access to 
> a mutable PipelineOptions which the user can edit before calling build() 
> creating a TestPipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-672) Figure out TestPipeline.create(PipelineOptions) / TestPipeline.fromOptions(PipelineOptions) story

2016-09-23 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-672:
--

 Summary: Figure out TestPipeline.create(PipelineOptions) / 
TestPipeline.fromOptions(PipelineOptions) story
 Key: BEAM-672
 URL: https://issues.apache.org/jira/browse/BEAM-672
 Project: Beam
  Issue Type: Improvement
Reporter: Luke Cwik
Priority: Minor


TestPipeline integrates with the integration testing environment and relies 
heavily on being able to be configured by the environment and executed on many 
runners.

Tests which rely on mutating PipelineOptions before creating the TestPipeline 
easily can get the integration wrong by creating PipelineOptions from 
PipelineOptionsFactory and then calling either TestPipeline.create(options) or 
TestPipeline.fromOptions(options), thus ignoring any integration environment 
pipeline options specified.

We should fix the exposed methods on TestPipeline to prevent users from making 
this simple mistake.

One suggestion is to create a TestPipeline builder which will give access to a 
mutable PipelineOptions which the user can edit before calling build() creating 
a TestPipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-670) BigQuery TableRow inserter incorrectly handles nextBackOff millis

2016-09-22 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-670:
---
Summary: BigQuery TableRow inserter incorrectly handles nextBackOff millis  
(was: FluentBackoff incorrectly handles nextBackOff millis)

> BigQuery TableRow inserter incorrectly handles nextBackOff millis
> -
>
> Key: BEAM-670
> URL: https://issues.apache.org/jira/browse/BEAM-670
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-gcp
>Reporter: Luke Cwik
>Assignee: Daniel Halperin
>Priority: Minor
>
> From:
> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/commit/94d57207924ed8650cf3c97fccb2a45f27bcc6a3#commitcomment-19135952
> Also present in:
> https://github.com/apache/incubator-beam/pull/888/files#diff-f6d45f28c12083c9556bb410bde8b109R614
> The check is inverted, it should be nextBackoffMillis != BackOff.STOP
> Otherwise it causes Thread.sleep() to be called with value -1 which causes an 
> IllegalArgumentException exception to be thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-670) FluentBackoff incorrectly handles nextBackOff millis

2016-09-22 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-670:
--

 Summary: FluentBackoff incorrectly handles nextBackOff millis
 Key: BEAM-670
 URL: https://issues.apache.org/jira/browse/BEAM-670
 Project: Beam
  Issue Type: Bug
  Components: sdk-java-gcp
Reporter: Luke Cwik
Assignee: Daniel Halperin
Priority: Minor


From:
https://github.com/GoogleCloudPlatform/DataflowJavaSDK/commit/94d57207924ed8650cf3c97fccb2a45f27bcc6a3#commitcomment-19135952

Also present in:
https://github.com/apache/incubator-beam/pull/888/files#diff-f6d45f28c12083c9556bb410bde8b109R614

The check is inverted, it should be nextBackoffMillis != BackOff.STOP
Otherwise it causes Thread.sleep() to be called with value -1 which causes an 
IllegalArgumentException exception to be thrown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (BEAM-661) CalendarWindows#isCompatibleWith should use equals instead of ==

2016-09-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reopened BEAM-661:

  Assignee: (was: Davor Bonaci)

> CalendarWindows#isCompatibleWith should use equals instead of ==
> 
>
> Key: BEAM-661
> URL: https://issues.apache.org/jira/browse/BEAM-661
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Ben Chambers
>Priority: Minor
> Fix For: Not applicable
>
>
> http://stackoverflow.com/questions/39617897/inputs-to-flatten-had-incompatible-window-windowfns-when-cogroupbykey-with-calen
> We're using `==` instead of `.equals` to compare objects, which causes 
> equivalent CalendarWindows to be incompatible.
> https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/CalendarWindows.java#L143



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-660) CalendarWindows compares DateTimes with ==

2016-09-21 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-660.

   Resolution: Duplicate
Fix Version/s: Not applicable

Duplicate of https://issues.apache.org/jira/browse/BEAM-661

> CalendarWindows compares DateTimes with ==
> --
>
> Key: BEAM-660
> URL: https://issues.apache.org/jira/browse/BEAM-660
> Project: Beam
>  Issue Type: Bug
>Reporter: Daniel Mills
>Priority: Minor
> Fix For: Not applicable
>
>
> CalendarWindows compares DateTime objects with ==, which causes compatible 
> WindowFns to not be considered compatible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-414) IntraBundleParallelization needs to be removed

2016-09-14 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-414.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> IntraBundleParallelization needs to be removed
> --
>
> Key: BEAM-414
> URL: https://issues.apache.org/jira/browse/BEAM-414
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Pei He
>Priority: Minor
>  Labels: newbie, starter
> Fix For: 0.3.0-incubating
>
>
> IntraBundleParallelization needs to be removed because it does not work since 
> it breaks bundle processing semantics by expecting that context information 
> is not mutated by the runner between element processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-414) IntraBundleParallelization needs to be removed

2016-09-14 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490774#comment-15490774
 ] 

Luke Cwik commented on BEAM-414:


https://github.com/apache/incubator-beam/pull/957

> IntraBundleParallelization needs to be removed
> --
>
> Key: BEAM-414
> URL: https://issues.apache.org/jira/browse/BEAM-414
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Pei He
>Priority: Minor
>  Labels: newbie, starter
>
> IntraBundleParallelization needs to be removed because it does not work since 
> it breaks bundle processing semantics by expecting that context information 
> is not mutated by the runner between element processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (BEAM-414) IntraBundleParallelization needs to be removed

2016-09-14 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik reassigned BEAM-414:
--

Assignee: Pei He  (was: Luke Cwik)

> IntraBundleParallelization needs to be removed
> --
>
> Key: BEAM-414
> URL: https://issues.apache.org/jira/browse/BEAM-414
> Project: Beam
>  Issue Type: Bug
>  Components: sdk-java-core
>Reporter: Luke Cwik
>Assignee: Pei He
>Priority: Minor
>  Labels: newbie, starter
>
> IntraBundleParallelization needs to be removed because it does not work since 
> it breaks bundle processing semantics by expecting that context information 
> is not mutated by the runner between element processing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (BEAM-544) Add header/footer support to TextIO.Write

2016-09-07 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik resolved BEAM-544.

   Resolution: Fixed
Fix Version/s: 0.3.0-incubating

> Add header/footer support to TextIO.Write
> -
>
> Key: BEAM-544
> URL: https://issues.apache.org/jira/browse/BEAM-544
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Assignee: Stas Levin
>Priority: Minor
> Fix For: 0.3.0-incubating
>
>
> Being able to add a header/footer to each file that is written via TextIO 
> would cover several simple text file format issues.
> Original ask:
> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/360



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-544) Add header/footer support to TextIO.Write

2016-09-06 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-544:
---
Assignee: Stas Levin

> Add header/footer support to TextIO.Write
> -
>
> Key: BEAM-544
> URL: https://issues.apache.org/jira/browse/BEAM-544
> Project: Beam
>  Issue Type: New Feature
>  Components: sdk-java-extensions
>Reporter: Luke Cwik
>Assignee: Stas Levin
>Priority: Minor
>
> Being able to add a header/footer to each file that is written via TextIO 
> would cover several simple text file format issues.
> Original ask:
> https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/360



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (BEAM-611) Add support for MapValues

2016-08-31 Thread Luke Cwik (JIRA)
Luke Cwik created BEAM-611:
--

 Summary: Add support for MapValues
 Key: BEAM-611
 URL: https://issues.apache.org/jira/browse/BEAM-611
 Project: Beam
  Issue Type: Improvement
  Components: sdk-ideas
Reporter: Luke Cwik
Priority: Minor


Filed from: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/412

Often I find myself needing to simply map a function over just the values of a 
key-valued PCollection. MapElements works for this, but suffers a small hit in 
readability (imho) and introduces some possibility for error.

I wanted to see if there is any bandwidth / interest in adding this as a 
standard transform to the SDK. If so, I have attached a gist with a basic spike 
I have been using in my flows:

https://gist.github.com/trentonstrong/8b60933dca545eb2138b72899195019e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner

2016-08-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-604:
---
Issue Type: Improvement  (was: Bug)

> Use Watermark Check Streaming Job Finish in TestDataflowRunner 
> ---
>
> Key: BEAM-604
> URL: https://issues.apache.org/jira/browse/BEAM-604
> Project: Beam
>  Issue Type: Improvement
>Reporter: Mark Liu
>Assignee: Mark Liu
>
> Currently, streaming job with bounded input can't be terminated automatically 
> and TestDataflowRunner can't handle this case. Need to update 
> TestDataflowRunner so that streaming integration test such as 
> WindowedWordCountIT can run with it.
> Implementation:
> Query watermark of each step and wait until all watermarks set to MAX then 
> cancel the job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner

2016-08-30 Thread Luke Cwik (JIRA)

 [ 
https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Cwik updated BEAM-604:
---
Priority: Minor  (was: Major)

> Use Watermark Check Streaming Job Finish in TestDataflowRunner 
> ---
>
> Key: BEAM-604
> URL: https://issues.apache.org/jira/browse/BEAM-604
> Project: Beam
>  Issue Type: Improvement
>Reporter: Mark Liu
>Assignee: Mark Liu
>Priority: Minor
>
> Currently, streaming job with bounded input can't be terminated automatically 
> and TestDataflowRunner can't handle this case. Need to update 
> TestDataflowRunner so that streaming integration test such as 
> WindowedWordCountIT can run with it.
> Implementation:
> Query watermark of each step and wait until all watermarks set to MAX then 
> cancel the job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner

2016-08-30 Thread Luke Cwik (JIRA)

[ 
https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449366#comment-15449366
 ] 

Luke Cwik commented on BEAM-604:


Would be better if the Dataflow service just shutdown jobs that hit max 
watermark automatically.

> Use Watermark Check Streaming Job Finish in TestDataflowRunner 
> ---
>
> Key: BEAM-604
> URL: https://issues.apache.org/jira/browse/BEAM-604
> Project: Beam
>  Issue Type: Bug
>Reporter: Mark Liu
>Assignee: Mark Liu
>
> Currently, streaming job with bounded input can't be terminated automatically 
> and TestDataflowRunner can't handle this case. Need to update 
> TestDataflowRunner so that streaming integration test such as 
> WindowedWordCountIT can run with it.
> Implementation:
> Query watermark of each step and wait until all watermarks set to MAX then 
> cancel the job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >