[jira] [Created] (BEAM-1200) PubsubIO should allow for a user to supply the function which computes the watermark that is reported
Luke Cwik created BEAM-1200: --- Summary: PubsubIO should allow for a user to supply the function which computes the watermark that is reported Key: BEAM-1200 URL: https://issues.apache.org/jira/browse/BEAM-1200 Project: Beam Issue Type: Improvement Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Daniel Halperin Priority: Minor A user wanted to build a watermark function which tracked the datas watermark but never falls behind current time more than Y minutes. PubsubIO does not support specifying the function which computes and reports the watermark. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-430) Introducing gcpTempLocation that default to tempLocation
[ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-430: --- Labels: backward-incompatible (was: ) > Introducing gcpTempLocation that default to tempLocation > > > Key: BEAM-430 > URL: https://issues.apache.org/jira/browse/BEAM-430 > Project: Beam > Issue Type: Improvement >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Labels: backward-incompatible > Fix For: 0.2.0-incubating > > > Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. > And, it requires tempLocation to be a gcs path. > Another case is BigQueryIO uses tempLocation and also requires it to be on > gcs. > So, users cannot set tempLocation to a non-gcs path with DataflowRunner or > BigQueryIO. > However, tempLocation could be on any file system. For example, WordCount > defaults to output to tempLocation. > The proposal is to add gcpTempLocation. And, it defaults to tempLocation if > tempLocation is a gcs path. > StagingLocation and BigQueryIO will use gcpTempLocation by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-430) Introducing gcpTempLocation that default to tempLocation
[ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik closed BEAM-430. -- > Introducing gcpTempLocation that default to tempLocation > > > Key: BEAM-430 > URL: https://issues.apache.org/jira/browse/BEAM-430 > Project: Beam > Issue Type: Improvement >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Labels: backward-incompatible > Fix For: 0.2.0-incubating > > > Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. > And, it requires tempLocation to be a gcs path. > Another case is BigQueryIO uses tempLocation and also requires it to be on > gcs. > So, users cannot set tempLocation to a non-gcs path with DataflowRunner or > BigQueryIO. > However, tempLocation could be on any file system. For example, WordCount > defaults to output to tempLocation. > The proposal is to add gcpTempLocation. And, it defaults to tempLocation if > tempLocation is a gcs path. > StagingLocation and BigQueryIO will use gcpTempLocation by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-430) Introducing gcpTempLocation that default to tempLocation
[ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-430. Resolution: Fixed > Introducing gcpTempLocation that default to tempLocation > > > Key: BEAM-430 > URL: https://issues.apache.org/jira/browse/BEAM-430 > Project: Beam > Issue Type: Improvement >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Labels: backward-incompatible > Fix For: 0.2.0-incubating > > > Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. > And, it requires tempLocation to be a gcs path. > Another case is BigQueryIO uses tempLocation and also requires it to be on > gcs. > So, users cannot set tempLocation to a non-gcs path with DataflowRunner or > BigQueryIO. > However, tempLocation could be on any file system. For example, WordCount > defaults to output to tempLocation. > The proposal is to add gcpTempLocation. And, it defaults to tempLocation if > tempLocation is a gcs path. > StagingLocation and BigQueryIO will use gcpTempLocation by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (BEAM-430) Introducing gcpTempLocation that default to tempLocation
[ https://issues.apache.org/jira/browse/BEAM-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-430: > Introducing gcpTempLocation that default to tempLocation > > > Key: BEAM-430 > URL: https://issues.apache.org/jira/browse/BEAM-430 > Project: Beam > Issue Type: Improvement >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Labels: backward-incompatible > Fix For: 0.2.0-incubating > > > Currently, DataflowPipelineOptions.stagingLocation default to tempLocation. > And, it requires tempLocation to be a gcs path. > Another case is BigQueryIO uses tempLocation and also requires it to be on > gcs. > So, users cannot set tempLocation to a non-gcs path with DataflowRunner or > BigQueryIO. > However, tempLocation could be on any file system. For example, WordCount > defaults to output to tempLocation. > The proposal is to add gcpTempLocation. And, it defaults to tempLocation if > tempLocation is a gcs path. > StagingLocation and BigQueryIO will use gcpTempLocation by default. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-1187) GCP Transport not performing timed backoff after connection failure
Luke Cwik created BEAM-1187: --- Summary: GCP Transport not performing timed backoff after connection failure Key: BEAM-1187 URL: https://issues.apache.org/jira/browse/BEAM-1187 Project: Beam Issue Type: Bug Components: runner-dataflow, sdk-java-core, sdk-java-gcp Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor The http request retries are failing and seemingly being immediately retried if there is a connection exception. Note that below all the times are the same, and also that we are logging too much. This seems to be related to the interaction by the chaining http request initializer combining the Credential initializer followed by the RetryHttpRequestInitializer. Also, note that we never log "Request failed with IOException, will NOT retry" which implies that the retry logic never made it to the RetryHttpRequestInitializer. Action items are: 1) Ensure that the RetryHttpRequestInitializer is used 2) Ensure that calls do backoff 3) Reduce the logging to one terminal statement saying that we retried X times and final failure was YYY. Dump of console output: Dec 20, 2016 9:12:20 AM com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner fromOptions INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 1 files. Enable logging at DEBUG level to see which files will be staged. Dec 20, 2016 9:12:21 AM com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner run INFO: Executing pipeline on the Dataflow Service, which will have billing implications related to Google Compute Engine usage and other Google Cloud Services. Dec 20, 2016 9:12:21 AM com.google.cloud.dataflow.sdk.util.PackageUtil stageClasspathElements INFO: Uploading 1 files from PipelineOptions.filesToStage to staging location to prepare for execution. Dec 20, 2016 9:12:21 AM com.google.cloud.dataflow.sdk.util.PackageUtil stageClasspathElements INFO: Uploading PipelineOptions.filesToStage complete: 1 files newly uploaded, 0 files cached Dec 20, 2016 9:12:22 AM com.google.api.client.http.HttpRequest execute WARNING: exception thrown while executing request java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) at java.net.Socket.connect(Socket.java:589) at sun.net.NetworkClient.doConnect(NetworkClient.java:175) at sun.net.www.http.HttpClient.openServer(HttpClient.java:432) at sun.net.www.http.HttpClient.openServer(HttpClient.java:527) at sun.net.www.http.HttpClient.(HttpClient.java:211) at sun.net.www.http.HttpClient.New(HttpClient.java:308) at sun.net.www.http.HttpClient.New(HttpClient.java:326) at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1169) at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1105) at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:999) at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:933) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(HttpURLConnection.java:1283) at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(HttpURLConnection.java:1258) at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:77) at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469) at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:632) at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:201) at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:181) at com.google.cloud.dataflow.integration.NumbersStreaming.numbersStreamingFromPubsub(NumbersStreaming.java:378) at com.google.cloud.dataflow.integration.NumbersStreaming.main(NumbersStreaming.java:831) Dec 20, 2016 9:12:22 AM com.google.api.client.http.HttpRequest execute WARNING: exception thrown while executing request
[jira] [Commented] (BEAM-1176) Make our test suites use @Rule TestPipeline
[ https://issues.apache.org/jira/browse/BEAM-1176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15764778#comment-15764778 ] Luke Cwik commented on BEAM-1176: - I took a look at AvroIOGeneratedClassTest, ApproximateUniqueTest, SampleTest and BigtableIOTest. It seemed as though all multi-pipeline cases were just different variants of the same pipeline. It seems as though if these tests were broken up into multiple tests or better yet a set of paramemterized tests (https://github.com/Pragmatists/junitparams), we could use the test rule. > Make our test suites use @Rule TestPipeline > --- > > Key: BEAM-1176 > URL: https://issues.apache.org/jira/browse/BEAM-1176 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Stas Levin >Priority: Minor > > Now that [~staslev] has made {{TestPipeline}} a JUnit rule that performs > useful sanity checks, we should port all of our tests to it so that they set > a good example for users. Maybe we'll even catch some straggling tests with > errors :-) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (BEAM-1005) Autogenerate example archetypes as part of build process
[ https://issues.apache.org/jira/browse/BEAM-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik closed BEAM-1005. --- Resolution: Duplicate Fix Version/s: Not applicable Duplicate of BEAM-1004 > Autogenerate example archetypes as part of build process > > > Key: BEAM-1005 > URL: https://issues.apache.org/jira/browse/BEAM-1005 > Project: Beam > Issue Type: Improvement > Components: sdk-java-extensions >Reporter: Kenneth Knowles > Fix For: Not applicable > > > Previously, the maven archetypes were manually curated. Recently, the > generation of the content for the example archetype was automated, and > another Java 8 example archetype created. The generated content is currently > checked into source control, but should be instead generated as part of the > build process. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-758) Per-step, per-execution nonce
[ https://issues.apache.org/jira/browse/BEAM-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736667#comment-15736667 ] Luke Cwik commented on BEAM-758: I would suggest building this into new DoFn, add an annotation called @Nonce that can be added to methods and it will automatically be populated. > Per-step, per-execution nonce > - > > Key: BEAM-758 > URL: https://issues.apache.org/jira/browse/BEAM-758 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Sam McVeety > > In the forthcoming runner API, a user will be able to save a pipeline to JSON > and then run it repeatedly. > Many pieces of code (e.g., BigQueryIO.Read or Write) rely on a single random > value (nonce). These values are typically generated at apply time, so that > they are deterministic (don't change across retries of DoFns) and global (are > the same across all workers). > However, once the runner API lands the existing code would result in the same > nonce being reused across jobs. Other possible solutions: > * Generate nonce in {{Create(1) | ParDo}} then use this as a side input. > Should work, as along as side inputs are actually checkpointed. But does not > work for {{BoundedSource}}. > * If a nonce is only needed for the lifetime of one bundle, can be generated > in {{startBundle}} and used in {{finishBundle}} [or {{tearDown}}]. > * Add some context somewhere that lets user code access unique step name, and > somehow generate a nonce consistently e.g. by hashing. Will usually work, but > this is similarly not available to sources. > Another Q: I'm not sure we have a good way to generate nonces in unbounded > pipelines -- we probably need one. This would enable us to, e.g., use > {{BigQueryIO.Write}} in an unbounded pipeline [if we had, e.g., exactly-once > triggering per window]. Or generalizing to multiple firings... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-682) Invoker Class should be created in Thread Context Classloader
[ https://issues.apache.org/jira/browse/BEAM-682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717358#comment-15717358 ] Luke Cwik commented on BEAM-682: ReflectHelpers exposes a method which figures out the correct class loader to use: https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/util/common/ReflectHelpers.java#L224 You can't assume the current threads context class loader is always available since it can be null. http://stackoverflow.com/questions/3459216/can-the-thread-context-class-loader-be-null > Invoker Class should be created in Thread Context Classloader > - > > Key: BEAM-682 > URL: https://issues.apache.org/jira/browse/BEAM-682 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 0.3.0-incubating >Reporter: Sumit Chawla >Assignee: Sumit Chawla > > As of now the InvokerClass is being loaded in wrong classloader. It should be > loaded into Thread.currentThread.getContextClassLoader() > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnInvokers.java#L167 > {code} > Class> res = > (Class>) > unloaded > .load(DoFnInvokers.class.getClassLoader(), > ClassLoadingStrategy.Default.INJECTION) > .getLoaded(); > {code} > Fix > {code} > Class> res = > (Class>) > unloaded > .load(Thread.currentThread().getContextClassLoader(), > ClassLoadingStrategy.Default.INJECTION) > .getLoaded(); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1061) PreCommit test with side inputs
[ https://issues.apache.org/jira/browse/BEAM-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15707206#comment-15707206 ] Luke Cwik commented on BEAM-1061: - Is this not covered by the RunnableOnService tests found in ViewTest.java? > PreCommit test with side inputs > --- > > Key: BEAM-1061 > URL: https://issues.apache.org/jira/browse/BEAM-1061 > Project: Beam > Issue Type: Test > Components: testing >Reporter: Daniel Halperin >Assignee: Daniel Halperin > > We should have at least one precommit integration test that exercises side > inputs on all runners. Existing tests exercise sources, files, per-key, > combiners, windowing, ...; side inputs is one lacking part of the model it > would be nice to touch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-1024) upgrade to protobuf-3.1.0
[ https://issues.apache.org/jira/browse/BEAM-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15684774#comment-15684774 ] Luke Cwik commented on BEAM-1024: - There was an upgrade to protobuf 3.0.0 in commit https://github.com/apache/incubator-beam/commit/f93ca9ce803a8847a7178ff0d7c5e1631bed8f2d for Apache Beam. Upgrading to 3.1.0 would require either shading protobuf everywhere or making sure that all our dependencies use protobuf 3.1.0 > upgrade to protobuf-3.1.0 > - > > Key: BEAM-1024 > URL: https://issues.apache.org/jira/browse/BEAM-1024 > Project: Beam > Issue Type: Wish >Reporter: Rafael Fernandez > > The SDK currently uses protobuf 3.0.0-beta-1. There are critical improvements > to the library since (such as JsonFormat.parser().ignoringUnknownFields()). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-950) DoFn Setup and Teardown methods should have access to PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654527#comment-15654527 ] Luke Cwik commented on BEAM-950: The primary use case is getting access to things like credentials/executor service. > DoFn Setup and Teardown methods should have access to PipelineOptions > - > > Key: BEAM-950 > URL: https://issues.apache.org/jira/browse/BEAM-950 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Thomas Groh >Assignee: Davor Bonaci > > This enables any options-relevant decisions to be made once per DoFn, without > having to lazily initialize in {{startBundle}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-626) AvroCoder not deserializing correctly in Kryo
[ https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-626. Resolution: Fixed Fix Version/s: 0.4.0-incubating > AvroCoder not deserializing correctly in Kryo > - > > Key: BEAM-626 > URL: https://issues.apache.org/jira/browse/BEAM-626 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Aviem Zur >Assignee: Aviem Zur >Priority: Minor > Fix For: 0.4.0-incubating > > > Unlike with Java serialization, when deserializing AvroCoder using Kryo, the > resulting AvroCoder is missing all of its transient fields. > The reason it works with Java serialization is because of the usage of > writeReplace and readResolve, which Kryo does not adhere to. > In ProtoCoder for example there are also unserializable members, the way it > is solved there is lazy initializing these members via their getters, so they > are initialized in the deserialized object on first call to the member. > It seems AvroCoder is the only class in Beam to use writeReplace convention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner
[ https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646506#comment-15646506 ] Luke Cwik commented on BEAM-939: Yes, did that in https://github.com/apache/incubator-beam/pull/1308 Unfortunately, the fact that we had to pass around the BigtableService instance for testing reasons made this more difficult than just deferring for PipelineOptions when its available in the few places we need the service. Added you and Thomas Groh for review. > New credentials code broke Dataflow runner > -- > > Key: BEAM-939 > URL: https://issues.apache.org/jira/browse/BEAM-939 > Project: Beam > Issue Type: New Feature > Components: sdk-java-gcp >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Luke Cwik >Priority: Minor > Fix For: 0.4.0-incubating > > > https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/ > {code} > java.lang.NoSuchMethodError: > com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials; > at > com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207) > at > com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112) > at > com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94) > at > com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-939) New credentials code broke Dataflow runner
[ https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-939: --- Priority: Minor (was: Major) > New credentials code broke Dataflow runner > -- > > Key: BEAM-939 > URL: https://issues.apache.org/jira/browse/BEAM-939 > Project: Beam > Issue Type: New Feature > Components: sdk-java-gcp >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Luke Cwik >Priority: Minor > Fix For: 0.4.0-incubating > > > https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/ > {code} > java.lang.NoSuchMethodError: > com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials; > at > com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207) > at > com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112) > at > com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94) > at > com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner
[ https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646147#comment-15646147 ] Luke Cwik commented on BEAM-939: Turns out that BigtableIO was never using the user specified credentials and relying on the application default via com.google.auth.GoogleCredentials. Unfortunately between 0.4.0 and 0.6.0, com.google.auth.oauth2.GoogleCredentials had a backwards incompatible change (removed the method com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials). It seems like the proper way to fix this is to pass through the com.google.auth.Credentials object from pipeline options through. > New credentials code broke Dataflow runner > -- > > Key: BEAM-939 > URL: https://issues.apache.org/jira/browse/BEAM-939 > Project: Beam > Issue Type: New Feature > Components: sdk-java-gcp >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Luke Cwik > Fix For: 0.4.0-incubating > > > https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/ > {code} > java.lang.NoSuchMethodError: > com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials; > at > com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207) > at > com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112) > at > com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94) > at > com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-939) New credentials code broke Dataflow runner
[ https://issues.apache.org/jira/browse/BEAM-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15646116#comment-15646116 ] Luke Cwik commented on BEAM-939: Taking a look > New credentials code broke Dataflow runner > -- > > Key: BEAM-939 > URL: https://issues.apache.org/jira/browse/BEAM-939 > Project: Beam > Issue Type: New Feature > Components: sdk-java-gcp >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Luke Cwik > Fix For: 0.4.0-incubating > > > https://builds.apache.org/view/Beam/job/beam_PostCommit_MavenVerify/1753/ > {code} > java.lang.NoSuchMethodError: > com.google.auth.oauth2.GoogleCredentials.getApplicationDefault(Lcom/google/api/client/http/HttpTransport;)Lcom/google/auth/oauth2/GoogleCredentials; > at > com.google.cloud.bigtable.config.CredentialFactory.getApplicationDefaultCredential(CredentialFactory.java:207) > at > com.google.cloud.bigtable.config.CredentialFactory.getCredentials(CredentialFactory.java:112) > at > com.google.cloud.bigtable.grpc.io.CredentialInterceptorCache.getCredentialsInterceptor(CredentialInterceptorCache.java:94) > at > com.google.cloud.bigtable.grpc.BigtableSession.(BigtableSession.java:272) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableServiceImpl.tableExists(BigtableServiceImpl.java:81) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:296) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableIO$Read.validate(BigtableIO.java:185) > at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:399) > at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:307) > at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:47) > at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:158) > at > org.apache.beam.sdk.io.gcp.bigtable.BigtableReadIT.testE2EBigtableRead(BigtableReadIT.java:53) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at > org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow
[ https://issues.apache.org/jira/browse/BEAM-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-725. Resolution: Fixed Fix Version/s: 0.4.0-incubating > Remove legacy credentials flags related to GCP and adopt application default > credentials as only supported default flow > --- > > Key: BEAM-725 > URL: https://issues.apache.org/jira/browse/BEAM-725 > Project: Beam > Issue Type: Improvement > Components: sdk-java-gcp >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Minor > Labels: backward-incompatible > Fix For: 0.4.0-incubating > > > Drop the following GcpOptions and use ADC > (https://developers.google.com/identity/protocols/application-default-credentials) > to clean-up credentials story for GCP: > AuthorizationServerEncodedUrl > TokenServerUrl > CredentialDir > CredentialId > SecretsFile > ServiceAccountName > ServiceAccountKeyfile > Also migrate from Apiary Credentials class to Google OAuth Credentials class > when available from google-cloud-java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-582) Allow usage of the new GCP service account JSON key
[ https://issues.apache.org/jira/browse/BEAM-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-582. Resolution: Won't Fix Fix Version/s: Not applicable > Allow usage of the new GCP service account JSON key > --- > > Key: BEAM-582 > URL: https://issues.apache.org/jira/browse/BEAM-582 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Alex Van Boxel >Assignee: Davor Bonaci > Fix For: Not applicable > > > The new JSON service account files are a lot easier to use, you don't need to > provide the accountId (as it's embedded in the JSON files, including the > private key as well). > I noticed this will integrating Cloud DataFlow in Apache Airflow, where I > upgraded the usage of the service keys. Airflow will drop support for the old > service files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-582) Allow usage of the new GCP service account JSON key
[ https://issues.apache.org/jira/browse/BEAM-582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644918#comment-15644918 ] Luke Cwik commented on BEAM-582: This is being superseded by https://issues.apache.org/jira/browse/BEAM-725 > Allow usage of the new GCP service account JSON key > --- > > Key: BEAM-582 > URL: https://issues.apache.org/jira/browse/BEAM-582 > Project: Beam > Issue Type: Improvement > Components: runner-dataflow >Reporter: Alex Van Boxel >Assignee: Davor Bonaci > > The new JSON service account files are a lot easier to use, you don't need to > provide the accountId (as it's embedded in the JSON files, including the > private key as well). > I noticed this will integrating Cloud DataFlow in Apache Airflow, where I > upgraded the usage of the service keys. Airflow will drop support for the old > service files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-898) BigQueryTornadoes IT has invalid PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15637972#comment-15637972 ] Luke Cwik commented on BEAM-898: Still failing: https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1734/testReport/ Error Message Expected getter for property [output] to be marked with @Default on all [org.apache.beam.examples.WordCount$WordCountOptions, org.apache.beam.examples.cookbook.BigQueryTornadoes$Options], found only on [org.apache.beam.examples.WordCount$WordCountOptions] Stacktrace java.lang.IllegalArgumentException: Expected getter for property [output] to be marked with @Default on all [org.apache.beam.examples.WordCount$WordCountOptions, org.apache.beam.examples.cookbook.BigQueryTornadoes$Options], found only on [org.apache.beam.examples.WordCount$WordCountOptions] at org.apache.beam.sdk.options.PipelineOptionsFactory.throwForGettersWithInconsistentAnnotation(PipelineOptionsFactory.java:1309) at org.apache.beam.sdk.options.PipelineOptionsFactory.validateGettersHaveConsistentAnnotation(PipelineOptionsFactory.java:1150) at org.apache.beam.sdk.options.PipelineOptionsFactory.validateMethodAnnotations(PipelineOptionsFactory.java:1065) at org.apache.beam.sdk.options.PipelineOptionsFactory.validateClass(PipelineOptionsFactory.java:995) at org.apache.beam.sdk.options.PipelineOptionsFactory.validateWellFormed(PipelineOptionsFactory.java:627) at org.apache.beam.sdk.options.PipelineOptionsFactory.register(PipelineOptionsFactory.java:561) at org.apache.beam.examples.cookbook.BigQueryTornadoesIT.testE2EBigQueryTornadoes(BigQueryTornadoesIT.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.apache.maven.surefire.junitcore.pc.Scheduler$1.run(Scheduler.java:393) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) > BigQueryTornadoes IT has invalid PipelineOptions > > > Key: BEAM-898 > URL: https://issues.apache.org/jira/browse/BEAM-898 > Project: Beam > Issue Type: Bug > Components: sdk-java-gcp, testing >Reporter: Daniel Halperin >Assignee: Mark Liu > > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1718/ > This PR: https://github.com/apache/incubator-beam/pull/1159 > checks that pipeline options cannot have multiple incompatible defaults. > BigQueryTornadoes ITs have a problem with how they register pipeline options. > Luke can give more details on fix. > cc [~pei...@gmail.com] [~lcwik] [~jasonkuster] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-790) Validate PipelineOptions Default annotation
[ https://issues.apache.org/jira/browse/BEAM-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-790: --- Priority: Minor (was: Major) > Validate PipelineOptions Default annotation > --- > > Key: BEAM-790 > URL: https://issues.apache.org/jira/browse/BEAM-790 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Fix For: 0.4.0-incubating > > > It shouldn't allow @Override with @Default annotation, for example the > following is broken: > interface A { > @Default.Integer(1) > Integer getFoo(); > void setFoo(); > } > interface B extends A { > @Default.Integer(-1) > @Override > Integer getFoo(); > } > It is broken, because PipelineOptions default values are lazily evaluated. > And, it will depends on which one of the two following operations happen > first: > options.as(A.class) and options.as(B.class) > If users want to change the default value, users should do setFoo(...) > explicitly. > It shouldn't allow adding Default annotation as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-790) Validate PipelineOptions Default annotation
[ https://issues.apache.org/jira/browse/BEAM-790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-790. Resolution: Fixed Fix Version/s: 0.4.0-incubating > Validate PipelineOptions Default annotation > --- > > Key: BEAM-790 > URL: https://issues.apache.org/jira/browse/BEAM-790 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Pei He >Assignee: Pei He > Fix For: 0.4.0-incubating > > > It shouldn't allow @Override with @Default annotation, for example the > following is broken: > interface A { > @Default.Integer(1) > Integer getFoo(); > void setFoo(); > } > interface B extends A { > @Default.Integer(-1) > @Override > Integer getFoo(); > } > It is broken, because PipelineOptions default values are lazily evaluated. > And, it will depends on which one of the two following operations happen > first: > options.as(A.class) and options.as(B.class) > If users want to change the default value, users should do setFoo(...) > explicitly. > It shouldn't allow adding Default annotation as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-874) sdks/java/microbenchmarks instructions incorrect since benchmarks no longer run
[ https://issues.apache.org/jira/browse/BEAM-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-874: --- Description: microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which leads to this failure upon executing "java -jar target/microbenchmarks.jar": No matching benchmarks. Miss-spelled regexp? Use EXTRA verbose mode to debug the pattern matching. Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid usage of mvn install when possible. An alternate suggestion could be: mvn package -pl sdks/java/microbenchmarks -am was: microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which leads to this failure upon executing ```java -jar target/microbenchmarks.jar```: No matching benchmarks. Miss-spelled regexp? Use EXTRA verbose mode to debug the pattern matching. Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid usage of mvn install when possible. An alternate suggestion could be: mvn package -pl sdks/java/microbenchmarks -am > sdks/java/microbenchmarks instructions incorrect since benchmarks no longer > run > --- > > Key: BEAM-874 > URL: https://issues.apache.org/jira/browse/BEAM-874 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Davor Bonaci > > microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which > leads to this failure upon executing > "java -jar target/microbenchmarks.jar": > No matching benchmarks. Miss-spelled regexp? > Use EXTRA verbose mode to debug the pattern matching. > Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid > usage of mvn install when possible. An alternate suggestion could be: > mvn package -pl sdks/java/microbenchmarks -am -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-874) sdks/java/microbenchmarks instructions and execution no longer function
Luke Cwik created BEAM-874: -- Summary: sdks/java/microbenchmarks instructions and execution no longer function Key: BEAM-874 URL: https://issues.apache.org/jira/browse/BEAM-874 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which leads to this failure upon executing ```java -jar target/microbenchmarks.jar```: No matching benchmarks. Miss-spelled regexp? Use EXTRA verbose mode to debug the pattern matching. Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid usage of mvn install when possible. An alternate suggestion could be: mvn package -pl sdks/java/microbenchmarks -am -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-874) sdks/java/microbenchmarks instructions incorrect since benchmarks no longer run
[ https://issues.apache.org/jira/browse/BEAM-874?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-874: --- Summary: sdks/java/microbenchmarks instructions incorrect since benchmarks no longer run (was: sdks/java/microbenchmarks instructions and execution no longer function) > sdks/java/microbenchmarks instructions incorrect since benchmarks no longer > run > --- > > Key: BEAM-874 > URL: https://issues.apache.org/jira/browse/BEAM-874 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Davor Bonaci > > microbenchmarks.jar is built with an empty META-INF/BenchmarkList file which > leads to this failure upon executing > ```java -jar target/microbenchmarks.jar```: > No matching benchmarks. Miss-spelled regexp? > Use EXTRA verbose mode to debug the pattern matching. > Also, note that sdks/java/microbenchmarks/README.md should attempt to avoid > usage of mvn install when possible. An alternate suggestion could be: > mvn package -pl sdks/java/microbenchmarks -am -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds
[ https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-822: --- Priority: Minor (was: Major) > SDK build writes timestamp to source tree, causing spurious builds > -- > > Key: BEAM-822 > URL: https://issues.apache.org/jira/browse/BEAM-822 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Minor > Fix For: 0.4.0-incubating > > > The SDK build puts the build timestamp into {{sdk.properties}}. To have a > timestamp that does not break incremental build, the right place for it is in > the manifest of the built artifact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds
[ https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-822. Resolution: Fixed Fix Version/s: 0.4.0-incubating > SDK build writes timestamp to source tree, causing spurious builds > -- > > Key: BEAM-822 > URL: https://issues.apache.org/jira/browse/BEAM-822 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Minor > Fix For: 0.4.0-incubating > > > The SDK build puts the build timestamp into {{sdk.properties}}. To have a > timestamp that does not break incremental build, the right place for it is in > the manifest of the built artifact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-822) SDK build writes timestamp to source tree, causing spurious builds
[ https://issues.apache.org/jira/browse/BEAM-822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-822: --- Issue Type: Improvement (was: Bug) > SDK build writes timestamp to source tree, causing spurious builds > -- > > Key: BEAM-822 > URL: https://issues.apache.org/jira/browse/BEAM-822 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kenneth Knowles >Priority: Minor > Fix For: 0.4.0-incubating > > > The SDK build puts the build timestamp into {{sdk.properties}}. To have a > timestamp that does not break incremental build, the right place for it is in > the manifest of the built artifact. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo
[ https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622892#comment-15622892 ] Luke Cwik commented on BEAM-626: I'm not for/against fixing AvroCoder to work with Kryo, just pointing out that the problem is that we lack a spec that says how things need to be serializable for portability reasons. Until we get the Beam Runner API [https://issues.apache.org/jira/browse/BEAM-115] up and going we will continue to run into these issues. Dataflow relied on Java serialization for DoFns, and Jackson for Coders. Spark relies on Kryo. Another runner may pull in yet another way as to how they serialize DoFns/Coders/etc... Still looking at the PR. > AvroCoder not deserializing correctly in Kryo > - > > Key: BEAM-626 > URL: https://issues.apache.org/jira/browse/BEAM-626 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Aviem Zur >Assignee: Aviem Zur >Priority: Minor > > Unlike with Java serialization, when deserializing AvroCoder using Kryo, the > resulting AvroCoder is missing all of its transient fields. > The reason it works with Java serialization is because of the usage of > writeReplace and readResolve, which Kryo does not adhere to. > In ProtoCoder for example there are also unserializable members, the way it > is solved there is lazy initializing these members via their getters, so they > are initialized in the deserialized object on first call to the member. > It seems AvroCoder is the only class in Beam to use writeReplace convention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo
[ https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622626#comment-15622626 ] Luke Cwik commented on BEAM-626: [~amitsela] I was referring to DoFn's/Coders/... that are being serialized via Kryo and not referring to the users data which is being encoded/decoded using Coders. > AvroCoder not deserializing correctly in Kryo > - > > Key: BEAM-626 > URL: https://issues.apache.org/jira/browse/BEAM-626 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Aviem Zur >Assignee: Aviem Zur >Priority: Minor > > Unlike with Java serialization, when deserializing AvroCoder using Kryo, the > resulting AvroCoder is missing all of its transient fields. > The reason it works with Java serialization is because of the usage of > writeReplace and readResolve, which Kryo does not adhere to. > In ProtoCoder for example there are also unserializable members, the way it > is solved there is lazy initializing these members via their getters, so they > are initialized in the deserialized object on first call to the member. > It seems AvroCoder is the only class in Beam to use writeReplace convention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-398) JAXBCoder uses incorrect Double-Checked Locking
[ https://issues.apache.org/jira/browse/BEAM-398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15622594#comment-15622594 ] Luke Cwik commented on BEAM-398: Was merged here: https://github.com/apache/incubator-beam/commit/c29afb119be034b6b93083d9e8ec5542f13b4373 > JAXBCoder uses incorrect Double-Checked Locking > --- > > Key: BEAM-398 > URL: https://issues.apache.org/jira/browse/BEAM-398 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Thomas Groh >Priority: Minor > Labels: findbugs, newbie, starter > Fix For: 0.3.0-incubating > > > [FindBugs > DC_DOUBLECHECK|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L72]: > Possible double check of field > Applies to: > [JAXBCoder.getContext|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java#L113]. > For details on why this is incorrect, see: > http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html > This is a good starter bug. When fixing, please remove the corresponding > entries from > [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml] > and verify the build passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-398) JAXBCoder uses incorrect Double-Checked Locking
[ https://issues.apache.org/jira/browse/BEAM-398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-398. Resolution: Fixed Fix Version/s: 0.3.0-incubating > JAXBCoder uses incorrect Double-Checked Locking > --- > > Key: BEAM-398 > URL: https://issues.apache.org/jira/browse/BEAM-398 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Scott Wegner >Assignee: Thomas Groh >Priority: Minor > Labels: findbugs, newbie, starter > Fix For: 0.3.0-incubating > > > [FindBugs > DC_DOUBLECHECK|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml#L72]: > Possible double check of field > Applies to: > [JAXBCoder.getContext|https://github.com/apache/incubator-beam/blob/58a029a06aea1030279e5da8f9fa3114f456c1db/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/JAXBCoder.java#L113]. > For details on why this is incorrect, see: > http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html > This is a good starter bug. When fixing, please remove the corresponding > entries from > [findbugs-filter.xml|https://github.com/apache/incubator-beam/blob/master/sdks/java/build-tools/src/main/resources/beam/findbugs-filter.xml] > and verify the build passes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-626) AvroCoder not deserializing correctly in Kryo
[ https://issues.apache.org/jira/browse/BEAM-626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15617003#comment-15617003 ] Luke Cwik commented on BEAM-626: This will only solve the short term problem that AvroCoder is not serializable via Kryo. Users who write their own objects that rely on readResolve will still have the same problem that your facing with AvroCoder. They will need to do additional work to get their objects to work with the Spark runner. We'll need an official schema / serialization story for many of the objects used such as Coder/DoFn/... to be part of the Beam model for portability reasons but until then it seems worthwhile to fix this. > AvroCoder not deserializing correctly in Kryo > - > > Key: BEAM-626 > URL: https://issues.apache.org/jira/browse/BEAM-626 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Aviem Zur >Assignee: Aviem Zur >Priority: Minor > > Unlike with Java serialization, when deserializing AvroCoder using Kryo, the > resulting AvroCoder is missing all of its transient fields. > The reason it works with Java serialization is because of the usage of > writeReplace and readResolve, which Kryo does not adhere to. > In ProtoCoder for example there are also unserializable members, the way it > is solved there is lazy initializing these members via their getters, so they > are initialized in the deserialized object on first call to the member. > It seems AvroCoder is the only class in Beam to use writeReplace convention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-813) Support metadata in Avro sink
[ https://issues.apache.org/jira/browse/BEAM-813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-813. Resolution: Fixed Fix Version/s: 0.4.0-incubating > Support metadata in Avro sink > - > > Key: BEAM-813 > URL: https://issues.apache.org/jira/browse/BEAM-813 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Neville Li >Assignee: Neville Li >Priority: Minor > Fix For: 0.4.0-incubating > > > It'd be nice to support custom metadata in Avro files. This change is similar > to [BEAM-701]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-813) Support metadata in Avro sink
[ https://issues.apache.org/jira/browse/BEAM-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15610102#comment-15610102 ] Luke Cwik commented on BEAM-813: Merged to master here: https://github.com/apache/incubator-beam/commit/eba099f564dba3dfbba30ae3533496b9e14f57a7 > Support metadata in Avro sink > - > > Key: BEAM-813 > URL: https://issues.apache.org/jira/browse/BEAM-813 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Neville Li >Assignee: Neville Li >Priority: Minor > Fix For: 0.4.0-incubating > > > It'd be nice to support custom metadata in Avro files. This change is similar > to [BEAM-701]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-803) Maven configuration that easily launches examples IT tests on one specific runner
[ https://issues.apache.org/jira/browse/BEAM-803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15602292#comment-15602292 ] Luke Cwik commented on BEAM-803: Does it fail because the ServiceLoader files during bundling are being ignored and not merged? > Maven configuration that easily launches examples IT tests on one specific > runner > - > > Key: BEAM-803 > URL: https://issues.apache.org/jira/browse/BEAM-803 > Project: Beam > Issue Type: Wish >Reporter: Kenneth Knowles >Assignee: Jason Kuster >Priority: Minor > > Today, there is {{-Pjenkins-precommit}} that activates separate executions > for each of the runners, but no easy way to invoke just one of those > executions that I can discern. > The most promising command that I can come up with to run, for example, the > Flink wordcount integration test, is {{mvn > failsafe:integration-test@flink-runner-integration-tests -Pjenkins-precommit > -pl examples/java/}} but this fails due to runner registrar issues. Ideally, > this would be a fail-proof one-liner. > Any tips? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-769) Spark streaming tests fail on "nothing processed" if runtime env. is slow because timeout is hit before processing is done.
[ https://issues.apache.org/jira/browse/BEAM-769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595461#comment-15595461 ] Luke Cwik commented on BEAM-769: I would prefer a bigger timeout over having flaky tests if we couldn't make it deterministic in some way. If a test never flakes, people won't have to look at it. > Spark streaming tests fail on "nothing processed" if runtime env. is slow > because timeout is hit before processing is done. > --- > > Key: BEAM-769 > URL: https://issues.apache.org/jira/browse/BEAM-769 > Project: Beam > Issue Type: Bug > Components: runner-spark >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Amit Sela > > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1586/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1587/ > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1588/ > {code} > org.apache.beam.runners.spark.translation.streaming.FlattenStreamingTest.testFlattenUnbounded > org.apache.beam.runners.spark.translation.streaming.KafkaStreamingTest.testRun > org.apache.beam.runners.spark.translation.streaming.SimpleStreamingWordCountTest.testFixedWindows > {code} > The above tests use a hard-timeout (ungraceful stop) so if the runtime env. > is slow enough so that the batch is not done, it'll stop anyway and assert > and rightfully fail. > It's difficult to create locally because I never had trouble on my laptop. > Since Jenkins will be slow from time to time, it is reasonable enough to have > a more robust solution here : > # don't use checkpoint (Spark) if not necessary - only really necessary for > one test in {{KafkaStreamingTest}} and {{ResumeFromCheckpointStreamingTest}} > I think. > # allow for graceful stop - will take longer for each test, but should allow > the test to finish even if runtime env. is slow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-779) filesToStage should allow for common ways in which people package their resources
Luke Cwik created BEAM-779: -- Summary: filesToStage should allow for common ways in which people package their resources Key: BEAM-779 URL: https://issues.apache.org/jira/browse/BEAM-779 Project: Beam Issue Type: Improvement Components: runner-dataflow Reporter: Luke Cwik Assignee: Davor Bonaci Different application environments launch and maintain their classpath resources in various ways. See these SO questions for examples of how people launch their pipelines that currently are unsupported: http://stackoverflow.com/questions/31978566/detectclasspathresourcestostage-unable-to-convert-url http://stackoverflow.com/questions/40099952/launching-dataflow-jobs-from-a-java-application Add support for classpaths which: * use URLs that are embedded within other jars * allow for manifest files that specify their classpaths * add support for rsrc:// URIs to support Eclipse jar packaging -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-739) Log full exception stack trace in WordCountIT and BigQueryTornadoesIT
[ https://issues.apache.org/jira/browse/BEAM-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-739: --- Fix Version/s: (was: Not applicable) 0.3.0-incubating > Log full exception stack trace in WordCountIT and BigQueryTornadoesIT > - > > Key: BEAM-739 > URL: https://issues.apache.org/jira/browse/BEAM-739 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Fix For: 0.3.0-incubating > > > When IT tests are broken, they don't provide the full stack trace, such as in: > https://issues.apache.org/jira/browse/BEAM-736 > It makes investigating root causes slower. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-763) BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid RunnableOnService test
[ https://issues.apache.org/jira/browse/BEAM-763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-763: --- Fix Version/s: (was: Not applicable) 0.3.0-incubating > BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid > RunnableOnService test > -- > > Key: BEAM-763 > URL: https://issues.apache.org/jira/browse/BEAM-763 > Project: Beam > Issue Type: Test >Reporter: Luke Cwik >Assignee: Pei He >Priority: Minor > Fix For: 0.3.0-incubating > > > TestPipeline.create(options) is not compatible with how TestPipeline > functions. This overrides the properties provided by the maven > surefire/failsafe profiles setup by the various runners for integration > testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-755) beam-runners-core-java NeedsRunner tests not executing
[ https://issues.apache.org/jira/browse/BEAM-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-755. Resolution: Fixed Fix Version/s: 0.3.0-incubating > beam-runners-core-java NeedsRunner tests not executing > -- > > Key: BEAM-755 > URL: https://issues.apache.org/jira/browse/BEAM-755 > Project: Beam > Issue Type: Bug > Components: runner-core >Reporter: Luke Cwik >Assignee: Kenneth Knowles >Priority: Minor > Fix For: 0.3.0-incubating > > > org.apache.beam:beam-runners-core-java is not specified as an integration > test dependency to scan within runners/pom.xml > There is also in runners/direct-java/pom.xml where its > org.apache.beam:beam-runners-java-core and should be > org.apache.beam:beam-runners-core-java > Finally, even if these dependencies are added and the typo fixed. When > running the runnable on service integration tests, SplittableParDoTest which > contains @RunnableOnService tests (part of runners/core-java) doesn't execute. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-755) beam-runners-core-java NeedsRunner tests not executing
[ https://issues.apache.org/jira/browse/BEAM-755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-755: --- Priority: Minor (was: Major) > beam-runners-core-java NeedsRunner tests not executing > -- > > Key: BEAM-755 > URL: https://issues.apache.org/jira/browse/BEAM-755 > Project: Beam > Issue Type: Bug > Components: runner-core >Reporter: Luke Cwik >Assignee: Kenneth Knowles >Priority: Minor > Fix For: 0.3.0-incubating > > > org.apache.beam:beam-runners-core-java is not specified as an integration > test dependency to scan within runners/pom.xml > There is also in runners/direct-java/pom.xml where its > org.apache.beam:beam-runners-java-core and should be > org.apache.beam:beam-runners-core-java > Finally, even if these dependencies are added and the typo fixed. When > running the runnable on service integration tests, SplittableParDoTest which > contains @RunnableOnService tests (part of runners/core-java) doesn't execute. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-756) Checkstyle suppression for JavadocPackage not working on Windows
[ https://issues.apache.org/jira/browse/BEAM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-756. Resolution: Fixed Fix Version/s: 0.3.0-incubating > Checkstyle suppression for JavadocPackage not working on Windows > > > Key: BEAM-756 > URL: https://issues.apache.org/jira/browse/BEAM-756 > Project: Beam > Issue Type: Bug >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Minor > Fix For: 0.3.0-incubating > > > Exclusions for test and other files don't consider '\' as separator. Hence > checkstyle complains about missing package-info files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-764) Remove cloneAs from PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586902#comment-15586902 ] Luke Cwik commented on BEAM-764: This was merged into master here: https://github.com/apache/incubator-beam/commit/71c69b31b6894064bf8111007f947150ff725528 > Remove cloneAs from PipelineOptions > --- > > Key: BEAM-764 > URL: https://issues.apache.org/jira/browse/BEAM-764 > Project: Beam > Issue Type: Task >Reporter: Pei He >Assignee: Pei He > Labels: codehealth > Fix For: 0.3.0-incubating > > > PipelineOptions.cloneAs was a workaround to support running multiple > pipelines in Dataflow examples for a streaming pipeline and its injector. > After the Beam examples refactoring, cloneAs is no longer needed. > cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and > requires users to manually set them. So, I am deleting it. > However, we should figure out a better API and implementation to support > running multiple pipelines with the same configurations (whether through > PipelineOptions or not). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-764) Remove cloneAs from PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-764: --- Priority: Minor (was: Major) > Remove cloneAs from PipelineOptions > --- > > Key: BEAM-764 > URL: https://issues.apache.org/jira/browse/BEAM-764 > Project: Beam > Issue Type: Task >Reporter: Pei He >Assignee: Pei He >Priority: Minor > Labels: codehealth > Fix For: 0.3.0-incubating > > > PipelineOptions.cloneAs was a workaround to support running multiple > pipelines in Dataflow examples for a streaming pipeline and its injector. > After the Beam examples refactoring, cloneAs is no longer needed. > cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and > requires users to manually set them. So, I am deleting it. > However, we should figure out a better API and implementation to support > running multiple pipelines with the same configurations (whether through > PipelineOptions or not). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-764) Remove cloneAs from PipelineOptions
[ https://issues.apache.org/jira/browse/BEAM-764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-764. Resolution: Fixed Fix Version/s: 0.3.0-incubating > Remove cloneAs from PipelineOptions > --- > > Key: BEAM-764 > URL: https://issues.apache.org/jira/browse/BEAM-764 > Project: Beam > Issue Type: Task >Reporter: Pei He >Assignee: Pei He > Labels: codehealth > Fix For: 0.3.0-incubating > > > PipelineOptions.cloneAs was a workaround to support running multiple > pipelines in Dataflow examples for a streaming pipeline and its injector. > After the Beam examples refactoring, cloneAs is no longer needed. > cloneAs also has known issue, such as: JsonIgnore fields are not cloned, and > requires users to manually set them. So, I am deleting it. > However, we should figure out a better API and implementation to support > running multiple pipelines with the same configurations (whether through > PipelineOptions or not). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-763) BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid RunnableOnService test
Luke Cwik created BEAM-763: -- Summary: BigQueryIOTest.testBuildSourceWithTableAndSqlDialect is not a valid RunnableOnService test Key: BEAM-763 URL: https://issues.apache.org/jira/browse/BEAM-763 Project: Beam Issue Type: Test Reporter: Luke Cwik Assignee: Pei He Priority: Minor TestPipeline.create(options) is not compatible with how TestPipeline functions. This overrides the properties provided by the maven surefire/failsafe profiles setup by the various runners for integration testing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-761) SplittableParDoTest fails
[ https://issues.apache.org/jira/browse/BEAM-761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-761. Resolution: Fixed Assignee: Ben Chambers Fix Version/s: 0.3.0-incubating > SplittableParDoTest fails > - > > Key: BEAM-761 > URL: https://issues.apache.org/jira/browse/BEAM-761 > Project: Beam > Issue Type: Test >Reporter: Luke Cwik >Assignee: Ben Chambers >Priority: Minor > Fix For: 0.3.0-incubating > > > Coder propagation was missing in GBKIntoKeyedWorkItems > Fixed with https://github.com/apache/incubator-beam/pull/1117/files > Filed for completeness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-761) SplittableParDoTest fails
Luke Cwik created BEAM-761: -- Summary: SplittableParDoTest fails Key: BEAM-761 URL: https://issues.apache.org/jira/browse/BEAM-761 Project: Beam Issue Type: Test Reporter: Luke Cwik Priority: Minor Coder propagation was missing in GBKIntoKeyedWorkItems Fixed with https://github.com/apache/incubator-beam/pull/1117/files Filed for completeness. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-760) Validation needs to exist that @NeedsRunner / @RunnableOnService tests execute
Luke Cwik created BEAM-760: -- Summary: Validation needs to exist that @NeedsRunner / @RunnableOnService tests execute Key: BEAM-760 URL: https://issues.apache.org/jira/browse/BEAM-760 Project: Beam Issue Type: Improvement Components: runner-core, runner-dataflow, runner-direct, runner-flink, runner-gearpump, runner-spark, sdk-java-core Reporter: Luke Cwik Assignee: Jason Kuster We lack the validation that tests that were supposed to execute actually executed part of pre/post commit. This is worrisome in an automated test environment since its difficult to know if all the tests that were supposed to run did run. Repro steps: checkout apache/master @ b8e6eea691b48e14c4e2c3e84609d750769e09ee mvn clean integration-test -T 1C -pl runners/direct-java -am Note that the SplittableParDoTest part of beam-runners-core-java doesn't execute. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-756) Checkstyle suppression for JavadocPackage not working on Windows
[ https://issues.apache.org/jira/browse/BEAM-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-756: --- Priority: Minor (was: Major) > Checkstyle suppression for JavadocPackage not working on Windows > > > Key: BEAM-756 > URL: https://issues.apache.org/jira/browse/BEAM-756 > Project: Beam > Issue Type: Bug >Reporter: Thomas Weise >Assignee: Thomas Weise >Priority: Minor > > Exclusions for test and other files don't consider '\' as separator. Hence > checkstyle complains about missing package-info files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-747) Text checksum verifier is not resilient to eventually consistent filesystems
[ https://issues.apache.org/jira/browse/BEAM-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582963#comment-15582963 ] Luke Cwik commented on BEAM-747: The number of shards is not deterministic without explicitly limiting it on the sink. Also, requiring support for limited parallelism increases the barrier to entry for this test for runners. Typically if you get one filename for the YYY-of-ZZZ case, you can figure out all the remaining by parsing out the bounds and knowing exactly how many files exist and what they are named. > Text checksum verifier is not resilient to eventually consistent filesystems > > > Key: BEAM-747 > URL: https://issues.apache.org/jira/browse/BEAM-747 > Project: Beam > Issue Type: Bug > Components: testing >Affects Versions: Not applicable >Reporter: Daniel Halperin >Assignee: Mark Liu > > Example 1: > https://builds.apache.org/job/beam_PreCommit_MavenVerify/3934/org.apache.beam$beam-examples-java/console > Here it looks like we need to retry listing files, at least a little bit, if > none are found. They did show up: > {code} > gsutil ls > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results\* > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-0-of-3 > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-1-of-3 > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-13-12-37-02-467/output/results-2-of-3 > {code} > Example 2: > https://builds.apache.org/job/beam_PostCommit_MavenVerify/org.apache.beam$beam-examples-java/1525/testReport/junit/org.apache.beam.examples/WordCountIT/testE2EWordCount/ > Here it looks like we need to fill in the shard template if the filesystem > does not give us a consistent result: > {code} > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > readLines > INFO: [0 of 1] Read 162 lines from file: > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-0-of-3 > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > readLines > INFO: [1 of 1] Read 144 lines from file: > gs://temp-storage-for-end-to-end-tests/WordCountIT-2016-10-14-00-25-55-609/output/results-2-of-3 > Oct 14, 2016 12:31:16 AM org.apache.beam.sdk.testing.FileChecksumMatcher > matchesSafely > INFO: Generated checksum for output data: > aec68948b2515e6ea35fd1ed7649c267a10a01e5 > {code} > We missed shard 1-of-3 and hence got the wrong checksum. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-758) Per-step, per-execution nonce
[ https://issues.apache.org/jira/browse/BEAM-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582798#comment-15582798 ] Luke Cwik commented on BEAM-758: Several runners have the concept of a job or pipeline id, using a stable hash of the job or pipeline id could be used to generate the nonce. Currently we expose job name within PipelineOptions, we could also expose the concept of job id which is populated by a runner and is expected to uniquely identify the job with respect to the runner if the runner supports running multiple jobs at the same time. > Per-step, per-execution nonce > - > > Key: BEAM-758 > URL: https://issues.apache.org/jira/browse/BEAM-758 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Affects Versions: Not applicable >Reporter: Daniel Halperin > > In the forthcoming runner API, a user will be able to save a pipeline to JSON > and then run it repeatedly. > Many pieces of code (e.g., BigQueryIO.Read or Write) rely on a single random > value (nonce). These values are typically generated at apply time, so that > they are deterministic (don't change across retries of DoFns) and global (are > the same across all workers). > However, once the runner API lands the existing code would result in the same > nonce being reused across jobs. Other possible solutions: > * Generate nonce in {{Create(1) | ParDo}} then use this as a side input. > Should work, as along as side inputs are actually checkpointed. But does not > work for {{BoundedSource}}. > * If a nonce is only needed for the lifetime of one bundle, can be generated > in {{startBundle}} and used in {{finishBundle}} [or {{tearDown}}]. > * Add some context somewhere that lets user code access unique step name, and > somehow generate a nonce consistently e.g. by hashing. Will usually work, but > this is similarly not available to sources. > Another Q: I'm not sure we have a good way to generate nonces in unbounded > pipelines -- we probably need one. This would enable us to, e.g., use > {{BigQueryIO.Write}} in an unbounded pipeline [if we had, e.g., exactly-once > triggering per window]. Or generalizing to multiple firings... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-755) beam-runners-core-java RunnableOnService tests not executing
Luke Cwik created BEAM-755: -- Summary: beam-runners-core-java RunnableOnService tests not executing Key: BEAM-755 URL: https://issues.apache.org/jira/browse/BEAM-755 Project: Beam Issue Type: Bug Components: runner-core Reporter: Luke Cwik Assignee: Frances Perry org.apache.beam:beam-runners-core-java is not specified as an integration test dependency to scan within runners/pom.xml There is also in runners/direct-java/pom.xml where its org.apache.beam:beam-runners-java-core and should be org.apache.beam:beam-runners-core-java Finally, even if these dependencies are added and the typo fixed. When running the runnable on service integration tests, SplittableParDoTest which contains @RunnableOnService tests (part of runners/core-java) doesn't execute. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-736) BigQueryTornadoesIT broken, blocking nightly release.
[ https://issues.apache.org/jira/browse/BEAM-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-736. Resolution: Fixed Fix Version/s: 0.3.0-incubating > BigQueryTornadoesIT broken, blocking nightly release. > - > > Key: BEAM-736 > URL: https://issues.apache.org/jira/browse/BEAM-736 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Jason Kuster >Assignee: Pei He >Priority: Minor > Fix For: 0.3.0-incubating > > > Build break begins here: > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1471/ > listing 3 potential culprit commits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-736) BigQueryTornadoesIT broken, blocking nightly release.
[ https://issues.apache.org/jira/browse/BEAM-736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-736: --- Priority: Minor (was: Major) > BigQueryTornadoesIT broken, blocking nightly release. > - > > Key: BEAM-736 > URL: https://issues.apache.org/jira/browse/BEAM-736 > Project: Beam > Issue Type: Bug > Components: testing >Reporter: Jason Kuster >Assignee: Pei He >Priority: Minor > > Build break begins here: > https://builds.apache.org/job/beam_PostCommit_MavenVerify/1471/ > listing 3 potential culprit commits. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-726) Standardize naming of PipelineResult objects
[ https://issues.apache.org/jira/browse/BEAM-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553364#comment-15553364 ] Luke Cwik commented on BEAM-726: Is PipelineResult the appropriate suffix? If I support a non-blocking mode then its not really a result yet which explains the choice of the Job suffix for Dataflow. > Standardize naming of PipelineResult objects > > > Key: BEAM-726 > URL: https://issues.apache.org/jira/browse/BEAM-726 > Project: Beam > Issue Type: Bug > Components: beam-model >Reporter: Ben Chambers >Assignee: Frances Perry >Priority: Minor > > Today: > PipelineResult is an interface returned by running a pipeline. > DataflowPipelineJob is the Dataflow implementation of that interface > FlinkRunnerResult is the Flink implementation > EvaluationContext is the Spark implementation > DirectPipelineResult is the DirectRunner implementation > Ideally, all the names would indicate that they are a PipelineResult, like > the DirectRunner does. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow
[ https://issues.apache.org/jira/browse/BEAM-725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-725: --- Description: Drop the following GcpOptions and use ADC (https://developers.google.com/identity/protocols/application-default-credentials) to clean-up credentials story for GCP: AuthorizationServerEncodedUrl TokenServerUrl CredentialDir CredentialId SecretsFile ServiceAccountName ServiceAccountKeyfile Also migrate from Apiary Credentials class to Google OAuth Credentials class when available from google-cloud-java. was: Drop the following GcpOptions and use ADC to clean-up credentials story for GCP: AuthorizationServerEncodedUrl TokenServerUrl CredentialDir CredentialId SecretsFile ServiceAccountName ServiceAccountKeyfile Also migrate from Apiary Credentials class to Google OAuth Credentials class when available from google-cloud-java. > Remove legacy credentials flags related to GCP and adopt application default > credentials as only supported default flow > --- > > Key: BEAM-725 > URL: https://issues.apache.org/jira/browse/BEAM-725 > Project: Beam > Issue Type: Improvement > Components: sdk-java-gcp >Reporter: Luke Cwik >Assignee: Luke Cwik >Priority: Minor > > Drop the following GcpOptions and use ADC > (https://developers.google.com/identity/protocols/application-default-credentials) > to clean-up credentials story for GCP: > AuthorizationServerEncodedUrl > TokenServerUrl > CredentialDir > CredentialId > SecretsFile > ServiceAccountName > ServiceAccountKeyfile > Also migrate from Apiary Credentials class to Google OAuth Credentials class > when available from google-cloud-java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-725) Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow
Luke Cwik created BEAM-725: -- Summary: Remove legacy credentials flags related to GCP and adopt application default credentials as only supported default flow Key: BEAM-725 URL: https://issues.apache.org/jira/browse/BEAM-725 Project: Beam Issue Type: Improvement Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Luke Cwik Priority: Minor Drop the following GcpOptions and use ADC to clean-up credentials story for GCP: AuthorizationServerEncodedUrl TokenServerUrl CredentialDir CredentialId SecretsFile ServiceAccountName ServiceAccountKeyfile Also migrate from Apiary Credentials class to Google OAuth Credentials class when available from google-cloud-java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-716) Migrate JmsIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-716: -- Summary: Migrate JmsIO to use AutoValue to reduce boilerplate Key: BEAM-716 URL: https://issues.apache.org/jira/browse/BEAM-716 Project: Beam Issue Type: Improvement Components: sdk-java-extensions Reporter: Luke Cwik Assignee: James Malone Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-718) Migrate KinesisIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-718: -- Summary: Migrate KinesisIO to use AutoValue to reduce boilerplate Key: BEAM-718 URL: https://issues.apache.org/jira/browse/BEAM-718 Project: Beam Issue Type: Improvement Components: sdk-java-extensions Reporter: Luke Cwik Assignee: James Malone Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-717) Migrate KafkaIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-717: -- Summary: Migrate KafkaIO to use AutoValue to reduce boilerplate Key: BEAM-717 URL: https://issues.apache.org/jira/browse/BEAM-717 Project: Beam Issue Type: Improvement Components: sdk-java-extensions Reporter: Luke Cwik Assignee: James Malone Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-715) Migrate AvroHDFSFileSource/HDFSFileSource/HDFSFileSink to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-715: -- Summary: Migrate AvroHDFSFileSource/HDFSFileSource/HDFSFileSink to use AutoValue to reduce boilerplate Key: BEAM-715 URL: https://issues.apache.org/jira/browse/BEAM-715 Project: Beam Issue Type: Improvement Components: sdk-java-extensions Reporter: Luke Cwik Assignee: James Malone Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-714) Migrate DatastoreV1 to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-714: -- Summary: Migrate DatastoreV1 to use AutoValue to reduce boilerplate Key: BEAM-714 URL: https://issues.apache.org/jira/browse/BEAM-714 Project: Beam Issue Type: Improvement Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Daniel Halperin Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-712) Migrate BigQueryIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-712: -- Summary: Migrate BigQueryIO to use AutoValue to reduce boilerplate Key: BEAM-712 URL: https://issues.apache.org/jira/browse/BEAM-712 Project: Beam Issue Type: Improvement Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Daniel Halperin Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-713) Migrate BigTableIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-713: -- Summary: Migrate BigTableIO to use AutoValue to reduce boilerplate Key: BEAM-713 URL: https://issues.apache.org/jira/browse/BEAM-713 Project: Beam Issue Type: Improvement Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Daniel Halperin Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-711) Migrate XmlSource/XmlSink to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-711: -- Summary: Migrate XmlSource/XmlSink to use AutoValue to reduce boilerplate Key: BEAM-711 URL: https://issues.apache.org/jira/browse/BEAM-711 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-710) Migrate Read/Write to use AutoValue to reduce boilerplate
[ https://issues.apache.org/jira/browse/BEAM-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-710: --- Summary: Migrate Read/Write to use AutoValue to reduce boilerplate (was: Migrate Read to use AutoValue to reduce boilerplate) > Migrate Read/Write to use AutoValue to reduce boilerplate > - > > Key: BEAM-710 > URL: https://issues.apache.org/jira/browse/BEAM-710 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Davor Bonaci >Priority: Minor > Labels: io, simple, starter > > Use the AutoValue functionality to reduce boilerplate. > See this PR for an example: > https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-709) Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate
[ https://issues.apache.org/jira/browse/BEAM-709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-709: --- Summary: Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate (was: Migrate CountingSource to use AutoValue to reduce boilerplate) > Migrate CountingSource/CountingInput to use AutoValue to reduce boilerplate > --- > > Key: BEAM-709 > URL: https://issues.apache.org/jira/browse/BEAM-709 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Davor Bonaci >Priority: Minor > Labels: io, simple, starter > > Use the AutoValue functionality to reduce boilerplate. > See this PR for an example: > https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-707) Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue to reduce boilerplate
[ https://issues.apache.org/jira/browse/BEAM-707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-707: --- Summary: Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue to reduce boilerplate (was: Migrate PubsubIO to use AutoValue to reduce boilerplate) > Migrate PubsubIO/PubsubUnboundedSource/PubsubUnboundedSink to use AutoValue > to reduce boilerplate > - > > Key: BEAM-707 > URL: https://issues.apache.org/jira/browse/BEAM-707 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Davor Bonaci >Priority: Minor > Labels: io, simple, starter > > Use the AutoValue functionality to reduce boilerplate. > See this PR for an example: > https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-710) Migrate Read to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-710: -- Summary: Migrate Read to use AutoValue to reduce boilerplate Key: BEAM-710 URL: https://issues.apache.org/jira/browse/BEAM-710 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-709) Migrate CountingSource to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-709: -- Summary: Migrate CountingSource to use AutoValue to reduce boilerplate Key: BEAM-709 URL: https://issues.apache.org/jira/browse/BEAM-709 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-708) Migrate BoundedReadFromUnboundedSource to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-708: -- Summary: Migrate BoundedReadFromUnboundedSource to use AutoValue to reduce boilerplate Key: BEAM-708 URL: https://issues.apache.org/jira/browse/BEAM-708 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-707) Migrate PubsubIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-707: -- Summary: Migrate PubsubIO to use AutoValue to reduce boilerplate Key: BEAM-707 URL: https://issues.apache.org/jira/browse/BEAM-707 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-706) Migrate TextIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-706: -- Summary: Migrate TextIO to use AutoValue to reduce boilerplate Key: BEAM-706 URL: https://issues.apache.org/jira/browse/BEAM-706 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-705) Migrate AvroIO to use AutoValue to reduce boilerplate
Luke Cwik created BEAM-705: -- Summary: Migrate AvroIO to use AutoValue to reduce boilerplate Key: BEAM-705 URL: https://issues.apache.org/jira/browse/BEAM-705 Project: Beam Issue Type: Improvement Components: sdk-java-core Reporter: Luke Cwik Assignee: Davor Bonaci Priority: Minor Use the AutoValue functionality to reduce boilerplate. See this PR for an example: https://github.com/apache/incubator-beam/pull/1054 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-695) DisplayData for PipelineOptions fails to correctly toString array types
[ https://issues.apache.org/jira/browse/BEAM-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-695. Resolution: Fixed Fix Version/s: 0.3.0-incubating > DisplayData for PipelineOptions fails to correctly toString array types > --- > > Key: BEAM-695 > URL: https://issues.apache.org/jira/browse/BEAM-695 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Scott Wegner >Priority: Minor > Fix For: 0.3.0-incubating > > > For array types in Java, toString produces an uninformative message like > [Ljava.lang.String;@fc258b1 > You need to check to see if its an array type and call Arrays.toString(array). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-695) DisplayData for PipelineOptions fails to correctly toString array types
Luke Cwik created BEAM-695: -- Summary: DisplayData for PipelineOptions fails to correctly toString array types Key: BEAM-695 URL: https://issues.apache.org/jira/browse/BEAM-695 Project: Beam Issue Type: Bug Components: sdk-java-core Reporter: Luke Cwik Assignee: Scott Wegner Priority: Minor For array types in Java, toString produces an uninformative message like [Ljava.lang.String;@fc258b1 You need to check to see if its an array type and call Arrays.toString(array). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-604. Resolution: Fixed Fix Version/s: 0.3.0-incubating > Use Watermark Check Streaming Job Finish in TestDataflowRunner > -- > > Key: BEAM-604 > URL: https://issues.apache.org/jira/browse/BEAM-604 > Project: Beam > Issue Type: Improvement >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Minor > Fix For: 0.3.0-incubating > > > Currently, streaming job with bounded input can't be terminated automatically > and TestDataflowRunner can't handle this case. Need to update > TestDataflowRunner so that streaming integration test such as > WindowedWordCountIT can run with it. > Implementation: > Query watermark of each step and wait until all watermarks set to MAX then > cancel the job. > Update: > Suggesting by [~pei...@gmail.com], implement checkMaxWatermark in > DataflowPipelineJob#waitUntilFinish. Thus, all dataflow streaming jobs with > bounded input will take advantage of this change and are canceled > automatically when watermarks reach to max value. Also Dataflow runners can > keep simple and free from handling batch and streaming two cases. > Update: > Pipeline author should have control on whether or not canceling streaming job > and when. Test framework is a better place to auto-cancel streaming test job > when curtain conditions meet, rather than in waitUntilFinish(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-679) Bigtable IO integration tests are failing
[ https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-679. Resolution: Fixed Assignee: Luke Cwik (was: Jean-Baptiste Onofré) Fix Version/s: 0.3.0-incubating > Bigtable IO integration tests are failing > - > > Key: BEAM-679 > URL: https://issues.apache.org/jira/browse/BEAM-679 > Project: Beam > Issue Type: Bug > Components: build-system, sdk-java-extensions >Reporter: Jean-Baptiste Onofré >Assignee: Luke Cwik >Priority: Critical > Fix For: 0.3.0-incubating > > > Bigtable ITests are failing with the following issue: > {code} > java.lang.NoClassDefFoundError: Could not initialize class > com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools > {code} > I'm investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing
[ https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527323#comment-15527323 ] Luke Cwik commented on BEAM-679: It turned out that the worker images for Dataflow weren't publicly available which caused them to get stuck. > Bigtable IO integration tests are failing > - > > Key: BEAM-679 > URL: https://issues.apache.org/jira/browse/BEAM-679 > Project: Beam > Issue Type: Bug > Components: build-system, sdk-java-extensions >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré >Priority: Critical > > Bigtable ITests are failing with the following issue: > {code} > java.lang.NoClassDefFoundError: Could not initialize class > com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools > {code} > I'm investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing
[ https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526790#comment-15526790 ] Luke Cwik commented on BEAM-679: Got permissions, BigtableWriteIT passed. Next postcommit should validate these findings. > Bigtable IO integration tests are failing > - > > Key: BEAM-679 > URL: https://issues.apache.org/jira/browse/BEAM-679 > Project: Beam > Issue Type: Bug > Components: build-system, sdk-java-extensions >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré >Priority: Critical > > Bigtable ITests are failing with the following issue: > {code} > java.lang.NoClassDefFoundError: Could not initialize class > com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools > {code} > I'm investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-679) Bigtable IO integration tests are failing
[ https://issues.apache.org/jira/browse/BEAM-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15526647#comment-15526647 ] Luke Cwik commented on BEAM-679: Trying to rerun the integration test locally to verify. > Bigtable IO integration tests are failing > - > > Key: BEAM-679 > URL: https://issues.apache.org/jira/browse/BEAM-679 > Project: Beam > Issue Type: Bug > Components: build-system, sdk-java-extensions >Reporter: Jean-Baptiste Onofré >Assignee: Jean-Baptiste Onofré >Priority: Critical > > Bigtable ITests are failing with the following issue: > {code} > java.lang.NoClassDefFoundError: Could not initialize class > com.google.cloud.bigtable.grpc.BigtableSessionSharedThreadPools > {code} > I'm investigating. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-672) Figure out TestPipeline.create(PipelineOptions) / TestPipeline.fromOptions(PipelineOptions) story
[ https://issues.apache.org/jira/browse/BEAM-672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-672: --- Component/s: sdk-java-core > Figure out TestPipeline.create(PipelineOptions) / > TestPipeline.fromOptions(PipelineOptions) story > - > > Key: BEAM-672 > URL: https://issues.apache.org/jira/browse/BEAM-672 > Project: Beam > Issue Type: Improvement > Components: sdk-java-core >Reporter: Luke Cwik >Priority: Minor > Labels: test > > TestPipeline integrates with the integration testing environment and relies > heavily on being able to be configured by the environment and executed on > many runners. > Tests which rely on mutating PipelineOptions before creating the TestPipeline > easily can get the integration wrong by creating PipelineOptions from > PipelineOptionsFactory and then calling either TestPipeline.create(options) > or TestPipeline.fromOptions(options), thus ignoring any integration > environment pipeline options specified. > We should fix the exposed methods on TestPipeline to prevent users from > making this simple mistake. > One suggestion is to create a TestPipeline builder which will give access to > a mutable PipelineOptions which the user can edit before calling build() > creating a TestPipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-672) Figure out TestPipeline.create(PipelineOptions) / TestPipeline.fromOptions(PipelineOptions) story
Luke Cwik created BEAM-672: -- Summary: Figure out TestPipeline.create(PipelineOptions) / TestPipeline.fromOptions(PipelineOptions) story Key: BEAM-672 URL: https://issues.apache.org/jira/browse/BEAM-672 Project: Beam Issue Type: Improvement Reporter: Luke Cwik Priority: Minor TestPipeline integrates with the integration testing environment and relies heavily on being able to be configured by the environment and executed on many runners. Tests which rely on mutating PipelineOptions before creating the TestPipeline easily can get the integration wrong by creating PipelineOptions from PipelineOptionsFactory and then calling either TestPipeline.create(options) or TestPipeline.fromOptions(options), thus ignoring any integration environment pipeline options specified. We should fix the exposed methods on TestPipeline to prevent users from making this simple mistake. One suggestion is to create a TestPipeline builder which will give access to a mutable PipelineOptions which the user can edit before calling build() creating a TestPipeline. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-670) BigQuery TableRow inserter incorrectly handles nextBackOff millis
[ https://issues.apache.org/jira/browse/BEAM-670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-670: --- Summary: BigQuery TableRow inserter incorrectly handles nextBackOff millis (was: FluentBackoff incorrectly handles nextBackOff millis) > BigQuery TableRow inserter incorrectly handles nextBackOff millis > - > > Key: BEAM-670 > URL: https://issues.apache.org/jira/browse/BEAM-670 > Project: Beam > Issue Type: Bug > Components: sdk-java-gcp >Reporter: Luke Cwik >Assignee: Daniel Halperin >Priority: Minor > > From: > https://github.com/GoogleCloudPlatform/DataflowJavaSDK/commit/94d57207924ed8650cf3c97fccb2a45f27bcc6a3#commitcomment-19135952 > Also present in: > https://github.com/apache/incubator-beam/pull/888/files#diff-f6d45f28c12083c9556bb410bde8b109R614 > The check is inverted, it should be nextBackoffMillis != BackOff.STOP > Otherwise it causes Thread.sleep() to be called with value -1 which causes an > IllegalArgumentException exception to be thrown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-670) FluentBackoff incorrectly handles nextBackOff millis
Luke Cwik created BEAM-670: -- Summary: FluentBackoff incorrectly handles nextBackOff millis Key: BEAM-670 URL: https://issues.apache.org/jira/browse/BEAM-670 Project: Beam Issue Type: Bug Components: sdk-java-gcp Reporter: Luke Cwik Assignee: Daniel Halperin Priority: Minor From: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/commit/94d57207924ed8650cf3c97fccb2a45f27bcc6a3#commitcomment-19135952 Also present in: https://github.com/apache/incubator-beam/pull/888/files#diff-f6d45f28c12083c9556bb410bde8b109R614 The check is inverted, it should be nextBackoffMillis != BackOff.STOP Otherwise it causes Thread.sleep() to be called with value -1 which causes an IllegalArgumentException exception to be thrown. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (BEAM-661) CalendarWindows#isCompatibleWith should use equals instead of ==
[ https://issues.apache.org/jira/browse/BEAM-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reopened BEAM-661: Assignee: (was: Davor Bonaci) > CalendarWindows#isCompatibleWith should use equals instead of == > > > Key: BEAM-661 > URL: https://issues.apache.org/jira/browse/BEAM-661 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Ben Chambers >Priority: Minor > Fix For: Not applicable > > > http://stackoverflow.com/questions/39617897/inputs-to-flatten-had-incompatible-window-windowfns-when-cogroupbykey-with-calen > We're using `==` instead of `.equals` to compare objects, which causes > equivalent CalendarWindows to be incompatible. > https://github.com/apache/incubator-beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/windowing/CalendarWindows.java#L143 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-660) CalendarWindows compares DateTimes with ==
[ https://issues.apache.org/jira/browse/BEAM-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-660. Resolution: Duplicate Fix Version/s: Not applicable Duplicate of https://issues.apache.org/jira/browse/BEAM-661 > CalendarWindows compares DateTimes with == > -- > > Key: BEAM-660 > URL: https://issues.apache.org/jira/browse/BEAM-660 > Project: Beam > Issue Type: Bug >Reporter: Daniel Mills >Priority: Minor > Fix For: Not applicable > > > CalendarWindows compares DateTime objects with ==, which causes compatible > WindowFns to not be considered compatible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-414) IntraBundleParallelization needs to be removed
[ https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-414. Resolution: Fixed Fix Version/s: 0.3.0-incubating > IntraBundleParallelization needs to be removed > -- > > Key: BEAM-414 > URL: https://issues.apache.org/jira/browse/BEAM-414 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Pei He >Priority: Minor > Labels: newbie, starter > Fix For: 0.3.0-incubating > > > IntraBundleParallelization needs to be removed because it does not work since > it breaks bundle processing semantics by expecting that context information > is not mutated by the runner between element processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-414) IntraBundleParallelization needs to be removed
[ https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15490774#comment-15490774 ] Luke Cwik commented on BEAM-414: https://github.com/apache/incubator-beam/pull/957 > IntraBundleParallelization needs to be removed > -- > > Key: BEAM-414 > URL: https://issues.apache.org/jira/browse/BEAM-414 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Pei He >Priority: Minor > Labels: newbie, starter > > IntraBundleParallelization needs to be removed because it does not work since > it breaks bundle processing semantics by expecting that context information > is not mutated by the runner between element processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (BEAM-414) IntraBundleParallelization needs to be removed
[ https://issues.apache.org/jira/browse/BEAM-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik reassigned BEAM-414: -- Assignee: Pei He (was: Luke Cwik) > IntraBundleParallelization needs to be removed > -- > > Key: BEAM-414 > URL: https://issues.apache.org/jira/browse/BEAM-414 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Luke Cwik >Assignee: Pei He >Priority: Minor > Labels: newbie, starter > > IntraBundleParallelization needs to be removed because it does not work since > it breaks bundle processing semantics by expecting that context information > is not mutated by the runner between element processing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (BEAM-544) Add header/footer support to TextIO.Write
[ https://issues.apache.org/jira/browse/BEAM-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik resolved BEAM-544. Resolution: Fixed Fix Version/s: 0.3.0-incubating > Add header/footer support to TextIO.Write > - > > Key: BEAM-544 > URL: https://issues.apache.org/jira/browse/BEAM-544 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions >Reporter: Luke Cwik >Assignee: Stas Levin >Priority: Minor > Fix For: 0.3.0-incubating > > > Being able to add a header/footer to each file that is written via TextIO > would cover several simple text file format issues. > Original ask: > https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/360 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-544) Add header/footer support to TextIO.Write
[ https://issues.apache.org/jira/browse/BEAM-544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-544: --- Assignee: Stas Levin > Add header/footer support to TextIO.Write > - > > Key: BEAM-544 > URL: https://issues.apache.org/jira/browse/BEAM-544 > Project: Beam > Issue Type: New Feature > Components: sdk-java-extensions >Reporter: Luke Cwik >Assignee: Stas Levin >Priority: Minor > > Being able to add a header/footer to each file that is written via TextIO > would cover several simple text file format issues. > Original ask: > https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/360 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (BEAM-611) Add support for MapValues
Luke Cwik created BEAM-611: -- Summary: Add support for MapValues Key: BEAM-611 URL: https://issues.apache.org/jira/browse/BEAM-611 Project: Beam Issue Type: Improvement Components: sdk-ideas Reporter: Luke Cwik Priority: Minor Filed from: https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/412 Often I find myself needing to simply map a function over just the values of a key-valued PCollection. MapElements works for this, but suffers a small hit in readability (imho) and introduces some possibility for error. I wanted to see if there is any bandwidth / interest in adding this as a standard transform to the SDK. If so, I have attached a gist with a basic spike I have been using in my flows: https://gist.github.com/trentonstrong/8b60933dca545eb2138b72899195019e -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-604: --- Issue Type: Improvement (was: Bug) > Use Watermark Check Streaming Job Finish in TestDataflowRunner > --- > > Key: BEAM-604 > URL: https://issues.apache.org/jira/browse/BEAM-604 > Project: Beam > Issue Type: Improvement >Reporter: Mark Liu >Assignee: Mark Liu > > Currently, streaming job with bounded input can't be terminated automatically > and TestDataflowRunner can't handle this case. Need to update > TestDataflowRunner so that streaming integration test such as > WindowedWordCountIT can run with it. > Implementation: > Query watermark of each step and wait until all watermarks set to MAX then > cancel the job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luke Cwik updated BEAM-604: --- Priority: Minor (was: Major) > Use Watermark Check Streaming Job Finish in TestDataflowRunner > --- > > Key: BEAM-604 > URL: https://issues.apache.org/jira/browse/BEAM-604 > Project: Beam > Issue Type: Improvement >Reporter: Mark Liu >Assignee: Mark Liu >Priority: Minor > > Currently, streaming job with bounded input can't be terminated automatically > and TestDataflowRunner can't handle this case. Need to update > TestDataflowRunner so that streaming integration test such as > WindowedWordCountIT can run with it. > Implementation: > Query watermark of each step and wait until all watermarks set to MAX then > cancel the job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-604) Use Watermark Check Streaming Job Finish in TestDataflowRunner
[ https://issues.apache.org/jira/browse/BEAM-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15449366#comment-15449366 ] Luke Cwik commented on BEAM-604: Would be better if the Dataflow service just shutdown jobs that hit max watermark automatically. > Use Watermark Check Streaming Job Finish in TestDataflowRunner > --- > > Key: BEAM-604 > URL: https://issues.apache.org/jira/browse/BEAM-604 > Project: Beam > Issue Type: Bug >Reporter: Mark Liu >Assignee: Mark Liu > > Currently, streaming job with bounded input can't be terminated automatically > and TestDataflowRunner can't handle this case. Need to update > TestDataflowRunner so that streaming integration test such as > WindowedWordCountIT can run with it. > Implementation: > Query watermark of each step and wait until all watermarks set to MAX then > cancel the job. -- This message was sent by Atlassian JIRA (v6.3.4#6332)