[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16041124#comment-16041124 ] Colin Bookman commented on BEAM-2418: - This particular issue was fixed by using the shadowJar lib and merging service files...and upgrade gradle from version 1.4 to 3.5. https://gist.github.com/cobookman/e30268b4cfa8d0cebbd1e4ae8ef848f0 Thanks for help. > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Luke Cwik >Priority: Blocker > Fix For: Not applicable > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040457#comment-16040457 ] Luke Cwik commented on BEAM-2418: - The way in which you are building your jar file is broken since it includes multiple copies of the same CoderProviderRegistrar file. Java does not understand jar files which have multiple copies of the same file within it. In this specific case the three *META-INF/services/org.apache.beam.sdk.coders.CoderProviderRegistrar* should have been concatenated together. {code} lcwik@lcwik0:~/beam2418$ jar tvf dataflow-teleport-1.0-Alpha.jar | grep META-INF/services/org.apache.beam.sdk.coders.CoderProviderRegistrar 71 Fri May 12 17:03:24 PDT 2017 META-INF/services/org.apache.beam.sdk.coders.CoderProviderRegistrar 150 Fri May 12 16:56:14 PDT 2017 META-INF/services/org.apache.beam.sdk.coders.CoderProviderRegistrar 130 Fri May 12 17:03:38 PDT 2017 META-INF/services/org.apache.beam.sdk.coders.CoderProviderRegistrar {code} The culprit seems to be that the build file in your project is incorrectly assembling the *uber* jar: {code} task uberjar(type: Jar) { from files(sourceSets.main.output.classesDir) from {configurations.compile.collect {zipTree(it)}} { exclude "META-INF/*.SF" exclude "META-INF/*.DSA" exclude "META-INF/*.RSA" } manifest { attributes 'Main-Class': mainClassName } } {code} Please take a look at the shadow gradle plugin and this section of their documentation about merging resources (specifically 2.7.1. Merging Service Descriptor Files): http://imperceptiblethoughts.com/shadow/#controlling_jar_content_merging > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Vikas Kedigehalli >Priority: Blocker > Fix For: Not applicable > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16040093#comment-16040093 ] Colin Bookman commented on BEAM-2418: - [~lcwik], I the ProtobufCoderProviderRegistar is included in the jar file. Here's the jar file in question: https://storage.googleapis.com/beam-dataflowio-bucket/dataflow-teleport-1.0-Alpha.jar Here's the entire code for the Beam pipeline: https://github.com/cobookman/DatastoreToGCS/tree/beam Here's the script / build I'm running and getting the error for: https://github.com/cobookman/DatastoreToGCS/blob/beam/scripts/datastore_to_gcs.sh > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Vikas Kedigehalli >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039838#comment-16039838 ] Luke Cwik commented on BEAM-2418: - [~bookman_google] Does build/libs/*.jar contain a jar representing beam-sdks-java-extensions-protobuf? > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Vikas Kedigehalli >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039733#comment-16039733 ] Colin Bookman commented on BEAM-2418: - Same issue. Tried with the following arguments. java -jar build/libs/*.jar \ --runner=DataflowRunner \ --project=my-project \ --stagingLocation=gs://my-project.appspot.com/staging/ \ --tempLocation=gs://my-project.appspot.com/temp/ ``` Jun 06, 2017 2:57:37 PM org.apache.beam.runners.dataflow.DataflowRunner fromOptions INFO: PipelineOptions.filesToStage was not specified. Defaulting to files from the classpath: will stage 1 files. Enable logging at DEBUG level to see which files will be staged. Exception in thread "main" java.lang.IllegalStateException: Unable to return a default Coder for IngestEntities/ParDo(GqlQueryTranslate)/ParMultiDo(GqlQueryTranslate).out0 [PCollection]. Correct one of the following root causes: No Coder has been manually specified; you may do so using .setCoder(). Inferring a Coder from the CoderRegistry failed: Unable to provide a Coder for com.google.datastore.v1.Query. Building a Coder using a registered CoderProvider failed. See suppressed exceptions for detailed failures. Using the default output Coder from the producing PTransform failed: Unable to provide a Coder for com.google.datastore.v1.Query. Building a Coder using a registered CoderProvider failed. See suppressed exceptions for detailed failures. at org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkState(Preconditions.java:444) at org.apache.beam.sdk.values.PCollection.getCoder(PCollection.java:250) at org.apache.beam.sdk.values.PCollection.finishSpecifying(PCollection.java:104) at org.apache.beam.sdk.runners.TransformHierarchy.finishSpecifyingInput(TransformHierarchy.java:147) at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:481) at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:422) at org.apache.beam.sdk.values.PCollection.apply(PCollection.java:277) at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read.expand(DatastoreV1.java:581) at org.apache.beam.sdk.io.gcp.datastore.DatastoreV1$Read.expand(DatastoreV1.java:226) at org.apache.beam.sdk.Pipeline.applyInternal(Pipeline.java:482) at org.apache.beam.sdk.Pipeline.applyTransform(Pipeline.java:441) at org.apache.beam.sdk.values.PBegin.apply(PBegin.java:56) at org.apache.beam.sdk.Pipeline.apply(Pipeline.java:179) at com.google.cloud.dataflow.teleport.DatastoreToGcs.main(DatastoreToGcs.java:50) at com.google.cloud.dataflow.teleport.Main.main(Main.java:50) ``` > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Davor Bonaci >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039467#comment-16039467 ] Vikas Kedigehalli commented on BEAM-2418: - [~bookman_google] could you try running it without templates (by passing query and other options via command line arguments) and see if it works? > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Davor Bonaci >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039392#comment-16039392 ] Colin Bookman commented on BEAM-2418: - If it helps I'm trying to build this as a template. Here's my CLI java -jar build/libs/*.jar \ --runner=DataflowRunner \ --project=my-project \ --stagingLocation=gs://my-project.appspot.com/staging/ \ --tempLocation=gs://my-project.appspot.com/temp/ \ --templateLocation=gs://my-project.appspot.com/templates/ > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Davor Bonaci >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039324#comment-16039324 ] Vikas Kedigehalli commented on BEAM-2418: - Looks like we do include 'beam-sdks-java-extensions-protobuf" https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/pom.xml#L76, and we also have integration tests that pass (https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/test/java/org/apache/beam/sdk/io/gcp/datastore/V1ReadIT.java#L105) Taking a look. > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Davor Bonaci >Priority: Blocker > Fix For: 2.1.0 > > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-2418) Datastore IO does not work out of the box
[ https://issues.apache.org/jira/browse/BEAM-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16039282#comment-16039282 ] Colin Bookman commented on BEAM-2418: - Tried adding `compile group: 'org.apache.beam', name: 'beam-sdks-java-extensions-protobuf', version: '2.0.0'` to my build. Still did not solve issue. Here's a gist that shows the stack trace, java code, and gradle build file: https://gist.github.com/cobookman/e4d2f2b89b4c3cadae9cd83892162758 > Datastore IO does not work out of the box > - > > Key: BEAM-2418 > URL: https://issues.apache.org/jira/browse/BEAM-2418 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions, sdk-java-gcp >Affects Versions: 2.0.0 >Reporter: Stephen Sisk >Assignee: Davor Bonaci > > We have user reports that DatastoreIO does not work when they try to use it. > We believe this is a result of our effort to minimize our dependencies in the > core SDK (protobuf in this case). ProtoCoder is not registered by default, so > a user would need explicitly include 'beam-sdks-java-extensions-protobuf' in > their maven dependencies to get it. > We need to confirm it, but if so, we will probably need to fix this in the > next release to have ProtoCoder when using DatastoreIO. > cc [~vikasrk] -- This message was sent by Atlassian JIRA (v6.3.15#6346)