[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352612#comment-16352612 ] Jean-Baptiste Onofré commented on BEAM-3587: It's not a problem in Beam codebase itself, so not a blocker for the release. More a website issue to add a note on the runners (spark, flink, others as they all work the same way). > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: website >Reporter: Kenneth Knowles >Assignee: Jean-Baptiste Onofré >Priority: Minor > Attachments: screen1.png, screen2.png > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352590#comment-16352590 ] Jean-Baptiste Onofré commented on BEAM-3587: I created a PR on the user example to fix the issue: https://github.com/pelletier/beam-flink-example/pull/1 I also took screenshots that it runs fine on Flink cluster. I'm adding a note in Beam documentation as some other users might have the same issue. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Jean-Baptiste Onofré >Priority: Blocker > Fix For: 2.3.0 > > Attachments: screen1.png, screen2.png > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352587#comment-16352587 ] Jean-Baptiste Onofré commented on BEAM-3587: As we have: {code} org.apache.maven.plugins maven-shade-plugin ${maven-shade-plugin.version} false *:* META-INF/*.SF META-INF/*.DSA META-INF/*.RSA package shade true shaded {code} for a Maven build, the same has to be applied to Gradle. I'm adding a note in the documentation. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Jean-Baptiste Onofré >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352564#comment-16352564 ] Jean-Baptiste Onofré commented on BEAM-3587: I cloned the test repo: https://github.com/pelletier/beam-flink-example Clearly the problem is not in the artifacts, but in the shadowJar. I'm testing the fix and will provide a PR to user. If it's what I think, I will close this Jira as it's not a Beam issue. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352390#comment-16352390 ] Jean-Baptiste Onofré commented on BEAM-3587: Using 2.3.0-SNAPSHOT artifacts provided by Maven with the following pipeline works without problem with the Flink runner: {code} public static final void main(String args[]) throws Exception { PipelineOptions options = PipelineOptionsFactory.fromArgs(args).create(); Pipeline pipeline = Pipeline.create(options); pipeline .apply(TextIO.read().from("/home/jbonofre/artists.csv")) .apply(ParDo.of(new DoFn() { @ProcessElement public void processElement(ProcessContext processContext) { String element = processContext.element(); String[] split = element.split(","); processContext.output(split[1]); } })) .apply(Count.perElement()) .apply(MapElements.via(new SimpleFunction , String>() { public String apply(KV element) { return "{\"" + element.getKey() + "\": \"" + element.getValue() + "\"}"; } })) .apply(TextIO.write().to("/home/jbonofre/label.json")); pipeline.run(); } {code} I'm now testing with artifacts built by Gradle. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16352050#comment-16352050 ] Jean-Baptiste Onofré commented on BEAM-3587: Hi Ben, thanks for the update. Let me take a look today. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351945#comment-16351945 ] Ben Sidhom commented on BEAM-3587: -- I think I've figured it out. I had been using SBT to build my own uberjars and had been using custom resource merging. When I tried using a gradle build modeled after the Java build under examples, I ran into the same failure as listed above. PtransformTranslation [uses a service loader|https://github.com/apache/beam/blob/24804e98d22180c1dc6603c8a437073ec2adde2d/runners/core-construction-java/src/main/java/org/apache/beam/runners/core/construction/PTransformTranslation.java#L98] to figure out urn mappings. It turns out that a simple shadowJar task does not handle this properly and instead only includes the special Flink override urns. Using the service-merging transform fixes this (with gradle): {{shadowJar \{ mergeServiceFiles() }}} > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351157#comment-16351157 ] Ben Sidhom commented on BEAM-3587: -- Just repeated it with literally just a read step. It still seems to run without error, but obviously there's no output to inspect: {{p.apply(TextIO.read().from(options.getInputPath())).run().waitUntilFinish()}} > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16351150#comment-16351150 ] Ben Sidhom commented on BEAM-3587: -- I built at head and ran a simple program that reads in a text file and writes it out to a different file. It seems to work for me. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350806#comment-16350806 ] Ben Sidhom commented on BEAM-3587: -- As Ken mentioned, this is strange since we have ValidatesRunner tests that should in theory trigger this. Can we get a minimal example pipeline that hits this issue? > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3587) User reports TextIO failure in FlinkRunner on master
[ https://issues.apache.org/jira/browse/BEAM-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16350324#comment-16350324 ] Jean-Baptiste Onofré commented on BEAM-3587: Can we get an update on this Jira ? I would like to help. So please don't hesitate to ping me. > User reports TextIO failure in FlinkRunner on master > > > Key: BEAM-3587 > URL: https://issues.apache.org/jira/browse/BEAM-3587 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Kenneth Knowles >Assignee: Ben Sidhom >Priority: Blocker > Fix For: 2.3.0 > > > Reported here: > [https://lists.apache.org/thread.html/47b16c94032392782505415e010970fd2a9480891c55c2f7b5de92bd@%3Cuser.beam.apache.org%3E] > "I'm trying to run a pipeline containing just a TextIO.read() step on a Flink > cluster, using the latest Beam git revision (ff37337). The job fails to start > with the Exception: > {{java.lang.UnsupportedOperationException: The transform is currently not > supported.}} > It does work with Beam 2.2.0 though. All code, logs, and reproduction steps > [https://github.com/pelletier/beam-flink-example]; > My initial thoughts: I have a guess that this has to do with switching to > running from a portable pipeline representation, and it looks like there's a > non-composite transform with an empty URN and it threw a bad error message. > We can try to root cause but may also mitigate short-term by removing the > round-trip through pipeline proto for now. > What is curious is that the ValidatesRunner and WordCountIT are working - > they only run on a local Flink, yet this seems to be a translation issue that > would occur for local or distributed runs. > We need to certainly run this repro on the RC if we don't totally get to the > bottom of it quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)