[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16339689#comment-16339689 ] Kenneth Knowles commented on BEAM-3243: --- I see the pull requests associated with this are merged or closed. Is it resolved? > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau >Priority: Major > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16327820#comment-16327820 ] ASF GitHub Bot commented on BEAM-3243: -- jkff closed pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java index 5358f7d9e27..d6991323854 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java @@ -19,12 +19,19 @@ import static com.google.common.base.Preconditions.checkNotNull; import static com.google.common.base.Preconditions.checkState; +import static com.google.common.collect.Iterables.transform; import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Function; import com.google.common.base.Joiner; +import com.google.common.base.Predicate; +import com.google.common.base.Predicates; +import com.google.common.collect.ArrayListMultimap; +import com.google.common.collect.Collections2; import com.google.common.collect.HashMultimap; +import com.google.common.collect.Multimap; import com.google.common.collect.SetMultimap; -import java.util.ArrayList; +import java.util.Collection; import java.util.HashSet; import java.util.List; import java.util.Map; @@ -493,7 +500,7 @@ OutputT applyTransform(String name, InputT input, /** Lazily initialized; access via {@link #getCoderRegistry()}. */ @Nullable private CoderRegistry coderRegistry; - private final List unstableNames = new ArrayList<>(); + private final MultimapinstancePerName = ArrayListMultimap.create(); private final PipelineOptions defaultOptions; private Pipeline(TransformHierarchy transforms, PipelineOptions options) { @@ -520,11 +527,8 @@ public String toString() { String namePrefix = transforms.getCurrent().getFullName(); String uniqueName = uniquifyInternal(namePrefix, name); -boolean nameIsUnique = uniqueName.equals(buildName(namePrefix, name)); - -if (!nameIsUnique) { - unstableNames.add(uniqueName); -} +final String builtName = buildName(namePrefix, name); +instancePerName.put(builtName, transform); LOG.debug("Adding {} to {}", transform, this); transforms.pushNode(uniqueName, input, transform); @@ -569,21 +573,29 @@ void applyReplacement( @VisibleForTesting void validate(PipelineOptions options) { this.traverseTopologically(new ValidateVisitor(options)); -if (!unstableNames.isEmpty()) { +final Collection >> errors = +Collections2.filter(instancePerName.asMap().entrySet(), +Predicates.not(new IsUnique ())); +if (!errors.isEmpty()) { switch (options.getStableUniqueNames()) { case OFF: break; case WARNING: LOG.warn( "The following transforms do not have stable unique names: {}", - Joiner.on(", ").join(unstableNames)); + Joiner.on(", ").join(transform(errors, new KeysExtractor(; break; -case ERROR: +case ERROR: // be very verbose here since it will just fail the execution throw new IllegalStateException( String.format( "Pipeline update will not be possible" + " because the following transforms do not have stable unique names: %s.", - Joiner.on(", ").join(unstableNames))); + Joiner.on(", ").join(transform(errors, new KeysExtractor( + "\n\n" + + "Conflicting instances:\n" + + Joiner.on("\n").join(transform( + errors, new UnstableNameToMessage(instancePerName))) + + "\n\nYou can fix it adding a name when you call apply(): " + + "pipeline.apply(, )."); default: throw new IllegalArgumentException( "Unrecognized value for stable unique names: " + options.getStableUniqueNames()); @@ -636,4 +648,42 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function { +@Override +public String apply(final PTransform transform) { + return "- " + transform; +} + } + + private static class UnstableNameToMessage implements +
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274107#comment-16274107 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on issue #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#issuecomment-348433101 Updated according the comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273526#comment-16273526 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r154210087 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { Review comment: If I understand correctly: the default .toString() on a ParDo returns simply "ParDo.SingleOutput@something" or alike; and instead you want to return a .toString() of the DoFn inside it. Can you just override the toString() method on the ParDo.SingleOutput class to say "return fn.toString()"? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273508#comment-16273508 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r154208876 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { +representant = ParDo.SingleOutput.class.cast( +transform).getFn(); + } else if (ParDo.MultiOutput.class.isInstance(transform)) { +representant = ParDo.MultiOutput.class.cast( +transform).getFn(); + } else { +representant = transform; + } + return "- " + representant; +} + } + + private static class UnstableNameToMessage implements Function { +private final Multimap instances; + +private UnstableNameToMessage(final Multimap instancePerName) { + this.instances = instancePerName; +} + +@Override +public String apply(final String input) { + Collection values = null; + String currentName = input; + // strip the counter appended to the name, + // see uniquifyInternal(). + // common example: ParDo(Anonymous) becoming ParDo(Anonymous)2 + while (values == null && !currentName.isEmpty()) { +currentName = currentName.substring(0, currentName.length() - 1); Review comment: Will change it tomorrow This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273506#comment-16273506 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r154208726 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { Review comment: Hmm, not really cause the goal is to log the min info needed for the user to fix it. XOutput doesnt help since it is ignored in the name computation. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273325#comment-16273325 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r154186471 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { Review comment: Sorry, I don't understand which internal Beam data you have in mind. Line 664 calls toString() - so all you need to get rid of special-casing ParDo as opposed to the dozens of other Beam transforms is to add .toString() methods to ParDo.SingleOutput and MultiOutput, which will be useful for debugging anyway. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273326#comment-16273326 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r154186790 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { +representant = ParDo.SingleOutput.class.cast( +transform).getFn(); + } else if (ParDo.MultiOutput.class.isInstance(transform)) { +representant = ParDo.MultiOutput.class.cast( +transform).getFn(); + } else { +representant = transform; + } + return "- " + representant; +} + } + + private static class UnstableNameToMessage implements Function { +private final Multimap instances; + +private UnstableNameToMessage(final Multimap instancePerName) { + this.instances = instancePerName; +} + +@Override +public String apply(final String input) { + Collection values = null; + String currentName = input; + // strip the counter appended to the name, + // see uniquifyInternal(). + // common example: ParDo(Anonymous) becoming ParDo(Anonymous)2 + while (values == null && !currentName.isEmpty()) { +currentName = currentName.substring(0, currentName.length() - 1); Review comment: I think traversing the collection is much better than parsing digit-by-digit a string that we ourselves have formatted. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272202#comment-16272202 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r153991859 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { +representant = ParDo.SingleOutput.class.cast( +transform).getFn(); + } else if (ParDo.MultiOutput.class.isInstance(transform)) { +representant = ParDo.MultiOutput.class.cast( +transform).getFn(); + } else { +representant = transform; + } + return "- " + representant; +} + } + + private static class UnstableNameToMessage implements Function { +private final Multimap instances; + +private UnstableNameToMessage(final Multimap instancePerName) { + this.instances = instancePerName; +} + +@Override +public String apply(final String input) { + Collection values = null; + String currentName = input; + // strip the counter appended to the name, + // see uniquifyInternal(). + // common example: ParDo(Anonymous) becoming ParDo(Anonymous)2 + while (values == null && !currentName.isEmpty()) { +currentName = currentName.substring(0, currentName.length() - 1); Review comment: The loop will remove both digits normally. Thought about removing it but then you need to browse the whole collection instead of just testing if the list is empty - twice. No strong opinion on it but thought that holding the direct information "conflict" was valuable. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16272201#comment-16272201 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r153991561 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { Review comment: To get a user data and not an internal beam data. ParDo doesnt help here for instance. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271603#comment-16271603 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r153924198 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { +representant = ParDo.SingleOutput.class.cast( +transform).getFn(); + } else if (ParDo.MultiOutput.class.isInstance(transform)) { +representant = ParDo.MultiOutput.class.cast( +transform).getFn(); + } else { +representant = transform; + } + return "- " + representant; +} + } + + private static class UnstableNameToMessage implements Function { +private final Multimap instances; + +private UnstableNameToMessage(final Multimap instancePerName) { + this.instances = instancePerName; +} + +@Override +public String apply(final String input) { + Collection values = null; + String currentName = input; + // strip the counter appended to the name, + // see uniquifyInternal(). + // common example: ParDo(Anonymous) becoming ParDo(Anonymous)2 + while (values == null && !currentName.isEmpty()) { +currentName = currentName.substring(0, currentName.length() - 1); Review comment: What if the counter is greater than 10, and takes more than 1 digit? In general this seems brittle: this is just extracting the original non-unique name of a transform, right? Can't we pass the offending PTransform's here directly somehow, e.g. at line 532 key things by the original (non-unique) name and traverse entries that have more than 1 transform? I think then you can also just remove the variable "unstableNames" because it's redundant w.r.t. the map. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271602#comment-16271602 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on a change in pull request #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#discussion_r153924026 ## File path: sdks/java/core/src/main/java/org/apache/beam/sdk/Pipeline.java ## @@ -636,4 +649,49 @@ public void visitPrimitiveTransform(Node node) { node.getTransform().validate(options); } } + + private static class TransformToMessage implements Function{ +@Override +public String apply(final PTransform transform) { + final Object representant; + if (ParDo.SingleOutput.class.isInstance(transform)) { Review comment: Hmm why are we special-casing ParDo here - why is line 664 ok for other transforms but not OK for ParDo.Single/MultiOutput? If it's a matter of the transform's .toString() - can you just change the toString() methods on these ParDo classes if they're currently not good enough? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270382#comment-16270382 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on issue #4185: BEAM-3243 better error message when there are conflicting anonymous names URL: https://github.com/apache/beam/pull/4185#issuecomment-347786173 Updated the PR, hope it addresses the comments. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269113#comment-16269113 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau closed pull request #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/NameUtils.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/NameUtils.java index 3f3054ae733..d42e41c059e 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/util/NameUtils.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/util/NameUtils.java @@ -52,8 +52,6 @@ private static final String ANONYMOUS_CLASS_REGEX = "\\$[0-9]+\\$"; private static String approximateSimpleName(Class clazz, boolean dropOuterClassNames) { -checkArgument(!clazz.isAnonymousClass(), -"Attempted to get simple name of anonymous class"); return approximateSimpleName(clazz.getName(), dropOuterClassNames); } @@ -92,14 +90,6 @@ private static String simplifyNameComponent(String name) { return name; } - /** - * As {@link #approximateSimpleName(Object, String)} but returning {@code "Anonymous"} when - * {@code object} is an instance of anonymous class. - */ - public static String approximateSimpleName(Object object) { -return approximateSimpleName(object, "Anonymous"); - } - /** * Returns a simple name describing a class that is being used as a function (eg., a {@link DoFn} * or {@link CombineFn}, etc.). @@ -123,7 +113,7 @@ public static String approximateSimpleName(Object object) { * {@code another.package.PairingFn} becomes "Pairing" * */ - public static String approximateSimpleName(Object object, String anonymousValue) { + public static String approximateSimpleName(Object object) { if (object instanceof NameOverride) { return ((NameOverride) object).getNameOverride(); } @@ -134,9 +124,11 @@ public static String approximateSimpleName(Object object, String anonymousValue) } else { clazz = object.getClass(); } +/* if anonymous ensure the name is unique to support multiple anonymous DoFn per class if (clazz.isAnonymousClass()) { return anonymousValue; } +*/ return approximateSimpleName(clazz, /* dropOuterClassNames */ true); } diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/NameUtilsTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/NameUtilsTest.java index 411f91317e0..fb17e149d95 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/util/NameUtilsTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/util/NameUtilsTest.java @@ -77,6 +77,9 @@ public void testDropsOuterClassNamesFalse() { * Inner class for simple name test. */ private class EmbeddedDoFn { +private Object anonymous() { // ensure we have a single one for the test + return new DeeperEmbeddedDoFn() {}; +} private class DeeperEmbeddedDoFn extends EmbeddedDoFn {} @@ -114,7 +117,7 @@ public void testSimpleName() { @Test public void testAnonSimpleName() throws Exception { -assertEquals("Anonymous", NameUtils.approximateSimpleName(new EmbeddedDoFn() {})); +assertEquals("Embedded$1", NameUtils.approximateSimpleName(new EmbeddedDoFn().anonymous())); } @Test @@ -180,10 +183,4 @@ public String getNameOverride() { }; assertEquals("CUSTOM_NAME", NameUtils.approximateSimpleName(overriddenName)); } - - @Test - public void testApproximateSimpleNameCustomAnonymous() { -Object overriddenName = new Object() {}; -assertEquals("CUSTOM_NAME", NameUtils.approximateSimpleName(overriddenName, "CUSTOM_NAME")); - } } This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16269112#comment-16269112 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347602536 superseeded with https://github.com/apache/beam/pull/4185 This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16267652#comment-16267652 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347341595 That would be optimal, and in general I really like the idea of capturing the stack trace of transform application and using it in all error messages. We already do this in PAssert, but it can be applied much more widely - e.g. errors during coder inference, validation errors, some runtime errors etc. I suspect that implementing that can have some unexpected stumbling blocks, so I would recommend to start with a simpler improvement - just make the error message clearly say that the fix is to specify name in apply(). However, if you're willing to drive an effort to use application stack traces in more places in Beam, starting with a discussion on the mailing list - that would be a wonderful contribution and I'd be happy to review it. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.3.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266173#comment-16266173 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347035725 Something like Transform Foo$1 conflicts with Foo$2 in pipeline defined in MyTest line 56. You can fix it adding a name in apply invocations line 54 and 52. The long name is important cause otherwise you have no clue of the provenance. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266135#comment-16266135 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347029760 Yeah, I would prefer an improved error message. I suppose you mean the message that says `Transform Foo does not have a stable unique name. This will prevent reloading of pipelines.` - that message definitely should point to specifying the name in .apply() as a fix. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266077#comment-16266077 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347019445 I understand where it comes from but it is very boring when writing tests where it is not uncommon to write anonymous classes and the inline naming is not always doable when you write utility methods or reusable piece of pipelines. You would also note that anonymous algorithm with the "number" is somehow aligned on the fact to keep only the suffix in the nested class case which leads to as meaningless names (Important$Stuff leads to Stuff which is in general not very meaningful for the "task" context since it is hold by the enclosing class to avoid long and repeating names). The original issue is when it fails it is quite abstract and not very obvious. An alternative can be to enrich the error message with: 1. where the anonymous *classes* (all conflicting ones) are 2. how to fix it - passing a name to the apply I would be fine with this "not solution" fix as well, does it sound better for you? This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16266057#comment-16266057 ] ASF GitHub Bot commented on BEAM-3243: -- jkff commented on issue #4172: [BEAM-3243] support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172#issuecomment-347018533 Using multiple anonymous DoFn's with the same enclosing class within the same composite transform is already possible if you specify the transform name in .apply() - e.g.: .apply("Something", ParDo.of(new DoFn..)).apply("Something else", ParDo.of(new DoFn..)). This is a good thing rather than a bug, because using generated names like `Enclosing$1` is unstable w.r.t. pipeline update: any reordering of the anonymous classes, or adding a new one, or making an existing one be non-anonymous, will change the numbering and make the pipeline update-incompatible. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265084#comment-16265084 ] ASF GitHub Bot commented on BEAM-3243: -- rmannibucau opened a new pull request #4172: BEAM-3243 support multiple anonymous classes from the same enclosing class in a pipeline URL: https://github.com/apache/beam/pull/4172 Idea is to keep the "number" (suffix) of the anonymous class for anonymous dofn to support multiple anonymous dofn in the same pipeline. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-3243) multiple anonymous DoFn lead to conflicting names
[ https://issues.apache.org/jira/browse/BEAM-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265085#comment-16265085 ] ASF GitHub Bot commented on BEAM-3243: -- GitHub user rmannibucau opened a pull request: https://github.com/apache/beam/pull/4172 BEAM-3243 support multiple anonymous classes from the same enclosing class in a pipeline Idea is to keep the "number" (suffix) of the anonymous class for anonymous dofn to support multiple anonymous dofn in the same pipeline. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmannibucau/incubator-beam fb/BEAM-3243_dont-default-to-anonymous-name-for-anonymous-dofn Alternatively you can review and apply these changes as the patch at: https://github.com/apache/beam/pull/4172.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4172 commit 76bebb6dc7942b081b386bd5eda7456421b926c7 Author: Romain Manni-BucauDate: 2017-11-24T09:31:04Z BEAM-3243 default name is not needed by beam in NameUtils so remove it and let anonymous class keep their number in their name > multiple anonymous DoFn lead to conflicting names > - > > Key: BEAM-3243 > URL: https://issues.apache.org/jira/browse/BEAM-3243 > Project: Beam > Issue Type: Task > Components: sdk-java-core >Reporter: Romain Manni-Bucau >Assignee: Romain Manni-Bucau > Fix For: 2.2.0 > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)