lostluck commented on a change in pull request #14822:
URL: https://github.com/apache/beam/pull/14822#discussion_r635277280
##########
File path: sdks/go/pkg/beam/core/runtime/pipelinex/replace.go
##########
@@ -207,11 +218,142 @@ func externalIns(counted map[string]bool, xforms
map[string]*pipepb.PTransform,
}
}
-// ensureUniqueNames ensures that each name is unique. Any conflict is
-// resolved by adding '1, '2, etc to the name.
+type idSorted []string
+
+func (s idSorted) Len() int {
+ return len(s)
+}
+func (s idSorted) Swap(i, j int) {
+ s[i], s[j] = s[j], s[i]
+}
+
+var idParseExp = regexp.MustCompile(`(\D*)(\d*)`)
Review comment:
This is a case of "good enough" vs "perfect". This is only intended to
make go's natural ID assignment scheme (e# for Transforms, and s# for
composites) sort properly. I'll add a comment to that effect though.
Basically this fixes having weird lexicographic ordering where the sort is
e1 e10 e11 e2 e20 e21 e3 e30.... rather than having the uniquifiers assigned to
addition to the pipeline order. It would manifest often with a CoGBK near the
start of a pipline getting a uniquifier, rather than one towards the end, which
is unintuitive.
The Go SDK assigns ids sequentially, and the global ids are invariably
"<letter><number>" by default. The complications come in when we add
intermediate nodes, like the CoGBK's inject and Expand steps which add suffixes
(like _injectN and _expand). However, we don't care so much with those. Such
SDK additions are always wrapped in a composite ptransform, so there's nothing
at the same "level" for them to get uniquified against. They're unique by their
ancestors.
The other times PTransform ids won't match this pattern is from xlang, which
will have whatever the foreign SDK uses. But that's similarly handled by the
"use the lexicographic" ordering case, and being wrapped in a composite.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]