okumin commented on code in PR #5037:
URL: https://github.com/apache/hive/pull/5037#discussion_r1476160515
##########
ql/src/java/org/apache/hadoop/hive/ql/plan/mapper/PlanMapper.java:
##########
@@ -200,54 +199,46 @@ public void merge(Object o1, Object o2) {
}
private void link(Object o1, Object o2, boolean mayMerge) {
-
- Set<Object> keySet = Collections.newSetFromMap(new IdentityHashMap<Object,
Boolean>());
- keySet.add(o1);
- keySet.add(o2);
- keySet.add(getKeyFor(o1));
- keySet.add(getKeyFor(o2));
-
- Set<EquivGroup> mGroups = Collections.newSetFromMap(new
IdentityHashMap<EquivGroup, Boolean>());
-
- for (Object object : keySet) {
- EquivGroup group = objectMap.get(object);
- if (group != null) {
- mGroups.add(group);
- }
+ // Caches signatures on the first access. A signature of an Operator could
change as optimizers could mutate it,
+ // keeping its semantics. The current implementation caches the signature
before optimizations so that we can
+ // link Operators with their signatures at consistent timing.
+ registerSignature(o1);
+ registerSignature(o2);
Review Comment:
> Assuming that we do not allow implicit linking via signatures I don't know
what's the point of registering the signature at this stage
That's because some optimization can mutate Operators, changing their true
signatures. [One example I remember is
TableScanPPD](https://github.com/apache/hive/blob/rel/release-4.0.0-beta-1/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java#L414).
I'm not 100% sure which signatures should be used in that case, but it sounds
more reasonable and predictable for me to use ones before optimization than
ones in the middle of optimization. I would say Operators should not be mutable
for the signing purpose, but it is a too fundamental change.
I will make a reply to @kgyrtkirk in another thread as I guess those
comments were written before reading my gist.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]