[
https://issues.apache.org/jira/browse/HIVE-11652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14715528#comment-14715528
]
Hari Sankar Sivarama Subramaniyan commented on HIVE-11652:
----------------------------------------------------------
[~jcamachorodriguez] You may want to look at the latest patch of HIVE-11341
where a similar issue is being looked into by me.
{code}
// While all its descendants have not been dispatched,
// we do not move forward
while(!childrenDispatched) {
for (Node childNode : nd.getChildren()) {
walk(childNode);
}
childrenDispatched = getDispatchedList().containsAll(nd.getChildren());
}
{code}
In the above change, I see that you are using the system stack here instead of
toWalk list to traverse the children nodes.
However a couple of points :
1. Why do you need the 'while(!childrenDispatched)' condition at all. Wont the
below code work?
{code}
if (!getDispatchedList().contains(childNode)) {
walk(childNode).
}
{code}
2. we should make sure that walk() is not called again for a node already
dispatched, we should make sure that this logic is present in startWalking as
well
{code}
public void startWalking(Collection<Node> startNodes,
HashMap<Node, Object> nodeOutput) throws SemanticException {
for Node (nd : startNodes) {
// If already dispatched list, continue.
if (getDispatchedList().contains(nd)) {
continue;
}
walk(nd);
if (nodeOutput != null && getDispatchedList().contains(nd)) {
nodeOutput.put(nd, retMap.get(nd));
}
}
}
{code}
In short, if you are using the system stack via recursion you can eliminate the
use of toWalk data structure all together for DefaultGraphWalker.
Otherwise, you can replace toWalk to be a stack instead of an arraylist.
Thanks
Hari
> Avoid expensive call to removeAll in DefaultGraphWalker
> -------------------------------------------------------
>
> Key: HIVE-11652
> URL: https://issues.apache.org/jira/browse/HIVE-11652
> Project: Hive
> Issue Type: Bug
> Components: Logical Optimizer, Physical Optimizer
> Affects Versions: 1.3.0, 2.0.0
> Reporter: Jesus Camacho Rodriguez
> Assignee: Jesus Camacho Rodriguez
> Attachments: HIVE-11652.patch
>
>
> When the plan is too large, the removeAll call in DefaultGraphWalker (line
> 140) will take very long as it will have to go through the list looking for
> each of the nodes. We try to get rid of this call by rewriting the logic in
> the walker.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)