[
https://issues.apache.org/jira/browse/CALCITE-794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056623#comment-15056623
]
Julian Hyde commented on CALCITE-794:
-------------------------------------
The performance impact of creating the RelMetadataQuery object should be
insignificant compared to other inefficiencies that exist.
RelMetadataQuery.instance creates a RelMetadataQuery and it contains a HashSet.
One inefficiency is that ReflectiveRelMetadataProvider uses
Proxy.newProxyInstance to create implementations of interfaces, and
Method.invoke to call the methods. I think method handles would be much better.
See CALCITE-604.
A larger inefficiency is that I'm not sure we're doing effective caching. I
think we make many calls to, say, get the estimated row count of a particular
RelNode. We have to be careful not to introduce too much caching: if we add a
RelNode to a RelSubset we need to make sure that that improvement is seen by
any RelNode that uses that RelSubset or anything downstream of it.
But still, if you sample the stack during a test that uses volcano heavily,
Calcite will likely be in a metadata call, both before and after this issue is
fixed. That seems wrong.
> Detect cycles when computing statistics
> ---------------------------------------
>
> Key: CALCITE-794
> URL: https://issues.apache.org/jira/browse/CALCITE-794
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
> Fix For: next
>
>
> The graph of RelNodes is allowed to be cyclic. This causes problems when
> evaluating certain metadata, for example RelMetataQuery.areColumnsUnique.
> While computing the value for RelNode r, it might recurse through say a
> Project and hit r again. This causes a stack overflow.
> We solve this by adding a map or set of active RelNodes. The map is stored
> within RelMetadataQuery, which can now be instantiated, and its methods are
> no longer static. The first call should instantiate a RelMetadataQuery, but
> all subsequent calls for metadata (perhaps several kinds of metadata) will
> use the same RelMetadataQuery instance, hence the same map.
> Also add a RelMetadataQuery argument to the static "handler" methods in
> RelMdColumnUniqueness and similar classes.
> This is a breaking change for people who have written a metadata handler, and
> might be subtle to detect, because the methods are invoked via reflection.
> For code that is just using RelMetadataQuery methods, the change is still
> breaking, but the break points and remedy will be obvious: the methods are no
> longer static, so they need to change RelMetadataQuery.foo() to
> RelMetadataQuery.instance().foo().
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)