Julian Hyde commented on CALCITE-3784:

First thing you should do is consider whether you need such a large plan. If 
it's so large, what you have is probably best considered data rather than code. 
And data belongs in a table. (Consider the case of a very large IN clause, 
which could instead be a join to a smallish table.)

Still, if  your numbers are correct, it seems wrong that 10,000 expressions 
would use up gigabytes of memory. Calcite needs the digests, so we can't git 
rid of them. But if your expression is deeply nested, the same string will 
occur in each enclosing expression. So, if you can, make your expression less 
deeply nested. In particular, flatten AND and OR expressions.

> RexCall toString digest gives OOM while huge expression is evaluated
> --------------------------------------------------------------------
>                 Key: CALCITE-3784
>                 URL: https://issues.apache.org/jira/browse/CALCITE-3784
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Ravi Kapoor
>            Priority: Critical
> I have some complex query which has tens of thousands of rex expressions and 
> this expression is used in filter expression in the query.
> On creating a filter below code gets called:
> public RelBuilder filter(Iterable<CorrelationId> variablesSet,
>   Iterable<? extends RexNode> predicates) {
>   final RexNode simplifiedPredicates =
>   simplifier.simplifyFilterPredicates(predicates);
>   if (simplifiedPredicates == null) {
>     return empty();
>  }
> RexSimplify further adds the rexnode in the Set<RexNode> calling hashcode() 
> internally which calls toString()
> Is there any way to avoid this computeDigest Call which creates complex 
> string object and blows up the  RAM about 14GB?

This message was sent by Atlassian Jira

Reply via email to