[ 
https://issues.apache.org/jira/browse/CALCITE-4994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483473#comment-17483473
 ] 

Julian Hyde commented on CALCITE-4994:
--------------------------------------

Yes, it would be a problem for the map in {{RelDataType}}. There is non-zero 
chance that one thread will see the map is null, starts populating it, and 
meanwhile another thread does the same thing. To do it properly you need to use 
a {{volatile}} keyword, or use double-checked locking, or protect it with 
{{Suppliers.memoize}}, and thoroughly test it, but we'd never know for sure. 
Not worth the hassle.

I haven't benchmarked it, but I think that the last solution - that uses 
{{MapBackedList}} - is better than all others. Memory usage is barely higher 
than today, the map is used throughout the lifecycle, there's no mutable state, 
and lookup performance is the best possible.

> SqlToRelConverter creates FieldMap for every Identifier Instead of Memoizing 
> it
> -------------------------------------------------------------------------------
>
>                 Key: CALCITE-4994
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4994
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Jay Narale
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> When converting from Sql To Rel, In SqlToRelConverter for every single 
> instance of an identifier we create a new map in 
> *_org.apache.calcite.sql2rel.SqlToRelConverter.Blackboard#lookupExp_*
>  
> {code:java}
> final Map<String, Integer> fieldOffsets = new HashMap<>();
> for (RelDataTypeField f : resolve.rowType().getFieldList()) {
> if (!fieldOffsets.containsKey(f.getName())) {
> fieldOffsets.put(f.getName(), f.getIndex());
> }
> }
> final Map<String, Integer> map = ImmutableMap.copyOf(fieldOffsets);{code}
>  
> So for a Sql Query
> {code:java}
> SELECT name, nation FROM customer{code}
> We would do the above operation twice.
> Memoization of this information will improve performance.
> In my database, I had observed that for a large table involving 1200 columns 
> and a huge select having multiple expressions and operators, this part was a 
> bottleneck.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to