[jira] [Commented] (DRILL-6101) Optimize Implicit Columns Processing

ASF GitHub Bot (JIRA) Thu, 02 Aug 2018 18:15:45 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16567660#comment-16567660
 ]


ASF GitHub Bot commented on DRILL-6101:
---------------------------------------

ilooner commented on issue #1414: DRILL-6101: Optimized implicit columns 
handling within scanner
URL: https://github.com/apache/drill/pull/1414#issuecomment-410114076
 
 
   @paul-rogers I suggest we go with this change for now since the code change 
is minimal, needed by some of our users, and provides a big speed up. I see 
your result set loader PRs are blocked on testing. I will be your personal 
tester and try to help you get those PRs in. When those PRs are ready we can 
switch to using your comprehensive solution.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


> Optimize Implicit Columns Processing
> ------------------------------------
>
>                 Key: DRILL-6101
>                 URL: https://issues.apache.org/jira/browse/DRILL-6101
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>    Affects Versions: 1.12.0
>            Reporter: salim achouche
>            Assignee: salim achouche
>            Priority: Critical
>              Labels: ready-to-commit
>             Fix For: 1.15.0
>
>
> Problem Description -
>  * Apache Drill allows users to specify columns even for SELECT STAR queries
>  * From my discussion with [~paul-rogers], Apache Calcite has a limitation 
> where the, extra columns are not provided
>  * The workaround has been to always include all implicit columns for SELECT 
> STAR queries
>  * Unfortunately, the current implementation is very inefficient as implicit 
> column values get duplicated; this leads to substantial performance 
> degradation when the number of rows are large
> Suggested Optimization -
>  * The NullableVarChar vector should be enhanced to efficiently store 
> duplicate values
>  * This will not only address the current Calcite limitations (for SELECT 
> STAR queries) but also optimize all queries with implicit columns
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6101) Optimize Implicit Columns Processing

Reply via email to