[ 
https://issues.apache.org/jira/browse/DRILL-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223215#comment-16223215
 ] 

ASF GitHub Bot commented on DRILL-5889:
---------------------------------------

GitHub user ppadma opened a pull request:

    https://github.com/apache/drill/pull/1015

    DRILL-5889: Simple pattern matchers can work with DrillBuf directly

    For the 4 simple patterns we have i.e. startsWith, endsWith, contains and 
constant,, we do not need the overhead of charSequenceWrapper. We can work with 
DrillBuf directly. This will save us from doing isAscii check and UTF8 decoding 
for each row.
    UTF-8 encoding ensures that no UTF-8 character is a prefix of any other 
valid character. So, instead of decoding varChar from each row we are 
processing, encode the patternString once during setup and do raw byte 
comparison. 
    This improved overall performance for filter operator by around 20%.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ppadma/drill DRILL-5899

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1015.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1015
    
----
commit ebe2c6f9110f0501a85cb5ae6e119e05254b9f3e
Author: Padma Penumarthy <ppenuma...@yahoo.com>
Date:   2017-10-25T20:37:25Z

    DRILL-5889: Simple pattern matchers can work with DrillBuf directly

----


> sqlline loses RPC connection
> ----------------------------
>
>                 Key: DRILL-5889
>                 URL: https://issues.apache.org/jira/browse/DRILL-5889
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Relational Operators
>    Affects Versions: 1.11.0
>            Reporter: Robert Hou
>            Assignee: Pritesh Maker
>         Attachments: 26183ef9-44b2-ef32-adf8-cc2b5ba9f9c0.sys.drill, 
> drillbit.log
>
>
> Query is:
> {noformat}
> alter session set `planner.memory.max_query_memory_per_node` = 10737418240;
> select count(*), max(`filename`) from dfs.`/drill/testdata/hash-agg/data1` 
> group by no_nulls_col, nulls_col;
> {noformat}
> Error is:
> {noformat}
> 0: jdbc:drill:drillbit=10.10.100.190> select count(*), max(`filename`) from 
> dfs.`/drill/testdata/hash-agg/data1` group by no_nulls_col, nulls_col;
> Error: CONNECTION ERROR: Connection /10.10.100.190:45776 <--> 
> /10.10.100.190:31010 (user client) closed unexpectedly. Drillbit down?
> [Error Id: db4aea70-11e6-4e63-b0cc-13cdba0ee87a ] (state=,code=0)
> {noformat}
> From drillbit.log:
> 2017-10-18 14:04:23,044 [UserServer-1] INFO  
> o.a.drill.exec.rpc.user.UserServer - RPC connection /10.10.100.190:31010 <--> 
> /10.10.100.190:45776 (user server) timed out.  Timeout was set to 30 seconds. 
> Closing connection.
> Plan is:
> {noformat}
> | 00-00    Screen
> 00-01      Project(EXPR$0=[$0], EXPR$1=[$1])
> 00-02        UnionExchange
> 01-01          Project(EXPR$0=[$2], EXPR$1=[$3])
> 01-02            HashAgg(group=[{0, 1}], EXPR$0=[$SUM0($2)], EXPR$1=[MAX($3)])
> 01-03              Project(no_nulls_col=[$0], nulls_col=[$1], EXPR$0=[$2], 
> EXPR$1=[$3])
> 01-04                HashToRandomExchange(dist0=[[$0]], dist1=[[$1]])
> 02-01                  UnorderedMuxExchange
> 03-01                    Project(no_nulls_col=[$0], nulls_col=[$1], 
> EXPR$0=[$2], EXPR$1=[$3], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($1, 
> hash32AsDouble($0, 1301011))])
> 03-02                      HashAgg(group=[{0, 1}], EXPR$0=[COUNT()], 
> EXPR$1=[MAX($2)])
> 03-03                        Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/hash-agg/data1]], 
> selectionRoot=maprfs:/drill/testdata/hash-agg/data1, numFiles=1, 
> usedMetadataFile=false, columns=[`no_nulls_col`, `nulls_col`, `filename`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to