[jira] [Created] (DRILL-4852) COUNT(*) query over 26M rows slower by 2x on mapr drill 1.8.0

Khurram Faraaz (JIRA) Thu, 18 Aug 2016 06:22:48 -0700

Khurram Faraaz created DRILL-4852:
-------------------------------------

             Summary: COUNT(*) query over 26M rows slower by 2x on mapr drill 
1.8.0
                 Key: DRILL-4852
                 URL: https://issues.apache.org/jira/browse/DRILL-4852
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.8.0
         Environment: 4 node cluster CentOS
            Reporter: Khurram Faraaz
            Priority: Critical



We have this manual test where it does a COUNT over 26 million JSON keys. From 
the results it looks like we have regressed and are slower by 2x on current 
1.8.0 master 1.8.0-SNAPSHOT git commit ID : 57dc9f43
Query takes over 30 seconds to execute consistently over several runs. Note 
that since this is a single large JSON file there is just one fragment doing 
all the work.

{noformat}
0: jdbc:drill:schema=dfs.tmp> select count(*) from `twoKeyJsn.json`;
+-----------+
|  EXPR$0   |
+-----------+
| 26212355  |
+-----------+
1 row selected (29.001 seconds)
{noformat}

On Drill 1.2.0 the above query took 13.949 seconds





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (DRILL-4852) COUNT(*) query over 26M rows slower by 2x on mapr drill 1.8.0

Reply via email to