[jira] [Commented] (DRILL-4321) Difference in results count distinct with min max query on JDK8

Khurram Faraaz (JIRA) Mon, 08 Feb 2016 01:52:23 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136759#comment-15136759
 ]


Khurram Faraaz commented on DRILL-4321:
---------------------------------------

here is the full JSON profile of the query that gives different results.

test : 
Functional/aggregates/aggregation/count_distinct/with_min_max_c_float_group_by_1_cols.sql

Query : select count(distinct c_float), max(c_float), min(c_float) from 
alltypes_with_nulls group by  c_date order by  c_date

Full JSON profile

{noformat}
{
    "id": {
        "part1": 2974517851546440000,
        "part2": 8109210720415190000
    },
    "type": 1,
    "start": 1454924741833,
    "end": 1454924742416,
    "query": "select count(distinct c_float), max(c_float), min(c_float) from 
alltypes_with_nulls group by  c_date order by  c_date",
    "plan": "00-00    Screen : rowType = RecordType(BIGINT EXPR$0, ANY EXPR$1, 
ANY EXPR$2): rowcount = 10.0, cumulative cost = {444.0 rows, 5588.877123795494 
cpu, 0.0 io, 0.0 network, 4832.0 memory}, id = 4499875\n00-01      
Project(EXPR$0=[$0], EXPR$1=[$1], EXPR$2=[$2]) : rowType = RecordType(BIGINT 
EXPR$0, ANY EXPR$1, ANY EXPR$2): rowcount = 10.0, cumulative cost = {443.0 
rows, 5587.877123795494 cpu, 0.0 io, 0.0 network, 4832.0 memory}, id = 
4499874\n00-02        Project(EXPR$0=[$4], EXPR$1=[$1], EXPR$2=[$2], 
c_date=[$0]) : rowType = RecordType(BIGINT EXPR$0, ANY EXPR$1, ANY EXPR$2, ANY 
c_date): rowcount = 10.0, cumulative cost = {443.0 rows, 5587.877123795494 cpu, 
0.0 io, 0.0 network, 4832.0 memory}, id = 4499873\n00-03          
MergeJoin(condition=[IS NOT DISTINCT FROM($0, $3)], joinType=[inner]) : rowType 
= RecordType(ANY c_date, ANY EXPR$1, ANY EXPR$2, ANY c_date0, BIGINT EXPR$0): 
rowcount = 10.0, cumulative cost = {443.0 rows, 5587.877123795494 cpu, 0.0 io, 
0.0 network, 4832.0 memory}, id = 4499872\n00-05            
SelectionVectorRemover : rowType = RecordType(ANY c_date, ANY EXPR$1, ANY 
EXPR$2): rowcount = 10.0, cumulative cost = {220.0 rows, 3542.8771237954943 
cpu, 0.0 io, 0.0 network, 2000.0000000000002 memory}, id = 4499865\n00-07       
       Sort(sort0=[$0], dir0=[ASC]) : rowType = RecordType(ANY c_date, ANY 
EXPR$1, ANY EXPR$2): rowcount = 10.0, cumulative cost = {210.0 rows, 
3532.8771237954943 cpu, 0.0 io, 0.0 network, 2000.0000000000002 memory}, id = 
4499864\n00-09                HashAgg(group=[{0}], EXPR$1=[MAX($1)], 
EXPR$2=[MIN($1)]) : rowType = RecordType(ANY c_date, ANY EXPR$1, ANY EXPR$2): 
rowcount = 10.0, cumulative cost = {200.0 rows, 3400.0 cpu, 0.0 io, 0.0 
network, 1760.0000000000002 memory}, id = 4499863\n00-11                  
Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/aggregation/alltypes_with_nulls]], 
selectionRoot=maprfs:/drill/testdata/aggregation/alltypes_with_nulls, 
numFiles=1, usedMetadataFile=false, columns=[`c_date`, `c_float`]]]) : rowType 
= RecordType(ANY c_date, ANY c_float): rowcount = 100.0, cumulative cost = 
{100.0 rows, 200.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4499862\n00-04   
         Project(c_date0=[$0], EXPR$0=[$1]) : rowType = RecordType(ANY c_date0, 
BIGINT EXPR$0): rowcount = 1.0, cumulative cost = {212.0 rows, 2001.0 cpu, 0.0 
io, 0.0 network, 2832.0 memory}, id = 4499871\n00-06              
SelectionVectorRemover : rowType = RecordType(ANY c_date, BIGINT EXPR$0): 
rowcount = 1.0, cumulative cost = {212.0 rows, 2001.0 cpu, 0.0 io, 0.0 network, 
2832.0 memory}, id = 4499870\n00-08                Sort(sort0=[$0], dir0=[ASC]) 
: rowType = RecordType(ANY c_date, BIGINT EXPR$0): rowcount = 1.0, cumulative 
cost = {211.0 rows, 2000.0 cpu, 0.0 io, 0.0 network, 2832.0 memory}, id = 
4499869\n00-10                  HashAgg(group=[{0}], EXPR$0=[COUNT($1)]) : 
rowType = RecordType(ANY c_date, BIGINT EXPR$0): rowcount = 1.0, cumulative 
cost = {210.0 rows, 2000.0 cpu, 0.0 io, 0.0 network, 2816.0 memory}, id = 
4499868\n00-12                    HashAgg(group=[{0, 1}]) : rowType = 
RecordType(ANY c_date, ANY c_float): rowcount = 10.0, cumulative cost = {200.0 
rows, 1800.0 cpu, 0.0 io, 0.0 network, 2640.0 memory}, id = 4499867\n00-13      
                Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
[path=maprfs:///drill/testdata/aggregation/alltypes_with_nulls]], 
selectionRoot=maprfs:/drill/testdata/aggregation/alltypes_with_nulls, 
numFiles=1, usedMetadataFile=false, columns=[`c_date`, `c_float`]]]) : rowType 
= RecordType(ANY c_date, ANY c_float): rowcount = 100.0, cumulative cost = 
{100.0 rows, 200.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 4499866\n",
    "foreman": {
        "address": "centos-04.qa.lab",
        "userPort": 31010,
        "controlPort": 31011,
        "dataPort": 31012
    },
    "state": 2,
    "totalFragments": 1,
    "finishedFragments": 0,
    "fragmentProfile": [
        {
            "majorFragmentId": 0,
            "minorFragmentProfile": [
                {
                    "state": 3,
                    "minorFragmentId": 0,
                    "operatorProfile": [
                        {
                            "inputProfile": [
                                {
                                    "records": 100,
                                    "batches": 1,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 11,
                            "operatorType": 21,
                            "setupNanos": 0,
                            "processNanos": 1199300,
                            "peakLocalMemoryAllocated": 3328,
                            "waitNanos": 404338
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 100,
                                    "batches": 1,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 9,
                            "operatorType": 3,
                            "setupNanos": 86644364,
                            "processNanos": 79392986,
                            "peakLocalMemoryAllocated": 3015936,
                            "metric": [
                                {
                                    "metricId": 0,
                                    "longValue": 65536
                                },
                                {
                                    "metricId": 2,
                                    "longValue": 0
                                },
                                {
                                    "metricId": 1,
                                    "longValue": 70
                                },
                                {
                                    "metricId": 3,
                                    "longValue": 0
                                }
                            ],
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 7,
                            "operatorType": 17,
                            "setupNanos": 0,
                            "processNanos": 2027130,
                            "peakLocalMemoryAllocated": 10657664,
                            "metric": [
                                {
                                    "metricId": 2,
                                    "longValue": 1
                                }
                            ],
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 2
                                }
                            ],
                            "operatorId": 5,
                            "operatorType": 14,
                            "setupNanos": 13539290,
                            "processNanos": 452781,
                            "peakLocalMemoryAllocated": 77824,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 100,
                                    "batches": 1,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 13,
                            "operatorType": 21,
                            "setupNanos": 0,
                            "processNanos": 848316,
                            "peakLocalMemoryAllocated": 3328,
                            "waitNanos": 540899
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 100,
                                    "batches": 1,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 12,
                            "operatorType": 3,
                            "setupNanos": 54747060,
                            "processNanos": 9265289,
                            "peakLocalMemoryAllocated": 1835008,
                            "metric": [
                                {
                                    "metricId": 0,
                                    "longValue": 65536
                                },
                                {
                                    "metricId": 2,
                                    "longValue": 0
                                },
                                {
                                    "metricId": 1,
                                    "longValue": 100
                                },
                                {
                                    "metricId": 3,
                                    "longValue": 0
                                }
                            ],
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 100,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 10,
                            "operatorType": 3,
                            "setupNanos": 31256696,
                            "processNanos": 20199873,
                            "peakLocalMemoryAllocated": 1967104,
                            "metric": [
                                {
                                    "metricId": 0,
                                    "longValue": 65536
                                },
                                {
                                    "metricId": 2,
                                    "longValue": 0
                                },
                                {
                                    "metricId": 1,
                                    "longValue": 70
                                },
                                {
                                    "metricId": 3,
                                    "longValue": 0
                                }
                            ],
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 8,
                            "operatorType": 17,
                            "setupNanos": 0,
                            "processNanos": 2049647,
                            "peakLocalMemoryAllocated": 10657408,
                            "metric": [
                                {
                                    "metricId": 2,
                                    "longValue": 1
                                }
                            ],
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 2
                                }
                            ],
                            "operatorId": 6,
                            "operatorType": 14,
                            "setupNanos": 11824877,
                            "processNanos": 419656,
                            "peakLocalMemoryAllocated": 69632,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 2
                                }
                            ],
                            "operatorId": 4,
                            "operatorType": 10,
                            "setupNanos": 205129,
                            "processNanos": 22102,
                            "peakLocalMemoryAllocated": 69632,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 2
                                },
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 3,
                            "operatorType": 5,
                            "setupNanos": 21716244,
                            "processNanos": 1223759,
                            "peakLocalMemoryAllocated": 2363904,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 2
                                }
                            ],
                            "operatorId": 2,
                            "operatorType": 10,
                            "setupNanos": 234101,
                            "processNanos": 43473,
                            "peakLocalMemoryAllocated": 1769472,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 1,
                            "operatorType": 10,
                            "setupNanos": 71897,
                            "processNanos": 21247,
                            "peakLocalMemoryAllocated": 1179648,
                            "waitNanos": 0
                        },
                        {
                            "inputProfile": [
                                {
                                    "records": 70,
                                    "batches": 2,
                                    "schemas": 1
                                }
                            ],
                            "operatorId": 0,
                            "operatorType": 13,
                            "setupNanos": 0,
                            "processNanos": 74316,
                            "peakLocalMemoryAllocated": 1179648,
                            "metric": [
                                {
                                    "metricId": 0,
                                    "longValue": 1260
                                }
                            ],
                            "waitNanos": 1127511
                        }
                    ],
                    "startTime": 1454924742036,
                    "endTime": 1454924742410,
                    "memoryUsed": 0,
                    "maxMemoryUsed": 54015936,
                    "endpoint": {
                        "address": "centos-04.qa.lab",
                        "userPort": 31010,
                        "controlPort": 31011,
                        "dataPort": 31012
                    },
                    "lastUpdate": 1454924742410,
                    "lastProgress": 1454924742410
                }
            ]
        }
    ],
    "user": "mapr"
}

{noformat}

> Difference in results count distinct with min max query on JDK8
> ---------------------------------------------------------------
>
>                 Key: DRILL-4321
>                 URL: https://issues.apache.org/jira/browse/DRILL-4321
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.4.0
>         Environment: 4 node cluster
>            Reporter: Khurram Faraaz
>            Assignee: Deneche A. Hakim
>              Labels: JDK8SUPPORT
>         Attachments: expected_results.res
>
>
> count distinct query with min max and group by and order by returns incorrect 
> results on MapR Drill 1.4.0, MapR FS 5.0.0 GA and JDK8
> The difference is in the way we round off values after the decimal when using 
> JDK8.
> Expected results file can be found here
> https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/aggregates/aggregation/count_distinct/with_min_max_c_float_group_by_1_cols.res
> Failing query is Functional/aggregates/aggregation/count_distinct/
> with_min_max_c_float_group_by_1_cols.sql
> {noformat}
> select count(distinct c_float), max(c_float), min(c_float) from 
> alltypes_with_nulls group by  c_date order by  c_date;
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-4321) Difference in results count distinct with min max query on JDK8

Reply via email to