[
https://issues.apache.org/jira/browse/DRILL-4453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Deneche A. Hakim closed DRILL-4453.
-----------------------------------
Resolution: Not A Problem
Assignee: Deneche A. Hakim (was: Khurram Faraaz)
There was an issue with the original data that was causing the wrong results
> Difference in results over char data, window function query
> -----------------------------------------------------------
>
> Key: DRILL-4453
> URL: https://issues.apache.org/jira/browse/DRILL-4453
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Flow
> Affects Versions: 1.6.0
> Environment: 4 node cluster
> Reporter: Khurram Faraaz
> Assignee: Deneche A. Hakim
> Labels: window_function
> Attachments: t_alltype.csv, t_alltype.parquet
>
>
> Window function query with frame clause returns results that are different
> from those returned by same query on Postgres 9.3 of same data.
> Note that the two tables have same number of nulls in both Drill and Postgres.
> The length of the result returned by MIN function is different on Postgres
> 9.3 vs Drill 1.6.0
> Drill 1.6.0 => returns 1 as length.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select length(min(c4)) from dfs.tmp.`t_alltype`;
> +---------+
> | EXPR$0 |
> +---------+
> | 1 |
> +---------+
> 1 row selected (0.282 seconds)
> {noformat}
> Postgress 9.3 returns 0 as length.
> {noformat}
> postgres=# select length(min(c4)) from t_alltype;
> length
> --------
> 0
> (1 row)
> {noformat}
> {noformat}
> postgres=# \d t_alltype
> Table "public.t_alltype"
> Column | Type | Modifiers
> --------+-----------------------------+-----------
> c1 | integer |
> c2 | integer |
> c3 | bigint |
> c4 | character(256) |
> c5 | character varying(256) |
> c6 | timestamp without time zone |
> c7 | date |
> c8 | boolean |
> c9 | double precision |
> postgres=# select c4 from t_alltype where c4 is null;
> c4
> ----
> (3 rows)
> {noformat}
> {noformat}
> postgres=# SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1 ROWS BETWEEN
> UNBOUNDED PRECEDING AND CURRENT ROW) FROM t_alltype;
>
> min
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> gwfrW
> ZAFOcferhjkcl
> ZAFOcferhjkcl
> ZAFOcferhjkcl
> ZAFOcferhjkcl
> ...
> ...
>
> ApKK
> ApKK
> (145 rows)
> {noformat}
> Parquet schema details
> {noformat}
> [root@centos-01 parquet-tools]# ./parquet-schema
> ./Datasources/window_functions/t_alltype.parquet
> message root {
> optional int32 c1;
> optional int32 c2;
> optional int64 c3;
> optional binary c4 (UTF8);
> optional binary c5 (UTF8);
> optional int64 c6 (TIMESTAMP_MILLIS);
> optional int32 c7 (DATE);
> optional boolean c8;
> optional double c9;
> }
> {noformat}
> On Drill 1.6.0
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> SELECT MIN(c4) OVER(PARTITION BY c8 ORDER BY c1
> ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM dfs.tmp.`t_alltype`;
> +--------------------------------+
> | EXPR$0 |
> +--------------------------------+
> | gwfrW |
> | ZAFOcferhjkcl |
> | ZAFOcferhjkcl |
> | ZAFOcferhjkcl |
> | ZAFOcferhjkcl |
> ...
> ...
> | ApKK |
> | ApKK |
> | |
> | |
> | |
> | |
> | |
> | |
> | |
> | |
> | |
> | |
> | null |
> | null |
> | |
> | |
> | |
> +------------------------------+
> 145 rows selected (0.409 seconds)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)