[
https://issues.apache.org/jira/browse/DRILL-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341223#comment-14341223
]
Sean Hsuan-Yi Chu commented on DRILL-2328:
------------------------------------------
Based on SQL standard, concat operator || is supposed to give null if any input
is null (Postgres obeys this rule). However, I have not found any rule on
concat function yet.
Also, I ran tests on these two systems:
1. Postgres:
concat function: ignores null (or treat it as "");
concat operator: if any input is null, output null
2. MySql
this has totally the opposite behaviors to Postgres...
> Concat operator returns wrong result when one of the operands is NULL
> ---------------------------------------------------------------------
>
> Key: DRILL-2328
> URL: https://issues.apache.org/jira/browse/DRILL-2328
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 0.8.0
> Reporter: Victoria Markman
> Assignee: Sean Hsuan-Yi Chu
> Priority: Critical
>
> Queries below should return NULL:
> {code}
> 0: jdbc:drill:schema=dfs> select cast(null as varchar(10)) || '--' from t1;
> +------------+
> | EXPR$0 |
> +------------+
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> | -- |
> +------------+
> 10 rows selected (0.09 seconds)
> 0: jdbc:drill:schema=dfs> select a1 || '--' from t1 where a1 is null;
> +------------+
> | EXPR$0 |
> +------------+
> | -- |
> +------------+
> 1 row selected (0.105 seconds)
> {code}
> Looks harmless at first, but a very common pattern in many customer queries
> will be broken: grouping by using '||' as following:
> {code}
> select
> cast(extract(day from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(month from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(year from c_timestamp) as varchar(10)),
> sum(c_integer) as sum1
> from
> alltypes_with_nulls
> group by
> cast(extract(day from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(month from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(year from c_timestamp) as varchar(10))
> order by
> cast(extract(day from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(month from c_timestamp) as varchar(10)) || '-' ||
> cast(extract(year from c_timestamp) as varchar(10))
> ;
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)