[
https://issues.apache.org/jira/browse/HIVE-17416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160935#comment-16160935
]
Zoltan Haindrich commented on HIVE-17416:
-----------------------------------------
I was able to make the test simpler
{code}
select distinct concat(field_name,'xxx') as l ,concat(field_name,'xXx') as r
from t;
{code}
> Hive Distinct changes column value
> ----------------------------------
>
> Key: HIVE-17416
> URL: https://issues.apache.org/jira/browse/HIVE-17416
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.1
> Reporter: Manoj Durisheti
>
> Hive 1.2.1000.2.6.1.0-129
> Below query with distinct is expected to just dedupe the resultant data. But
> it alters the data.
> *Query without Distinct:*
> select
> REGEXP_EXTRACT(UPPER(field_name), '([A-Z]_[0-9]*[A-Z]?)\\??.*', 1)
> r_field_name,
> REGEXP_EXTRACT(UPPER(field_name), '([A-Z]_[0-9]*[a-z]?)\\??.*', 1)
> w_field_name
> from alpha.table_name
> where
> datestamp = 20170805
> and
> field_name =
> 'https://www.abcd.com/details/123-main-st-abcde-xx-84004-5434484-e_2300a'
> ;
> Result:
> e_2300a e_2300
> e_2300a e_2300
> e_2300a e_2300
> e_2300a e_2300
> e_2300a e_2300
> *Query with Distinct:*
> select distinct
> REGEXP_EXTRACT(UPPER(field_name), '([A-Z]_[0-9]*[A-Z]?)\\??.*', 1)
> r_field_name,
> REGEXP_EXTRACT(UPPER(field_name), '([A-Z]_[0-9]*[a-z]?)\\??.*', 1)
> w_field_name
> from alpha.table_name
> where
> datestamp = 20170805
> and
> field_name =
> 'https://www.abcd.com/details/123-main-st-abcde-xx-84004-5434484-e_2300a'
> ;
> Result:
> e_2300 e_2300
> *Expected Result with Distinct is: *
> e_2300a e_2300
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)