[ 
https://issues.apache.org/jira/browse/PHOENIX-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16214243#comment-16214243
 ] 

Dumindu Buddhika commented on PHOENIX-4139:
-------------------------------------------

[~jamestaylor] The logic for setting hasSeperator is below
{code:java}
this.hasSeparator = !isFixedLength && (datum != data.get(data.size()-1));
{code}

isFixedLength being false here, now in this scenario we have the same column 
repeated (TRIM(NAM) ), I think datum has the reference for the same PDarum 
object for the repeated columns, because of that datum != 
data.get(data.size()-1) becomes false. That's why hasSeparator is not set. We 
may need to have different PDatum objects here (But I do not know the 
performance impact of that) or we need to change this logic.



> select distinct with identical aggregations return weird values 
> ----------------------------------------------------------------
>
>                 Key: PHOENIX-4139
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4139
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.12.0
>         Environment: minicluster
>            Reporter: Csaba Skrabak
>            Assignee: Csaba Skrabak
>            Priority: Minor
>             Fix For: 4.13.0
>
>         Attachments: PHOENIX-4139.patch
>
>
> From sme-hbase hipchat room:
> Pulkit Bhardwaj·10:31
> i'm seeing a weird issue with phoenix, appreciate some thoughts
> Created a simple table in phoenix
> {noformat}
> 0: jdbc:phoenix:> create table test_select(nam VARCHAR(20), address 
> VARCHAR(20), id BIGINT
> . . . . . . . . > constraint my_pk primary key (id));
> 0: jdbc:phoenix:> upsert into test_select (nam, address,id) 
> values('pulkit','badaun',1);
> 0: jdbc:phoenix:> select * from test_select;
> +---------+----------+-----+
> |   NAM   | ADDRESS  | ID  |
> +---------+----------+-----+
> | pulkit  | badaun   | 1   |
> +---------+----------+-----+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", nam from 
> test_select;
> +--------------+---------+
> | test_column  |   NAM   |
> +--------------+---------+
> | harshit      | pulkit  |
> +--------------+---------+
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), 
> trim(nam) from test_select;
> +--------------+----------------+----------------+
> | test_column  |   TRIM(NAM)    |   TRIM(NAM)    |
> +--------------+----------------+----------------+
> | harshit      | pulkitpulkit  | pulkitpulkit  |
> +--------------+----------------+----------------+
> {noformat}
> When I apply a trim on the nam column and use it multiple times, the output 
> has the cell data duplicated!
> {noformat}
> 0: jdbc:phoenix:> select distinct 'harshit' as "test_column", trim(nam), 
> trim(nam), trim(nam) from test_select;
> +--------------+-----------------------+-----------------------+-----------------------+
> | test_column  |       TRIM(NAM)       |       TRIM(NAM)       |       
> TRIM(NAM)       |
> +--------------+-----------------------+-----------------------+-----------------------+
> | harshit      | pulkitpulkitpulkit  | pulkitpulkitpulkit  | 
> pulkitpulkitpulkit  |
> +--------------+-----------------------+-----------------------+-----------------------+
> {noformat}
> Wondering if someone has seen this before??
> One thing to note is, if I remove the —— distinct 'harshit' as "test_column" 
> ——  The issue is not seen
> {noformat}
> 0: jdbc:phoenix:> select trim(nam), trim(nam), trim(nam) from test_select;
> +------------+------------+------------+
> | TRIM(NAM)  | TRIM(NAM)  | TRIM(NAM)  |
> +------------+------------+------------+
> | pulkit     | pulkit     | pulkit     |
> +------------+------------+------------+
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to