Gerald Sangudi created PHOENIX-5353:
---------------------------------------
Summary: Incorrect results when JOINs project array elements
Key: PHOENIX-5353
URL: https://issues.apache.org/jira/browse/PHOENIX-5353
Project: Phoenix
Issue Type: Bug
Reporter: Gerald Sangudi
There are two errors observed below, one with HASH JOIN and the other with SORT
MERGE JOIN.
Here are the steps for reproducing the bug in array element projection when
using sort merge join.
I also noticed there is an issue with hash-joins (see the last example).
* Create tables and upsert test data.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> create table d (c1 integer
primary key, c2 integer, c3 integer[]);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (1, 1,
array[1,2,3]);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into d values (2, 10,
array[2,3,4]);{quote}
{quote} {quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> create table t2 (c1 integer
primary key, c2 integer);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (1,
10);{quote}
{quote}0: jdbc:phoenix:localhost:2181:/hbase> upsert into t2 values (2,
20);{quote}{quote}
* Verify data in the tables
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select * from d;
+-----+-----+------------+
| C1 | C2 | C3 |
+-----+-----+------------+
| 1 | 1 | [1, 2, 3] |
| 2 | 10 | [2, 3, 4] |
+-----+-----+------------+
2 rows selected (0.027 seconds)
0: jdbc:phoenix:localhost:2181:/hbase> select * from t2;
+-----+-----+
| C1 | C2 |
+-----+-----+
| 1 | 10 |
| 2 | 20 |
+-----+-----+
2 rows selected (0.021 seconds){quote}{quote}
* Perform sort merge join without projecting array elements. This works fine.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+
use_sort_merge_join */ d.c1, d.c2, d.c3, t2.c1, t2.c2 from d join t2 on d.c1 =
t2.c1;{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}| D.C1 | D.C2 | D.C3 | T2.C1 | T2.C2 |{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}| 1 | 1 | [1, 2, 3] | 1 | 10 |{quote}
{quote}| 2 | 10 | [2, 3, 4] | 2 | 20 |{quote}
{quote}+-------+-------+------------+--------+--------+{quote}
{quote}2 rows selected (0.054 seconds){quote}{quote}
* Perform sort merge join by projecting array elements. *Returns junk data for
array elements.*
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select /*+
use_sort_merge_join */ d.c1, d.c2, d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from
d join t2 on d.c1 = t2.c1;{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}| D.C1 | D.C2 | ARRAY_ELEM(D.C3, 1) | ARRAY_ELEM(D.C3, 2) |
ARRAY_ELEM(D.C3, 3) | T2.C1 | T2.C2 |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}| 1 | 1 | {color:#ff0000}-1937768448{color} |
{color:#ff0000}-2122317824{color} | {color:#ff0000}-2105540608{color}
| 1 | 10 |{quote}
{quote}| 2 | 10 | {color:#ff0000}-1937768448{color} |
{color:#ff0000}-2105540608{color} | {color:#ff0000}-2088763392{color}
| 2 | 20 |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------+--------+{quote}
{quote}2 rows selected (0.043 seconds){quote}{quote}
* Array element projection works fine when using hash-join but *columns from
the non-array table are messed up*.
{quote}{quote}0: jdbc:phoenix:localhost:2181:/hbase> select d.c1, d.c2,
d.c3[1], d.c3[2], d.c3[3], t2.c1, t2.c2 from d join t2 on d.c1 = t2.c1;{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}| D.C1 | D.C2 | ARRAY_ELEM(D.C3, 1) | ARRAY_ELEM(D.C3, 2) |
ARRAY_ELEM(D.C3, 3) | T2.C1 | T2.C2 |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}| 1 | 1 | 1 | 2 | 3
| {color:#ff0000}-2146795520{color} |
{color:#ff0000}-2147319808{color} |{quote}
{quote}| 2 | 10 | 2 | 3 | 4
| {color:#ff0000}-2146140160{color} |
{color:#ff0000}-2147319808{color} |{quote}
{quote}+-------+-------+----------------------+----------------------+----------------------+--------------+--------------+{quote}
{quote}2 rows selected (0.067 seconds){quote}{quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)