[
https://issues.apache.org/jira/browse/PHOENIX-4791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581930#comment-16581930
]
Gerald Sangudi commented on PHOENIX-4791:
-----------------------------------------
Here's my hypothesis--
(1) the projection of array elements, without the full array, is adding a new
Cell to the Tuple. This cell is identified by arrayElementCellPosition and
arrayElementCell.
(2) ProjectedTupleValue can only contain one Cell, so it loses information
about the array cell.
(3) ProjectedTupleValue is used by HashJoinRegionScanner to merge the join
inputs; see the calls to TupleProjector.mergeProjectedValue().
Therefore, hash join loses the array cell.
[~jamestaylor], is my understanding correct? If so, how can this be fixed? Can
HashJoinRegionScanner combine the merged join tuple with the array cell before
adding the new combined tuple to the resultQueue?
> Array elements are nullified with joins
> ---------------------------------------
>
> Key: PHOENIX-4791
> URL: https://issues.apache.org/jira/browse/PHOENIX-4791
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 4.11.0, 4.12.0, 4.13.0, 4.14.0
> Reporter: Tulasi P
> Priority: Major
> Fix For: 4.15.0
>
>
> Returning elements of an array from a table that is part of a join causes
> array elements to be nullified.
> {noformat}
> create table array_test_1 (id integer not null primary key, arr tinyint[5]);
> upsert into array_test_1 values (1001, array[0, 0, 0, 0, 0]);
> upsert into array_test_1 values (1002, array[0, 0, 0, 0, 1]);
> upsert into array_test_1 values (1003, array[0, 0, 0, 1, 1]);
> upsert into array_test_1 values (1004, array[0, 0, 1, 1, 1]);
> upsert into array_test_1 values (1005, array[1, 1, 1, 1, 1]);
> {noformat}
>
> {noformat}
> create table test_table_1 (id integer not null primary key, val varchar);
> upsert into test_table_1 values (1001, 'abc');
> upsert into test_table_1 values (1002, 'def');
> upsert into test_table_1 values (1003, 'ghi');{noformat}
> {noformat}
> 0: jdbc:phoenix:localhost> select t1.id, t2.val, t1.arr[1], t1.arr[2],
> t1.arr[3] from array_test_1 as t1 join test_table_1 as t2 on t1.id = t2.id;
> +--------+---------+---------+------------------------+---------------+
> | T1.ID | T2.VAL | ARRAY_ELEM(T1.ARR, 1) | ARRAY_ELEM(T1.ARR, 2) |
> ARRAY_ELEM(T1.ARR, 3) |
> +--------+---------+---------+-----------------+------------------------+
> | 1001 | abc | null | null | null |
> | 1002 | def | null | null | null |
> | 1003 | ghi | null | null | null |
> +--------+---------+--------+------------------------+-----------------+
> 3 rows selected (0.056 seconds)
> {noformat}
> However, directly selecting array elements from the array returns data
> correctly.
> {noformat}
> 0: jdbc:phoenix:localhost> select [t1.id, t1.arr[1], t1.arr[2], t1.arr[3]
> from array_test_1 as t1;
> +-------+--------+-------------+-------------+
> | ID | ARRAY_ELEM(ARR, 1) | ARRAY_ELEM(ARR, 2) | ARRAY_ELEM(ARR, 3) |
> +-------+--------+-------------+-------------+
> | 1001 | 0 | 0 | 0 |
> | 1002 | 0 | 0 | 0 |
> | 1003 | 0 | 0 | 0 |
> | 1004 | 0 | 0 | 1 |
> | 1005 | 1 | 1 | 1 |
> +-------+-------+--------------+--------------+
> 5 rows selected (0.044 seconds)
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)