[ 
https://issues.apache.org/jira/browse/DRILL-4684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Egbert van der Veen updated DRILL-4684:
---------------------------------------
    Description: 
When joining a RDBMS table on itself (for instance; with a table that contains 
both subitems and main items), Drill fails to correctly retrieve the result 
columns; the columns beloning to the first instance that the table is 
referenced, are always NULL.

This has been tested using both MySQL and PostgreSQL databases as storage 
providers.

See attached screenshots for expected result (which works when queried directly 
against the database), and the actual result through Drill

Code to reproduce:

{code:sql}
--create table
CREATE TABLE `join_test` (
  `id` int(11) NOT NULL,
  `type` varchar(50) DEFAULT NULL,
  `mainItem` int(11) DEFAULT NULL,
  `description` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`)
) ;

--insert testdata; one main item and two subitems. For subitems, column 
'mainItem' is a reference to the parent main item
INSERT INTO join_test (id, type, mainItem, description) VALUES (1, 'mainItem', 
null, 'First main item');
INSERT INTO join_test (id, type, mainItem, description) VALUES (2, 'subItem', 
1, 'First subitem of first main item');
INSERT INTO join_test (id, type, mainItem, description) VALUES (3, 'subItem', 
1, 'Second subitem of first main item');

--perform query
SELECT main.type firsttype, main.description firstdescription, sub.type 
secondtype, sub.description seconddescription 
FROM join_test sub
JOIN join_test main ON sub.mainItem = main.id
{code}

  was:
When joining a RDBMS table on itself (for instance; with a table that contains 
both subitems and main items), Drill fails to correctly retrieve the result 
columns; the columns beloning to the first instance of the table that is 
referenced, are always NULL.

This has been tested using both MySQL and PostgreSQL databases as storage 
providers.

See attached screenshots for expected result (which works when queried directly 
against the database), and the actual result through Drill

Code to reproduce:

{code:sql}
--create table
CREATE TABLE `join_test` (
  `id` int(11) NOT NULL,
  `type` varchar(50) DEFAULT NULL,
  `mainItem` int(11) DEFAULT NULL,
  `description` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id` (`id`)
) ;

--insert testdata; one main item and two subitems. For subitems, column 
'mainItem' is a reference to the parent main item
INSERT INTO join_test (id, type, mainItem, description) VALUES (1, 'mainItem', 
null, 'First main item');
INSERT INTO join_test (id, type, mainItem, description) VALUES (2, 'subItem', 
1, 'First subitem of first main item');
INSERT INTO join_test (id, type, mainItem, description) VALUES (3, 'subItem', 
1, 'Second subitem of first main item');

--perform query
SELECT main.type firsttype, main.description firstdescription, sub.type 
secondtype, sub.description seconddescription 
FROM join_test sub
JOIN join_test main ON sub.mainItem = main.id
{code}


> Incorrect behavior when joining RDBMS table on itself
> -----------------------------------------------------
>
>                 Key: DRILL-4684
>                 URL: https://issues.apache.org/jira/browse/DRILL-4684
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.6.0
>            Reporter: Egbert van der Veen
>         Attachments: actual result.png, expected result.png
>
>
> When joining a RDBMS table on itself (for instance; with a table that 
> contains both subitems and main items), Drill fails to correctly retrieve the 
> result columns; the columns beloning to the first instance that the table is 
> referenced, are always NULL.
> This has been tested using both MySQL and PostgreSQL databases as storage 
> providers.
> See attached screenshots for expected result (which works when queried 
> directly against the database), and the actual result through Drill
> Code to reproduce:
> {code:sql}
> --create table
> CREATE TABLE `join_test` (
>   `id` int(11) NOT NULL,
>   `type` varchar(50) DEFAULT NULL,
>   `mainItem` int(11) DEFAULT NULL,
>   `description` varchar(255) DEFAULT NULL,
>   PRIMARY KEY (`id`),
>   UNIQUE KEY `id` (`id`)
> ) ;
> --insert testdata; one main item and two subitems. For subitems, column 
> 'mainItem' is a reference to the parent main item
> INSERT INTO join_test (id, type, mainItem, description) VALUES (1, 
> 'mainItem', null, 'First main item');
> INSERT INTO join_test (id, type, mainItem, description) VALUES (2, 'subItem', 
> 1, 'First subitem of first main item');
> INSERT INTO join_test (id, type, mainItem, description) VALUES (3, 'subItem', 
> 1, 'Second subitem of first main item');
> --perform query
> SELECT main.type firsttype, main.description firstdescription, sub.type 
> secondtype, sub.description seconddescription 
> FROM join_test sub
> JOIN join_test main ON sub.mainItem = main.id
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to