[ 
https://issues.apache.org/jira/browse/TAJO-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaehwa Jung updated TAJO-585:
-----------------------------

    Description: 
I found a bug which TajoCli prints unexpected results in union query.

For the first, TajoCli prints wrong row numbers as follows:
{code:xml}
tajo> select id from table1 union all select id from table2;
result: 
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, 5 
rows (21 B)
id
-------------------------------
6
7
8
9
10
1
2
3
4
5
{code}

And if empty table located on last phase, it just prints zero as follows:
{code:xml}

tajo> select id from table1 union all select id from table3;
result: 
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, 0 
rows (10 B)
id
-------------------------------
{code}

For reference, I created test tables as follows:
{code:xml}
CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) 
USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
'hdfs://localhost:9010/tajo/warehouse/table1';

CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) 
USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
'hdfs://localhost:9010/tajo/warehouse/table2';

CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT) 
USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
'hdfs://localhost:9010/tajo/warehouse/table3';

tajo> select * from table1;
result: 
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, 5 
rows (60 B)
id,  name,  score,  type
-------------------------------
1,  ooo,  1.1,  a
2,  ppp,  2.3,  b
3,  qqq,  3.4,  c
4,  rrr,  4.5,  d
5,  xxx,  5.6,  e

tajo> select * from table2;
result: 
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, 5 
rows (61 B)
id,  name,  score,  type
-------------------------------
6,  ooo,  1.1,  a
7,  ppp,  2.3,  b
8,  qqq,  3.4,  c
9,  rrr,  4.5,  d
10,  xxx,  5.6,  e

tajo> select * from table3;
result: 
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, 0 
rows (0 B)
id,  name,  score,  type
-------------------------------
{code}


  was:
I found a bug when I used empty table which located on local file system in 
union query as follows:
{code:xml}
tajo> select * from table1;
result: file:/tmp/tajo-blrunner/staging/q_1391568986216_0001/RESULT, 5 rows (72 
B)
id,  name,  score,  type
-------------------------------
1,  ooo,  1.1,  a
2,  ppp,  2.3,  b
3,  qqq,  3.4,  c
4,  rrr,  4.5,  d
5,  xxx,  5.6,  e

tajo> select * from table2;
result: file:/tmp/tajo-blrunner/staging/q_1391568986216_0002/RESULT, 0 rows (0 
B)
id,  name,  score,  type,  part
-------------------------------

tajo> select id, name from table2 union select id, name from table1 ;
result: file:/tmp/tajo-blrunner/staging/q_1391576101475_0003/RESULT, 5 rows (42 
B)
id,  name
-------------------------------
1,  ooo
2,  ppp
3,  qqq
4,  rrr
5,  xxx

tajo> select id, name from table1 union select id, name from table2 ;
result: file:/tmp/tajo-blrunner/staging/q_1391576101475_0002/RESULT, 0 rows (42 
B)
id,  name
-------------------------------

{code}

In this case, empty table order affected result. And I created tables as 
follows:
{code:xml}
CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) 
USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
'hdfs://localhost:9010/tajo/warehouse/table1';

CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) 
USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
'file:/tmp/tajo-blrunner/warehouse/table2';
{code}


> TajoCli prints unexpected results in union query.
> -------------------------------------------------
>
>                 Key: TAJO-585
>                 URL: https://issues.apache.org/jira/browse/TAJO-585
>             Project: Tajo
>          Issue Type: Bug
>          Components: physical operator
>            Reporter: Jaehwa Jung
>            Assignee: Jaehwa Jung
>
> I found a bug which TajoCli prints unexpected results in union query.
> For the first, TajoCli prints wrong row numbers as follows:
> {code:xml}
> tajo> select id from table1 union all select id from table2;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, 
> 5 rows (21 B)
> id
> -------------------------------
> 6
> 7
> 8
> 9
> 10
> 1
> 2
> 3
> 4
> 5
> {code}
> And if empty table located on last phase, it just prints zero as follows:
> {code:xml}
> tajo> select id from table1 union all select id from table3;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, 
> 0 rows (10 B)
> id
> -------------------------------
> {code}
> For reference, I created test tables as follows:
> {code:xml}
> CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table1';
> CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table2';
> CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT) 
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION 
> 'hdfs://localhost:9010/tajo/warehouse/table3';
> tajo> select * from table1;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, 
> 5 rows (60 B)
> id,  name,  score,  type
> -------------------------------
> 1,  ooo,  1.1,  a
> 2,  ppp,  2.3,  b
> 3,  qqq,  3.4,  c
> 4,  rrr,  4.5,  d
> 5,  xxx,  5.6,  e
> tajo> select * from table2;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, 
> 5 rows (61 B)
> id,  name,  score,  type
> -------------------------------
> 6,  ooo,  1.1,  a
> 7,  ppp,  2.3,  b
> 8,  qqq,  3.4,  c
> 9,  rrr,  4.5,  d
> 10,  xxx,  5.6,  e
> tajo> select * from table3;
> result: 
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, 
> 0 rows (0 B)
> id,  name,  score,  type
> -------------------------------
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to