[
https://issues.apache.org/jira/browse/TAJO-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jaehwa Jung updated TAJO-585:
-----------------------------
Description:
I found a bug which TajoCli prints unexpected results in union query.
For the first, TajoCli prints wrong row numbers as follows:
{code:xml}
tajo> select id from table1 union all select id from table2;
result:
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT, 5
rows (21 B)
id
-------------------------------
6
7
8
9
10
1
2
3
4
5
{code}
And if empty table located on last phase, it just prints zero as follows:
{code:xml}
tajo> select id from table1 union all select id from table3;
result:
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT, 0
rows (10 B)
id
-------------------------------
{code}
For reference, I created test tables as follows:
{code:xml}
CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT)
USING CSV WITH ('csvfile.delimiter'='|') LOCATION
'hdfs://localhost:9010/tajo/warehouse/table1';
CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT)
USING CSV WITH ('csvfile.delimiter'='|') LOCATION
'hdfs://localhost:9010/tajo/warehouse/table2';
CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT)
USING CSV WITH ('csvfile.delimiter'='|') LOCATION
'hdfs://localhost:9010/tajo/warehouse/table3';
tajo> select * from table1;
result:
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT, 5
rows (60 B)
id, name, score, type
-------------------------------
1, ooo, 1.1, a
2, ppp, 2.3, b
3, qqq, 3.4, c
4, rrr, 4.5, d
5, xxx, 5.6, e
tajo> select * from table2;
result:
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT, 5
rows (61 B)
id, name, score, type
-------------------------------
6, ooo, 1.1, a
7, ppp, 2.3, b
8, qqq, 3.4, c
9, rrr, 4.5, d
10, xxx, 5.6, e
tajo> select * from table3;
result:
hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT, 0
rows (0 B)
id, name, score, type
-------------------------------
{code}
was:
I found a bug when I used empty table which located on local file system in
union query as follows:
{code:xml}
tajo> select * from table1;
result: file:/tmp/tajo-blrunner/staging/q_1391568986216_0001/RESULT, 5 rows (72
B)
id, name, score, type
-------------------------------
1, ooo, 1.1, a
2, ppp, 2.3, b
3, qqq, 3.4, c
4, rrr, 4.5, d
5, xxx, 5.6, e
tajo> select * from table2;
result: file:/tmp/tajo-blrunner/staging/q_1391568986216_0002/RESULT, 0 rows (0
B)
id, name, score, type, part
-------------------------------
tajo> select id, name from table2 union select id, name from table1 ;
result: file:/tmp/tajo-blrunner/staging/q_1391576101475_0003/RESULT, 5 rows (42
B)
id, name
-------------------------------
1, ooo
2, ppp
3, qqq
4, rrr
5, xxx
tajo> select id, name from table1 union select id, name from table2 ;
result: file:/tmp/tajo-blrunner/staging/q_1391576101475_0002/RESULT, 0 rows (42
B)
id, name
-------------------------------
{code}
In this case, empty table order affected result. And I created tables as
follows:
{code:xml}
CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT)
USING CSV WITH ('csvfile.delimiter'='|') LOCATION
'hdfs://localhost:9010/tajo/warehouse/table1';
CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT)
USING CSV WITH ('csvfile.delimiter'='|') LOCATION
'file:/tmp/tajo-blrunner/warehouse/table2';
{code}
> TajoCli prints unexpected results in union query.
> -------------------------------------------------
>
> Key: TAJO-585
> URL: https://issues.apache.org/jira/browse/TAJO-585
> Project: Tajo
> Issue Type: Bug
> Components: physical operator
> Reporter: Jaehwa Jung
> Assignee: Jaehwa Jung
>
> I found a bug which TajoCli prints unexpected results in union query.
> For the first, TajoCli prints wrong row numbers as follows:
> {code:xml}
> tajo> select id from table1 union all select id from table2;
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0004/RESULT,
> 5 rows (21 B)
> id
> -------------------------------
> 6
> 7
> 8
> 9
> 10
> 1
> 2
> 3
> 4
> 5
> {code}
> And if empty table located on last phase, it just prints zero as follows:
> {code:xml}
> tajo> select id from table1 union all select id from table3;
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0005/RESULT,
> 0 rows (10 B)
> id
> -------------------------------
> {code}
> For reference, I created test tables as follows:
> {code:xml}
> CREATE EXTERNAL TABLE table1 (id INT4, name TEXT, score FLOAT4, type TEXT)
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION
> 'hdfs://localhost:9010/tajo/warehouse/table1';
> CREATE EXTERNAL TABLE table2 (id INT4, name TEXT, score FLOAT4, type TEXT)
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION
> 'hdfs://localhost:9010/tajo/warehouse/table2';
> CREATE EXTERNAL TABLE table3 (id INT4, name TEXT, score FLOAT4, type TEXT)
> USING CSV WITH ('csvfile.delimiter'='|') LOCATION
> 'hdfs://localhost:9010/tajo/warehouse/table3';
> tajo> select * from table1;
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0001/RESULT,
> 5 rows (60 B)
> id, name, score, type
> -------------------------------
> 1, ooo, 1.1, a
> 2, ppp, 2.3, b
> 3, qqq, 3.4, c
> 4, rrr, 4.5, d
> 5, xxx, 5.6, e
> tajo> select * from table2;
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0002/RESULT,
> 5 rows (61 B)
> id, name, score, type
> -------------------------------
> 6, ooo, 1.1, a
> 7, ppp, 2.3, b
> 8, qqq, 3.4, c
> 9, rrr, 4.5, d
> 10, xxx, 5.6, e
> tajo> select * from table3;
> result:
> hdfs://localhost:9010/tmp/tajo-blrunner/staging/q_1391680780275_0003/RESULT,
> 0 rows (0 B)
> id, name, score, type
> -------------------------------
> {code}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)