Jaehwa Jung created TAJO-1685:
---------------------------------
Summary: Query fails when using table data which located on local
file system occasionally on fully distributed mode.
Key: TAJO-1685
URL: https://issues.apache.org/jira/browse/TAJO-1685
Project: Tajo
Issue Type: Improvement
Components: Java Client, SQL Shell
Reporter: Jaehwa Jung
Tajo allows that the location of table would be set the path of local file
system, for example, “file:///home/tajo/xyz”. When querying above table data on
pseudo distributed mode, the query would finished successfully. Pseudo
distributed mode for tajo means that TajoMaster and TajoWorker just run on the
same host. But when querying the data on fully distribute mode, the query would
failed because the data was’t located on all hosts for running TajoWorker. In
this case, users would see ambiguous error message as follows.
{code:xml}
default> create external table table1 (
> id int,
> name text,
> score float,
> type text)
> using text with ('text.delimiter'='|') location
> 'file:///home/tajo/data.csv'
> ;
OK
default> \d table1;
table name: default.table1
table uri: file:///home/tajo/data.csv
store type: text
number of rows: unknown
volume: 60 B
Options:
'text.delimiter'='|'
schema:
id INT4
name TEXT
score FLOAT4
type TEXT
default> select * from table1;
id, name, score, type
-------------------------------
1, abc, 1.1, a
2, def, 2.3, b
3, ghi, 3.4, c
4, jkl, 4.5, d
5, mno, 5.6, e
(5 rows, 0.081 sec, 60 B selected)
default> select count(*) from table1;
ERROR: No error message
{code}
It doesn’t seems easy for users to know the cause of the error. We need to
print a well-defined message for avoiding users confusion.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)