[
https://issues.apache.org/jira/browse/TAJO-1339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyunsik Choi reassigned TAJO-1339:
----------------------------------
Assignee: Hyunsik Choi (was: Keuntae Park)
> Incorrect handling of tables with custom delimiter when their data contain '|'
> ------------------------------------------------------------------------------
>
> Key: TAJO-1339
> URL: https://issues.apache.org/jira/browse/TAJO-1339
> Project: Tajo
> Issue Type: Bug
> Reporter: Keuntae Park
> Assignee: Hyunsik Choi
>
> With the table data
> {code}
> 1;a;1.1
> 2;a|b;2.4
> 3;b|c|d;3.2
> {code}
> and external table declaration
> {code}
> create external table delimiter (id int, name text, score float) using csv
> with ('csvfile.delimiter'=';') location 'xxx';
> {code}
> , I got the following incorrect query result for query 'select name, score
> from delimiter'
> {code}
> name,score
> -------------------------------
> a,1.1
> a,null
> b,null
> {code}
> It looks like '|' in name column is recognized as delimiter.
> As I inspect the code,
> table meta information like 'csvfile.delimiter' is only valid on leaf scan
> operation and all the other operations (including making intermediate data
> and materialize query result) assumes that delimiter is
> DEFAULT_FIELD_DELIMITER, which is '|'.
> Hence, if the plan has the process of making intermediate data,
> it handles '|' in the data as a delimiter even though it is not actually.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)