[
https://issues.apache.org/jira/browse/DRILL-5239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074021#comment-16074021
]
Arina Ielchiieva commented on DRILL-5239:
-----------------------------------------
Roman, having the below data set and option skip header set to false:
{noformat}
# Exported from server abcd at 2017:07:01T01:00:00
# Server log version 2.3
time,recv-ip,bytes,status,...
data1, data2, data3, data4,...
data1, data2, data3, data4,...
#data1, data2, data3, data4,...
{noformat}
will we be able to skip first comments till header but treat text with # as
data?
{noformat}
time,recv-ip,bytes,status,...
data1, data2, data3, data4,...
data1, data2, data3, data4,...
#data1, data2, data3, data4,...
{noformat}
> Drill text reader reports wrong results when column value starts with '#'
> -------------------------------------------------------------------------
>
> Key: DRILL-5239
> URL: https://issues.apache.org/jira/browse/DRILL-5239
> Project: Apache Drill
> Issue Type: Bug
> Components: Storage - Text & CSV
> Affects Versions: 1.10.0
> Reporter: Rahul Challapalli
> Assignee: Roman
> Priority: Blocker
>
> git.commit.id.abbrev=2af709f
> Data Set :
> {code}
> D|32
> 8h|234
> ;#|3489
> ^$*(|308
> #|98
> {code}
> Wrong Result : (Last row is missing)
> {code}
> select columns[0] as col1, columns[1] as col2 from
> dfs.`/drill/testdata/wtf2.tbl`;
> +-------+-------+
> | col1 | col2 |
> +-------+-------+
> | D | 32 |
> | 8h | 234 |
> | ;# | 3489 |
> | ^$*( | 308 |
> +-------+-------+
> 4 rows selected (0.233 seconds)
> {code}
> The issue does not however happen with a parquet file
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)