[
https://issues.apache.org/jira/browse/DRILL-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372425#comment-16372425
]
Paul Rogers commented on DRILL-6176:
------------------------------------
The code works as designed and configured.
You are running afoul of a feature of CSV. Lines starting with the '#'
character are treated as comments and ignored. The sixth line of the file:
{noformat}
#@$%@#$%@#$%#%@#$%#^@%^$&%&*^%&*^#%@$%...
{noformat}
Start the line with any other character and the line won't be ignored.
The # character is used in some CSV-like files such as Microsoft IIS access
logs.
There is another JIRA for this issue. I thought we allowed setting the comment
to 0 to disable the feature, but the fix is not in the code. So, maybe the fix
was never done.
The text format plugin defines the following property:
{code:java}
public char comment = '#';
{code}
In your format plugin config for the ".tbl" suffix, change the comment
character to be something not in your file. Not pretty, but you can try
backspace, which should never occur: `\b`.
> Drill skips a row when querying a text file but does not report it.
> -------------------------------------------------------------------
>
> Key: DRILL-6176
> URL: https://issues.apache.org/jira/browse/DRILL-6176
> Project: Apache Drill
> Issue Type: Bug
> Components: Execution - Data Types
> Affects Versions: 1.12.0
> Reporter: Robert Hou
> Assignee: Pritesh Maker
> Priority: Critical
> Attachments: 10.tbl
>
>
> I tried to query 10 rows from a tbl file. It skipped the 6th row, which only
> has special symbols in it. So it shows 9 rows. And there was no warning
> that a row is skipped.
> i checked the special symbols. The same symbols appear in other rows.
> This also occurs if the file is a csv file.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)