[ 
https://issues.apache.org/jira/browse/DRILL-6176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16372425#comment-16372425
 ] 

Paul Rogers commented on DRILL-6176:
------------------------------------

The code works as designed and configured.

You are running afoul of a feature of CSV. Lines starting with the '#' 
character are treated as comments and ignored. The sixth line of the file:
{noformat}
#@$%@#$%@#$%#%@#$%#^@%^$&%&*^%&*^#%@$%...
{noformat}
Start the line with any other character and the line won't be ignored.

The # character is used in some CSV-like files such as Microsoft IIS access 
logs.

There is another JIRA for this issue. I thought we allowed setting the comment 
to 0 to disable the feature, but the fix is not in the code. So, maybe the fix 
was never done.

The text format plugin defines the following property:
{code:java}
    public char comment = '#';
{code}
In your format plugin config for the ".tbl" suffix, change the comment 
character to be something not in your file. Not pretty, but you can try 
backspace, which should never occur: `\b`.

> Drill skips a row when querying a text file but does not report it.
> -------------------------------------------------------------------
>
>                 Key: DRILL-6176
>                 URL: https://issues.apache.org/jira/browse/DRILL-6176
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Execution - Data Types
>    Affects Versions: 1.12.0
>            Reporter: Robert Hou
>            Assignee: Pritesh Maker
>            Priority: Critical
>         Attachments: 10.tbl
>
>
> I tried to query 10 rows from a tbl file.  It skipped the 6th row, which only 
> has special symbols in it.  So it shows 9 rows.  And there was no warning 
> that a row is skipped.
> i checked the special symbols.  The same symbols appear in other rows.
> This also occurs if the file is a csv file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to