[ 
https://issues.apache.org/jira/browse/DRILL-7869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Curtis Lambert updated DRILL-7869:
----------------------------------
    Description: 
Querying CSV files with \x0d new line delimiters results in "DATA_READ ERROR: 
Column exceeds maximum length of 1024" with the default configuration.

The \x0d new line isn't used to break lines resulting in the entire file being 
read in as a single record. This is configurable as "delimiter" in the format 
but if you have mixed csv files with different line endings it's problematic. 
If I have files with both \x0d and \x0d\x0a new lines (\r\n) and need to be 
able to read both without having to change the configuration between queries.

  was:
Querying CSV files with linux new line delimiters results in "DATA_READ ERROR: 
Column exceeds maximum length of 1024".

The \x0d new line isn't used to break lines resulting in the entire file being 
read in as a single record. This is configurable as "delimiter" in the format 
but if you have mixed csv files with different line endings it's problematic. 
If I have files with both \x0d and \x0d\x0a new lines (\r\n) and need to be 
able to read both without having to change the configuration between queries.


> CSV files can't mix line breaks \x0d Vs. \x0d\x0a
> -------------------------------------------------
>
>                 Key: DRILL-7869
>                 URL: https://issues.apache.org/jira/browse/DRILL-7869
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text & CSV
>    Affects Versions: 1.19.0
>            Reporter: Curtis Lambert
>            Priority: Minor
>
> Querying CSV files with \x0d new line delimiters results in "DATA_READ ERROR: 
> Column exceeds maximum length of 1024" with the default configuration.
> The \x0d new line isn't used to break lines resulting in the entire file 
> being read in as a single record. This is configurable as "delimiter" in the 
> format but if you have mixed csv files with different line endings it's 
> problematic. If I have files with both \x0d and \x0d\x0a new lines (\r\n) and 
> need to be able to read both without having to change the configuration 
> between queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to