[jira] [Commented] (DRILL-7020) big varchar doesn't work with extractHeader=true

Paul Rogers (JIRA) Thu, 18 Apr 2019 17:07:59 -0700


    [ 
https://issues.apache.org/jira/browse/DRILL-7020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821571#comment-16821571
 ]


Paul Rogers commented on DRILL-7020:
------------------------------------

The "write" part of the message means "write to the value vector", so read this 
as "tried to write too large a value to the column's value vector."

The request here seems to be to modify the "compliant" text reader to avoid the 
use of the fixed-size buffer for column values in order to allow column values 
larger than 64K.

The simple way to do this is to use a Java string for the value, or to 
reallocate the column buffer as needed for ever larger values.

A more elegant way, now that the compliant reader is on the row set framework, 
is to implement an "append" operation which will take a buffer and append it to 
the value (if any) already in the column. This will allow reading large values 
without having to allocate large intermediate buffers.

> big varchar doesn't work with extractHeader=true
> ------------------------------------------------
>
>                 Key: DRILL-7020
>                 URL: https://issues.apache.org/jira/browse/DRILL-7020
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Text &amp; CSV
>    Affects Versions: 1.15.0
>            Reporter: benj
>            Priority: Major
>
> with a TEST file of csv type like
> {code:java}
> col1,col2
> w,x
> ...y...,z
> {code}
> where ...y... is > 65536 characters string (let say 66000 for example)
> SELECT with +*extractHeader=false*+ are OK
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => false));
>     col1  | col2
> +---------+------
> | w       | x
> | ...y... | z
> {code}
> But SELECT with +*extractHeader=true*+ gives an error
> {code:java}
> SELECT * FROM TABLE(tmp.`TEST`(type => 'text', fieldDelimiter => ',', 
> extractHeader => true));
> Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a column
> columnIndex 1
> Limit 65536
> Fragment 0:0
> {code}
> Note that is possible to use extractHeader=false with skipFirstLine=true but 
> in this case it's not possible to automatically get columns names.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7020) big varchar doesn't work with extractHeader=true

Reply via email to