Hi Benj,

   Testing with the latest code, this error does not show:

     0: jdbc:drill:zk=local> select columns[0], columns[1] from dfs.`/data/bar.csv`; +----------------------------------------------------------------------------------+---------+
     | EXPR$0                                      | EXPR$1  |
+----------------------------------------------------------------------------------+---------+
     | hello | world   |
     | a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a01234567890a ...............890

The error you saw is raised in the append() method in FieldVarCharOutput.java , but somehow when I ran the example above that code path was not taken.

However indeed the printout truncates (at about 10,000 chars), including truncating the following columns. The actual varchar retains its length:

     0: jdbc:drill:zk=local> select char_length(columns[0]) from dfs.`/data/bar.csv`;
     +---------+
     | EXPR$0  |
     +---------+
     | 5       |
     | 72061   |
     +---------+

 -- Boaz

On 1/23/19 12:29 PM, benj.dev wrote:
Hi,

With a CSV file test.csv
col1,col2
w,x
...y...,z

where ...y... is a > 65536 character string (let say 66000 for example)

Error with
extract of storage : "csv": { "type": "text", "extensions": [ "csv" ],
"extractHeader": true, "delimiter": "," },
SELECT * FROM tmp.`test.csv`
Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a
column
column name columns
column index
Fragment 0:0

Same error with
SELECT col1, col2 FROM TABLE(tmp.`test.csv`(type => 'text',
fieldDelimiter => ',', extractHeader => true))
Error: UNSUPPORTED_OPERATION ERROR: Trying to write something big in a
column
columnIndex 0
Limit 65536
Fragment 0:0

But it's OK with
SELECT columns[0], columns[1] FROM TABLE(tmp.`test.csv`(type => 'text',
fieldDelimiter => ',', extractHeader => false))
     col1  | col2
+---------+------
| w       | x
| ...y... | z

Why it's not possible to work with big varchar with extractHeader=true
event though it work with extractHeader=false ?


And in a derivative point for the last case, for printing, ...y... is
truncated and next fields (here only col2) are not display for this row
(without any mark/warning) but the real value is correct.
Maybe it's a bug to not display other columns and maybe the length of
the printed character should be controlled by an option and finally a
marker should indicate when it's the case

Thanks for any explanation

Reply via email to