...@gmail.com]
Sent: Thursday, November 02, 2017 6:21 PM
To: user@hive.apache.org
Subject: Re: READING STRING, CONTAINS \R\N, FROM ORC FILES VIA JDBC DRIVER
PRODUCES DIRTY DATA
ORC stores the data in UTF-8 with the length of the value stored explicitly.
Therefore, it doesn't do any parsing of newlines
> Why jdbc read them as control symbols?
Most likely this is already fixed by
https://issues.apache.org/jira/browse/HIVE-1608
That pretty much makes the default as
set hive.query.result.fileformat=SequenceFile;
Cheers,
Gopal
ORC stores the data in UTF-8 with the length of the value stored
explicitly. Therefore, it doesn't do any parsing of newlines.
You can see the contents of an ORC file by using:
% hive --orcfiledump -d
from https://orc.apache.org/docs/hive-ddl.html . How did you load the data
into Hive?
...