[
https://issues.apache.org/jira/browse/ORC-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919773#comment-16919773
]
Yukihiro Okada commented on ORC-11:
-----------------------------------
I assume this issue are already fixed in orc-tool at least.
{code:java}
% uname -s
Darwin
% orc-tools version
ORC 1.5.6
% cat t/orc11.json
{"name":"foobar","time":"2019-10-28 07:34:07"}
{"name":"barbaz","time":"2019-10-29 07:20:57"}
# confirm a file which include control characters
% cat -v t/orc11.json
{"name":"foo^Ebar","time":"2019-10-28 07:34:07"}
{"name":"bar^Abaz","time":"2019-10-29 07:20:57"}
# Create an orc file from above file
% if [ -f output.orc ]; then rm -f output.orc ; fi && orc-tools convert
--schema "struct<name:string,time:timestamp>" t/orc11.json
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more
info.
Processing t/orc11.json
# dump output.orc
% orc-tools data output.orc
log4j:WARN No appenders could be found for logger
(org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more
info.
Processing data file output.orc [length: 395]
{"name":"foo\u0005bar","time":"2019-10-28 07:34:07.0"}
{"name":"bar\u0001baz","time":"2019-10-29 07:20:57.0"}
________________________________________________________________________________________________________________________
{code}
> Quote control characters in ColumnPrinter
> -----------------------------------------
>
> Key: ORC-11
> URL: https://issues.apache.org/jira/browse/ORC-11
> Project: ORC
> Issue Type: Bug
> Components: C++, tools
> Reporter: Owen O'Malley
> Priority: Major
>
> ColumnPrinter should quote all of control characters in string literals.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)