[
https://issues.apache.org/jira/browse/ORC-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Varun Raval updated ORC-1031:
-----------------------------
Issue Type: Wish (was: Bug)
> No way to escape delimiter in column values
> -------------------------------------------
>
> Key: ORC-1031
> URL: https://issues.apache.org/jira/browse/ORC-1031
> Project: ORC
> Issue Type: Wish
> Components: C++
> Reporter: Varun Raval
> Priority: Major
>
> I am using the C++ csv to orc tool to convert csv file to orc file and I
> could not find a way to escape the delimiters present in the column values of
> the table in csv file. If a delimiter is present as part of a column value in
> csv file, csv to orc tool uses that character to separate the columns and
> that messes up the data in the orc file.
>
> For my scenario, all the possible values for delimiter can be a character in
> one of the columns in csv file.
> To provide more information about my use case, I have a hive table with
> binary column and I have a csv file with that column having binary data. I am
> converting csv file to orc file using this tool. There are no limitations on
> what kind of data that binary column can have and hence the delimiter we use
> for csv to orc conversion, can end up inside that binary column.
> Sample value of the binary column shown below
> {code:java}
> 9Tl���������������~sjc_\[[\^`a`]WPF:."�������������������+Gaw���������������xnf`][Z[\_`a_[TK@4
> {code}
>
> If there is a way to escape the delimiter characters in the column values,
> that would be really useful!
--
This message was sent by Atlassian Jira
(v8.20.1#820001)