[ 
https://issues.apache.org/jira/browse/ORC-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Varun Raval updated ORC-1031:
-----------------------------
    Issue Type: Wish  (was: Bug)

> No way to escape delimiter in column values
> -------------------------------------------
>
>                 Key: ORC-1031
>                 URL: https://issues.apache.org/jira/browse/ORC-1031
>             Project: ORC
>          Issue Type: Wish
>          Components: C++
>            Reporter: Varun Raval
>            Priority: Major
>
> I am using the C++ csv to orc tool to convert csv file to orc file and I 
> could not find a way to escape the delimiters present in the column values of 
> the table in csv file. If a delimiter is present as part of a column value in 
> csv file, csv to orc tool uses that character to separate the columns and 
> that messes up the data in the orc file.
>  
> For my scenario, all the possible values for delimiter can be a character in 
> one of the columns in csv file.
> To provide more information about my use case, I have a hive table with 
> binary column and I have a csv file with that column having binary data. I am 
> converting csv file to orc file using this tool. There are no limitations on 
> what kind of data that binary column can have and hence the delimiter we use 
> for csv to orc conversion, can end up inside that binary column.
> Sample value of the binary column shown below
> {code:java}
> 9Tl���������������~sjc_\[[\^`a`]WPF:."�������������������+Gaw���������������xnf`][Z[\_`a_[TK@4
> {code}
>  
> If there is a way to escape the delimiter characters in the column values, 
> that would be really useful!



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to