[ 
https://issues.apache.org/jira/browse/HIVE-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14963578#comment-14963578
 ] 

Aihua Xu commented on HIVE-1898:
--------------------------------

HIVE-11785 added the support of escaping the newline and carriage return for 
LazySimpleSerDe and it should fix this issue. So the intermediate result with 
LazySimpleSerDe will escape newline and carriage return and later 
LineRecordReader can handle each line properly. 

> The ESCAPED BY clause does not seem to pick up newlines in colums and the 
> line terminator cannot be changed
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-1898
>                 URL: https://issues.apache.org/jira/browse/HIVE-1898
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.5.0
>            Reporter: Josh Patterson
>            Priority: Minor
>
> If I want to preserve data in columns which contains a newline (webcrawling 
> for instance) I cannot set the ESCAPED BY clause to escape these out (other 
> characters such as commas escape fine, however). This may be due to the line 
> terminators, which are locked to be newlines, are picked up first, and then 
> fields processed. 
> This seems to be related to:
> "SerDe should escape some special characters"
> https://issues.apache.org/jira/browse/HIVE-136
> and
> "Implement "LINES TERMINATED BY""
> https://issues.apache.org/jira/browse/HIVE-302
> where at comment: 
> https://issues.apache.org/jira/browse/HIVE-302?focusedCommentId=12793435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793435
> "This is not fixable currently because the line terminator is determined by 
> LineRecordReader.LineReader which is in the Hadoop land."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to