The ESCAPED BY clause does not seem to pick up newlines in colums and the line
terminator cannot be changed
-----------------------------------------------------------------------------------------------------------
Key: HIVE-1898
URL: https://issues.apache.org/jira/browse/HIVE-1898
Project: Hive
Issue Type: Bug
Components: Serializers/Deserializers
Affects Versions: 0.5.0
Reporter: Josh Patterson
Priority: Minor
If I want to preserve data in columns which contains a newline (webcrawling for
instance) I cannot set the ESCAPED BY clause to escape these out (other
characters such as commas escape fine, however). This may be due to the line
terminators, which are locked to be newlines, are picked up first, and then
fields processed.
This seems to be related to:
"SerDe should escape some special characters"
https://issues.apache.org/jira/browse/HIVE-136
and
"Implement "LINES TERMINATED BY""
https://issues.apache.org/jira/browse/HIVE-302
where at comment:
https://issues.apache.org/jira/browse/HIVE-302?focusedCommentId=12793435&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12793435
"This is not fixable currently because the line terminator is determined by
LineRecordReader.LineReader which is in the Hadoop land."
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.