Hi,

We've run into this issue as well, and it is indeed annoying. As I recall, the issue comes in not when the records are read off disk but when hive deals with the records further down the line (I forget exactly where).

I believe this issue is relevant: https://issues.apache.org/jira/browse/HIVE-1898 . If you can't preprocess the input to clean it up, the suggestion there of using

regexp_replace(<my_column>, "\n", "")

might be useful.

Our (rather clunky) workaround was to do the replacement in our SerDe (we were already using a custom SerDe, so this wasn't a huge burden for us).

What does your CREATE TABLE statement look like?

Andrew

On 5/21/14, 3:48 PM, hbaseuser hbaseuser wrote:
Hi,

I'm trying to process JSON data in hive (0.12) with "\n" inside some of the keys & values. It is messed up and I have no control over changing the input.

What is the best way to process this data in hdfs?

Thanks!

Reply via email to