Hi Richard,
What Bejoy said is correct. However, another way to get around it would be 
pre-process your data between <doc> and </doc> to not contain any newlines. 
Then, you should be able to treat that data as string and parse it out 
relatively easily.

Mark


----- Original Message -----
From: "Bejoy Ks" <[email protected]>
To: [email protected]
Sent: Monday, May 21, 2012 7:22:58 AM
Subject: Re: user define data format



Hi Richard 


In hive the default record delimiter is the next line character. In your sample 
data set, a single row/record is spread across multiple lines. AFAIK The only 
possible option here is to write a custom serde for your data. 


Regards 
Bejoy KS 





From: Richard <[email protected]> 
To: "[email protected]" <[email protected]> 
Sent: Monday, May 21, 2012 3:14 PM 
Subject: user define data format 



Hi, I want to use Hive on some data in the following format: 
<doc>\0x01 
field1=val1\0x01 
field2=val2\0x01 
... 
</doc>\0x01 

the lines between <doc> and </doc> are a record. How should I define the table? 

thanks. 
Richard 




Reply via email to