Hi Richard, What Bejoy said is correct. However, another way to get around it would be pre-process your data between <doc> and </doc> to not contain any newlines. Then, you should be able to treat that data as string and parse it out relatively easily.
Mark ----- Original Message ----- From: "Bejoy Ks" <[email protected]> To: [email protected] Sent: Monday, May 21, 2012 7:22:58 AM Subject: Re: user define data format Hi Richard In hive the default record delimiter is the next line character. In your sample data set, a single row/record is spread across multiple lines. AFAIK The only possible option here is to write a custom serde for your data. Regards Bejoy KS From: Richard <[email protected]> To: "[email protected]" <[email protected]> Sent: Monday, May 21, 2012 3:14 PM Subject: user define data format Hi, I want to use Hive on some data in the following format: <doc>\0x01 field1=val1\0x01 field2=val2\0x01 ... </doc>\0x01 the lines between <doc> and </doc> are a record. How should I define the table? thanks. Richard
