Plain old tab-separated text is very easy to import to Hive. So I would 
pre-process the CSV you have now to change it to tab-separated (with no quotes).

Chuck Connell
Nuance R&D Data Team
Burlington, MA

From: Mark Grover [mailto:[email protected]]
Sent: Wednesday, November 28, 2012 11:17 AM
To: Ram Krishnamurthy
Cc: [email protected]
Subject: Re: Hive - Load file with delimiters

Moving to [email protected]<mailto:[email protected]> since this is an 
Apache Hive related question

Hi Ram,
Hive doesn't provide a way to do this out of the box. I have seen other people 
have the same request before (and more, e.g. use of escape characters to embed 
the enclosing characters, etc.) and have just created a Hive JIRA to discuss 
and implement this. https://issues.apache.org/jira/browse/HIVE-3751

However, there are several ways I can think of to work around this for now. 
They are:
1. Pre-process the files to get rid of the enclosing characters ("). This may 
or may not be scalable depending on your implementation, would be fairly simple 
to implement.
2. Create a staging Hive table with enclosing characters as a part of the data. 
Then, you could do one of 3 things with this staging table.
 2a. Create another Hive table that gets populated by a Hive query on the 
staging table. The query basically strips out the enclosing characters.
 2b. Create a view on the existing Hive staging table that presents the data 
after removing the enclosing characters.
 2c. Use the existing RegexSerDe when creating the Hive staging table to get 
rid of the enclosing characters when serializing/deserializing the data from 
the staging table.
3. Create a custom informat or SerDe that supports enclosing characters in 
delimited fields.

#3 above will be an answer to the JIRA I just created. So, the community would 
be grateful to you if you go that route and contribute it back. Of course, feel 
free to choose another option if that works better for you.

For your reference, there is a similar thread that I had replied to at: 
http://mail-archives.apache.org/mod_mbox/hive-user/201204.mbox/%3CCAENxBwwrZrqBSJXtJHpqc_FfcZvwRMoaT9W7dR=jgtyjoxp...@mail.gmail.com%3E

Good luck!
Mark

On Wed, Nov 28, 2012 at 7:25 AM, Ram Krishnamurthy 
<[email protected]<mailto:[email protected]>>
 wrote:
I am having issues loading a delimited csv. here is a sample data
"1", "123456", " ", "Charlotte, NC"

The delimiter is ,
and fields are enclosed by " "

Does anyone have syntax for loading this in Hive?

Thanks, Ram



Reply via email to