Hi,
Thank you all for your help. I'll try both ways and i'll get back to you.

On Fri, Sep 7, 2012 at 11:02 AM, Mohammad Tariq <donta...@gmail.com> wrote:

> I said this assuming that a Hadoop cluster is available since Sandeep is
> planning to use Hive. If that is the case then MapReduce would be faster
> for such large files.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Fri, Sep 7, 2012 at 8:27 PM, Connell, Chuck 
> <chuck.conn...@nuance.com>wrote:
>
>>  I cannot promise which is faster. A lot depends on how clever your
>> scripts are.****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> *From:* Sandeep Reddy P [mailto:sandeepreddy.3...@gmail.com]
>> *Sent:* Friday, September 07, 2012 10:42 AM
>> *To:* user@hive.apache.org
>> *Subject:* Re: How to load csv data into HIVE****
>>
>> ** **
>>
>> Hi,
>> I wrote a shell script to get csv data but when i run that script on a
>> 12GB csv its taking more time. If i run a python script will that be faster?
>> ****
>>
>> On Fri, Sep 7, 2012 at 10:39 AM, Connell, Chuck <chuck.conn...@nuance.com>
>> wrote:****
>>
>> How about a Python script that changes it into plain tab-separated text?
>> So it would look like this…****
>>
>>  ****
>>
>> 174969274<tab>14-mar-2006<tab>3522876<tab>
>> <tab>14-mar-2006<tab>500000308<tab>65<tab>1<newline>
>> etc…****
>>
>>  ****
>>
>> Tab-separated with newlines is easy to read and works perfectly on import.
>> ****
>>
>>  ****
>>
>> Chuck Connell****
>>
>> Nuance R&D Data Team****
>>
>> Burlington, MA****
>>
>> 781-565-4611****
>>
>>  ****
>>
>> *From:* Sandeep Reddy P [mailto:sandeepreddy.3...@gmail.com]
>> *Subject:* How to load csv data into HIVE****
>>
>>  ****
>>
>> Hi,
>> Here is the sample data
>> "174969274","14-mar-2006","****
>>
>> 3522876","","14-mar-2006","500000308","65","1"|
>> "174969275","19-jul-2006","3523154","","19-jul-2006","500000308","65","1"|
>> "174969276","31-dec-2005","3530333","","31-dec-2005","500000308","65","1"|
>> "174969277","14-apr-2005","3531470","","14-apr-2005","500000308","65","1"|
>>
>> How to load this kind of data into HIVE?
>> I'm using shell script to get rid of double quotes and '|' but its taking
>> very long time to work on each csv which are 12GB each. What is the best
>> way to do this?****
>>
>>  ****
>>
>>
>>
>>
>> --
>> Thanks,
>> sandeep****
>>
>
>


-- 
Thanks,
sandeep

Reply via email to