Hi keshav,

Seemingly there is a problem with bulk load when we try to import data from csv 
file. I also ran into this problem yesterday and posted the same on mailing 
list. I got pulled into some other task at work so unable to devote much time 
on it. I have identified the problem but I still need to figure out the fix of 
it. I will post the solution once I finish it.

Best Regards,
Anil

On Mar 6, 2012, at 6:37 AM, <[email protected]> wrote:

> Did you try to add a comma at the end of line? Just to see how it will do?
> 
> 
> On Mar 6, 2012, at 5:02 AM, ext Savant, Keshav wrote:
> 
>> Hi,
>> 
>> I tried bulk uploading and it ran well with TSV files, we first ran 
>> importtsv and then completebulkload, after doing these two steps I can scan 
>> my HBase table and see the data. I can also see the data when I traverse 
>> HDFS of my Hadoop cluster using web browser.
>> 
>> But when I try to upload my CSVs in a folder, I get bad lines for all the 
>> lines of my CSV files. I use following command to upload my CSVs on my local 
>> file system to HDFS, 
>> 
>> HADOOP_CLASSPATH=`hbase classpath` $HADOOP_HOME/bin/hadoop jar 
>> /hbase_home/hbase-0.92.0/hbase-0.92.0.jar importtsv  
>> -Dimporttsv.bulk.output=/my_output_dir 
>> -Dimporttsv.columns=HBASE_ROW_KEY,SerialNumber,Column1,Column2 my_table 
>> file:/my_csv/data.txt '-Dimporttsv.separator=,'
>> 
>> my csv file is of following format
>> 
>> 1,data11,data12
>> 2,data21,data22
>> 3,data31,data32
>> .....
>> .....
>> 
>> And my HBase table has 3 columns
>> 
>> 
>> Please let me know what is the exact problem and how this can be resolved?
>> 
>> Kind regards,
>> Keshav
>> 
>> 
>> 
>> -----Original Message-----
>> From: Savant, Keshav 
>> Sent: Friday, March 02, 2012 7:02 PM
>> To: [email protected]
>> Cc: '[email protected]'
>> Subject: RE: Inserting Data from CSV into HBase
>> 
>> Hi Harsh,
>> 
>> Thanks for your response, I don't get any error using the code mentioned in 
>> that URL. I will get back to you after analyzing the tools suggested by you.
>> Thanks again.
>> 
>> 
>> Kind regards,
>> Keshav C Savant 
>> 
>> -----Original Message-----
>> From: Harsh J [mailto:[email protected]]
>> Sent: Friday, March 02, 2012 6:51 PM
>> To: [email protected]
>> Subject: Re: Inserting Data from CSV into HBase
>> 
>> Hi,
>> 
>> You may use the importtsv tool and the bulk-load utilities in HBase to 
>> achieve this fast-and-easy.
>> 
>> This is detailed at http://hbase.apache.org/bulk-loads.html (See section 
>> about importtsv along the bottom) and also under section "Using the 
>> importtsv tool" on Page 460 of Lars George's "HBase: The Definitive Guide" 
>> (O'Reilly).
>> 
>> Also when you say something didn't work, please also supply any errors you 
>> encountered and the configuration you used. Its hard to help without those.
>> 
>> On Fri, Mar 2, 2012 at 6:24 PM, Savant, Keshav 
>> <[email protected]> wrote:
>>> Hi All,
>>> 
>>> I am looking for a way so that I can map my existing CSV file to HBase 
>>> table, basically for each column family I want only one value (just like 
>>> RDBMS).
>>> 
>>> Just to illustrate more suppose I define a HBase table as
>>> 
>>> create 'inventory', 'item', 'supplier', 'quantity'
>>> (here table name is inventory and it has three columns named as item, 
>>> supplier and quantity)
>>> 
>>> Now I want to load my N number of CSVs in following format into this 
>>> HBase table
>>> 
>>> Burger,abc confectionary,100
>>> Pizza,xyz bakers,50
>>> ...
>>> ...
>>> ...
>>> 
>>> Here I want to put the data of CSV into my inventory table on HBase, the 
>>> number of lines in a CSV and even number of CSVs are dynamic, and this will 
>>> be a continuous process.
>>> 
>>> What I want to know that, do we have any way by which we can achieve above 
>>> goal, I tried SampleUploader as specified on 
>>> http://svn.apache.org/repos/asf/hbase/trunk/src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce/SampleUploader.java,
>>>  but it did not worked and data does not gets populated in HBase table 
>>> though the program ran successfully.
>>> 
>>> Please suggest on this, any help is appreciated.
>>> 
>>> Kind regards,
>>> Keshav C Savant
>>> 
>>> _____________
>>> The information contained in this message is proprietary and/or 
>>> confidential. If you are not the intended recipient, please: (i) delete the 
>>> message and all copies; (ii) do not disclose, distribute or use the message 
>>> in any manner; and (iii) notify the sender immediately. In addition, please 
>>> be aware that any message addressed to our domain is subject to archiving 
>>> and review by persons other than the intended recipient. Thank you.
>> 
>> 
>> 
>> --
>> Harsh J
>> 
>> _____________
>> The information contained in this message is proprietary and/or 
>> confidential. If you are not the intended recipient, please: (i) delete the 
>> message and all copies; (ii) do not disclose, distribute or use the message 
>> in any manner; and (iii) notify the sender immediately. In addition, please 
>> be aware that any message addressed to our domain is subject to archiving 
>> and review by persons other than the intended recipient. Thank you.
> 

Reply via email to