Hi keshav, Seemingly there is a problem with bulk load when we try to import data from csv file. I also ran into this problem yesterday and posted the same on mailing list. I got pulled into some other task at work so unable to devote much time on it. I have identified the problem but I still need to figure out the fix of it. I will post the solution once I finish it.
Best Regards, Anil On Mar 6, 2012, at 6:37 AM, <[email protected]> wrote: > Did you try to add a comma at the end of line? Just to see how it will do? > > > On Mar 6, 2012, at 5:02 AM, ext Savant, Keshav wrote: > >> Hi, >> >> I tried bulk uploading and it ran well with TSV files, we first ran >> importtsv and then completebulkload, after doing these two steps I can scan >> my HBase table and see the data. I can also see the data when I traverse >> HDFS of my Hadoop cluster using web browser. >> >> But when I try to upload my CSVs in a folder, I get bad lines for all the >> lines of my CSV files. I use following command to upload my CSVs on my local >> file system to HDFS, >> >> HADOOP_CLASSPATH=`hbase classpath` $HADOOP_HOME/bin/hadoop jar >> /hbase_home/hbase-0.92.0/hbase-0.92.0.jar importtsv >> -Dimporttsv.bulk.output=/my_output_dir >> -Dimporttsv.columns=HBASE_ROW_KEY,SerialNumber,Column1,Column2 my_table >> file:/my_csv/data.txt '-Dimporttsv.separator=,' >> >> my csv file is of following format >> >> 1,data11,data12 >> 2,data21,data22 >> 3,data31,data32 >> ..... >> ..... >> >> And my HBase table has 3 columns >> >> >> Please let me know what is the exact problem and how this can be resolved? >> >> Kind regards, >> Keshav >> >> >> >> -----Original Message----- >> From: Savant, Keshav >> Sent: Friday, March 02, 2012 7:02 PM >> To: [email protected] >> Cc: '[email protected]' >> Subject: RE: Inserting Data from CSV into HBase >> >> Hi Harsh, >> >> Thanks for your response, I don't get any error using the code mentioned in >> that URL. I will get back to you after analyzing the tools suggested by you. >> Thanks again. >> >> >> Kind regards, >> Keshav C Savant >> >> -----Original Message----- >> From: Harsh J [mailto:[email protected]] >> Sent: Friday, March 02, 2012 6:51 PM >> To: [email protected] >> Subject: Re: Inserting Data from CSV into HBase >> >> Hi, >> >> You may use the importtsv tool and the bulk-load utilities in HBase to >> achieve this fast-and-easy. >> >> This is detailed at http://hbase.apache.org/bulk-loads.html (See section >> about importtsv along the bottom) and also under section "Using the >> importtsv tool" on Page 460 of Lars George's "HBase: The Definitive Guide" >> (O'Reilly). >> >> Also when you say something didn't work, please also supply any errors you >> encountered and the configuration you used. Its hard to help without those. >> >> On Fri, Mar 2, 2012 at 6:24 PM, Savant, Keshav >> <[email protected]> wrote: >>> Hi All, >>> >>> I am looking for a way so that I can map my existing CSV file to HBase >>> table, basically for each column family I want only one value (just like >>> RDBMS). >>> >>> Just to illustrate more suppose I define a HBase table as >>> >>> create 'inventory', 'item', 'supplier', 'quantity' >>> (here table name is inventory and it has three columns named as item, >>> supplier and quantity) >>> >>> Now I want to load my N number of CSVs in following format into this >>> HBase table >>> >>> Burger,abc confectionary,100 >>> Pizza,xyz bakers,50 >>> ... >>> ... >>> ... >>> >>> Here I want to put the data of CSV into my inventory table on HBase, the >>> number of lines in a CSV and even number of CSVs are dynamic, and this will >>> be a continuous process. >>> >>> What I want to know that, do we have any way by which we can achieve above >>> goal, I tried SampleUploader as specified on >>> http://svn.apache.org/repos/asf/hbase/trunk/src/examples/mapreduce/org/apache/hadoop/hbase/mapreduce/SampleUploader.java, >>> but it did not worked and data does not gets populated in HBase table >>> though the program ran successfully. >>> >>> Please suggest on this, any help is appreciated. >>> >>> Kind regards, >>> Keshav C Savant >>> >>> _____________ >>> The information contained in this message is proprietary and/or >>> confidential. If you are not the intended recipient, please: (i) delete the >>> message and all copies; (ii) do not disclose, distribute or use the message >>> in any manner; and (iii) notify the sender immediately. In addition, please >>> be aware that any message addressed to our domain is subject to archiving >>> and review by persons other than the intended recipient. Thank you. >> >> >> >> -- >> Harsh J >> >> _____________ >> The information contained in this message is proprietary and/or >> confidential. If you are not the intended recipient, please: (i) delete the >> message and all copies; (ii) do not disclose, distribute or use the message >> in any manner; and (iii) notify the sender immediately. In addition, please >> be aware that any message addressed to our domain is subject to archiving >> and review by persons other than the intended recipient. Thank you. >
