Re: Bulk import tools for HBase

Leo Alekseyev Tue, 12 Oct 2010 18:10:10 -0700

Command line tools don't seem to be included in the 0.89.20100830 branch.
In addition, it doesn't look like ImportTsv.java gets compiled into
the Hbase jar file.


Are there any tutorials for working with hbase source other than
http://wiki.apache.org/hadoop/Hbase/MavenPrimer?

Also, a somewhat naive question: do the bulk load tools assume that
the source data already resides in HDFS?  If so, what efficient ways
are there of loading bulk data into HDFS?

On Mon, Oct 11, 2010 at 2:33 PM, Sean Bigdatafun
<[email protected]> wrote:
> Another potential "problem" of incremental bulk loader is that the number of
> reducers (for the bulk loading process) needs to be equal to the existing
> regions -- this seems to be unfeasible for very large table, say with 2000
> regions.
>
> Any comment on this? Thanks.
>
> Sean
>
> On Fri, Oct 8, 2010 at 9:03 PM, Todd Lipcon <[email protected]> wrote:
>
>> What version are you building from? These tools are new as of this past
>> june.
>>
>> -Todd
>>
>> On Fri, Oct 8, 2010 at 4:52 PM, Leo Alekseyev <[email protected]> wrote:
>>
>>  > We want to investigate HBase bulk imports, as described on
>> > http://hbase.apache.org/docs/r0.89.20100726/bulk-loads.html and and/or
>> > JIRA HBASE-48.  I can't seem to run either the importtsv tool or the
>> > completebulkload tool using the hadoop jar /path/to/hbase-VERSION.jar
>> > command.  In fact, the ImportTsv class is not part of that jar file.
>> > Am I looking in the wrong place for this class, or do I need to
>> > somehow customize the build process to include it?..  Our HBase was
>> > built from source using the default procedure.
>> >
>> > Thanks for any insight,
>> > --Leo
>> >
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: Bulk import tools for HBase

Reply via email to