What you need?  bulk-upload, in the scheme of things, is a well
documented feature.  Its also one that has had some exercise and is
known to work well.  For a 0.89 release and trunk, documentation is
here: http://hbase.apache.org/docs/r0.89.20100924/bulk-loads.html.
The unit test you refer to below is good for figuring how to run a job
(Bulk-upload was redone for 0.89/trunk and is much improved over what
was available in 0.20.x)

On totalorderpartition, this is a partitioner class from hadoop.  The
MR partitioner -- the class that dictates which reducers get what map
outputs -- is pluggable. The default partitioner does a hash of the
output key to figure which reducer.  This won't work if you are
looking to have your hfile output totally sorted.

If you can't figure what its about, I'd suggest you check out the
hadoop book where it gets a good explication.

On incremental upload, the doc. suggests you look at the output for
LoadIncrementalHFiles command.  Have you done that?  You run the
command and it'll add in whatever is ready for loading.

St.Ack


On Wed, Nov 10, 2010 at 6:47 AM, Shuja Rehman <[email protected]> wrote:
> Hey Community,
>
> Well...it seems that nobody has experienced with the bulk load option. I
> have found one class which might help to write the code for it.
>
> https://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat.java
>
> From this, you can get the idea how to write map reduce job to output in
> HFiles format. But There is a little confusion about these two things
>
> 1-TotalOrderPartitioner
> 2-configureIncrementalLoad
>
> Does anybody have idea about how these things and how to configure it for
> the job?
>
> Thanks
>
>
>
> On Wed, Nov 10, 2010 at 1:02 AM, Shuja Rehman <[email protected]> wrote:
>
>> Hi
>>
>> I am trying to investigate the bulk load option as described in the
>> following link.
>>
>> http://hbase.apache.org/docs/r0.89.20100621/bulk-loads.html
>>
>> Does anybody have sample code or have used it before?
>> Can it be helpful to insert data into existing table. In my scenario, I
>> have one table with 1 column family in which data will be inserted every 15
>> minutes.
>>
>> Kindly share your experiences
>>
>> Thanks
>> --
>> Regards
>> Shuja-ur-Rehman Baig
>> <http://pk.linkedin.com/in/shujamughal>
>>
>>
>
>
> --
> Regards
> Shuja-ur-Rehman Baig
> <http://pk.linkedin.com/in/shujamughal>
>

Reply via email to