Hi:
stack,thanks for your replying.
I just use the deault hash partitioner.I am a HBase newbie,but i will
do my best to work on this issue fellowing HBASE-1901.
On Sun, Oct 11, 2009 at 2:54 PM, stack <[email protected]> wrote:
> On Sat, Oct 10, 2009 at 10:54 PM, Anty <[email protected]> wrote:
>
> > Hi:
> > statck
> > i did some tests on bulk load tools of HBASE-48.
> >
>
> Thanks for trying it out.
>
>
> > I took files made by TestHFileOutputFormat test and passed them to the
> > script you wrote.It did works ,but it seems to be something unusual.For
> > each
> > region ,the STARTKEY and ENDKEY is nearly the same,the ENDKY is bigger
> than
> > STARTKEY by nearly 1,e.g.
> > STARTKEY=>'0000009447',ENDKY=>'0000009448';
> > STARTKEY=>'0000020476',ENDKY=>'0000020477';
> > ...
> >
> >
> Did you do your own partitioner or just use default hash partitioner?
>
>
>
> > i also have some doubts about TestHFileOutputFormat,the default
> > partitioner is hash partitioner,however ,the hash partitioner can't meet
> > requirements of TestHFileOutputFormat ,just as you said we need to ensure
> a
> > total ordering of all keys and we need to supply a partitioner that does
> > total ordering(but you didn't add a new partitioner in
> > TestHFileOutputFormat).
> >
>
> This is broke then as you point out. We should make something like what
> is
> described in https://issues.apache.org/jira/browse/HBASE-1901 for
> TestHFileOutputFormat?
>
>
>
>
> > so ,I think TestHFileOutputFormat use the hash partitionar ,it does not
> > do totoal ordering,different regions would have rows intercross ,which
> is
> > not correct for hbase.And I found the firstKey,lastKey of the files mady
> by
> > TestHFileOutputFormat is indeed intercross.
> > if the bulk tools is just the beginning,needed further improvement?I
> > think the bulk tools is very usefull.
> >
> >
> Can you help us improve it? What do you think we need to do next
> (hbase-901?)
>
> Thanks for writing Anty Rao.
> St.Ack
>
--
Anty Rao