Re: Best practices in ID generation?

Guilherme Germoglio Wed, 08 Jul 2009 12:07:47 -0700

You may also want to read this blog post:
http://devblog.streamy.com/2009/04/23/hbase-row-key-design-for-paging-limit-offset-queries/


It contains a great lesson on designing keys in order to enable results
paging (maybe it only needs to be updated to 0.20 api).

On Wed, Jul 8, 2009 at 3:57 PM, Vaibhav Puranik <[email protected]> wrote:

> Thanks for your help Jon and Brian.
>
> IncrementColumnValues seems promising to us because we would like to have
> rows ordered by insertion time.
>
> Regards,
> Vaibhav
>
> On Wed, Jul 8, 2009 at 11:36 AM, Jonathan Gray <[email protected]> wrote:
>
> > Right, BD.
> >
> > We use incrementing IDs for most things because it gives us ordering.
> >
> > If you only have random-key access, you would be better suited using
> > UUID-style IDs as Bryan says.
> >
> >
> > Bryan Duxbury wrote:
> >
> >> Not necessarily in context of hbase, but Rapleaf uses UUIDs/GUIDs, since
> >> they are crazy fast to generate and have no dependencies on external
> >> resources.
> >>
> >> In the context of hbase, a benefit of UUIDs is that they will be
> randomly
> >> distributed over your whole table, instead of consistently showing up in
> the
> >> last region in the table.
> >>
> >> -Bryan
> >>
> >> On Jul 8, 2009, at 11:10 AM, Jonathan Gray wrote:
> >>
> >>  There are a number of different ways you could generate IDs.
> >>>
> >>> Some people use GUIDs, probably the simplest way, though not my
> >>> recommendation.
> >>>
> >>> ZooKeeper as a facility for ID generation.
> >>>
> >>> Here, we use HBase for ID generation.  Currently in production, which
> is
> >>> running on 0.19, I run a custom patch that works very much like
> >>> incrementColumnValue does in 0.20.  When moving to 0.20 I plan on
> migrating
> >>> our ID assignment system to the built-in ICV.
> >>>
> >>> You can expect about a 1ms end-to-end time on an increment operation,
> so
> >>> if you need to generate more than 1000 ids/second, you need to think
> about
> >>> how to distribute it across multiple rows or you could grab them in
> batches
> >>> (increment by 100 to generate 100 ids at a time, still takes 1ms).
> >>>
> >>> Hope that helps.
> >>>
> >>> JG
> >>>
> >>> Vaibhav Puranik wrote:
> >>>
> >>>> Hi,
> >>>> Does anybody have any suggestion/best practices on id/row/key
> generation
> >>>> for
> >>>> HBase rows?
> >>>> Do people use sequential ids (like rdbms - 1,2,3,4...) or people use
> >>>> strings  ids?
> >>>> What id server do you use? Do we have to write our own?
> >>>> Any help/experiences please?.
> >>>> Regards,
> >>>> Vaibhav
> >>>>
> >>>
> >>
>



-- 
Guilherme

msn: [email protected]
homepage: http://germoglio.googlepages.com

Re: Best practices in ID generation?

Reply via email to