Here I was figuring you'd just keep an internal counter, but that you would define an interface and maybe then people can easily create there own.

e.g.
class Converter{
   private long id;

  public long convert(String){
        //check to see if string exists, return it's long
        //else increment id //make sure it is thread safe
      return id;
  }

  public Object lookup(long){
       //look it up and return it
 }


On Aug 4, 2009, at 10:38 AM, Sean Owen wrote:

Agree, there is of course a limit to this and it needs to be documented. At
some scale you would have to switch to 'really' use longs. For smaller
scales it remains useful to provide this sort of thing.

I would also note the consequences of a collision are modest in a CF
problem. The worst thing is that one of your users is actually getting recs for someone else. Which... is not good but it is not like a rocket crashes
or a bank account gets wiped out.

On Aug 4, 2009 3:28 PM, "Yonik Seeley" <[email protected]> wrote:

On Tue, Aug 4, 2009 at 10:16 AM, Otis

Gospodnetic<[email protected]> wrote:

Excellent, was thinking the same thing last night, too - I'm probably not
the only person using St...
The problem with using a hash is that they can generate collisions...
using an excellent 64 bit hash will only allow millions (at most) of
ids before the probability of collision becomes to great.  At ~190M
ids, the probability of collision becomes .1% - too high to rely on
the output of such a system with confidence.

-Yonik
http://www.lucidimagination.com

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene:
http://www.lucidimagination.com/search

Reply via email to