Re: [ANN] Flake 0.4.0: Decentralized, k-ordered unique ID generator

Max Countryman Tue, 21 Jun 2016 16:58:11 -0700

I also released Flake 0.4.2 today which includes an important bugfix where two 
competing threads could have caused duplicate IDs in certain circumstances as 
well as a new method for deriving timestamps.



> On Jun 21, 2016, at 16:29, Max Countryman <m...@me.com> wrote:
> 
> Brian,
> 
> I think you make good points here, especially with regard to the size of IDs.
> 
> I’d also like to point out that while adding the process and thread IDs 
> helps, it doesn’t eliminate the possibility of duplicate IDs: this is why 
> it’s necessary to write out the last used timestamp in a separate thread.
> 
> Just a clarification with regard to disk persistence: we aren’t writing out 
> the epoch, we’re writing out the last used timestamp periodically, in its own 
> thread. Yes, the `init!` API is cumbersome, but it’s an important safety 
> valve which helps protect against duplicate IDs.
> 
> My understanding from reading the documentation and various StackOverflow 
> answers is that System/nanoTime is monotonic, but I don’t know what 
> guarantees it makes across threads.
> 
> 
> Max
> 
> 
>> On Jun 21, 2016, at 10:00, Brian Platz <brian.platz@place.works 
>> <mailto:brian.platz@place.works>> wrote:
>> 
>> Bruno,
>> 
>> I think the more you can reduce the chance of collision the better and the 
>> thread-local capability is a good idea, but in the process you've almost 
>> doubled the bits.
>> 
>> For me anyhow, an ID need to be produceable at a reasonable rate (1 million 
>> a second per machine is good for me), have near-zero probability of 
>> collision and take up the least amount of space possible.
>> 
>> Under those criteria, I think 128 bits is a reasonable target and the 
>> thread-safe atom I would expect to handle such volume (although I haven't 
>> tested).
>> 
>> If you need a billion per second and don't want 100 machines producing them, 
>> then I think you are at the point of needing to have thread independence and 
>> probably have to increase the bit-count, and your ideas provide a good path 
>> towards such a solution.
>> 
>> Your comment on the file persistence is a good one, I wonder if the 
>> potential problems are real enough to warrant the risks.
>> 
>> My other curiosity is if System/nanoTime is guaranteed to increment across 
>> threads. I know at least a while ago that this guarantee did not exist.
>> 
>> -Brian
>> 
>> 
>> On Tuesday, June 21, 2016 at 8:38:58 AM UTC-4, Bruno Bonacci wrote:
>> 
>> Hi this change it is actually easier than it sounds. Looking at the code, I 
>> came across a couple of things which I think might be better.
>> 
>> 1) use of filesystem persistence.
>> 
>> Not too sure that the file based persistence is a good idea. Maybe this is a 
>> good idiomatic design for Erlang, but definitely it doesn't look nice in 
>> Clojure.
>>  
>> In particular I'm not too sure that by storing the init time epoc we 
>> actually accomplish anything at all.
>> I would argue that there are a number of problems there, race conditions on 
>> data, tmp file purged out, and still doesn't protect against the case the 
>> clock drift during the use.
>> 
>> 2) use of CAS (atom) for storing the VM state.
>> If if is truly decentralized then you shouldn't need an atom at all. The 
>> disadvantage of the CAS is that, when many thread race to the same change, 
>> only one will succeed and all the other ones will fail and retry. Which mean 
>> that if you have 100 threads (for example) only 1 will succeed all the other 
>> 99 will fail and retry. Again at the second round only 1 will succeed and 98 
>> will retry, and so on.
>> Therefore the total number of attempts will be 
>> 
>>  
>> <https://lh3.googleusercontent.com/-ZVELcKNoB9M/V2kxgYmlFMI/AAAAAAAAB8Q/nR6jLFjKSI0611-WiQpQHXAcY3SueVIdwCLcB/s1600/Screen%2BShot%2B2016-06-21%2Bat%2B13.21.24.png>
>> 
>> If you want to develop a real "decentralized" id generator, I think, you 
>> need to drop the atom in favour of a thread local store.
>> Now to do so and make collision impossible we need to add more bits:
>> 
>>     64 bits - ts (i.e. a timestamp )
>>     48 bits - worker-id/node (i.e. MAC address)
>>     32 bits - worker-id/process (pid) 
>>     64 bits - worker-id/thread (thread num)
>>     32 bits - seq-no (i.e. a counter)
>> By adding the process id (pid) and the thread id there is possibility of 
>> having two systems running and creating the same id at the same time.
>> Finally by using thread-local storage there is no need of process level 
>> coordination (atom) and no risk of retries because every process is stepping 
>> on each others toes.
>> 
>> With such setup 100 threads will be able to increment their own thread local 
>> counter independently (given that you have 100 execution cores).
>> 
>> What do you think?
>> Bruno
>> 
>>  
>> 
>> 
>> -- 
>> You received this message because you are subscribed to the Google
>> Groups "Clojure" group.
>> To post to this group, send email to clojure@googlegroups.com 
>> <mailto:clojure@googlegroups.com>
>> Note that posts from new members are moderated - please be patient with your 
>> first post.
>> To unsubscribe from this group, send email to
>> clojure+unsubscr...@googlegroups.com 
>> <mailto:clojure+unsubscr...@googlegroups.com>
>> For more options, visit this group at
>> http://groups.google.com/group/clojure?hl=en 
>> <http://groups.google.com/group/clojure?hl=en>
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Clojure" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to clojure+unsubscr...@googlegroups.com 
>> <mailto:clojure+unsubscr...@googlegroups.com>.
>> For more options, visit https://groups.google.com/d/optout 
>> <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en 
> <http://groups.google.com/group/clojure?hl=en>
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to clojure+unsubscr...@googlegroups.com 
> <mailto:clojure+unsubscr...@googlegroups.com>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [ANN] Flake 0.4.0: Decentralized, k-ordered unique ID generator

Reply via email to