Here is a new version, in preparation of possible integration in the main image:

===
Name: Neo-UUID-SvenVanCaekenberghe.2
Author: SvenVanCaekenberghe
Time: 8 February 2016, 8:33:04.141334 pm
UUID: a909453e-35dd-4c25-8273-62a9b2bd982e
Ancestors: Neo-UUID-SvenVanCaekenberghe.1

Streamline UUID generation

Add a current, shared instance

Added better class and method comments

Add more tests

As suggested by Henrik Johansen, change to a version 0 UUID to indicate that we 
follow a custom approach
===

The class comments now reads as follows:

===
I am NeoUUIDGenerator, I generate UUIDs.

An RFC4122 Universally Unique Identifier (UUID) is an opaque 128-bit number 
that can be used for identification purposes. Concretely, a UUID is a 16 
element byte array.

The intent of UUIDs is to enable distributed systems to uniquely identify 
information without significant central coordination. In this context the word 
unique should be taken to mean "practically unique" rather than "guaranteed 
unique".
 
I generate UUIDs similar, in spirit, to those defined in RFC4122, though I use 
version 0 to indicate that I follow none of the defined versions. This does not 
matter much, if at all, in practice.

I try to conform to the following aspects:
 - each 'node' (machine, image, instance) should generate unique UUIDs
 - even when generating UUIDs at a very fast rate, they should remain unique
- be fast and efficient

To achieve this goal, I
- take several aspects into account to generate a unique node ID
- combine a clock, a counter and some random bits
- hold a state, protected for multi user access

I can generate about 500K UUIDs per second.

Implementation:

Although a UUID should be seen as totally opaque, here is the concrete way I 
generate one:
- the first 8 bytes are the millisecond clock value with the smallest quantity 
first; this means that the later of these 8 bytes will be identical when 
generated with small(er) timespans; within the same millisecond, the full first 
8 bytes will be identical
- the next 2 bytes represent a counter with safe overflow, held as protected 
state inside me; after 2*16 this value will repeat; the counter initalizes with 
a random value
- the next 2 bytes are simply random, based on the system PRNG, Random
- the final 4 bytes represent the node ID; the node ID is unique per instance 
of me, across OS environments where the image might run; the node ID is the MD5 
hash of a string that is the concatenation of several elements (see 
#computeNodeIdentifier)
 
Some bits are set to some predefined value, to indicate the variant and version 
(see #setVariantAndVersion:)

Usage:

  NeoUUIDGenerator next.
  NeoUUIDGenerator current next.
  NeoUUIDGenerator new next.

Sharing an instance is more efficient and correct.
Instances should be reset whenever the image comes up.

See also:

  http://en.wikipedia.org/wiki/UUID
  https://tools.ietf.org/html/rfc4122
===

If we integrate this, I think we should replace the old generator and the use 
of the primitive/plugin. But that requires at least some support apart from me.

And although I think that we should integrate this generator and get rid of the 
plugin, I think there is probably an underlying problem here (why did the 
generator fail ?) that could be important to find.

Sven

> On 08 Feb 2016, at 10:38, Henrik Johansen <henrik.s.johan...@veloxit.no> 
> wrote:
> 
>> 
>> On 08 Feb 2016, at 10:29 , Sven Van Caekenberghe <s...@stfx.eu> wrote:
>> 
>>> 2) Misrepresenting the way the UUID was generated (a combination of node 
>>> identifier + timestamp + random value, similar to type 3, but with 
>>> differently sized/ordered fields) by marking it as being of type 4, which 
>>> is defined to be UUID consisting of random bytes.
>>> IOW, I think it should be marked as type 0 instead of 4, so for the 1 
>>> person in each country who might be found to assume something about the 
>>> implementation based on the type field, won't later feel he's been duped 
>>> when checking the generator.
>> 
>> OK, I certainly want to change the type. Thing is, I cannot find a reference 
>> to type 0 anywhere that I am looking (I mostly used 
>> https://en.wikipedia.org/wiki/Universally_unique_identifier). Where did you 
>> find a definition of type 0 ? Or would that be a way to say 'no specific 
>> type' then ?
> 
> My rationale was that it is currently unassigned, and the least likely number 
> to be chosen as identifier by new versions of the standard.
> IOW, for those who care, it might raise a "hmm, this is strange, better check 
> the source", upon which they will discover it is generated in a non-standard 
> fashion (but can verify for themselves it is generated in a way still pretty 
> much guaranteed to be unique), and the rest... well, they can (most probably) 
> keep on living happily without ever seeing a collision.
> 
> Cheers,
> Henry


Reply via email to