AAAANNNDDD! It looks I found the real cause of this!

- I tested an image pre-new session manager, and an image post-new session manager. The issue only appeared in the latter.

- Checking, it seems that UUIDGenerator is not subscribed to the new startup list. This means that the UUIDGenerator is not being reinitialized on every startup. This means, moreover, that every person that is loading the latest Pharo image is using the same UUIDGenerator instance, with the same random seed => same generated UUIDs.

So, adding UUIDGenerator to the list would be the simplest solution and the integration of NeoUUIDGenerator can be moved to Pharo6 maybe.

However, now that we found that something was missing in the startup list, we should check for others...

Guille

On 02/09/2016 10:36 AM, Guille Polito wrote:
Yes, go on. I think it is the easiest.

Right now, to check if a slice is good or not I have to commit, browse Smalltalkhub's UI, and if it not good I have to delete the mcz manually...

On 02/09/2016 10:10 AM, Sven Van Caekenberghe wrote:
You want me to do it then ?

I don't want us to do double work ...

On 09 Feb 2016, at 10:09, Guillermo Polito <[email protected]> wrote:

Sad and true... :'(

On Tue, Feb 9, 2016 at 10:01 AM, Sven Van Caekenberghe <[email protected]> wrote:
Beside, you can't make slices ;-)

On 09 Feb 2016, at 09:58, Sven Van Caekenberghe <[email protected]> wrote:

I can do the integration too, but I need some people to say go ahead.
I vote for replacing everything, there is no need for a plugin.

On 09 Feb 2016, at 09:25, Guille Polito <[email protected]> wrote:

Sven, just to answer your last question. The UUID generation right now generates the UUID fields like this:

UUIDGenerator>>generateFieldsVersion4

   timeLow := self generateRandomBitsOfLength: 32.
   timeMid := self generateRandomBitsOfLength: 16.
timeHiAndVersion := 16r4000 bitOr: (self generateRandomBitsOfLength: 12). clockSeqHiAndReserved := 16r80 bitOr: (self generateRandomBitsOfLength: 6).
   clockSeqLow := self generateRandomBitsOfLength: 8.
   node := self generateRandomBitsOfLength: 48.

So... It's basically completely random. There is no part of the UUID that is actually based on the node, the clock or the time. It is actually a random string of bits that are generated using a number from /dev/urandom as seed (in linux).

Does the mac VM include the plugin? (I do not have a mac any more to test fast ^^)

I'll work on the integration of NeoUUID now, I hope this is the kind of issues that got integrated in code-freeze :)

Guille

On 02/08/2016 08:39 PM, Sven Van Caekenberghe wrote:
Here is a new version, in preparation of possible integration in the main image:

===
Name: Neo-UUID-SvenVanCaekenberghe.2
Author: SvenVanCaekenberghe
Time: 8 February 2016, 8:33:04.141334 pm
UUID: a909453e-35dd-4c25-8273-62a9b2bd982e
Ancestors: Neo-UUID-SvenVanCaekenberghe.1

Streamline UUID generation

Add a current, shared instance

Added better class and method comments

Add more tests

As suggested by Henrik Johansen, change to a version 0 UUID to indicate that we follow a custom approach
===

The class comments now reads as follows:

===
I am NeoUUIDGenerator, I generate UUIDs.

An RFC4122 Universally Unique Identifier (UUID) is an opaque 128-bit number that can be used for identification purposes. Concretely, a UUID is a 16 element byte array.

The intent of UUIDs is to enable distributed systems to uniquely identify information without significant central coordination. In this context the word unique should be taken to mean "practically unique" rather than "guaranteed unique". I generate UUIDs similar, in spirit, to those defined in RFC4122, though I use version 0 to indicate that I follow none of the defined versions. This does not matter much, if at all, in practice.

I try to conform to the following aspects:
- each 'node' (machine, image, instance) should generate unique UUIDs - even when generating UUIDs at a very fast rate, they should remain unique
- be fast and efficient

To achieve this goal, I
- take several aspects into account to generate a unique node ID
- combine a clock, a counter and some random bits
- hold a state, protected for multi user access

I can generate about 500K UUIDs per second.

Implementation:

Although a UUID should be seen as totally opaque, here is the concrete way I generate one: - the first 8 bytes are the millisecond clock value with the smallest quantity first; this means that the later of these 8 bytes will be identical when generated with small(er) timespans; within the same millisecond, the full first 8 bytes will be identical - the next 2 bytes represent a counter with safe overflow, held as protected state inside me; after 2*16 this value will repeat; the counter initalizes with a random value - the next 2 bytes are simply random, based on the system PRNG, Random - the final 4 bytes represent the node ID; the node ID is unique per instance of me, across OS environments where the image might run; the node ID is the MD5 hash of a string that is the concatenation of several elements (see #computeNodeIdentifier) Some bits are set to some predefined value, to indicate the variant and version (see #setVariantAndVersion:)

Usage:

  NeoUUIDGenerator next.
  NeoUUIDGenerator current next.
  NeoUUIDGenerator new next.

Sharing an instance is more efficient and correct.
Instances should be reset whenever the image comes up.

See also:

  http://en.wikipedia.org/wiki/UUID
  https://tools.ietf.org/html/rfc4122
===

If we integrate this, I think we should replace the old generator and the use of the primitive/plugin. But that requires at least some support apart from me.

And although I think that we should integrate this generator and get rid of the plugin, I think there is probably an underlying problem here (why did the generator fail ?) that could be important to find.

Sven

On 08 Feb 2016, at 10:38, Henrik Johansen <[email protected]> wrote:

On 08 Feb 2016, at 10:29 , Sven Van Caekenberghe <[email protected]> wrote:

2) Misrepresenting the way the UUID was generated (a combination of node identifier + timestamp + random value, similar to type 3, but with differently sized/ordered fields) by marking it as being of type 4, which is defined to be UUID consisting of random bytes. IOW, I think it should be marked as type 0 instead of 4, so for the 1 person in each country who might be found to assume something about the implementation based on the type field, won't later feel he's been duped when checking the generator.
OK, I certainly want to change the type. Thing is, I cannot find a reference to type 0 anywhere that I am looking (I mostly used https://en.wikipedia.org/wiki/Universally_unique_identifier). Where did you find a definition of type 0 ? Or would that be a way to say 'no specific type' then ?
My rationale was that it is currently unassigned, and the least likely number to be chosen as identifier by new versions of the standard. IOW, for those who care, it might raise a "hmm, this is strange, better check the source", upon which they will discover it is generated in a non-standard fashion (but can verify for themselves it is generated in a way still pretty much guaranteed to be unique), and the rest... well, they can (most probably) keep on living happily without ever seeing a collision.

Cheers,
Henry




Reply via email to