Agreed :)
But I was easy to convince :)

Le 9/2/16 19:09, Sven Van Caekenberghe a écrit :
On 09 Feb 2016, at 17:35, Ben Coman <[email protected]> wrote:

On Tue, Feb 9, 2016 at 11:53 PM, stepharo <[email protected]> wrote:

Le 9/2/16 09:58, Sven Van Caekenberghe a écrit :
I can do the integration too, but I need some people to say go ahead.
I vote for replacing everything, there is no need for a plugin.

me too.
Less plugin!
This is a good general philosophy, but we should benchmark in-image
versus plugin so we can make an informed analytical decision on how
much performance we are willing to trade for the convenience of it
being all in-image, and whether we want to maintain a separate CI job
for keeping the plugin tested.  Overriding that is providing a quick
fix so other development is not blocked, which however Guille's
solution seems sufficient for the code freeze of Pharo 5.
 From a comment in

===
Speedwise, in Pharo 5 Spur on my machine, the new implementation is just as 
fast as the primitive, or so it seems:

[ UUIDGenerator next ] bench. "'1,235,356 per second'"

[ UUID nilUUID primMakeUUID ] bench. "'1,237,213 per second'"
===

But even if it were slower (like half or quarter speed), there is a *HUGE* 
benefit to in image code.

C code will always win, performance wise, but if that is all that counts, why 
do we program in Pharo at all.

Please read << Design Principles Behind Smalltalk >>

   http://www.cs.virginia.edu/~evans/cs655/readings/smalltalk.html

Principle 1:

Personal Mastery: If a system is to serve the creative spirit, it must be 
entirely comprehensible to a single individual.

Further on:

Operating System: An operating system is a collection of things that don't fit 
into a language. There shouldn't be one.

Not that this document is an axiom or anything, but it articulates well some 
very relevant principles.

For all these years, you nor I ever knew what took place in that silly plugin 
(1 method), while a confusing UUIDGenerator made some of us believe that it was 
used or somehow identical to the plugin (that was probably not true), now you, 
me and 99% of all other Pharo developers can read clean code.

For me, Pharo is much more than the next scripting language that links to OS 
libraries. Pharo makes software (development) tangible, understandable in one 
single language/environment.

When is the last time you looked into the open source C code of any library of 
your OS, let alone the kernel itself or one of its drivers ? Probably never. 
But in Pharo we can all look under the hood, everywhere, seamlessly.

Sven

cheers -ben


On 09 Feb 2016, at 09:25, Guille Polito <[email protected]>
wrote:

Sven, just to answer your last question. The UUID generation right now
generates the UUID fields like this:

UUIDGenerator>>generateFieldsVersion4

    timeLow := self generateRandomBitsOfLength: 32.
    timeMid := self generateRandomBitsOfLength: 16.
    timeHiAndVersion := 16r4000 bitOr: (self generateRandomBitsOfLength:
12).
    clockSeqHiAndReserved := 16r80 bitOr: (self
generateRandomBitsOfLength: 6).
    clockSeqLow := self generateRandomBitsOfLength: 8.
    node := self generateRandomBitsOfLength: 48.

So... It's basically completely random. There is no part of the UUID that
is actually based on the node, the clock or the time. It is actually a
random string of bits that are generated using a number from /dev/urandom as
seed (in linux).

Does the mac VM include the plugin? (I do not have a mac any more to test
fast ^^)

I'll work on the integration of NeoUUID now, I hope this is the kind of
issues that got integrated in code-freeze :)

Guille

On 02/08/2016 08:39 PM, Sven Van Caekenberghe wrote:
Here is a new version, in preparation of possible integration in the
main image:

===
Name: Neo-UUID-SvenVanCaekenberghe.2
Author: SvenVanCaekenberghe
Time: 8 February 2016, 8:33:04.141334 pm
UUID: a909453e-35dd-4c25-8273-62a9b2bd982e
Ancestors: Neo-UUID-SvenVanCaekenberghe.1

Streamline UUID generation

Add a current, shared instance

Added better class and method comments

Add more tests

As suggested by Henrik Johansen, change to a version 0 UUID to indicate
that we follow a custom approach
===

The class comments now reads as follows:

===
I am NeoUUIDGenerator, I generate UUIDs.

An RFC4122 Universally Unique Identifier (UUID) is an opaque 128-bit
number that can be used for identification purposes. Concretely, a UUID is a
16 element byte array.

The intent of UUIDs is to enable distributed systems to uniquely
identify information without significant central coordination. In this
context the word unique should be taken to mean "practically unique" rather
than "guaranteed unique".
  I generate UUIDs similar, in spirit, to those defined in RFC4122,
though I use version 0 to indicate that I follow none of the defined
versions. This does not matter much, if at all, in practice.

I try to conform to the following aspects:
  - each 'node' (machine, image, instance) should generate unique UUIDs
  - even when generating UUIDs at a very fast rate, they should remain
unique
- be fast and efficient

To achieve this goal, I
- take several aspects into account to generate a unique node ID
- combine a clock, a counter and some random bits
- hold a state, protected for multi user access

I can generate about 500K UUIDs per second.

Implementation:

Although a UUID should be seen as totally opaque, here is the concrete
way I generate one:
- the first 8 bytes are the millisecond clock value with the smallest
quantity first; this means that the later of these 8 bytes will be identical
when generated with small(er) timespans; within the same millisecond, the
full first 8 bytes will be identical
- the next 2 bytes represent a counter with safe overflow, held as
protected state inside me; after 2*16 this value will repeat; the counter
initalizes with a random value
- the next 2 bytes are simply random, based on the system PRNG, Random
- the final 4 bytes represent the node ID; the node ID is unique per
instance of me, across OS environments where the image might run; the node
ID is the MD5 hash of a string that is the concatenation of several elements
(see #computeNodeIdentifier)
  Some bits are set to some predefined value, to indicate the variant
and version (see #setVariantAndVersion:)

Usage:

   NeoUUIDGenerator next.
   NeoUUIDGenerator current next.
   NeoUUIDGenerator new next.

Sharing an instance is more efficient and correct.
Instances should be reset whenever the image comes up.

See also:

   http://en.wikipedia.org/wiki/UUID
   https://tools.ietf.org/html/rfc4122
===

If we integrate this, I think we should replace the old generator and
the use of the primitive/plugin. But that requires at least some support
apart from me.

And although I think that we should integrate this generator and get rid
of the plugin, I think there is probably an underlying problem here (why did
the generator fail ?) that could be important to find.

Sven

On 08 Feb 2016, at 10:38, Henrik Johansen
<[email protected]> wrote:

On 08 Feb 2016, at 10:29 , Sven Van Caekenberghe <[email protected]> wrote:

2) Misrepresenting the way the UUID was generated (a combination of
node identifier + timestamp + random value, similar to type 3, but with
differently sized/ordered fields) by marking it as being of type 4, which is
defined to be UUID consisting of random bytes.
IOW, I think it should be marked as type 0 instead of 4, so for the 1
person in each country who might be found to assume something about the
implementation based on the type field, won't later feel he's been duped
when checking the generator.
OK, I certainly want to change the type. Thing is, I cannot find a
reference to type 0 anywhere that I am looking (I mostly used
https://en.wikipedia.org/wiki/Universally_unique_identifier). Where did you
find a definition of type 0 ? Or would that be a way to say 'no specific
type' then ?
My rationale was that it is currently unassigned, and the least likely
number to be chosen as identifier by new versions of the standard.
IOW, for those who care, it might raise a "hmm, this is strange, better
check the source", upon which they will discover it is generated in a
non-standard fashion (but can verify for themselves it is generated in a way
still pretty much guaranteed to be unique), and the rest... well, they can
(most probably) keep on living happily without ever seeing a collision.

Cheers,
Henry




Reply via email to