On Sun, Dec 11, 2016 at 7:49 PM, Henrik Nergaard <[email protected]> wrote:
> >The timing problem might indeed be resolution dependent: 100us is > probably to short sometimes. > Time microsecondClockValue gives the time in microseconds, but it does not > guarantee microsecond resolution, on windows the resolution is still in > milliseconds. > Useful to know. But regardless of the resolution, the units are still microseconds. So the majority of the time you'll assert: 3622 - 3622 < 100 which passes, but other times you'll cross the boundary and you'll assert: ...4622 - ...3622 < 100, which fails. The important question is, what resolution of time is important for UUIDs? If 10ms is sufficient, then ...4622 - ...3622 < 10000 easily passes on even the slowest computer.. > ---------- > | col | > > col := Array new: 30000 streamContents: [ :stream | > 1 to: 30000 do: [ :x | stream nextPut: Time microsecondClockValue ] > ]. > > col asSet "a Set(3658909536504622 3658909536503622 3658909536502622)"' > btw I forgot to mention I'm on 32-bit Linux. So thankfully I get... col asSet size "==>16135" btw2, For a challenge I'm lining up to improve the VM clock "sometime", maybe end of Feb (if it proves within my capability no one else gets to it first). https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/36 > > -----Original Message----- > From: Pharo-dev [mailto:[email protected]] On Behalf Of > Sven Van Caekenberghe > Sent: Sunday, December 11, 2016 11:56 AM > To: Pharo Development List <[email protected]> > Subject: Re: [Pharo-dev] NeoUUIDGenerator>>testTwoSameGenerator - > intermittent CI failure > > Hi Ben, > > Thanks for looking into this, it was bothering me for quite some time ;-) > > https://pharo.fogbugz.com/f/cases/19453/NeoUUIDGenerator- > testTwoSameGenerator-intermittent-CI-failure > > Yes, I think the counter roll-over is the problem, stupid of me to to have > seen this. > > The timing problem might indeed be resolution dependent: 100us is probably > to short sometimes. > > I will provide a slice with fixes. > Cool. Ping me from the slice. cheers -ben > > Thanks again ! > > Regards, > > Sven > > > On 11 Dec 2016, at 09:07, Ben Coman <[email protected]> wrote: > > > > For Case [1] I got an unrelated CI failure [2] for > > NeoUUIDGeneratorTests>>#testTwoSameGenerator > > > > Checking this locally by running this a dozen times... > > 1 to: 1000 do: [:n| last:=n. NeoUUIDGeneratorTests new setUp > > testTwoSameGenerator ] > > > > Two distinct failures are apparent... > > > > 1. Takes longer than expected. > > self assert: (time2 - time1) abs < 100 > > last=1 ==> a time difference of 109 > > last=1 ==> a time difference of 102 > > last=1 ==> a time difference of 155 > > last=1 ==> a time difference of 105 > > Interesting that whenever this happens its on the first iteration. > > > > 2. Invalid consecutive ids. > > self assert: (uuid1 at: 10) + 1 equals: (uuid2 at: 10). > > last=127 ==> TestFailure: Got 256 instead of 0. > > last=828 ==> TestFailure: Got 256 instead of 0. > > last=128 ==> TestFailure: Got 256 instead of 0. > > last=316 ==> TestFailure: Got 256 instead of 0. > > Maybe a problem with rollover? > > > > So instrumenting a bit: > > testTwoSameGenerator: n > > | uuid1 uuid2 consecutiveOk time1 time2 timeOk | > > .... > > "same generator, consequetive counters" > > consecutiveOk := ((uuid1 at: 10) + 1) = (uuid2 at: 10). > > "same generators, similar clock, within 100 milliseconds" > > time1 := self timeFromUUID: uuid1. > > time2 := self timeFromUUID: uuid2. > > timeOk := (time2 - time1) abs < 100. > > ( timeOk and: [ consecutiveOk ]) ifFalse: [ > > Transcript cr; > > show: 'fail n=', (n printPaddedWith: $0 to: 3) ; tab; > > show: ' time ', ((time2 - time1) printString) , '<100'; tab; > > show: ' uuid ', (((uuid1 at: 10) + 1) printString, '=', > (uuid2 at: 10) printString). > > ]. > > self assert: consecutiveOk. > > self assert: timeOk. > > > > From playground running... > > 1 to: 1000 do: [:n| > > NeoUUIDGeneratorTests new setUp testTwoSameGenerator: n]. > > Transcript crShow: 'x'. > > > > Note the 'x' was not output due to one of the consecutiveOk or timeOk > asserts triggered each run. > > fail n=001 time 104<100 uuid 160=160 > > fail n=269 time 2<100 uuid 256=0 > > fail n=001 time 152<100 uuid 166=166 > > fail n=117 time 3<100 uuid 256=0 > > fail n=001 time 123<100 uuid 128=128 > > fail n=278 time 3<100 uuid 256=0 > > fail n=001 time 145<100 uuid 230=230 > > fail n=673 time 1340<100 uuid 141=141 > > fail n=707 time 3<100 uuid 256=0 > > fail n=001 time 132<100 uuid 73=73 > > fail n=001 time 117<100 uuid 206=206 > > fail n=063 time 2<100 uuid 256=0 > > fail n=345 time 3<100 uuid 256=0 > > > > Now a strange thing is that after commenting out those two asserts for > consecutiveOk and timeOk, the fails at n=001 stop happening. > > > > fail n=001 time 141<100 uuid 232=232 > > x > > x > > fail n=201 time 3<100 uuid 256=0 > > fail n=268 time 3<100 uuid 256=0 > > fail n=359 time 3<100 uuid 256=0 > > fail n=408 time 2<100 uuid 256=0 > > fail n=559 time 3<100 uuid 256=0 > > x > > fail n=139 time 3<100 uuid 256=0 > > fail n=854 time 7<100 uuid 256=0 > > x > > fail n=389 time 3<100 uuid 256=0 > > fail n=669 time 3<100 uuid 256=0 > > fail n=876 time 3<100 uuid 256=0 > > fail n=916 time 3<100 uuid 256=0 > > fail n=953 time 2<100 uuid 256=0 > > fail n=954 time 3<100 uuid 256=0 > > x > > fail n=148 time 3<100 uuid 256=0 > > fail n=741 time 3<100 uuid 256=0 > > x > > fail n=116 time 2<100 uuid 256=0 > > fail n=480 time 3<100 uuid 256=0 > > x > > > > After treating the rollover like this... > > consecutiveOk := ((uuid1 at: 10) + 1) \\ 256 = (uuid2 at: 10). > > running the last playground script gives... > > > > (Note the n=001 fail reoccurs after waiting a while between running the > playground script again.) > > fail n=001 time 123<100 uuid 110=110 > > x > > fail n=345 time 1287<100 uuid 24=24 > > fail n=404 time 1313<100 uuid 125=125 > > x > > x > > x > > x > > x > > fail n=449 time 1290<100 uuid 24=24 > > x > > x > > x > > x > > x > > > > > > Thats looking better, so upping to 10000 iterations... > > fail n=0001 time 151<100 uuid 160=160 > > fail n=2977 time 1437<100 uuid 221=221 > > fail n=5498 time 1454<100 uuid 224=224 > > fail n=9007 time 1351<100 uuid 239=239 > > x > > fail n=3389 time 1381<100 uuid 223=223 > > fail n=7768 time 1365<100 uuid 82=82 > > fail n=9773 time 1395<100 uuid 164=164 > > x > > fail n=3325 time 1448<100 uuid 67=67 > > fail n=9930 time 1456<100 uuid 103=103 > > x > > fail n=0060 time 2914<100 uuid 159=159 > > fail n=1829 time 1438<100 uuid 32=32 > > fail n=7918 time 1436<100 uuid 182=182 > > x > > > > So the rollover looks cured. > > > > Now adding some more instrumentation to view GC Time and Time To Run > > and (see attached ST file) > > > > 0:00:00:00 TTR 0:00:00:00.001 TTR 0:00:00:00 TTR 0:00:00:00 > TTR 0:00:00:00 TTR 0:00:00:00 TTR 0:00:00:00 TTR 0:00:00:00 TTR > 0:00:00:00 TTR 0:00:00:00 > > x > > fail n=0001 time 115<100 uuid 157=157 GCTime 0 TTR 0:00:00:00 > > fail n=0001 time 2<100 uuid 255=0 GCTime 0 TTR 0:00:00:00 > > x > > fail n=0001 time 111<100 uuid 104=104 GCTime 0 TTR 0 > > x > > fail n=0001 time 120<100 uuid 204=204 GCTime 0 TTR 0 > > fail n=0001 time 1182<100 uuid 103=103 GCTime 1 TTR 1 > > x > > x > > x > > x > > x > > fail n=0001 time 3<100 uuid 255=0 GCTime 0 TTR 0 > > x > > x > > fail n=0001 time 1251<100 uuid 253=253 GCTime 2 TTR 1 > > x > > > > > > Now I wonder at the class comment saying... > > "the first 8 bytes are the *millisecond* clock value" > > but #clockOn: is using *microsecond* clock. > > Perhaps "self assert: (time2 - time1) abs < 100" > > is meant to be 100ms not 100us ?? > > > > cheers -ben > > > > [1] https://pharo.fogbugz.com/f/cases/19438 > > [2] > > https://ci.inria.fr/pharo/job/Pharo-6.0-Issue-Validator/19471//artifac > > t/validationReport.html > > <NeoUUIDGeneratorTests-testTwoSameGenerator.st> > > > >
