Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Sven Van Caekenberghe Wed, 06 Apr 2016 06:15:07 -0700

Henrik,

> On 06 Apr 2016, at 14:55, Henrik Nergaard <[email protected]> wrote:
> 
> Hi Sven,
> 
> Instead of: 
> 
> stonProcessSubObjects: block
>       | didContainStonReferenceAsKey |
>       didContainStonReferenceAsKey := self stonProcessSubObjects: block
> 
>       super stonProcessSubObjects: block.
>       
>       self isHealthy ifFalse: [ self rehash ]..
>       super stonProcessSubObjects: block.
>       didContainStonReferenceAsKey
>               ifTrue: [ self rehash ]
> 
> 
> One can do:
> 
> stonProcessSubObjects: block
> 
>    super stonProcessSubObjects: block.
> 
>     self isHealthy ifFalse: [ self rehash ].
> 
> 
> And this could also be moved to HashedCollection as a generic solution for 
> any hash based collection. Moving it there would require implementing 
> #isHealthy as a subclass responsibility. (all current subclasses (Set, 
> Dictionary) implements it). 
> 
> What do you think?


First, thanks a lot for having a look.

Now, I did spend a lot of time thinking about the best approach, and your 
suggestion was one of the solutions that I considered (and it certainly is the 
most elegant one).

My goal is to make STON both correct and as fast as possible. When 
materialising, there are 2 phases, the second being the resolution of 
references. That is a separate phase because doing it directly 'in place' would 
result in cycles.

Consider the first optimisation one level up:

STONReader>>next
  | object |
  self consumeWhitespace.
  object := self parseValue.
  unresolvedReferences > 0
    ifTrue: [ self processSubObjectsOf: object ].
  ^ object 

Unless there really are unresolved references, it skips the second phase 
completely.

Next, all Dictionaries and Sets are always freshly created and in principle 
(99% of the time) correct and #rehash is expensive (it rebuilds them again). We 
have to try to avoid this.

IMO, #isHealthy is more expensive than the manual tests that I added 
(#containsStonReferenceAsKey and #containsStonReference). Why ? Because 
#isHealthy always checks all elements, and checks all their hashes, while my 
tests back out immediately and don't do any hashing. Would you agree ?

Still, this is all a balancing act where the pro's and con's have to be 
carefully considered. Thanks for thinking along.

Sven

> Best regards,
> Henrik
> 
> 
> -----Original Message-----
> From: Pharo-dev [mailto:[email protected]] On Behalf Of Sven 
> Van Caekenberghe
> Sent: Wednesday, April 6, 2016 2:27 PM
> To: Pharo Development List <[email protected]>
> Subject: Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a 
> dictionary if the keys are references
> 
> Fix for review:
> 
> ===
> Name: STON-Core-SvenVanCaekenberghe.71
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:24.782251 pm
> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> Ancestors: STON-Core-SvenVanCaekenberghe.70
> 
> Fix a bug where STONReferences occurring as keys in Dictionaries or elements 
> in Sets caused those to be unhealthy after materialization. Thx to Peter 
> Uhnák for reporting this issue.
> 
> Add 3 new unit tests to STONReaderTests
> 
> #testDictionaryWithReferenceKeys
> #testSetWithReferenceElements
> #testDeepStructure
> 
> Fix Details
> 
> change the implementation of STONReader>>#processSubObjectsOf: from iterative 
> to recursive (see version 39 of 29 November 2012, this might be a functional 
> regression, see #testDeepStructure; cleanup of stack instance variable for 
> later) so that #stonProcessSubObjects: can be overwritten with code being 
> executed before or after full reference resolution
> 
> imho, recursion stack depth will be equal during both writing and reading, 
> and should be acceptable.
> 
> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the 
> end, but only when needed (minimal optimalization, see 
> Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference) ===
> Name: STON-Tests-SvenVanCaekenberghe.63
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:45.01986 pm
> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> Ancestors: STON-Tests-SvenVanCaekenberghe.62
> 
> (idem)
> ===
> 
>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[email protected]> wrote:
>> 
>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Di
>> ctionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>> 
>> fix coming
>> 
>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[email protected]> wrote:
>>> 
>>>> 
>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[email protected]> wrote:
>>>> 
>>>> 
>>>> 
>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque 
>>>> <[email protected]>:
>>>> 
>>>> 
>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>> 
>>>>> Like I said, it is a hashing issue, sometimes it will be correct by 
>>>>> accident.
>>>>> 
>>>>> I hope you did not have to much trouble with this bug, I guess it must 
>>>>> have been hard to chase.
>>>>> 
>>>>> Is it urgent ?
>>>>> 
>>>>> I probably can give you a quick fix, but I would like to think a bit more 
>>>>> about this, since rehashing each materialised dictionary seems expensive.
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> Hi Sven,
>>>> 
>>>> I got the same kind of problem in a personal application.
>>>> 
>>>> I use Sets that I serialize and I had a lot of trouble because 
>>>> sometimes some action had strange behaviours.
>>>> 
>>>> For example in a set with element `aSet remove: aSet anyOne` raised 
>>>> 'XXX not found in aSet'.
>>>> 
>>>> I am glad to hear that it is a Ston issue and not me that used sets 
>>>> in a bad way :)
>>>> 
>>>> For me too it is not urgent since I have a not of university work 
>>>> for the moment.
>>>> 
>>>> 
>>>> 
>>>> How are hashed collections created/filled during ston-parsing ?
>>>> If the position in a hashed collection is created by a 
>>>> ston-reference, that is later replaced by the "real" object, the index in 
>>>> the dictionary  (or other hashed collections) may be wrong.
>>> 
>>> Yes, that is indeed it, Nicolai. 
>>> 
>>> But I would like to try to minimise the rehashing as it seems expensive. 
>>> But first I need a more reliable failure.
>>> 
>>>> --
>>>> Cyril Ferlicot
>>>> 
>>>> http://www.synectique.eu
>>>> 
>>>> 165 Avenue Bretagne
>>>> Lille 59000 France
>> 
> 
>

Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a dictionary if the keys are references

Reply via email to