Henrik,
> On 06 Apr 2016, at 14:55, Henrik Nergaard <[email protected]> wrote:
>
> Hi Sven,
>
> Instead of:
>
> stonProcessSubObjects: block
> | didContainStonReferenceAsKey |
> didContainStonReferenceAsKey := self stonProcessSubObjects: block
>
> super stonProcessSubObjects: block.
>
> self isHealthy ifFalse: [ self rehash ]..
> super stonProcessSubObjects: block.
> didContainStonReferenceAsKey
> ifTrue: [ self rehash ]
>
>
> One can do:
>
> stonProcessSubObjects: block
>
> super stonProcessSubObjects: block.
>
> self isHealthy ifFalse: [ self rehash ].
>
>
> And this could also be moved to HashedCollection as a generic solution for
> any hash based collection. Moving it there would require implementing
> #isHealthy as a subclass responsibility. (all current subclasses (Set,
> Dictionary) implements it).
>
> What do you think?
First, thanks a lot for having a look.
Now, I did spend a lot of time thinking about the best approach, and your
suggestion was one of the solutions that I considered (and it certainly is the
most elegant one).
My goal is to make STON both correct and as fast as possible. When
materialising, there are 2 phases, the second being the resolution of
references. That is a separate phase because doing it directly 'in place' would
result in cycles.
Consider the first optimisation one level up:
STONReader>>next
| object |
self consumeWhitespace.
object := self parseValue.
unresolvedReferences > 0
ifTrue: [ self processSubObjectsOf: object ].
^ object
Unless there really are unresolved references, it skips the second phase
completely.
Next, all Dictionaries and Sets are always freshly created and in principle
(99% of the time) correct and #rehash is expensive (it rebuilds them again). We
have to try to avoid this.
IMO, #isHealthy is more expensive than the manual tests that I added
(#containsStonReferenceAsKey and #containsStonReference). Why ? Because
#isHealthy always checks all elements, and checks all their hashes, while my
tests back out immediately and don't do any hashing. Would you agree ?
Still, this is all a balancing act where the pro's and con's have to be
carefully considered. Thanks for thinking along.
Sven
> Best regards,
> Henrik
>
>
> -----Original Message-----
> From: Pharo-dev [mailto:[email protected]] On Behalf Of Sven
> Van Caekenberghe
> Sent: Wednesday, April 6, 2016 2:27 PM
> To: Pharo Development List <[email protected]>
> Subject: Re: [Pharo-dev] [Pharo-users] STON materialization corrupts a
> dictionary if the keys are references
>
> Fix for review:
>
> ===
> Name: STON-Core-SvenVanCaekenberghe.71
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:24.782251 pm
> UUID: 64b8b741-365e-41fe-aa98-565e33ca5d24
> Ancestors: STON-Core-SvenVanCaekenberghe.70
>
> Fix a bug where STONReferences occurring as keys in Dictionaries or elements
> in Sets caused those to be unhealthy after materialization. Thx to Peter
> Uhnák for reporting this issue.
>
> Add 3 new unit tests to STONReaderTests
>
> #testDictionaryWithReferenceKeys
> #testSetWithReferenceElements
> #testDeepStructure
>
> Fix Details
>
> change the implementation of STONReader>>#processSubObjectsOf: from iterative
> to recursive (see version 39 of 29 November 2012, this might be a functional
> regression, see #testDeepStructure; cleanup of stack instance variable for
> later) so that #stonProcessSubObjects: can be overwritten with code being
> executed before or after full reference resolution
>
> imho, recursion stack depth will be equal during both writing and reading,
> and should be acceptable.
>
> overwrite #stonProcessSubObjects: in Dictionary and Set to #rehash at the
> end, but only when needed (minimal optimalization, see
> Dictionary>>#containsStonReferenceAsKey and Set>>#containsStonReference) ===
> Name: STON-Tests-SvenVanCaekenberghe.63
> Author: SvenVanCaekenberghe
> Time: 6 April 2016, 2:22:45.01986 pm
> UUID: 0beb2322-b81a-46ee-a0e2-6648a808774a
> Ancestors: STON-Tests-SvenVanCaekenberghe.62
>
> (idem)
> ===
>
>> On 06 Apr 2016, at 14:04, Sven Van Caekenberghe <[email protected]> wrote:
>>
>> https://pharo.fogbugz.com/f/cases/17946/STON-materializes-unhealthy-Di
>> ctionaries-and-Sets-when-references-occur-in-its-keys-or-elements
>>
>> fix coming
>>
>>> On 05 Apr 2016, at 13:11, Sven Van Caekenberghe <[email protected]> wrote:
>>>
>>>>
>>>> On 05 Apr 2016, at 13:02, Nicolai Hess <[email protected]> wrote:
>>>>
>>>>
>>>>
>>>> 2016-04-05 12:32 GMT+02:00 Cyril Ferlicot Delbecque
>>>> <[email protected]>:
>>>>
>>>>
>>>> On 05/04/2016 12:09, Sven Van Caekenberghe wrote:
>>>>
>>>>> Like I said, it is a hashing issue, sometimes it will be correct by
>>>>> accident.
>>>>>
>>>>> I hope you did not have to much trouble with this bug, I guess it must
>>>>> have been hard to chase.
>>>>>
>>>>> Is it urgent ?
>>>>>
>>>>> I probably can give you a quick fix, but I would like to think a bit more
>>>>> about this, since rehashing each materialised dictionary seems expensive.
>>>>>
>>>>>
>>>>>
>>>>
>>>> Hi Sven,
>>>>
>>>> I got the same kind of problem in a personal application.
>>>>
>>>> I use Sets that I serialize and I had a lot of trouble because
>>>> sometimes some action had strange behaviours.
>>>>
>>>> For example in a set with element `aSet remove: aSet anyOne` raised
>>>> 'XXX not found in aSet'.
>>>>
>>>> I am glad to hear that it is a Ston issue and not me that used sets
>>>> in a bad way :)
>>>>
>>>> For me too it is not urgent since I have a not of university work
>>>> for the moment.
>>>>
>>>>
>>>>
>>>> How are hashed collections created/filled during ston-parsing ?
>>>> If the position in a hashed collection is created by a
>>>> ston-reference, that is later replaced by the "real" object, the index in
>>>> the dictionary (or other hashed collections) may be wrong.
>>>
>>> Yes, that is indeed it, Nicolai.
>>>
>>> But I would like to try to minimise the rehashing as it seems expensive.
>>> But first I need a more reliable failure.
>>>
>>>> --
>>>> Cyril Ferlicot
>>>>
>>>> http://www.synectique.eu
>>>>
>>>> 165 Avenue Bretagne
>>>> Lille 59000 France
>>
>
>