[Pharo-dev] Re: Equality/Hash Modification and Image Rehashing

2022-08-30 Thread Sven Van Caekenberghe
A use case in the system is #stonPostReferenceResolution for Dictionary and 
Set. You could have a look at that (it is an edge case, but still).

> On 30 Aug 2022, at 19:00, s...@clipperadams.com wrote:
> 
> From Pharo Discord:
> 
> IIUC when I change the comparison behavior of a class (i.e. reimplement #= 
> and #hash), sets containing those objects start acting oddly e.g. happily 
> storing duplicates of equal objects and failing to find objects they contain. 
> A solution seems to be Set allSubInstances do: #rehash.
> 
> Is my understanding correct? If so, I find it odd that I've been using and 
> researching Smalltalk for well over a decade and I never remember 
> reading/hearing about this. I wonder if it can be baked into the IDE somehow 
> e.g. via a critic...
> 
> And stephan responded:
> 
> The mailing list posts about that I remember were at the time where the 
> number of object bits used for the hash were increased.
> 
> Following up on his clue, I found this old post, which seems to confirm my 
> suspicion. This seems like a serious trap for users to fall into (I’m having 
> flashbacks to my C++ days lol). So, can/should we somehow handle this in the 
> IDE? Something like a refactoring (that has a confirmation window) that gives 
> you the choice to rehash image collections when redefining these methods?
> 
> In a 1GB image, `Set allSubInstances do: #rehash` is almost instantaneous and 
> `Dictionary allSubInstances do: #rehash` takes several seconds.
> 


[Pharo-dev] Re: Equality/Hash Modification and Image Rehashing

2022-08-30 Thread Daniel Slomovits
Everything you say is true, but I don't consider it nearly as serious a
trap as you say, because I wouldn't generally expect a "user" (that is, a
developer not explicitly working on Pharo itself) to modify the #=/#hash
implementation of a system class (that is, one included in the base image
and instances of which may be expected to exist at all times). A user
working on their own application and modifying a #=/#hash implementation
there has a very straightforward way to resolve the problem—"reboot" their
own application, tearing down and rebuilding any structures it uses. For
the applications I work on this is very easy, because it needs to be in
order for tests to set up a clean environment each run.

For a developer who *is* explicitly working on Pharo, I think appropriate
comments in the #=/#hash implementations of classes critical to the
functioning of the image (most obviously the default implementation in
[Proto]Object) would be enough (and may already exist). I know in Dolphin
Smalltalk there are comments in the #identityHash methods explaining the
related point that a binary-filed collection may need to be rehashed on
load as its contents will not have the same identities and thus
#identityHash-es. Dolphin also changed the number of hash bits similar to
what stephan mentioned, and this definitely involved jumping through some
hoops to perform safely.

More generally, I think this is something anyone with a solid computer
science background and/or experience working on language implementations
would be expected to pick up by osmosis and/or reasoning from first
principles. I don't mean this as an insult to people who just want to write
cool apps, but, those are exactly the people who are unlikely to encounter
the problem, or if they do, will shrug, restart everything, and the problem
will go away. That said, there's certainly nothing wrong with some good
comments.

The thing is that the `Set/Dictionary allSubInstances do: #rehash` approach
is extremely heavy-handed. I suppose it should be largely safe, though to
be certain it would need to be done at very high priority to avoid the
possibility of concurrent modification (and even then there is a
theoretical window of vulnerability). But the cost/benefit analysis feels
pretty marginal IMO for making it a part of the IDE. One refinement, if it
were to be added—if there are no instances of the class which is being
modified, no rehashing can possibly be needed, and in fact in most cases I
would expect this to be true. One could additionally scan Sets for
instances of that class before rehashing them, which might not actually be
vastly more performant but would reduce the heavy-handedness of the
operation.

On Tue, Aug 30, 2022 at 1:01 PM  wrote:

> From Pharo Discord:
>
> IIUC when I change the comparison behavior of a class (i.e. reimplement #=
> and #hash), sets containing those objects start acting oddly e.g. happily
> storing duplicates of equal objects and failing to find objects they
> contain. A solution seems to be Set allSubInstances do: #rehash.
>
> Is my understanding correct? If so, I find it odd that I've been using and
> researching Smalltalk for well over a decade and I never remember
> reading/hearing about this. I wonder if it can be baked into the IDE
> somehow e.g. via a critic...
>
> And stephan responded:
>
> The mailing list posts about that I remember were at the time where the
> number of object bits used for the hash were increased.
>
> --
>
> Following up on his clue, I found this old post
> ,
> which seems to confirm my suspicion. This seems like a serious trap for
> users to fall into (I’m having flashbacks to my C++ days lol). So,
> can/should we somehow handle this in the IDE? Something like a refactoring
> (that has a confirmation window) that gives you the choice to rehash image
> collections when redefining these methods?
>
> In a 1GB image, `Set allSubInstances do: #rehash` is almost instantaneous
> and `Dictionary allSubInstances do: #rehash` takes several seconds.
>


[Pharo-dev] Equality/Hash Modification and Image Rehashing

2022-08-30 Thread sean
>From Pharo Discord:

IIUC when I change the comparison behavior of a class (i.e. reimplement #= and 
#hash), sets containing those objects start acting oddly e.g. happily storing 
duplicates of equal objects and failing to find objects they contain. A 
solution seems to be Set allSubInstances do: #rehash.

Is my understanding correct? If so, I find it odd that I've been using and 
researching Smalltalk for well over a decade and I never remember 
reading/hearing about this. I wonder if it can be baked into the IDE somehow 
e.g. via a critic...

And stephan responded:

> The mailing list posts about that I remember were at the time where the 
> number of object bits used for the hash were increased.

---

Following up on his clue, I found [this old 
post](https://lists.pharo.org/empathy/thread/LKCE5UXYPXPLVOGNZYW64BWHYJY5NMHQ?hash=LKCE5UXYPXPLVOGNZYW64BWHYJY5NMHQ#LKCE5UXYPXPLVOGNZYW64BWHYJY5NMHQ
 "this old post"), which seems to confirm my suspicion. This seems like a 
serious trap for users to fall into (I’m having flashbacks to my C++ days lol). 
So, can/should we somehow handle this in the IDE? Something like a refactoring 
(that has a confirmation window) that gives you the choice to rehash image 
collections when redefining these methods? 

In a 1GB image, \`Set allSubInstances do: #rehash\` is almost instantaneous and 
\`Dictionary allSubInstances do: #rehash\` takes several seconds.


[Pharo-dev] Re: Platform Packages vs. Git Branches

2022-08-30 Thread Max Leske
On 29 Aug 2022, at 18:25, Dale Henrichs wrote:

> Max,
>
> Yeah, your approach looks manageable ... with a well defined common code
> base, both branch per platform (fuel) and platform packages (Grease) can be
> used ...
>
> I looked at your github action code and see that you've got a solution for
> triggering multi-branch runs ... workflow per branch/platform ... so I will
> HAVE TO CONSIDER the branch per platform approach again :)

Yeah, not perfect, but it works (reusability isn't a focus of GitHub 
workflows...).

>
> Dale
>
> On Mon, Aug 29, 2022 at 8:53 AM Max Leske  wrote:
>
>> Hi Sean,
>>
>> At some point you will probably come to the same point that Seaside and
>> Fuel have reached in the past: build your own abstraction layer (Grease in
>> Seaside, Fuel-Platform in Fuel). There are valid arguments against this
>> approach but, personally, I didn't have another choice for Fuel.
>>
>> Fuel setup:
>>
>>- A Fuel-Plaform-Core package
>>- 1 Fuel-Platform package per dialect and version
>>- development branch, old "release" branch, current "release" branch
>>for Fuel-Platform
>>- 1 Git branch for Fuel per version
>>- some magic in the baselines to get Metacello to load what I want
>>
>> I used to have the same multiple package approach that you mentioned but
>> it fell apart when I had too many packages with too many differences.
>>
>> Cheers,
>> Max
>>
>> On 25 Aug 2022, at 16:16, s...@clipperadams.com wrote:
>>
>> I’ve always supported multiple platforms (e.g. different Pharo versions)
>> via packages like MyProject-Plaform-Pharo9. Thinking back, the primary
>> reason is that is how I saw it done by other projects. Also, I adopted the
>> practice well before git was in wide use in the Pharo world.
>> However, Jan Bliznicenko recently suggested an alternate workflow which
>> sounds like how Pharo itself is managed: use git branches, with the primary
>> branch supporting only latest Pharo, and other branches only getting
>> critical bug fixes backported.
>> Not sure how that would work for projects that support other dialects e.g.
>> Gemstone and Squeak, since there would then be multiple “latest versions”.
>> I’m interested in opinions about these options as I feel that Magritte is
>> an important community asses and want to keep it compatible on as many
>> platforms as possible (with as little work as possible)! I also get the
>> feeling that many people keep ancient systems in production, and I wonder
>> which they would prefer - a project that is stable on a Pharo version (more
>> or less) when that version is released or having the latest commits,
>> especially bug fixes.
>> Thoughts?
>>
>>



signature.asc
Description: OpenPGP digital signature