Hi David, Thanks very much for the explanation.
My motivation for hash-consing is as follows. I want to add an id (an integer) that is (at least) unique for every distinct value of certain types in the HOL4 kernel. (The motivation for these ids is on-the-fly proof-recording by emitting a trace of kernel operations.) Hash-consing would allow me to re-use ids for new constructions of values that are equal to existing values. >From your explanation, I understand that Poly/ML's garbage collector is already effectively doing hash-consing via its "sharing" phase. However, if I add ids to my values, they will no longer be equal from the garbage collector's perspective and I will lose the benefits of sharing. (An alternative perspective: if I add ids, I will not know when to re-use the same id because the garbage collector doesn't expose information about whether a structurally equal (modulo the id) value is reachable already.) So this leaves me with two avenues to pursue: 1. Would it be easy to add an alternative interface to the garbage collector wherein weak pointers can point to immutable data? (So I could use that with my own hash-consing, given that I don't care about being precise about discovering whether some object is reachable or not.) 2. Or, could Poly/ML expose some primitives by which I could add unique ids to certain objects (or simply read existing ids if you already have them) while benefiting from the garbage collector's own hash-consing/sharing implementation? Thanks, Ramana On 7 October 2015 at 02:31, David Matthews <[email protected]> wrote: > Hi Ramana, > The reason for only allowing a weak reference to point to a reference has > to do with the notion of equality in ML and the way the Poly/ML system and > the garbage collector in particular deal with that. A reference has an > identity and so it is possible to say whether a particular reference is > pointed at from anywhere else. Equality for references is defined as > pointer equality and Poly/ML respects that. > > Equality for immutable data is by value and in the absence of > PolyML.pointerEq it is not possible to distinguish two pointers to the same > immutable data structure from two pointers to different structures with the > same contents. The garbage collector exploits this both by allowing some > duplication and also by compacting equal data structures when memory is > short. All this means that if a weak pointer could point to an immutable > data structure it would not be possible to give any assurance as to when > the weak pointer would be modified by the garbage collector. That may not > matter if the object is to implement some sort of speed-up but it wouldn't > be acceptable if the idea was to discover whether some other entity was > reachable or not. > > I have to ask what your motivation behind implementing hash-consing is. > The garbage collector already implements a "sharing" phase if memory is > really short and this will effectively hash-cons all the shareable data. > > Regards, > David > > > On 04/10/2015 09:27, Ramana Kumar wrote: > >> Hello, >> >> I am interested in using Poly/ML's Weak structure ( >> http://polyml.org/documentation/Reference/Weak.html) for hash-consing. >> The >> documentation would perhaps suggest that this is not an intended use of >> Weak, so I'd like to ask the list whether it is suitable and, if not, >> whether there are alternatives. >> >> An abstract of the problem is as follows. For a certain datatype, t, we >> want there to be at most one copy of any value of type t created by >> applying a constructor to unique arguments. To achieve this, we implement >> "hash-consing" constructors for t, which check whether the desired value >> exists already and return it if so, and only otherwise create a new value. >> To be able to check which values exist already, we maintain a hash table >> of >> all existing values. To allow values to be garbage collected when they are >> no longer needed, we make the hash table store only weak references to the >> values. >> >> The Weak structure allows weak references to be made to other references, >> but not to other non-reference values. Thus, the values in the hash table >> can be of type (t ref option ref), as produced by Weak.weak, but not of >> type (t option ref). Thus, on the face of it, it seems like using >> Poly/ML's >> Weak implementation will require unnecessary references and indirections. >> Is that indeed the case, or is Poly/ML smart about avoiding unnecessary >> indirections somehow? >> >> For hash-consing, we are not using Weak in the way described in its >> documentation, wherein the contents of the reference don't matter and the >> reference itself is only used as a token to track an external resource. Is >> there, perhaps, an alternative interface to Poly/ML's garbage collector >> more suitable for hash-consing? >> >> Cheers, >> Ramana >> >> >> >> _______________________________________________ >> polyml mailing list >> [email protected] >> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml >> >> _______________________________________________ > polyml mailing list > [email protected] > http://lists.inf.ed.ac.uk/mailman/listinfo/polyml >
_______________________________________________ polyml mailing list [email protected] http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
