Hi David,

Thanks very much for the explanation.

My motivation for hash-consing is as follows. I want to add an id (an
integer) that is (at least) unique for every distinct value of certain
types in the HOL4 kernel. (The motivation for these ids is on-the-fly
proof-recording by emitting a trace of kernel operations.) Hash-consing
would allow me to re-use ids for new constructions of values that are equal
to existing values.

>From your explanation, I understand that Poly/ML's garbage collector is
already effectively doing hash-consing via its "sharing" phase. However, if
I add ids to my values, they will no longer be equal from the garbage
collector's perspective and I will lose the benefits of sharing. (An
alternative perspective: if I add ids, I will not know when to re-use the
same id because the garbage collector doesn't expose information about
whether a structurally equal (modulo the id) value is reachable already.)

So this leaves me with two avenues to pursue:

   1. Would it be easy to add an alternative interface to the garbage
   collector wherein weak pointers can point to immutable data? (So I could
   use that with my own hash-consing, given that I don't care about being
   precise about discovering whether some object is reachable or not.)
   2. Or, could Poly/ML expose some primitives by which I could add unique
   ids to certain objects (or simply read existing ids if you already have
   them) while benefiting from the garbage collector's own
   hash-consing/sharing implementation?


Thanks,
Ramana

On 7 October 2015 at 02:31, David Matthews <[email protected]>
wrote:

> Hi Ramana,
> The reason for only allowing a weak reference to point to a reference has
> to do with the notion of equality in ML and the way the Poly/ML system and
> the garbage collector in particular deal with that.  A reference has an
> identity and so it is possible to say whether a particular reference is
> pointed at from anywhere else.  Equality for references is defined as
> pointer equality and Poly/ML respects that.
>
> Equality for immutable data is by value and in the absence of
> PolyML.pointerEq it is not possible to distinguish two pointers to the same
> immutable data structure from two pointers to different structures with the
> same contents.  The garbage collector exploits this both by allowing some
> duplication and also by compacting equal data structures when memory is
> short.  All this means that if a weak pointer could point to an immutable
> data structure it would not be possible to give any assurance as to when
> the weak pointer would be modified by the garbage collector.  That may not
> matter if the object is to implement some sort of speed-up but it wouldn't
> be acceptable if the idea was to discover whether some other entity was
> reachable or not.
>
> I have to ask what your motivation behind implementing hash-consing is.
> The garbage collector already implements a "sharing" phase if memory is
> really short and this will effectively hash-cons all the shareable data.
>
> Regards,
> David
>
>
> On 04/10/2015 09:27, Ramana Kumar wrote:
>
>> Hello,
>>
>> I am interested in using Poly/ML's Weak structure (
>> http://polyml.org/documentation/Reference/Weak.html) for hash-consing.
>> The
>> documentation would perhaps suggest that this is not an intended use of
>> Weak, so I'd like to ask the list whether it is suitable and, if not,
>> whether there are alternatives.
>>
>> An abstract of the problem is as follows. For a certain datatype, t, we
>> want there to be at most one copy of any value of type t created by
>> applying a constructor to unique arguments. To achieve this, we implement
>> "hash-consing" constructors for t, which check whether the desired value
>> exists already and return it if so, and only otherwise create a new value.
>> To be able to check which values exist already, we maintain a hash table
>> of
>> all existing values. To allow values to be garbage collected when they are
>> no longer needed, we make the hash table store only weak references to the
>> values.
>>
>> The Weak structure allows weak references to be made to other references,
>> but not to other non-reference values. Thus, the values in the hash table
>> can be of type (t ref option ref), as produced by Weak.weak, but not of
>> type (t option ref). Thus, on the face of it, it seems like using
>> Poly/ML's
>> Weak implementation will require unnecessary references and indirections.
>> Is that indeed the case, or is Poly/ML smart about avoiding unnecessary
>> indirections somehow?
>>
>> For hash-consing, we are not using Weak in the way described in its
>> documentation, wherein the contents of the reference don't matter and the
>> reference itself is only used as a token to track an external resource. Is
>> there, perhaps, an alternative interface to Poly/ML's garbage collector
>> more suitable for hash-consing?
>>
>> Cheers,
>> Ramana
>>
>>
>>
>> _______________________________________________
>> polyml mailing list
>> [email protected]
>> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
>>
>> _______________________________________________
> polyml mailing list
> [email protected]
> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
>
_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Reply via email to