> Is there any way of getting the following code to immediately return
> True without performing the element-by-element comparison? Essentially
> this boils down to checking whether pointers are equal before
> comparing the contents.
> 
>> main = print $ f == f
>>      where f = [1..10^9]

Nikhil,

As others pointed out, what you're asking for is not possible in Haskell, and 
for good reasons. However, this is an important problem, and it comes up quite 
often in practice when implementing DSLs: Detecting sharing. So, it's no 
surprise that people developed different ways of dealing with it in various 
forms.

In my mind, Andy Gill came up with the nicest solution in his "Type-Safe 
Observable Sharing in Haskell" paper, published in the 2009 Haskell Symposium. 
Andy's paper traces the technique back to a 1999 paper by Simon Peyton Jones, 
Simon Marlow, and Conal Elliott. Probably few others came up with the same idea 
as well over the years. Andy's paper is a joy to read and I'd highly recommend 
it, you can get a copy at his web site: 
http://www.ittc.ku.edu/csdl/fpg/sites/default/files/Gill-09-TypeSafeReification.pdf.
 

Andy's fundamental observation is that while you cannot check for 
"pointer-equality" in the pure world for obvious reasons, it's perfectly fine 
to do so when you're in the IO monad. He further observes that for almost all 
most practical use cases, this is really not an issue: You're probably in some 
sort of a monad wrapped over IO anyhow: This has certainly been my experience, 
for instance. While the paper has all the details, the "trick" is to use GHC's 
StableName abstraction. If you define:

import System.Mem.StableName

areEqual :: Eq a => a -> a -> IO Bool
areEqual x y = do
   sx <- hashStableName `fmap` (x `seq` makeStableName x)
   sy <- hashStableName `fmap` (y `seq` makeStableName y)
   return $ (sx == sy) || x == y

then areEqual will run quite fast if it indeed receives the same object twice, 
no matter how large it is. In fact, it might even be cyclic! (See Andy's paper 
for details.) However, if the stable-name equality fails, then you are *not* 
guaranteed that the objects are different, hence you further need to run the 
usual "==" on them, which can be quite costly. (Of course "==" can go in loops 
if the objects happen to have cycles in them.)

You can also change the last line of areEqual to read:

    return $ sx == sy

In this case, if it returns True then you're guaranteed that the objects are 
equal. If the result is False, then you just don't know. However it's 
guaranteed that the function will run fast in either case. Client code can 
decide on how to proceed based on that information.

I hope this helps. Reading Andy's paper, and the papers he's cited can further 
elucidate the technique. In fact, Andy uses this idea to turn cyclic structures 
to graphs with explicit back-edges that can be processed much more easily in 
the pure world, something that comes up quite often in practice as well.

-Levent.
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to