#3480: Easily make Typeable keys pure, so that Typeable can be handled
efficiently across communications
-------------------------------------+--------------------------------------
Reporter: guest | Owner:
Type: task | Status: new
Priority: normal | Milestone: 6.14.1
Component: libraries/base | Version:
Severity: trivial | Resolution:
Keywords: Typeable, efficiency | Difficulty: Unknown
Testcase: | Os: Unknown/Multiple
Architecture: Unknown/Multiple |
-------------------------------------+--------------------------------------
Old description:
> Data.Typeable: Easily make Typeable keys pure(used in Eq), so that
> Typeable keys don´t vary from run to run. This permits an efficient
> storage of the keys in files and to be transmitted trough communications
> as well as processed without loss of efficiency. Actually gaining
> efficiency probably, since the keys caches are not necessary.
>
> Currently, whenever the user needs to communicate types, he must transmit
> the full string name for each type. Moreover, in the reception, the
> programmer is forced to handle these full string keys for mapping types
> to handlers, in equality checks etc. if the type keys are pure, the
> efficiency of key handling can be keept across communications.
>
> short description of task:
> Istead of using a Hash( stringType, generatedKey) use hashString
> (string-of-type)
>
> Long description:
>
> 1) drop the cache
> drop newKey
>
> 2) use instead the expression:
>
> Key $ hashString str
>
> whenever a new key is needed
>
> from the package Data.HashTable:
> hashString :: String -> Int
>
> the key obtained is pure so:
>
> 3) drop the "IO" in typeRepKey signature
New description:
Data.Typeable: Easily make Typeable keys pure(used in Eq), so that
Typeable keys don´t vary from run to run. This permits an efficient
storage of the keys in files and to be transmitted trough communications
as well as processed without loss of efficiency. Actually gaining
efficiency probably, since the keys caches are not necessary.
Currently, whenever the user needs to communicate types, he must transmit
the full string name for each type. Moreover, in the reception, the
programmer is forced to handle these full string keys for mapping types
to handlers, in equality checks etc. if the type keys are pure, the
efficiency of key handling can be keept across communications.
short description of task:
Istead of using a Hash( stringType, generatedKey) use hashString (string-
of-type)
Long description:
1) drop the cache; drop newKey
2) use instead the expression:
{{{
Key $ hashString str
}}}
whenever a new key is needed from the package `Data.HashTable`:
{{{
hashString :: String -> Int
}}}
the key obtained is pure so:
3) drop the "IO" in `typeRepKey` signature
Comment (by simonpj):
Something that does a better job of `TypeRep` would certainly be welcome:
1. Those `unsafePerformIO`s are embarrassing. And indeed they lead to
trouble if (as can happen in GHCi or Template Haskell) you get two
instances of that hash table.
2. If you were (naughtily) to serialise one of those keys into a file,
they'd be wrong when you read them back in. But this never happens (see
the `Show` instance for `TypeRep`: what you serialise is the `TypeRep`
data structure ''without'' the keys. When it's read back in a new set of
keys is constructed. (Actually there is no `Read` instance but that is
what would happen if there were.)
The keys are ''solely'' used to speed up comparison. Comparing the data
structures and strings is the underlying ground truth. So it's perfectly
fine for keys to vary from run to run.
However, it's vital that if two `TypeRep`s with the same key really are
equal. Hashing the string to an `Int` does not guarantee that. To be
sure, you'd need a lot more bits, something like the fingerprints we store
in interface files. See `compiler/utils/Fingerprint.hsc`.
I'm not sure about the best approach. The baseline is:
* Compare `TypeRep` structures as one would any data structure.
* At a `TyCon` compare the `String` representation of the type
constructor.
An alternative, that would do the fingerprint stuff once and for all at
compile time, is:
* Compare `TypeRep` structures as one would any data structure.
* At a `TyCon` compare the fingerprint of the string representation of
the `TyCon`
Neither of these give constant-time comparison of `TypeRep`s, but I doubt
that such comparisons are going to be in the inner loop of any program.
And both fix (1) above
Other things to think about:
* Surely `TypeRep` should be in `Ord`!
* We should check that the `String` used to make a `Typeable.TyCon`
includes the full, versioned, package Id.
* Even then I'm slightly worried. The expection is that `foo-3.2:Foo.T`
uniquely defines the entire structure of `T`. But what if you forget to
bump the package version?
* More subtly what if `T` is defined thus:
{{{
data T = MkT S -- S comes from package bar
}}}
Then if S's definition in package `bar` changes, but package `foo` is
unchanged, the `TyCon` for `T` should really change too. Otherwise you
might serialise a value of type `T` into a file along with its `TypeRep`,
and try to read it back in with the wrong deserialiser. I'm not sure
whether the package versioning rules force `foo`'s version to be bumped.
Maybe we want `foo`'s new `PackageId`?
* But in fact the `PackageId` is perhaps ''too'' fine-grained, becuase
it changes whenever anything in package `foo` changes. We really only
want the `Typeable.TyCon` for `T` to change when the structure of `T`
(transitively) changes. Aha! We already have fingerprints for that, used
for versioning in interface files. So perhpas, once we've calculated
`T`'s fingerprint we can use that in the `Typeable.TyCon` for `T`. That
seems to me to be best story.
We could do with someone who'd like to follow these things up. Dynamic
typing is important.
--
Ticket URL: <http://hackage.haskell.org/trac/ghc/ticket/3480#comment:2>
GHC <http://www.haskell.org/ghc/>
The Glasgow Haskell Compiler_______________________________________________
Glasgow-haskell-bugs mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs