Re: Resources on how to implement (Haskell 98) kind checking?

2021-10-27 Thread Richard Eisenberg


> On Oct 27, 2021, at 9:36 AM, Benjamin Redelings 
>  wrote:
>> 
>> This won't work.
>> 
>> class C a where
>>   meth :: a b -> b Int
>> 
>> You have to know the kind of local b to learn the kind of class-variable a. 
>> So you have to do it all at once.
> I was doing it all at once -- but I wasn't sure how to export the information 
> about 'b' from the procedure.  (After you record the kinds of the typecons 
> like C, I believe the different typecons in the recursive group become 
> separable.)
> 
> I presume (in retrospect) that GHC modifies the declaration to record the 
> kind of b, and then re-walks the declaration to substitute kind variables 
> later?
> 
That would be smart, but it's not what GHC does. Instead, GHC first says (a :: 
kappa), for a fresh unification variable kappa. Then, GHC determines (b :: Type 
-> Type) and thus unifies kappa := (Type -> Type) -> Type. GHC then *throws 
away* the information about b. (See the `_ <- ...` at 
https://gitlab.haskell.org/ghc/ghc/-/blob/638f65482ca5265c268aa97abfcc14cdc27e46ba/compiler/GHC/Tc/Gen/HsType.hs#L388)
 Having now determined conclusively what the kind of `a` is, GHC will later 
re-check the kind of the type of meth, recording it for good.
> But if that is a difficulty on the right road, I will just go for it.  
> 
If zonking is a difficulty, there's probably something wrong somewhere. 
Lengthy, yes, but not difficult. If you're using Data.Data, I imagine you could 
implement zonking with just the right use of Data.Generics.Schemes.everywhere.

> Oh, I forgot to add, would it make sense to put some of the information I 
> discovered about implementing kind checking into the wiki somewhere?
> 
This is an excellent idea, but I'll counterpropose that you write this in a 
Note. Indeed, Note [Kind checking for type and class decls] in GHC.Tc.TyCl is 
probably meant to be the Note you're looking for, but it's too short. Feel free 
to flesh it out with all of these details. Then, make sure that the functions 
that implement the details refer to the Note (and vice versa). This will help 
it to stay up-to-date -- much more so than putting it in the wiki.
> It might also help to reference the relevant papers (mostly the PolyKinds 
> paper), and maybe also to mention papers like the THIH paper that don't 
> actually implement kind checking.
> 
Yes, refer to papers. But please do use their proper names (years and 
conference names are also helpful). It actually took some inference for me to 
figure out what the "PolyKinds paper" was in your emails. (The paper that 
proposes the extension that became PolyKinds is "Giving Haskell a Promotion", 
TLDI'12, and the paper that describes its extension to work more generally is 
"System FC with Explicit Kind Equality", ICFP'13. I thought you were referring 
to one of these -- not to "Kind Inference for Datatypes".)

Thanks!
Richard


> -BenRI
> 
> On 10/15/21 1:37 PM, Richard Eisenberg wrote:
>> 
>> 
>>> On Oct 14, 2021, at 11:59 AM, Benjamin Redelings 
>>> mailto:benjamin.redeli...@gmail.com>> wrote:
>>> 
>>> I asked about this on Haskell-Cafe, and was recommended to ask here 
>>> instead.  Any help is much appreciated!
>>> 
>> 
>> I saw your post over there, but haven't had time to respond but this 
>> retelling of the story makes it easier to respond, so I'll do so here.
>> 
>>> * The PolyKinds paper was the most helpful thing I've found, but it doesn't 
>>> cover type classes.  I'm also not sure that all implementers can follow 
>>> algorithm descriptions that are laid out as inference rules, but maybe that 
>>> could be fixed with a few hints about how to run the rules in reverse.  
>>> Also, in practice I think an implementer would want to follow GHC in 
>>> specifying the initial kind of a data type as k1 -> k2 -> ... kn -> *. 
>>> 
>> 
>> What is unique about type classes? It seems like you're worried about 
>> locally quantified type variables in method types, but (as far as kind 
>> inference is concerned) those are very much like existential variables in 
>> data constructors. So perhaps take the bit about existential variables from 
>> the PolyKinds part of that paper and combine it with the Haskell98 part.
>> 
>> It's true that many implementors may find the notation in that paper to be a 
>> barrier, but you just have to read the rules clockwise, starting from the 
>> bottom left and ending on the bottom right. :)
>>> 
>>> 2. The following question (which I have maybe kind of answered now, but 
>>> could use more advice on) is an example of what I am hoping such 
>>> documentation would explain: 
>>> 
>>> 
 Q: How do you handle type variables that are present in class methods, but 
 are not type class parameters? If there are multiple types/classes in a 
 single recursive group, the kind of such type variables might not be fully 
 resolved until a later type-or-class is processed.  Is there a recommended 
 approach? 
 
 I can see two ways to proceed: 
 
 i) F

Re: Resources on how to implement (Haskell 98) kind checking?

2021-10-27 Thread Benjamin Redelings
Oh, I forgot to add, would it make sense to put some of the information 
I discovered about implementing kind checking into the wiki somewhere?  
I am mostly thinking of a sequence of steps like:


1. Divide class, data/newtype, type synonym, and instance declarations 
into recursive groups.


1a) Record for each group which LOCAL typecons are mentioned in the 
declaration


1b) ... etc

2. Infer kinds within a recursive group

2a) Treat type classes as having kind k1 -> k2 -> ... -> kn -> Constraint

2b) Begin by recording a kind k1 -> k2 -> ... -> kn -> Constraint/* for 
each typecon


2c) etc...

2d) Substitute kind variables (zonking)

2e) Substitute * for remaining kind variables (zapping)

3. ...

I am not actually sure what to write yet, the above is just an illustration.

It might also help to reference the relevant papers (mostly the 
PolyKinds paper), and maybe also to mention papers like the THIH paper 
that don't actually implement kind checking.


-BenRI

On 10/15/21 1:37 PM, Richard Eisenberg wrote:



On Oct 14, 2021, at 11:59 AM, Benjamin Redelings 
 wrote:



I asked about this on Haskell-Cafe, and was recommended to ask here 
instead.  Any help is much appreciated!




I saw your post over there, but haven't had time to respond but 
this retelling of the story makes it easier to respond, so I'll do so 
here.


* The PolyKinds paper was the most helpful thing I've found, but it 
doesn't cover type classes. I'm also not sure that all implementers 
can follow algorithm descriptions that are laid out as inference 
rules, but maybe that could be fixed with a few hints about how to 
run the rules in reverse.  Also, in practice I think an implementer 
would want to follow GHC in specifying the initial kind of a data 
type as k1 -> k2 -> ... kn -> *.




What is unique about type classes? It seems like you're worried about 
locally quantified type variables in method types, but (as far as kind 
inference is concerned) those are very much like existential variables 
in data constructors. So perhaps take the bit about existential 
variables from the PolyKinds part of that paper and combine it with 
the Haskell98 part.


It's true that many implementors may find the notation in that paper 
to be a barrier, but you just have to read the rules clockwise, 
starting from the bottom left and ending on the bottom right. :)



2. The following question (which I have maybe kind of answered now, 
but could use more advice on) is an example of what I am hoping such 
documentation would explain:


Q: How do you handle type variables that are present in class 
methods, but are not type class parameters? If there are multiple 
types/classes in a single recursive group, the kind of such type 
variables might not be fully resolved until a later type-or-class is 
processed.  Is there a recommended approach?


I can see two ways to proceed:

i) First determine the kinds of all the data types, classes, and 
type synonyms.  Then perform a second pass over each type or class 
to determine the kinds of type variables (in class methods) that are 
not type class parameters.


This won't work.

class C a where
  meth :: a b -> b Int

You have to know the kind of local b to learn the kind of 
class-variable a. So you have to do it all at once.




ii) Alternatively, record the kind of each type variable as it is 
encountered -- even though such kinds may contain unification kind 
variables.  After visiting all types-or-classes in the recursive 
group, replace any kind variables with their definition, or with a * 
if there is no definition.


I've currently implement approach i), which requires doing kind 
inference on class methods twice.


Further investigation revealed that GHC takes yet another approach (I 
think):


iii) Represent kinds with modifiable variables. Substitution can be 
implemented by modifying kind variables in-place.  This is (I think) 
called "zonking" in the GHC sources.


I don't really see the difference between (ii) and (iii). Maybe (ii) 
records the kinds in a table somewhere, while (iii) records them "in" 
the kind variables themselves, but that's not so different, I think.




This solves a small mystery for me, since I previously thought that 
zonking was just replacing remaining kind variables with '*'.  And 
indeed this seems to be an example of zonking, but not what zonking 
is (I think).


We can imagine that, instead of mutation, we build a substitution 
mapping unification variables to types (or kinds). This would be 
stored just as a simple mapping or dictionary structure. No mutation. 
As we learn about a unification variable, we just add to the mapping. 
If we did this, zonking would be the act of applying the substitution, 
replacing known unification variables with their values. It just so 
happens that GHC builds a mapping by using mutable cells in memory, 
but that's just an implementation detail: zonking is still just 
applying the substitution.


Zonking does /not/ replace any

Re: Resources on how to implement (Haskell 98) kind checking?

2021-10-27 Thread Benjamin Redelings

Hi Richard,

Many thanks for the hints!

On 10/15/21 1:37 PM, Richard Eisenberg wrote:



I can see two ways to proceed:

i) First determine the kinds of all the data types, classes, and 
type synonyms.  Then perform a second pass over each type or class 
to determine the kinds of type variables (in class methods) that are 
not type class parameters.


This won't work.

class C a where
  meth :: a b -> b Int

You have to know the kind of local b to learn the kind of 
class-variable a. So you have to do it all at once.


I was doing it all at once -- but I wasn't sure how to export the 
information about 'b' from the procedure.  (After you record the kinds 
of the typecons like C, I believe the different typecons in the 
recursive group become separable.)


I presume (in retrospect) that GHC modifies the declaration to record 
the kind of b, and then re-walks the declaration to substitute kind 
variables later?




iii) Represent kinds with modifiable variables. Substitution can be 
implemented by modifying kind variables in-place.  This is (I think) 
called "zonking" in the GHC sources.


I don't really see the difference between (ii) and (iii). Maybe (ii) 
records the kinds in a table somewhere, while (iii) records them "in" 
the kind variables themselves, but that's not so different, I think.


Yeah, that is a good point.  That clarified for me what GHC is doing.



This solves a small mystery for me, since I previously thought that 
zonking was just replacing remaining kind variables with '*'.  And 
indeed this seems to be an example of zonking, but not what zonking 
is (I think).


We can imagine that, instead of mutation, we build a substitution 
mapping unification variables to types (or kinds). This would be 
stored just as a simple mapping or dictionary structure. No mutation. 
As we learn about a unification variable, we just add to the mapping. 
If we did this, zonking would be the act of applying the substitution, 
replacing known unification variables with their values. It just so 
happens that GHC builds a mapping by using mutable cells in memory, 
but that's just an implementation detail: zonking is still just 
applying the substitution.


OK, that makes sense.  I'll start with the mapping approach, and then 
consider optimizing things later.



Zonking does /not/ replace anything with *. Well, functions that have 
"zonk" in their name may do this. But it is not really logically part 
of the zonking operation. If you like, you can pretend that, after 
zonking a program, we take a separate pass replacing any yet-unfilled 
kind-level unification variables with *. Sometimes, this is called 
"zapping" in GHC, I believe.


Yes, I was definitely confusing zonking and zapping.  (Wow, lots of fun 
names here!)



Zonking is a bit laborious to implement, but not painful. Laborious, 
because it requires a full pass over the AST. Not painful, because all 
it's trying to do is replace type/kind variables with substitutions: 
each individual piece of the puzzle is quite simple.
This was quite helpful -- I think I was trying to somehow avoid a 
separate pass over the AST.


But if that is a difficulty on the right road, I will just go for it.

-BenRI
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Resources on how to implement (Haskell 98) kind checking?

2021-10-15 Thread Richard Eisenberg


> On Oct 14, 2021, at 11:59 AM, Benjamin Redelings 
>  wrote:
> 
> I asked about this on Haskell-Cafe, and was recommended to ask here instead.  
> Any help is much appreciated!
> 

I saw your post over there, but haven't had time to respond but this 
retelling of the story makes it easier to respond, so I'll do so here.

> * The PolyKinds paper was the most helpful thing I've found, but it doesn't 
> cover type classes.  I'm also not sure that all implementers can follow 
> algorithm descriptions that are laid out as inference rules, but maybe that 
> could be fixed with a few hints about how to run the rules in reverse.  Also, 
> in practice I think an implementer would want to follow GHC in specifying the 
> initial kind of a data type as k1 -> k2 -> ... kn -> *. 
> 

What is unique about type classes? It seems like you're worried about locally 
quantified type variables in method types, but (as far as kind inference is 
concerned) those are very much like existential variables in data constructors. 
So perhaps take the bit about existential variables from the PolyKinds part of 
that paper and combine it with the Haskell98 part.

It's true that many implementors may find the notation in that paper to be a 
barrier, but you just have to read the rules clockwise, starting from the 
bottom left and ending on the bottom right. :)
> 
> 2. The following question (which I have maybe kind of answered now, but could 
> use more advice on) is an example of what I am hoping such documentation 
> would explain: 
> 
> 
>> Q: How do you handle type variables that are present in class methods, but 
>> are not type class parameters? If there are multiple types/classes in a 
>> single recursive group, the kind of such type variables might not be fully 
>> resolved until a later type-or-class is processed.  Is there a recommended 
>> approach? 
>> 
>> I can see two ways to proceed: 
>> 
>> i) First determine the kinds of all the data types, classes, and type 
>> synonyms.  Then perform a second pass over each type or class to determine 
>> the kinds of type variables (in class methods) that are not type class 
>> parameters. 

This won't work.

class C a where
  meth :: a b -> b Int

You have to know the kind of local b to learn the kind of class-variable a. So 
you have to do it all at once.

>> 
>> ii) Alternatively, record the kind of each type variable as it is 
>> encountered -- even though such kinds may contain unification kind 
>> variables.  After visiting all types-or-classes in the recursive group, 
>> replace any kind variables with their definition, or with a * if there is no 
>> definition. 
>> 
>> I've currently implement approach i), which requires doing kind inference on 
>> class methods twice. 
>> 
> Further investigation revealed that GHC takes yet another approach (I think): 
> 
> iii) Represent kinds with modifiable variables.  Substitution can be 
> implemented by modifying kind variables in-place.  This is (I think) called 
> "zonking" in the GHC sources. 

I don't really see the difference between (ii) and (iii). Maybe (ii) records 
the kinds in a table somewhere, while (iii) records them "in" the kind 
variables themselves, but that's not so different, I think.

> 
> This solves a small mystery for me, since I previously thought that zonking 
> was just replacing remaining kind variables with '*'.  And indeed this seems 
> to be an example of zonking, but not what zonking is (I think). 

We can imagine that, instead of mutation, we build a substitution mapping 
unification variables to types (or kinds). This would be stored just as a 
simple mapping or dictionary structure. No mutation. As we learn about a 
unification variable, we just add to the mapping. If we did this, zonking would 
be the act of applying the substitution, replacing known unification variables 
with their values. It just so happens that GHC builds a mapping by using 
mutable cells in memory, but that's just an implementation detail: zonking is 
still just applying the substitution.

Zonking does not replace anything with *. Well, functions that have "zonk" in 
their name may do this. But it is not really logically part of the zonking 
operation. If you like, you can pretend that, after zonking a program, we take 
a separate pass replacing any yet-unfilled kind-level unification variables 
with *. Sometimes, this is called "zapping" in GHC, I believe.

> 
> Zonking looks painful to implement, but approach (i) might require multiple 
> passes over types to update the kind of type variables, which might be 
> worse...

Zonking is a bit laborious to implement, but not painful. Laborious, because it 
requires a full pass over the AST. Not painful, because all it's trying to do 
is replace type/kind variables with substitutions: each individual piece of the 
puzzle is quite simple.

I hope this is helpful!
Richard___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.or

Re: Resources on how to implement (Haskell 98) kind checking?

2021-10-15 Thread Gergő Érdi
Using mutable references for type metavariables is an implementation
optimization: unifying a metavariable with a kind can then be cheaply
implemented by replacing the metavar's content with the kind. After
you are done with unification, you don't need metavariables anymore.
If this were HM, you'd generalize remaining metavariables into type
variables. But for kinds, remaining metavariables are defaulted to *
in Haskell 98. (Of course, if you used PolyKinds, kind metavars would
also be generalized).

Zonking is the process of traversing a type / kind with these mutable
references and replacing each mutable reference with its value,
thereby "baking in" its current solution. If you encounter a mutable
reference that doesn't contain anything, it means the metavar it
represents hasn't been constrained in any way -- so you can just
choose * for them.

People seem to have upvoted my SO answer about GHC's zonking so you
might find it useful as well:
https://stackoverflow.com/a/31890743/477476

On Fri, Oct 15, 2021 at 12:01 AM Benjamin Redelings
 wrote:
>
> Hi,
>
> I asked about this on Haskell-Cafe, and was recommended to ask here instead.  
> Any help is much appreciated!
>
> 1. I'm looking for resources that describe how to implement kind Haskell 98 
> checking.  Does anyone have any good suggestions?  So far, the papers that 
> I've looked at all fall short in different ways:
>
> * Mark Jones's paper "A system of constructor classes":  This paper contains 
> a kind-aware type-inference algorithm, but no kind inference algorithm.  The 
> closest it comes is the rule:
>
> C :: k' -> k   and   C' :: k'   =>   C C' :: k
>
> * The THIH paper doesn't have an algorithm for kind checking.  It assumes 
> that every type variable already has a kind.
>
> * The 2010 Report helpfully mentions substituting any remaining kind 
> variables with *.  But it refers to "A system of constructor classes" for an 
> algorithm.
>
> * The PolyKinds paper was the most helpful thing I've found, but it doesn't 
> cover type classes.  I'm also not sure that all implementers can follow 
> algorithm descriptions that are laid out as inference rules, but maybe that 
> could be fixed with a few hints about how to run the rules in reverse.  Also, 
> in practice I think an implementer would want to follow GHC in specifying the 
> initial kind of a data type as k1 -> k2 -> ... kn -> *.
>
> * I've looked at the source code to GHC, and some of the longer notes were 
> quite helpful.  However, it is hard to follow for a variety of reasons.  It 
> isn't laid out like an algorithm description, and the complexity to handle 
> options like PolyKinds and DataKinds makes the code harder to follow.
>
>
>
> 2. The following question (which I have maybe kind of answered now, but could 
> use more advice on) is an example of what I am hoping such documentation 
> would explain:
>
> Q: How do you handle type variables that are present in class methods, but 
> are not type class parameters? If there are multiple types/classes in a 
> single recursive group, the kind of such type variables might not be fully 
> resolved until a later type-or-class is processed.  Is there a recommended 
> approach?
>
> I can see two ways to proceed:
>
> i) First determine the kinds of all the data types, classes, and type 
> synonyms.  Then perform a second pass over each type or class to determine 
> the kinds of type variables (in class methods) that are not type class 
> parameters.
>
> ii) Alternatively, record the kind of each type variable as it is encountered 
> -- even though such kinds may contain unification kind variables.  After 
> visiting all types-or-classes in the recursive group, replace any kind 
> variables with their definition, or with a * if there is no definition.
>
> I've currently implement approach i), which requires doing kind inference on 
> class methods twice.
>
> Further investigation revealed that GHC takes yet another approach (I think):
>
> iii) Represent kinds with modifiable variables.  Substitution can be 
> implemented by modifying kind variables in-place.  This is (I think) called 
> "zonking" in the GHC sources.
>
> This solves a small mystery for me, since I previously thought that zonking 
> was just replacing remaining kind variables with '*'.  And indeed this seems 
> to be an example of zonking, but not what zonking is (I think).
>
> Zonking looks painful to implement, but approach (i) might require multiple 
> passes over types to update the kind of type variables, which might be 
> worse...
>
>
>
> 3. I'm curious now how many other pieces of software besides GHC have 
> implemented kind inference...
>
>
> -BenRI
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Resources on how to implement (Haskell 98) kind checking?

2021-10-14 Thread Benjamin Redelings

Hi,

I asked about this on Haskell-Cafe, and was recommended to ask here 
instead.  Any help is much appreciated!


1. I'm looking for resources that describe how to implement kind Haskell 
98 checking.  Does anyone have any good suggestions?  So far, the papers 
that I've looked at all fall short in different ways:


* Mark Jones's paper "A system of constructor classes":  This paper 
contains a kind-aware type-inference algorithm, but no kind inference 
algorithm.  The closest it comes is the rule:


    C :: k' -> k   and   C' :: k'   =>   C C' :: k

* The THIH paper doesn't have an algorithm for kind checking.  It 
assumes that every type variable already has a kind.


* The 2010 Report helpfully mentions substituting any remaining kind 
variables with *.  But it refers to "A system of constructor classes" 
for an algorithm.


* The PolyKinds paper was the most helpful thing I've found, but it 
doesn't cover type classes.  I'm also not sure that all implementers can 
follow algorithm descriptions that are laid out as inference rules, but 
maybe that could be fixed with a few hints about how to run the rules in 
reverse.  Also, in practice I think an implementer would want to follow 
GHC in specifying the initial kind of a data type as k1 -> k2 -> ... kn 
-> *.


* I've looked at the source code to GHC, and some of the longer notes 
were quite helpful.  However, it is hard to follow for a variety of 
reasons.  It isn't laid out like an algorithm description, and the 
complexity to handle options like PolyKinds and DataKinds makes the code 
harder to follow.




2. The following question (which I have maybe kind of answered now, but 
could use more advice on) is an example of what I am hoping such 
documentation would explain:


Q: How do you handle type variables that are present in class methods, 
but are not type class parameters? If there are multiple types/classes 
in a single recursive group, the kind of such type variables might not 
be fully resolved until a later type-or-class is processed.  Is there 
a recommended approach?


I can see two ways to proceed:

i) First determine the kinds of all the data types, classes, and type 
synonyms.  Then perform a second pass over each type or class to 
determine the kinds of type variables (in class methods) that are not 
type class parameters.


ii) Alternatively, record the kind of each type variable as it is 
encountered -- even though such kinds may contain unification kind 
variables.  After visiting all types-or-classes in the recursive 
group, replace any kind variables with their definition, or with a * 
if there is no definition.


I've currently implement approach i), which requires doing kind 
inference on class methods twice.


Further investigation revealed that GHC takes yet another approach (I 
think):


iii) Represent kinds with modifiable variables.  Substitution can be 
implemented by modifying kind variables in-place.  This is (I think) 
called "zonking" in the GHC sources.


This solves a small mystery for me, since I previously thought that 
zonking was just replacing remaining kind variables with '*'.  And 
indeed this seems to be an example of zonking, but not what zonking is 
(I think).


Zonking looks painful to implement, but approach (i) might require 
multiple passes over types to update the kind of type variables, which 
might be worse...




3. I'm curious now how many other pieces of software besides GHC have 
implemented kind inference...



-BenRI___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs