RE: Operating on HsSyn

2017-07-31 Thread Simon Peyton Jones via ghc-devs
Am I far off?
I don’t think you are far off.  All I’m after is the debug-printer use-case: I 
want to be able to see the contents of the tree, including the PostRn 
annotations.  I don’t care about round-tripping it.

Simon

From: Shayan Najd [mailto:sh.n...@gmail.com]
Sent: 28 July 2017 15:32
To: Simon Peyton Jones <simo...@microsoft.com>
Cc: ghc-devs@haskell.org; Alan & Kim Zimmerman <alan.z...@gmail.com>; Jacques 
Carette <care...@mcmaster.ca>
Subject: Re: Operating on HsSyn

On the contrary, inside GHC I /do/ want to print them. Otherwise how can I see 
what the renamer has done?

Right. So if I understand correctly, with this semantics, `Outputable` is 
somewhere between pretty printing as often used in program manipulation 
libraries (like Haskell-Src-Exts (HSE)) which is closer to syntax, and `Show` 
which is closer to Haskell representation.
(There are also "exact printers" (as in HSE) that are even closer to syntax in 
some sense.)
Often, pretty printers generate only grammatically valid terms, not the ones 
polluted with extra annotations (hence grammatically invalid), e.g., what is 
the grammatically valid form of `OpApp` printed via `Outputable` that includes 
the fixity annotation.

If I recall correctly, we have briefly studied these in the past summer, we 
came up with some roundtrip correctness criteria, like the following (bar error 
handling; assume valid input):

 (parser . prettyPrint . parser) = id

[paging in Jacques]

The reason I am trying to flesh out the semantics is the /big/ gains on code 
reuse later on in the process: one does not need to define a separate pretty 
printing library for Haskell syntax, and can reuse the well-tested and 
well-maintained one in GHC.

Reformulating part of your concern, based on my understanding (if I may), the 
questions is: what is the proper design of an "outputer" (debug-printer?) where 
/annotated/ terms can be pretty-printed including any printable 
(pretty?showable?) annotations.
In particular, we may want to take advantage of extensibility of data types for 
this.
Am I far off?

Note: with proper design, an extensible debug-printer can still subsume 
corresponding pretty-printers.


On Fri, Jul 28, 2017 at 2:36 PM, Simon Peyton Jones 
<simo...@microsoft.com<mailto:simo...@microsoft.com>> wrote:
I have been under the impression that we don't even want to print those.
On the contrary, inside GHC I /do/ want to print them. Otherwise how can I see 
what the renamer has done?

Simon

From: Shayan Najd [mailto:sh.n...@gmail.com<mailto:sh.n...@gmail.com>]
Sent: 28 July 2017 12:20
To: Simon Peyton Jones <simo...@microsoft.com<mailto:simo...@microsoft.com>>
Cc: ghc-devs@haskell.org<mailto:ghc-devs@haskell.org>; Alan & Kim Zimmerman 
<alan.z...@gmail.com<mailto:alan.z...@gmail.com>>
Subject: Re: Operating on HsSyn

Before all this, we may need to discuss a bit about the intended semantics of
`Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
suffixed constructors?  If not, then we only need to write a set of instances
for the base growable AST, once and for all.  Such instances will be polymorphic
on the extension descriptor `p`, and do not need to mention the constraints like
`(PostRn p (IdP p)`, since these are just extensions and not part of the base
growable AST.  Or, am I missing something about the intended semantics of
`Outputable`?

You write

So today we never print these annotations, to avoid bloating the instance 
contexts, which can be painful.

I have been under the impression that we don't even want to print those.

Of course, there are scenarios (like `Show` instances) where we do want to write
compositional / generic functions that take into account the extensions.
Here is my abstract overview of the scenario, that may help the discussion.
Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler 
phase) that
 are defined as extensions to a base datatype `T`:

> A = T XA
> B = T XB
> C = T XC

where `X*`s are extension descriptors.
Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `B`, 
and `C`.
We have two main alternatives:
(a) either we write these  (manually or using the deriving mechanism) separately
(b) or we write a generic / parametric function `g` over `T`, and reuse that to 
define `f_*`s

Of course, (b) is preferable in theory , but not always possible or preferable 
in practice.
In which case, we can always resort to (a).
The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more 
parametric should
`g` get, as this is the case for any generic function.

With a correct design, I believe, these are all independent of Trees that Grow 
story itself:
we are now not only trying to reuse data types, and functions agnostic towards 
extensions
(pretty printers in my view of their semantics), but also reuse functions with 
parametric /
varying behavi

Re: Operating on HsSyn

2017-07-30 Thread Shayan Najd
Simon,

Back to your original questions:

1. I think we should probably use a superclass instead of a type synonym
>
> class (Data p, ForallX Data p, ) => DataId p where {}
>
> Why?  Only one argument to pass, and to pass on to successive calls.  I
> see no downside.


Sounds good to me. I have not experimented with it yet, but I see no
downside as well.

Shall we treat Outputable like Data?  (I.e. make an abbreviation for a long
> list of Outputable instances and use it everywhere)


Besides, the above point about semantics and reusing it as classical
pretty-printer, I suggest a bit more fine-grained abbreviations, if we are
going along this path:
 we define the abbreviations per datatype (as opposed to all), we define
`ForallXNAME :: constraint -> *` (as in Section 3.7 of the paper) where
`NAME` is name of the datatype.

However, we do NOT need Trees that Grow in its full generality: as far as I
understand, we do not have GADTs or existentials in GHC ASTs.
This means we can use the simpler design of Trees that Grow that all
extension families had the same arity and kind, then we could factor them
all into one type family dispatching on some unique identifier.
(we used type-level strings, or promoted datatypes for annotations.)

For example, the following in absence of existentials

> type family XVar x a
> type family XApp x a
> type family XAbs x a
> data Exp x a
>  = Var (XVar x a) a
>  | Abs (XAbs x a) a (Exp x a)
>  | App (XApp x a) (Exp x a) (Exp x a)

is as good as (one of our very first extensible encodings)

> type family XDispatch name x a
> data Lbl = VarL | AbsL | AppL
> data Exp x a
>  = Var (XDispatch VarL x a) a
>  | Abs (XDispatch AbsL x a) a (Exp x a)
>  | App (XDispatch AppL x a) (Exp x a) (Exp x a)

We could do the same with promoted datatypes as annotations (instead of
type-level strings).

The advantage of the simpler encoding is that now, in theory, we can define

> ForallXExp (c :: Constraint) x a = forall l. c (XDispatch l x a)

Which gives us for example `ForallXExp Outputable x a `.

Is this encoding faster, in comparison? Does it help?

/Shayan









On Fri, Jul 28, 2017 at 10:11 PM, Shayan Najd  wrote:

> MarLinn,
>Thanks for correcting me, and spelling this out.
>I did mean what Alan mentioned: "re-parsing a pretty printed parse tree
> gives you back a parse tree identical to the original (ignoring SrcSpans)".
>As I recall, we had to go a bit further to give 'Something' some more
> structure to take into account things like "(ignoring SrcSpans)" (e.g., to
> define exact-printers, etc).
>Provided I have failed twice to properly recall the invariant, I
> refrain from trying to recall the rest tonight :)
>
> Not diverging from my point above, as far as I understand, an ideal
> `Outputable` machinery is going to be a bit different from the traditional
> pretty printers.
> I believe with a proper design we can even reuse `Outputable` machinery
> and provide it as a pretty printer for Haskell terms.
> It resembles the scenario in Section 3.7 compared to Section 3.6 of Trees
> that Grow [1].
>
> Having said all these, we ARE diverging from the original thread, and
> Simon's questions.
>
> How about taking printer-design related discussion to the following wiki
> page (and/or a new ghc-dev thread if needed):
>   https://ghc.haskell.org/trac/ghc/wiki/HaskellSyntaxPrinters
>
> Cheers,
>   Shayan
>
> [1] http://www.jucs.org/jucs_23_1/trees_that_grow/jucs_23_
> 01_0042_0062_najd.pdf
>
> On Fri, Jul 28, 2017 at 8:43 PM, Alan & Kim Zimmerman  > wrote:
>
>> I agree. 4 is the current GHC invariant.
>>
>> i.e., re-parsing a pretty printed parse tree gives you back a parse tree
>> identical to the original (ignoring SrcSpans)
>>
>> Alan
>>
>> On 28 July 2017 at 20:34, MarLinn  wrote:
>>
>>> by
>>>
>>>  (parser . prettyPrint . parser) = id
>>>
>>> I meant
>>>
>>> (prettyPrint . parser . prettyPrint) = id
>>>
>>> for a valid input.
>>>
>>> Simplifying, (parser ∷ String → something), and (prettyPrint ∷
>>> something → String).
>>>
>>> Therefore, (parser . prettyPrint . parser ∷ String → something) and 
>>> (prettyPrint
>>> . parser . prettyPrint ∷ something → String).
>>>
>>> Therefore, both criteria could only apply for (something ~ String). But
>>> as pretty printing adds quotation marks, not even that is true.
>>>
>>> There are four formulations that might be applicable:
>>>
>>>1.
>>>
>>>parser . prettyPrint ≍ id
>>>2.
>>>
>>>prettyPrint . parser ≍ id -- ∷ String → String, useless here
>>>3.
>>>
>>>prettyPrint . parser . prettyPrint ≍ prettyPrint
>>>4.
>>>
>>>parser . prettyPrint . parser ≍ parser
>>>5. Well, you could go beyond to (prettyPrint . parser . prettyPrint
>>>. parser ≍ prettyPrint . parser) etc…
>>>
>>> I don't think 1 (or 2) follow from one of the last two. But 1 does imply
>>> them. So it is a stronger criterion than both, and therefore probably not
>>> the 

Re: Operating on HsSyn

2017-07-28 Thread Shayan Najd
MarLinn,
   Thanks for correcting me, and spelling this out.
   I did mean what Alan mentioned: "re-parsing a pretty printed parse tree
gives you back a parse tree identical to the original (ignoring SrcSpans)".
   As I recall, we had to go a bit further to give 'Something' some more
structure to take into account things like "(ignoring SrcSpans)" (e.g., to
define exact-printers, etc).
   Provided I have failed twice to properly recall the invariant, I refrain
from trying to recall the rest tonight :)

Not diverging from my point above, as far as I understand, an ideal
`Outputable` machinery is going to be a bit different from the traditional
pretty printers.
I believe with a proper design we can even reuse `Outputable` machinery and
provide it as a pretty printer for Haskell terms.
It resembles the scenario in Section 3.7 compared to Section 3.6 of Trees
that Grow [1].

Having said all these, we ARE diverging from the original thread, and
Simon's questions.

How about taking printer-design related discussion to the following wiki
page (and/or a new ghc-dev thread if needed):
  https://ghc.haskell.org/trac/ghc/wiki/HaskellSyntaxPrinters

Cheers,
  Shayan

[1]
http://www.jucs.org/jucs_23_1/trees_that_grow/jucs_23_01_0042_0062_najd.pdf

On Fri, Jul 28, 2017 at 8:43 PM, Alan & Kim Zimmerman 
wrote:

> I agree. 4 is the current GHC invariant.
>
> i.e., re-parsing a pretty printed parse tree gives you back a parse tree
> identical to the original (ignoring SrcSpans)
>
> Alan
>
> On 28 July 2017 at 20:34, MarLinn  wrote:
>
>> by
>>
>>  (parser . prettyPrint . parser) = id
>>
>> I meant
>>
>> (prettyPrint . parser . prettyPrint) = id
>>
>> for a valid input.
>>
>> Simplifying, (parser ∷ String → something), and (prettyPrint ∷ something
>> → String).
>>
>> Therefore, (parser . prettyPrint . parser ∷ String → something) and 
>> (prettyPrint
>> . parser . prettyPrint ∷ something → String).
>>
>> Therefore, both criteria could only apply for (something ~ String). But
>> as pretty printing adds quotation marks, not even that is true.
>>
>> There are four formulations that might be applicable:
>>
>>1.
>>
>>parser . prettyPrint ≍ id
>>2.
>>
>>prettyPrint . parser ≍ id -- ∷ String → String, useless here
>>3.
>>
>>prettyPrint . parser . prettyPrint ≍ prettyPrint
>>4.
>>
>>parser . prettyPrint . parser ≍ parser
>>5. Well, you could go beyond to (prettyPrint . parser . prettyPrint .
>>parser ≍ prettyPrint . parser) etc…
>>
>> I don't think 1 (or 2) follow from one of the last two. But 1 does imply
>> them. So it is a stronger criterion than both, and therefore probably not
>> the one to choose. Assuming the parser is internally consistent, 3 just
>> says something about the internal consistency of the pretty printer, while
>> 4 says something about the relationship of the pretty printer to the
>> parser. Thus 4 looks like the best candidate for a criterion. Possibly with
>> 3 as a secondary target.
>>
>> Cheers,
>> MarLinn
>>
>> ___
>> ghc-devs mailing list
>> ghc-devs@haskell.org
>> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>>
>>
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Operating on HsSyn

2017-07-28 Thread Alan & Kim Zimmerman
I agree. 4 is the current GHC invariant.

i.e., re-parsing a pretty printed parse tree gives you back a parse tree
identical to the original (ignoring SrcSpans)

Alan

On 28 July 2017 at 20:34, MarLinn  wrote:

> by
>
>  (parser . prettyPrint . parser) = id
>
> I meant
>
> (prettyPrint . parser . prettyPrint) = id
>
> for a valid input.
>
> Simplifying, (parser ∷ String → something), and (prettyPrint ∷ something
> → String).
>
> Therefore, (parser . prettyPrint . parser ∷ String → something) and 
> (prettyPrint
> . parser . prettyPrint ∷ something → String).
>
> Therefore, both criteria could only apply for (something ~ String). But
> as pretty printing adds quotation marks, not even that is true.
>
> There are four formulations that might be applicable:
>
>1.
>
>parser . prettyPrint ≍ id
>2.
>
>prettyPrint . parser ≍ id -- ∷ String → String, useless here
>3.
>
>prettyPrint . parser . prettyPrint ≍ prettyPrint
>4.
>
>parser . prettyPrint . parser ≍ parser
>5. Well, you could go beyond to (prettyPrint . parser . prettyPrint .
>parser ≍ prettyPrint . parser) etc…
>
> I don't think 1 (or 2) follow from one of the last two. But 1 does imply
> them. So it is a stronger criterion than both, and therefore probably not
> the one to choose. Assuming the parser is internally consistent, 3 just
> says something about the internal consistency of the pretty printer, while
> 4 says something about the relationship of the pretty printer to the
> parser. Thus 4 looks like the best candidate for a criterion. Possibly with
> 3 as a secondary target.
>
> Cheers,
> MarLinn
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Operating on HsSyn

2017-07-28 Thread MarLinn

by

  (parser . prettyPrint . parser) = id

I meant

(prettyPrint . parser . prettyPrint) = id

for a valid input.

Simplifying, (parser ∷ String → something), and (prettyPrint ∷ something 
→ String).


Therefore, (parser . prettyPrint . parser ∷ String → something) and 
(prettyPrint . parser . prettyPrint ∷ something → String).


Therefore, both criteria could only apply for (something ~ String). But 
as pretty printing adds quotation marks, not even that is true.


There are four formulations that might be applicable:

1.

   parser . prettyPrint ≍ id

2.

   prettyPrint . parser ≍ id -- ∷ String → String, useless here

3.

   prettyPrint . parser . prettyPrint ≍ prettyPrint

4.

   parser . prettyPrint . parser ≍ parser

5. Well, you could go beyond to (prettyPrint . parser . prettyPrint.
   parser ≍prettyPrint. parser) etc…

I don't think 1 (or 2) follow from one of the last two. But 1 does imply 
them. So it is a stronger criterion than both, and therefore probably 
not the one to choose. Assuming the parser is internally consistent, 3 
just says something about the internal consistency of the pretty 
printer, while 4 says something about the relationship of the pretty 
printer to the parser. Thus 4 looks like the best candidate for a 
criterion. Possibly with 3 as a secondary target.


Cheers,
MarLinn

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Operating on HsSyn

2017-07-28 Thread Shayan Najd
by

 (parser . prettyPrint . parser) = id


I meant

 (prettyPrint . parser . prettyPrint) = id

for a valid input.


On Fri, Jul 28, 2017 at 4:32 PM, Shayan Najd <sh.n...@gmail.com> wrote:

> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
>> see what the renamer has done?
>
>
> Right. So if I understand correctly, with this semantics, `Outputable` is
> somewhere between pretty printing as often used in program manipulation
> libraries (like Haskell-Src-Exts (HSE)) which is closer to syntax, and
> `Show` which is closer to Haskell representation.
> (There are also "exact printers" (as in HSE) that are even closer to
> syntax in some sense.)
> Often, pretty printers generate only grammatically valid terms, not the
> ones polluted with extra annotations (hence grammatically invalid), e.g.,
> what is the grammatically valid form of `OpApp` printed via `Outputable`
> that includes the fixity annotation.
>
> If I recall correctly, we have briefly studied these in the past summer,
> we came up with some roundtrip correctness criteria, like the following
> (bar error handling; assume valid input):
>
>  (parser . prettyPrint . parser) = id
>
> [paging in Jacques]
>
> The reason I am trying to flesh out the semantics is the /big/ gains on
> code reuse later on in the process: one does not need to define a separate
> pretty printing library for Haskell syntax, and can reuse the well-tested
> and well-maintained one in GHC.
>
> Reformulating part of your concern, based on my understanding (if I may),
> the questions is: what is the proper design of an "outputer"
> (debug-printer?) where /annotated/ terms can be pretty-printed including
> any printable (pretty?showable?) annotations.
> In particular, we may want to take advantage of extensibility of data
> types for this.
> Am I far off?
>
> Note: with proper design, an extensible debug-printer can still subsume
> corresponding pretty-printers.
>
>
> On Fri, Jul 28, 2017 at 2:36 PM, Simon Peyton Jones <simo...@microsoft.com
> > wrote:
>
>> I have been under the impression that we don't even want to print those.
>>
>> On the contrary, inside GHC I /do/ want to print them. Otherwise how can
>> I see what the renamer has done?
>>
>>
>>
>> Simon
>>
>>
>>
>> *From:* Shayan Najd [mailto:sh.n...@gmail.com]
>> *Sent:* 28 July 2017 12:20
>> *To:* Simon Peyton Jones <simo...@microsoft.com>
>> *Cc:* ghc-devs@haskell.org; Alan & Kim Zimmerman <alan.z...@gmail.com>
>> *Subject:* Re: Operating on HsSyn
>>
>>
>>
>> Before all this, we may need to discuss a bit about the intended
>> semantics of
>>
>> `Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
>>
>> suffixed constructors?  If not, then we only need to write a set of
>> instances
>>
>> for the base growable AST, once and for all.  Such instances will be
>> polymorphic
>>
>> on the extension descriptor `p`, and do not need to mention the
>> constraints like
>>
>> `(PostRn p (IdP p)`, since these are just extensions and not part of the
>> base
>>
>> growable AST.  Or, am I missing something about the intended semantics of
>>
>> `Outputable`?
>>
>>
>>
>> You write
>>
>>
>>
>> So today we never print these annotations, to avoid bloating the instance
>> contexts, which can be painful.
>>
>>
>>
>> I have been under the impression that we don't even want to print those.
>>
>>
>>
>> Of course, there are scenarios (like `Show` instances) where we do want
>> to write
>>
>> compositional / generic functions that take into account the extensions.
>>
>> Here is my abstract overview of the scenario, that may help the
>> discussion.
>>
>> Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
>> phase) that
>>
>>  are defined as extensions to a base datatype `T`:
>>
>>
>>
>> > A = T XA
>>
>> > B = T XB
>>
>> > C = T XC
>>
>>
>>
>> where `X*`s are extension descriptors.
>>
>> Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`,
>> `B`, and `C`.
>>
>> We have two main alternatives:
>>
>> (a) either we write these  (manually or using the deriving mechanism)
>> separately
>>
>> (b) or we write a generic / parametric function `g` over `T`, and reuse
>> that to define `f_*`s
>>
>>
>>
>> Of course, (b) is p

Re: Operating on HsSyn

2017-07-28 Thread Shayan Najd
>
> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
> see what the renamer has done?


Right. So if I understand correctly, with this semantics, `Outputable` is
somewhere between pretty printing as often used in program manipulation
libraries (like Haskell-Src-Exts (HSE)) which is closer to syntax, and
`Show` which is closer to Haskell representation.
(There are also "exact printers" (as in HSE) that are even closer to syntax
in some sense.)
Often, pretty printers generate only grammatically valid terms, not the
ones polluted with extra annotations (hence grammatically invalid), e.g.,
what is the grammatically valid form of `OpApp` printed via `Outputable`
that includes the fixity annotation.

If I recall correctly, we have briefly studied these in the past summer, we
came up with some roundtrip correctness criteria, like the following (bar
error handling; assume valid input):

 (parser . prettyPrint . parser) = id

[paging in Jacques]

The reason I am trying to flesh out the semantics is the /big/ gains on
code reuse later on in the process: one does not need to define a separate
pretty printing library for Haskell syntax, and can reuse the well-tested
and well-maintained one in GHC.

Reformulating part of your concern, based on my understanding (if I may),
the questions is: what is the proper design of an "outputer"
(debug-printer?) where /annotated/ terms can be pretty-printed including
any printable (pretty?showable?) annotations.
In particular, we may want to take advantage of extensibility of data types
for this.
Am I far off?

Note: with proper design, an extensible debug-printer can still subsume
corresponding pretty-printers.


On Fri, Jul 28, 2017 at 2:36 PM, Simon Peyton Jones <simo...@microsoft.com>
wrote:

> I have been under the impression that we don't even want to print those.
>
> On the contrary, inside GHC I /do/ want to print them. Otherwise how can I
> see what the renamer has done?
>
>
>
> Simon
>
>
>
> *From:* Shayan Najd [mailto:sh.n...@gmail.com]
> *Sent:* 28 July 2017 12:20
> *To:* Simon Peyton Jones <simo...@microsoft.com>
> *Cc:* ghc-devs@haskell.org; Alan & Kim Zimmerman <alan.z...@gmail.com>
> *Subject:* Re: Operating on HsSyn
>
>
>
> Before all this, we may need to discuss a bit about the intended semantics
> of
>
> `Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
>
> suffixed constructors?  If not, then we only need to write a set of
> instances
>
> for the base growable AST, once and for all.  Such instances will be
> polymorphic
>
> on the extension descriptor `p`, and do not need to mention the
> constraints like
>
> `(PostRn p (IdP p)`, since these are just extensions and not part of the
> base
>
> growable AST.  Or, am I missing something about the intended semantics of
>
> `Outputable`?
>
>
>
> You write
>
>
>
> So today we never print these annotations, to avoid bloating the instance
> contexts, which can be painful.
>
>
>
> I have been under the impression that we don't even want to print those.
>
>
>
> Of course, there are scenarios (like `Show` instances) where we do want to
> write
>
> compositional / generic functions that take into account the extensions.
>
> Here is my abstract overview of the scenario, that may help the discussion.
>
> Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
> phase) that
>
>  are defined as extensions to a base datatype `T`:
>
>
>
> > A = T XA
>
> > B = T XB
>
> > C = T XC
>
>
>
> where `X*`s are extension descriptors.
>
> Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `
> B`, and `C`.
>
> We have two main alternatives:
>
> (a) either we write these  (manually or using the deriving mechanism)
> separately
>
> (b) or we write a generic / parametric function `g` over `T`, and reuse
> that to define `f_*`s
>
>
>
> Of course, (b) is preferable in theory , but not always possible or
> preferable in practice.
>
> In which case, we can always resort to (a).
>
> The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more
> parametric should
>
> `g` get, as this is the case for any generic function.
>
>
>
> With a correct design, I believe, these are all independent of Trees that
> Grow story itself:
>
> we are now not only trying to reuse data types, and functions agnostic
> towards extensions
>
> (pretty printers in my view of their semantics), but also reuse functions
> with parametric /
>
> varying behaviour with respect to extensions.
>
>
>
> /Shayan
>
>
>
>
>
>
>
> On Fri, Jul 28, 2017 

RE: Operating on HsSyn

2017-07-28 Thread Simon Peyton Jones via ghc-devs
I have been under the impression that we don't even want to print those.
On the contrary, inside GHC I /do/ want to print them. Otherwise how can I see 
what the renamer has done?

Simon

From: Shayan Najd [mailto:sh.n...@gmail.com]
Sent: 28 July 2017 12:20
To: Simon Peyton Jones <simo...@microsoft.com>
Cc: ghc-devs@haskell.org; Alan & Kim Zimmerman <alan.z...@gmail.com>
Subject: Re: Operating on HsSyn

Before all this, we may need to discuss a bit about the intended semantics of
`Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
suffixed constructors?  If not, then we only need to write a set of instances
for the base growable AST, once and for all.  Such instances will be polymorphic
on the extension descriptor `p`, and do not need to mention the constraints like
`(PostRn p (IdP p)`, since these are just extensions and not part of the base
growable AST.  Or, am I missing something about the intended semantics of
`Outputable`?

You write

So today we never print these annotations, to avoid bloating the instance 
contexts, which can be painful.

I have been under the impression that we don't even want to print those.

Of course, there are scenarios (like `Show` instances) where we do want to write
compositional / generic functions that take into account the extensions.
Here is my abstract overview of the scenario, that may help the discussion.
Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler 
phase) that
 are defined as extensions to a base datatype `T`:

> A = T XA
> B = T XB
> C = T XC

where `X*`s are extension descriptors.
Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `B`, 
and `C`.
We have two main alternatives:
(a) either we write these  (manually or using the deriving mechanism) separately
(b) or we write a generic / parametric function `g` over `T`, and reuse that to 
define `f_*`s

Of course, (b) is preferable in theory , but not always possible or preferable 
in practice.
In which case, we can always resort to (a).
The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more 
parametric should
`g` get, as this is the case for any generic function.

With a correct design, I believe, these are all independent of Trees that Grow 
story itself:
we are now not only trying to reuse data types, and functions agnostic towards 
extensions
(pretty printers in my view of their semantics), but also reuse functions with 
parametric /
varying behaviour with respect to extensions.

/Shayan



On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones 
<simo...@microsoft.com<mailto:simo...@microsoft.com>> wrote:
Devs,

Shayan is working away on “Trees that grow”… do keep it on your radar:

To: ghc-devs
Sent: 25 May 2017 23:49
Do take a look at this:


• We propose to re-engineer HsSyn itself.  This will touch a lot of 
code.

• But it’s very neat, and will bring big long-term advantages

• And we can do it a bit at a time

The wiki page https://ghc.haskell.org/trac/ghc/wiki/ImplementingTreesThatGrow 
has the details.  It’s entirely an internal change, not a change to GHC’s 
specification, so it’s independent of the GHC proposals process.  But I’d value 
the opinion of other GHC devs

Meanwhile I have a question. When pretty-printing HsSyn we often have a 
situation like this:


  data Pass = Parsed | Renamed | Typechecked



  data HsExpr (p :: Pass) = HsVar (IdP p) | 



  type famliy IdP p where

 IdP Parsed  = RdrName

 IdP Renamed = Name

 IdP Typechecked = Id



  instance (Outputable (IdP p)) => Outputable (HsExpr p) where

 ppr (HsVar v) = ppr v

The (ppr v) requires (Outputable (IdP p)), hence the context.

Moreover, and more seriously, there are things we just can't pretty-print
right now.  For example, HsExpr has this data constructor:


  data HsExpr p = ...

| OpApp   (LHsExpr p)

  (LHsExpr p)

  (PostRn p Fixity)

  (LHsExpr p)

To pretty-print the third argument, we'd need to add


  instance (Outputable (IdP p),

Outputable (PostRn p Fixity))   -- New

=> Outputable (HsExpr p) where

 ppr (HsVar v) = ppr v


and that gets onerous.  So today we never print these annotations, to avoid 
bloating the instance contexts, which can be painful.  It bit me yesterday.

We have bitten that bullet for the Data class: look at HsExtension.DataId, 
which abbreviates the long list of dictionaries:


  type DataId p =

( Data p

, ForallX Data p

, Data (NameOrRdrName (IdP p))

, Data (IdP p)

, Data (PostRn p (IdP p))

, Data (PostRn p (Located Name))

, Data (PostRn p Bool)

, Data (PostRn p Fixity)

 ,..and nine more... )

Let me note in passing that 
[wiki:QuantifiedContexts<https://ghc.haskell.org/trac/ghc/wiki/QuantifiedContexts>]
 would make this somewhat shorter


  type DataId p =

( Data p

,

Re: Operating on HsSyn

2017-07-28 Thread Shayan Najd
Before all this, we may need to discuss a bit about the intended semantics
of
`Outputable`: does it need to print `PostRn`, or `PostTc` fields; or `Out`
suffixed constructors?  If not, then we only need to write a set of
instances
for the base growable AST, once and for all.  Such instances will be
polymorphic
on the extension descriptor `p`, and do not need to mention the constraints
like
`(PostRn p (IdP p)`, since these are just extensions and not part of the
base
growable AST.  Or, am I missing something about the intended semantics of
`Outputable`?

You write

So today we never print these annotations, to avoid bloating the instance
> contexts, which can be painful.
>

I have been under the impression that we don't even want to print those.

Of course, there are scenarios (like `Show` instances) where we do want to
write
compositional / generic functions that take into account the extensions.
Here is my abstract overview of the scenario, that may help the discussion.
Consider data types `A`, `B`, and `C` (say, one AST datatype per compiler
phase) that
 are defined as extensions to a base datatype `T`:

> A = T XA
> B = T XB
> C = T XC

where `X*`s are extension descriptors.
Now, say we want to a define functions `f_A`, `f_B`, and `f_C` over `A`, `B`,
and `C`.
We have two main alternatives:
(a) either we write these  (manually or using the deriving mechanism)
separately
(b) or we write a generic / parametric function `g` over `T`, and reuse
that to define `f_*`s

Of course, (b) is preferable in theory , but not always possible or
preferable in practice.
In which case, we can always resort to (a).
The more varying are the definitions of `f_A`, `f_B`, and `f_C` the more
parametric should
`g` get, as this is the case for any generic function.

With a correct design, I believe, these are all independent of Trees that
Grow story itself:
we are now not only trying to reuse data types, and functions agnostic
towards extensions
(pretty printers in my view of their semantics), but also reuse functions
with parametric /
varying behaviour with respect to extensions.

/Shayan



On Fri, Jul 28, 2017 at 10:18 AM, Simon Peyton Jones 
wrote:

> Devs,
>
>
>
> Shayan is working away on “Trees that grow”… do keep it on your radar:
>
>
>
> *To:* ghc-devs
> *Sent:* 25 May 2017 23:49
>
> Do take a look at this:
>
>
>
> · We propose to re-engineer HsSyn itself.  This will touch a *lot*
> of code.
>
> · But it’s very neat, and will bring big long-term advantages
>
> · And we can do it a bit at a time
>
>
>
> The wiki page https://ghc.haskell.org/trac/ghc/wiki/
> ImplementingTreesThatGrow has the details.  It’s entirely an internal
> change, not a change to GHC’s specification, so it’s independent of the GHC
> proposals process.  But I’d value the opinion of other GHC devs
>
>
>
> Meanwhile I have a question. When pretty-printing HsSyn we often have a
> situation like this:
>
>
>
>   data Pass = Parsed | Renamed | Typechecked
>
>
>
>   data HsExpr (p :: Pass) = HsVar (IdP p) | 
>
>
>
>   type famliy IdP p where
>
>  IdP Parsed  = RdrName
>
>  IdP Renamed = Name
>
>  IdP Typechecked = Id
>
>
>
>   instance (Outputable (IdP p)) => Outputable (HsExpr p) where
>
>  ppr (HsVar v) = ppr v
>
>
>
> The (ppr v) requires (Outputable (IdP p)), hence the context.
>
>
>
> Moreover, and more seriously, there are things we just can't pretty-print
>
> right now.  For example, HsExpr has this data constructor:
>
>
>
>   data HsExpr p = ...
>
> | OpApp   (LHsExpr p)
>
>   (LHsExpr p)
>
>   (PostRn p Fixity)
>
>   (LHsExpr p)
>
>
>
> To pretty-print the third argument, we'd need to add
>
>
>
>   instance (Outputable (IdP p),
>
> Outputable (PostRn p Fixity))   -- New
>
> => Outputable (HsExpr p) where
>
>  ppr (HsVar v) = ppr v
>
>
>
> and that gets onerous.  *So today we never print these annotations*, to
> avoid bloating the instance contexts, which can be painful.  It bit me
> yesterday.
>
>
>
> We have bitten that bullet for the Data class: look at HsExtension.DataId,
> which abbreviates the long list of dictionaries:
>
>
>
>   type DataId p =
>
> ( Data p
>
> , ForallX Data p
>
> , Data (NameOrRdrName (IdP p))
>
> , Data (IdP p)
>
> , Data (PostRn p (IdP p))
>
> , Data (PostRn p (Located Name))
>
> , Data (PostRn p Bool)
>
> , Data (PostRn p Fixity)
>
>  ,..and nine more... )
>
>
>
> Let me note in passing that [wiki:QuantifiedContexts
> ] would make
> this somewhat shorter
>
>
>
>   type DataId p =
>
> ( Data p
>
> , ForallX Data p
>
> , Data (NameOrRdrName (IdP p))
>
> , Data (IdP p)
>
> , forall t. Data t => Data (PostRn p t))
>
>
>
> But we still need one item in this list for each type function,
>
> and I am worried about how this scales to the
>
>