Re: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Alfredo Di Napoli
Morning all,

*Richard*: sorry! Unfortunately MR !4798 is the cornerstone of this
refactoring work but it's also gargantuan. Let's discuss a plan to attack
it, but fundamentally there is a critical mass of changes that needs to
happen atomically or it wouldn't make much sense, and alas this doesn't
play in our favour when it comes to MR size and ease of review. However, to
quickly reply to your remak: currently (for the sake of the
"minimum-viable-product") I am trying to stabilise the external interfaces,
by which I mean giving functions their final type signature while I do
what's easiest to make things typecheck. In this phase what I think is the
easiest is to wrap the majority of diagnostics into the `xxUnknownxx`
constructor, and change them gradually later. A fair warning, though: you
say "I would think that a DsMessage would later be wrapped in an envelope."
This might be true for Ds messages (didn't actually invest any brain cycles
to check that) but in general we have to turn a message into an envelope as
soon as we have a chance to do so, because we need to grab the `SrcSpan`
and the `DynFlags` *at the point of creation* of the diagnostics. Carrying
around a message and make it bubble up at some random point won't be a good
plan (even for Ds messages). Having said that, I clearly have very little
knowledge about this area of GHC, so feel free to disagree :)

*John*: Although it's a bit hard to predict how well this is going to
evolve, my current embedding, to refresh everyone's memory, is the
following:

data DsMessage =

DsUnknownMessage !DiagnosticMessage

  -- ^ Stop-gap constructor to ease the migration.

  | DsLiftedTcRnMessage !TcRnMessage

  -- ^ A diagnostic coming straight from the Typecheck-renamer.

  -- More messages added in the future, of course


At first I thought this was the wrong way around, due to Simon's comment,
but this actually creates pleasant external interfaces. To give you a bunch
of examples from MR !4798:


deSugar :: HscEnv -> ModLocation -> TcGblEnv -> IO (Messages DsMessage,
Maybe ModGuts)
deSugarExpr :: HscEnv -> LHsExpr GhcTc -> IO (Messages DsMessage, Maybe
CoreExpr)

Note something interesting: the second function actually calls
`runTcInteractive` inside the body, but thanks to the `DsLiftedTcRnMessage`
we can still expose to the consumer an opaque `DsMessage` , which is what I
would expect to see from a function called "deSugarExpr". Conversely, I
would be puzzled to find those functions returning a `TcRnDsMessage`.


Having said all of that, I am not advocating this design is "the best". I
am sure we will iterate on it. I am just reporting that even this baseline
seems to be decent from an API perspective :)


On Wed, 31 Mar 2021 at 05:45, John Ericson 
wrote:

> Alfredo also replied to this pointing his embedding plan. I also prefer
> that, because I really wish TH didn't smear together the phases so much.
> Moreover, I hope with
>
>  - GHC proposals https://github.com/ghc-proposals/ghc-proposals/pull/412
> / https://github.com/ghc-proposals/ghc-proposals/pull/243
>
>  - The parallelism work currently be planned in
> https://gitlab.haskell.org/ghc/ghc/-/wikis/Plan-for-increased-parallelism-and-more-detailed-intermediate-output
>
> we might actually have an opportunity/extra motivation to do that. Splices
> and quotes will still induce intricate inter-phase dependencies, but I hope
> that could be mediated by the driver rather than just baked into each phase.
>
> (One final step would be the "stuck macros" technique of
> https://www.youtube.com/watch?v=nUvKoG_V_U0 /
> https://github.com/gelisam/klister, where TH splices would be able to
> making "blocking queries" of the the compiler in ways that induce more of
> these fine-grained dependencies.)
>
> Anyways, while we could also do a "RnTsDsError" and split later, I hope
> Alfredo's alternative of embedding won't be too much harder and prepare us
> for these exciting areas of exploration.
>
> John
> On 3/30/21 10:14 AM, Richard Eisenberg wrote:
>
>
>
> On Mar 30, 2021, at 4:57 AM, Alfredo Di Napoli 
> wrote:
>
> I'll explore the idea of adding a second IORef.
>
>
> Renaming/type-checking is already mutually recursive. (The renamer must
> call the type-checker in order to rename -- that is, evaluate -- untyped
> splices. I actually can't recall why the type-checker needs to call the
> renamer.) So we will have a TcRnError. Now we see that the desugarer ends
> up mixed in, too. We could proceed how Alfredo suggests, by adding a second
> IORef. Or we could just make TcRnDsError (maybe renaming that).
>
> What's the disadvantage? Clients will have to potentially know about all
> the different error forms with either approach (that is, using my combined
> type or using multiple IORefs). The big advantage to separating is maybe
> module dependencies? But my guess is that the dependencies won't be an
> issue here, due to the fact that these components are already leaning on
> each other. Maybe the advantage is 

Re: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread John Ericson
Alfredo also replied to this pointing his embedding plan. I also prefer 
that, because I really wish TH didn't smear together the phases so much. 
Moreover, I hope with


 - GHC proposals 
https://github.com/ghc-proposals/ghc-proposals/pull/412 / 
https://github.com/ghc-proposals/ghc-proposals/pull/243


 - The parallelism work currently be planned in 
https://gitlab.haskell.org/ghc/ghc/-/wikis/Plan-for-increased-parallelism-and-more-detailed-intermediate-output 



we might actually have an opportunity/extra motivation to do that. 
Splices and quotes will still induce intricate inter-phase dependencies, 
but I hope that could be mediated by the driver rather than just baked 
into each phase.


(One final step would be the "stuck macros" technique of 
https://www.youtube.com/watch?v=nUvKoG_V_U0 / 
https://github.com/gelisam/klister, where TH splices would be able to 
making "blocking queries" of the the compiler in ways that induce more 
of these fine-grained dependencies.)


Anyways, while we could also do a "RnTsDsError" and split later, I hope 
Alfredo's alternative of embedding won't be too much harder and prepare 
us for these exciting areas of exploration.


John

On 3/30/21 10:14 AM, Richard Eisenberg wrote:



On Mar 30, 2021, at 4:57 AM, Alfredo Di Napoli 
mailto:alfredo.dinap...@gmail.com>> wrote:


I'll explore the idea of adding a second IORef.


Renaming/type-checking is already mutually recursive. (The renamer 
must call the type-checker in order to rename -- that is, evaluate -- 
untyped splices. I actually can't recall why the type-checker needs to 
call the renamer.) So we will have a TcRnError. Now we see that the 
desugarer ends up mixed in, too. We could proceed how Alfredo 
suggests, by adding a second IORef. Or we could just make TcRnDsError 
(maybe renaming that).


What's the disadvantage? Clients will have to potentially know about 
all the different error forms with either approach (that is, using my 
combined type or using multiple IORefs). The big advantage to 
separating is maybe module dependencies? But my guess is that the 
dependencies won't be an issue here, due to the fact that these 
components are already leaning on each other. Maybe the advantage is 
just in having smaller types? Maybe.


I don't have a great sense as to what to do here, but I would want a 
clear reason that e.g. the TcRn monad would have two IORefs, while 
other monads will work with GhcMessage (instead of a whole bunch of 
IORefs).


Richard

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Type inference of singular matches on GADTs

2021-03-30 Thread Alexis King

On 3/28/21 9:17 PM, Richard Eisenberg wrote:



I think this is the key part of Alexis's plea: that the type checker 
take into account exhaustivity in choosing how to proceed.


[…]

Does this match your understanding?


Yes, precisely. :) Without GADTs, exhaustivity doesn’t yield any useful 
information to the typechecker, but with them, it can.


I agree with Simon that it seems tricky—his examples are good ones—and I 
agree with Richard that I don’t know if this is actually a good or 
fruitful idea. I’m certainly not demanding anyone else produce a 
solution! But I was curious if anyone had explored this before, and it 
sounds like perhaps the answer is “no.” Fair enough! I still appreciate 
the discussion.


Alexis

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Alfredo Di Napoli
Hello folks,

Richard: as I was in the middle of some other refactoring by the time Simon
replied, you can see a potential refactoring that *doesn't* use the double
IORef, but rather this idea of having a `DsMessage` embed `TcRnMessage`(s)
via a new costructor:

https://gitlab.haskell.org/ghc/ghc/-/merge_requests/4798/diffs#6eaba7424490cb26d74e0dab0f6fd7bc3537dca7

(Just grep for "DsMessage", "DsUnknownMessage", and `DsLiftedTcRnMessage`
to see the call sites).

The end result is not bad, I have to say. Or, at least, it doesn't
strike me as totally horrid :)

A.



On Tue, 30 Mar 2021 at 16:14, Richard Eisenberg  wrote:

>
>
> On Mar 30, 2021, at 4:57 AM, Alfredo Di Napoli 
> wrote:
>
> I'll explore the idea of adding a second IORef.
>
>
> Renaming/type-checking is already mutually recursive. (The renamer must
> call the type-checker in order to rename -- that is, evaluate -- untyped
> splices. I actually can't recall why the type-checker needs to call the
> renamer.) So we will have a TcRnError. Now we see that the desugarer ends
> up mixed in, too. We could proceed how Alfredo suggests, by adding a second
> IORef. Or we could just make TcRnDsError (maybe renaming that).
>
> What's the disadvantage? Clients will have to potentially know about all
> the different error forms with either approach (that is, using my combined
> type or using multiple IORefs). The big advantage to separating is maybe
> module dependencies? But my guess is that the dependencies won't be an
> issue here, due to the fact that these components are already leaning on
> each other. Maybe the advantage is just in having smaller types? Maybe.
>
> I don't have a great sense as to what to do here, but I would want a clear
> reason that e.g. the TcRn monad would have two IORefs, while other monads
> will work with GhcMessage (instead of a whole bunch of IORefs).
>
> Richard
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Richard Eisenberg


> On Mar 30, 2021, at 4:57 AM, Alfredo Di Napoli  
> wrote:
> 
> I'll explore the idea of adding a second IORef.

Renaming/type-checking is already mutually recursive. (The renamer must call 
the type-checker in order to rename -- that is, evaluate -- untyped splices. I 
actually can't recall why the type-checker needs to call the renamer.) So we 
will have a TcRnError. Now we see that the desugarer ends up mixed in, too. We 
could proceed how Alfredo suggests, by adding a second IORef. Or we could just 
make TcRnDsError (maybe renaming that).

What's the disadvantage? Clients will have to potentially know about all the 
different error forms with either approach (that is, using my combined type or 
using multiple IORefs). The big advantage to separating is maybe module 
dependencies? But my guess is that the dependencies won't be an issue here, due 
to the fact that these components are already leaning on each other. Maybe the 
advantage is just in having smaller types? Maybe.

I don't have a great sense as to what to do here, but I would want a clear 
reason that e.g. the TcRn monad would have two IORefs, while other monads will 
work with GhcMessage (instead of a whole bunch of IORefs).

Richard___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Multiple versions of happy

2021-03-30 Thread Oleg Grenrus
One way is

cabal install happy --program-suffix=-1.19 --constraint='happy ^>=1.19'
cabal install happy --program-suffix=-1.20 --constraint='happy ^>=1.20'

happy-1.19 --version
Happy Version 1.19.12 Copyright (c) 1993-1996 Andy Gill, Simon Marlow
(c) 1997-2005 Simon Marlow

happy-1.20 --version
Happy Version 1.20.0 Copyright (c) 1993-1996 Andy Gill, Simon Marlow (c)
1997-2005 Simon Marlow

- Oleg

On 30.3.2021 16.22, Simon Peyton Jones via ghc-devs wrote:
>
> That’s (2), thanks.  How about (1)?
>
>  
>
> *From:*Sebastian Graf 
> *Sent:* 30 March 2021 14:22
> *To:* Simon Peyton Jones 
> *Cc:* ghc-devs@haskell.org
> *Subject:* Re: Multiple versions of happy
>
>  
>
> Hi Simon,
>
>  
>
> According to the configure script, you can use the HAPPY env variable.
> e.g.
>
>  
>
> $ HAPPY=/full/path/to/happy ./configure
>
>  
>
> Hope that helps. Cheers,
>
> Sebastian
>
>  
>
> Am Di., 30. März 2021 um 15:19 Uhr schrieb Simon Peyton Jones via
> ghc-devs mailto:ghc-devs@haskell.org>>:
>
> What’s the approved mechanism to install multiple versions of
> happy/alex etc?  Eg I tried to build ghc-9.0 and got this:
>
> checking for makeinfo... no
>
> checking for python3... /usr/bin/python3
>
> checking for ghc-pkg matching /opt/ghc/bin/ghc... /opt/ghc/bin/ghc-pkg
>
> checking for happy... /home/simonpj/.cabal/bin/happy
>
> checking for version of happy... 1.20.0
>
> configure: error: Happy version 1.19 is required to compile GHC.
>
>  
>
> I so I have to
>
>  1. Install happy-1.19 without overwriting the installed happy-1.20
>  2. Tell configure to use happy-1.19
>
> What’s the best way to do those two things?
>
> Thanks
>
> Simon
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org 
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
> 
> 
>
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Multiple versions of happy

2021-03-30 Thread Simon Peyton Jones via ghc-devs
That's (2), thanks.  How about (1)?

From: Sebastian Graf 
Sent: 30 March 2021 14:22
To: Simon Peyton Jones 
Cc: ghc-devs@haskell.org
Subject: Re: Multiple versions of happy

Hi Simon,

According to the configure script, you can use the HAPPY env variable. e.g.

$ HAPPY=/full/path/to/happy ./configure

Hope that helps. Cheers,
Sebastian

Am Di., 30. März 2021 um 15:19 Uhr schrieb Simon Peyton Jones via ghc-devs 
mailto:ghc-devs@haskell.org>>:
What's the approved mechanism to install multiple versions of happy/alex etc?  
Eg I tried to build ghc-9.0 and got this:

checking for makeinfo... no

checking for python3... /usr/bin/python3

checking for ghc-pkg matching /opt/ghc/bin/ghc... /opt/ghc/bin/ghc-pkg

checking for happy... /home/simonpj/.cabal/bin/happy

checking for version of happy... 1.20.0

configure: error: Happy version 1.19 is required to compile GHC.


I so I have to

  1.  Install happy-1.19 without overwriting the installed happy-1.20
  2.  Tell configure to use happy-1.19
What's the best way to do those two things?
Thanks
Simon
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Multiple versions of happy

2021-03-30 Thread Sebastian Graf
Hi Simon,

According to the configure script, you can use the HAPPY env variable. e.g.

$ HAPPY=/full/path/to/happy ./configure

Hope that helps. Cheers,
Sebastian

Am Di., 30. März 2021 um 15:19 Uhr schrieb Simon Peyton Jones via ghc-devs <
ghc-devs@haskell.org>:

> What’s the approved mechanism to install multiple versions of happy/alex
> etc?  Eg I tried to build ghc-9.0 and got this:
>
> checking for makeinfo... no
>
> checking for python3... /usr/bin/python3
>
> checking for ghc-pkg matching /opt/ghc/bin/ghc... /opt/ghc/bin/ghc-pkg
>
> checking for happy... /home/simonpj/.cabal/bin/happy
>
> checking for version of happy... 1.20.0
>
> configure: error: Happy version 1.19 is required to compile GHC.
>
>
>
> I so I have to
>
>1. Install happy-1.19 without overwriting the installed happy-1.20
>2. Tell configure to use happy-1.19
>
> What’s the best way to do those two things?
>
> Thanks
>
> Simon
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Multiple versions of happy

2021-03-30 Thread Simon Peyton Jones via ghc-devs
What's the approved mechanism to install multiple versions of happy/alex etc?  
Eg I tried to build ghc-9.0 and got this:

checking for makeinfo... no

checking for python3... /usr/bin/python3

checking for ghc-pkg matching /opt/ghc/bin/ghc... /opt/ghc/bin/ghc-pkg

checking for happy... /home/simonpj/.cabal/bin/happy

checking for version of happy... 1.20.0

configure: error: Happy version 1.19 is required to compile GHC.


I so I have to

  1.  Install happy-1.19 without overwriting the installed happy-1.20
  2.  Tell configure to use happy-1.19
What's the best way to do those two things?
Thanks
Simon
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Alfredo Di Napoli
Right, I see, thanks.

This is what I was attempting so far:

data DsMessage =
DsUnknownMessage !DiagnosticMessage
  | DsLiftedTcRnMessage !TcRnMessage
  -- ^ A diagnostic coming straight from the Typecheck-renamer.

and later:

liftTcRnMessages :: MonadIO m => IORef (Messages TcRnMessage) -> m (IORef
(Messages DsMessage))
liftTcRnMessages ref = liftIO $ do
  oldContent <- readIORef ref
  newIORef (DsLiftedTcRnMessage <$> oldContent)

...

mkDsEnvsFromTcGbl :: MonadIO m
  => HscEnv -> IORef (Messages TcRnMessage) -> TcGblEnv
  -> m (DsGblEnv, DsLclEnv)
mkDsEnvsFromTcGbl hsc_env msg_var tcg_env
  = do { cc_st_var   <- liftIO $ newIORef newCostCentreState
   ...
   ; msg_var' <- liftTcRnMessages msg_var
   ; return $ mkDsEnvs unit_env this_mod rdr_env type_env fam_inst_env
   msg_var' cc_st_var complete_matches
   }


While this typechecks, I wonder if that's the right way to think about it
-- from your reply, it seems like the dependency is in the opposite
direction -- we need to store desugaring diagnostics in the TcM due to TH
splicing, not the other way around.

I'll explore the idea of adding a second IORef.

Thanks!


On Tue, 30 Mar 2021 at 10:51, Simon Peyton Jones 
wrote:

> I think the main reason is that for Template Haskell the
> renamer/type-checker need to run the desugarer.  See the call to initDsTc
> in GHC.Tc.Gen.Splice.
>
>
>
> I suppose an alternative is that the TcGblEnv could have a second IORef to
> use for error messages that come from desugaring during TH splices.
>
>
>
> Nothing deeper than that I think.
>
>
>
> Simon
>
>
>
> *From:* ghc-devs  *On Behalf Of *Alfredo Di
> Napoli
> *Sent:* 30 March 2021 08:42
> *To:* Simon Peyton Jones via ghc-devs 
> *Subject:* Why TcLclEnv and DsGblEnv need to store the same IORef for
> errors?
>
>
>
> Hello folks,
>
>
>
> as some of you might know me and Richard are reworking how GHC constructs,
> emits and deals with errors and warnings (See
> https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values
> 
> and #18516).
>
>
>
> To summarise very briefly the spirit, we will have (eventually) proper
> domain-specific types instead of SDocs. The idea is to have very precise
> and "focused" types for the different phases of the compilation pipeline,
> and a "catch-all" monomorphic `GhcMessage` type used for the final
> pretty-printing and exception-throwing:
>
>
>
> data GhcMessage where
>
>   GhcPsMessage  :: PsMessage -> GhcMessage
>
>   GhcTcRnMessage:: TcRnMessage -> GhcMessage
>
>   GhcDsMessage  :: DsMessage -> GhcMessage
>
>   GhcDriverMessage  :: DriverMessage -> GhcMessage
>
>   GhcUnknownMessage :: forall a. (Diagnostic a, Typeable a) => a ->
> GhcMessage
>
>
>
> While starting to refactor GHC to use these types, I have stepped into
> something bizarre: the `DsGblEnv` and `TcLclEnv` envs both share the same
> `IORef` to store the diagnostics (i.e. warnings and errors) accumulated
> during compilation. More specifically, a function like
> `GHC.HsToCore.Monad.mkDsEnvsFromTcGbl` simply receives as input the `IORef`
> coming straight from the `TcLclEnv`, and stores it into the `DsGblEnv`.
>
>
>
> This is unfortunate, because it would force me to change the type of this
> `IORef` to be
>
> `IORef (Messages GhcMessage)` to accommodate both diagnostic types, but
> this would bubble up into top-level functions like `initTc`, which would
> now return a `Messages GhcMessage`. This is once again unfortunate, because
> is "premature": ideally it might still be nice to return `Messages
> TcRnMessage`, so that GHC API users could get a very precise diagnostic
> type rather than the bag `GhcMessage` is. It also violates an implicit
> contract: we are saying that `initTc` might return (potentially) *any* GHC
> diagnostic message (including, for example, driver errors/warnings), which
> I think is misleading.
>
>
>
> Having said all of that, it's also possible that returning `Messages
> GhcMessage` is totally fine here and we don't need to be able to do this
> fine-grained distinction for the GHC API functions. Regardless, I would
> like to ask the audience:
>
>
>
> * Why `TcLclEnv` and `DsGblEnv` need to share the same IORef?
>
> * Is this for efficiency reasons?
>
> * Is this because we need the two monads to independently accumulate
> errors into the
>
>   same IORef?
>
>
>
> Thanks!
>
>
>
> Alfredo
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
___
ghc-devs mailing list
ghc

RE: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Simon Peyton Jones via ghc-devs
I think the main reason is that for Template Haskell the renamer/type-checker 
need to run the desugarer.  See the call to initDsTc in GHC.Tc.Gen.Splice.

I suppose an alternative is that the TcGblEnv could have a second IORef to use 
for error messages that come from desugaring during TH splices.

Nothing deeper than that I think.

Simon

From: ghc-devs  On Behalf Of Alfredo Di Napoli
Sent: 30 March 2021 08:42
To: Simon Peyton Jones via ghc-devs 
Subject: Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

Hello folks,

as some of you might know me and Richard are reworking how GHC constructs, 
emits and deals with errors and warnings (See 
https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values
 and #18516).

To summarise very briefly the spirit, we will have (eventually) proper 
domain-specific types instead of SDocs. The idea is to have very precise and 
"focused" types for the different phases of the compilation pipeline, and a 
"catch-all" monomorphic `GhcMessage` type used for the final pretty-printing 
and exception-throwing:

data GhcMessage where
  GhcPsMessage  :: PsMessage -> GhcMessage
  GhcTcRnMessage:: TcRnMessage -> GhcMessage
  GhcDsMessage  :: DsMessage -> GhcMessage
  GhcDriverMessage  :: DriverMessage -> GhcMessage
  GhcUnknownMessage :: forall a. (Diagnostic a, Typeable a) => a -> GhcMessage

While starting to refactor GHC to use these types, I have stepped into 
something bizarre: the `DsGblEnv` and `TcLclEnv` envs both share the same 
`IORef` to store the diagnostics (i.e. warnings and errors) accumulated during 
compilation. More specifically, a function like 
`GHC.HsToCore.Monad.mkDsEnvsFromTcGbl` simply receives as input the `IORef` 
coming straight from the `TcLclEnv`, and stores it into the `DsGblEnv`.

This is unfortunate, because it would force me to change the type of this 
`IORef` to be
`IORef (Messages GhcMessage)` to accommodate both diagnostic types, but this 
would bubble up into top-level functions like `initTc`, which would now return 
a `Messages GhcMessage`. This is once again unfortunate, because is 
"premature": ideally it might still be nice to return `Messages TcRnMessage`, 
so that GHC API users could get a very precise diagnostic type rather than the 
bag `GhcMessage` is. It also violates an implicit contract: we are saying that 
`initTc` might return (potentially) *any* GHC diagnostic message (including, 
for example, driver errors/warnings), which I think is misleading.

Having said all of that, it's also possible that returning `Messages 
GhcMessage` is totally fine here and we don't need to be able to do this 
fine-grained distinction for the GHC API functions. Regardless, I would like to 
ask the audience:

* Why `TcLclEnv` and `DsGblEnv` need to share the same IORef?
* Is this for efficiency reasons?
* Is this because we need the two monads to independently accumulate errors 
into the
  same IORef?

Thanks!

Alfredo












___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Type inference of singular matches on GADTs

2021-03-30 Thread Simon Peyton Jones via ghc-devs
I'm not saying this is a good idea for GHC or that it's implementable. But the 
idea of having type inference account for exhaustivity in this way does not 
seem, a priori, unspecified.

No, but I’m pointing out that specifying it might be tricky, involving some 
highly non-local reasoning.  I can’t yet see how to write a formal 
specification.   Note “yet”  -- growth mindset!

Simon

From: Richard Eisenberg 
Sent: 30 March 2021 04:58
To: Simon Peyton Jones 
Cc: Alexis King ; ghc-devs@haskell.org
Subject: Re: Type inference of singular matches on GADTs

As usual, I want to separate out the specification of a feature from the 
implementation. So let's just focus on specification for now -- with the caveat 
that there might be no possible implementation of these ideas.

The key innovation I see lurking here is the idea of an *exhaustive* function, 
where we know that any pattern-match on an argument is always exhaustive. I 
will write such a thing with @->, in both the type and in the arrow that 
appears after the lambda. The @-> type is a subtype of -> (and perhaps does not 
need to be written differently from ->).

EX1: \x @-> case x of HNil -> blah

This is easy: we can infer HList '[] @-> blah's type, because the pattern match 
is declared to be exhaustive, and no other type grants that property.

EX2: \x @-> (x, case x of HNit -> blah)

Same as EX1.

EX3: \x @-> case x of { HNil1 -> blah; HNil2 -> blah }

Same as EX1. There is still a unique type for which the patten-match is 
exhaustive.

EX4: Reject. There are multiple valid types, and we don't know which one to 
pick. This is like classic untouchable-variables territory.

EX5: This is hard. A declarative spec would probably choose HL2 [a] -> ... as 
you suggest, but there may be no implementation of such an idea.

EX6: Reject. No type leads to exhaustive matches.

I'm not saying this is a good idea for GHC or that it's implementable. But the 
idea of having type inference account for exhaustivity in this way does not 
seem, a priori, unspecified.

Richard




On Mar 29, 2021, at 5:00 AM, Simon Peyton Jones 
mailto:simo...@microsoft.com>> wrote:

I haven't thought about how to implement such a thing. At the least, it would 
probably require some annotation saying that we expect `\HNil -> ()` to be 
exhaustive (as GHC won't, in general, make that assumption). Even with that, 
could we get type inference to behave? Possibly.

As I wrote in another post on this thread, it’s a bit tricky.

What would you expect of (EX1)

   \x -> case x of HNil -> blah

Here the lambda and the case are separated

Now (EX2)

\x -> (x, case x of HNil -> blah)

Here the lambda and the case are separated more, and x is used twice.
What if there are more data constructors that share a common return type? (EX3)

data HL2 a where
HNil1 :: HL2 []
HNil2 :: HL2 []
HCons :: …blah…

\x -> case x of { HNil1 -> blah; HNil 2 -> blah  }

Here HNil1 and HNil2 both return HL2 [].   Is that “singular”?

What if one was a bit more general than the other?   Do we seek the least 
common generalisation of the alternatives given? (EX4)

data HL3 a where
HNil1 :: HL2 [Int]
HNil2 :: HL2 [a]
HCons :: …blah…

\x -> case x of { HNil1 -> blah; HNil 2 -> blah  }

What if the cases were incompatible?  (EX5)

data HL4 a where
HNil1 :: HL2 [Int]
HNil2 :: HL2 [Bool]
HCons :: …blah…

\x -> case x of { HNil1 -> blah; HNil 2 -> blah  }

Would you expect that to somehow generalise to `HL4 [a] -> blah`?

What if x matched multiple times, perhaps on different constructors (EX6)


\x -> (case s of HNil1 -> blah1;  case x of HNil2 -> blah)


The water gets deep quickly here.  I don’t (yet) see an obviously-satisfying 
design point that isn’t massively ad-hoc.

Simon


From: ghc-devs 
mailto:ghc-devs-boun...@haskell.org>> On Behalf 
Of Richard Eisenberg
Sent: 29 March 2021 03:18
To: Alexis King mailto:lexi.lam...@gmail.com>>
Cc: ghc-devs@haskell.org
Subject: Re: Type inference of singular matches on GADTs





On Mar 26, 2021, at 8:41 PM, Alexis King 
mailto:lexi.lam...@gmail.com>> wrote:

If there’s a single principal type that makes my function well-typed and 
exhaustive, I’d really like GHC to pick it.

I think this is the key part of Alexis's plea: that the type checker take into 
account exhaustivity in choosing how to proceed.

Another way to think about this:

f1 :: HList '[] -> ()
f1 HNil = ()

f2 :: HList as -> ()
f2 HNil = ()

Both f1 and f2 are well typed definitions. In any usage site where both are 
well-typed, they will behave the same. Yet f1 is exhaustive while f2 is not. 
This isn't really about an open-world assumption or the possibility of extra 
cases -- it has to do with what the runtime behaviors of the two functions are. 
f1 never fails, while f2 must check a constructor tag and perhaps throw an 
exception.

If we just see \HNil -> (), Alexis seems to be suggesting we prefer the f1 
interpretation over the f2 interpretation. Why? Because f1 is exhaustive, and 
whe

Why TcLclEnv and DsGblEnv need to store the same IORef for errors?

2021-03-30 Thread Alfredo Di Napoli
Hello folks,

as some of you might know me and Richard are reworking how GHC constructs,
emits and deals with errors and warnings (See
https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values
and #18516).

To summarise very briefly the spirit, we will have (eventually) proper
domain-specific types instead of SDocs. The idea is to have very precise
and "focused" types for the different phases of the compilation pipeline,
and a "catch-all" monomorphic `GhcMessage` type used for the final
pretty-printing and exception-throwing:

data GhcMessage where
  GhcPsMessage  :: PsMessage -> GhcMessage
  GhcTcRnMessage:: TcRnMessage -> GhcMessage
  GhcDsMessage  :: DsMessage -> GhcMessage
  GhcDriverMessage  :: DriverMessage -> GhcMessage
  GhcUnknownMessage :: forall a. (Diagnostic a, Typeable a) => a ->
GhcMessage

While starting to refactor GHC to use these types, I have stepped into
something bizarre: the `DsGblEnv` and `TcLclEnv` envs both share the same
`IORef` to store the diagnostics (i.e. warnings and errors) accumulated
during compilation. More specifically, a function like
`GHC.HsToCore.Monad.mkDsEnvsFromTcGbl` simply receives as input the `IORef`
coming straight from the `TcLclEnv`, and stores it into the `DsGblEnv`.

This is unfortunate, because it would force me to change the type of this
`IORef` to be
`IORef (Messages GhcMessage)` to accommodate both diagnostic types, but
this would bubble up into top-level functions like `initTc`, which would
now return a `Messages GhcMessage`. This is once again unfortunate, because
is "premature": ideally it might still be nice to return `Messages
TcRnMessage`, so that GHC API users could get a very precise diagnostic
type rather than the bag `GhcMessage` is. It also violates an implicit
contract: we are saying that `initTc` might return (potentially) *any* GHC
diagnostic message (including, for example, driver errors/warnings), which
I think is misleading.

Having said all of that, it's also possible that returning `Messages
GhcMessage` is totally fine here and we don't need to be able to do this
fine-grained distinction for the GHC API functions. Regardless, I would
like to ask the audience:

* Why `TcLclEnv` and `DsGblEnv` need to share the same IORef?
* Is this for efficiency reasons?
* Is this because we need the two monads to independently accumulate errors
into the
  same IORef?

Thanks!

Alfredo
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs