Re: [Haskell-cafe] ghc-api Static Semantics?

2012-02-06 Thread JP Moresmau
As a side note, buildwrapper version 0.4.0 and above follows the
approach you outline. When a file is modified, we call GHC to build
it, and we store the GHC AST as a JSON object in a hidden file. Then
all subsequent calls that make use of the JSON data (in EclipseFP,
this would be to show you a tooltip of the type of the object you're
hovering over) without calling GHC again, so it's much faster, even
though buildwrapper is a pure one shot executable with no concept of
a session. The JSON file could also be read by another process that
buildwrapper itself, so maybe Christopher could use this approach.

JP

On Thu, Jan 26, 2012 at 7:00 PM, Thomas Schilling
nomin...@googlemail.com wrote:


 On 26 January 2012 16:33, JP Moresmau jpmores...@gmail.com wrote:

 Thomas, thank you for that explanation about the different type of
 identifiers in the different phases of analysis. I've never seen that
 information so clearly laid out before, can it be added to the wikis
 (in http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/API
 or http://www.haskell.org/haskellwiki/GHC/As_a_library maybe)? I think
 it would be helpful to all people that want to dive into the GHC API.


 Will do.



 On a side note, I'm going to do something very similar in my
 BuildWrapper project (which is now the backend of the EclipseFP IDE
 plugins): instead of going back to the API every time the user
 requests to know the type of something in the AST, I'm thinking of
 sending the whole typed AST to the Java code. Maybe that's something
 Christopher could use. Both the BuildWrapper code and Thomas's scion
 code are available on GitHub, as they provide examples on how to use
 the GHC API.


 I really don't think you want to do much work on the front-end as that will
 just need to be duplicated for each front-end.  That was the whole point of
 building Scion in the first place.  I understand, of course, that Scion is
 not useful enough at this time.

 Well, I currently don't have much time to work on Scion, but the plan is as
 follows:

   - Scion becomes a multi-process architecture.  It has to be since it's not
 safe to run multiple GHC sessions inside the same process.  Even if that
 were possible, you wouldn't be able to, say, have a profiling compiler and a
 release compiler in the same process due to how static flags work.  Separate
 processes have the additional advantage that you can kill them if they use
 too much memory (e.g., because you can't unload loaded interfaces).

   - Scion will be based on Shake and GHC will mostly be used in one-shot
 mode (i.e., not --make).  This makes it easier to handle preprocessed
 files.  It also allows us to generate and update meta-information on
 demand.  I.e., instead of parsing and typechecking a file and then caching
 the result for the current file, Scion will simply generate meta information
 whenever it (re-)compiles a source file and writes that meta information to
 a file.  Querying or caching that meta information then is completely
 orthogonal to generating it.  The most basic meta information would be a
 type-annotated version of the compiled AST (possibly + warnings and errors
 from the last time it was compiled).  Any other meta information can then be
 generated from that.

  - The GHCi debugger probably needs to be treated specially.  There also
 should be automatic detection of files that aren't supported by the bytecode
 compiler (e.g., those using UnboxedTuples) and force compilation to machine
 code for those.

  - The front-end protocol should be specified somewhere.  I'm thinking about
 using protobuf specifications and then use ways to generate custom formats
 from that (e.g., JSON, Lisp S-Expressions, XML?).  And if the frontend
 supports protocol buffers, then it can use that and be fast.  That also
 means that all serialisation code can be auto-generated.

 I won't have time to work on this before the ICFP deadline (and only very
 little afterwards), but Scion is not dead (just hibernating).



 JP


 On Thu, Jan 26, 2012 at 2:31 PM, Thomas Schilling
 nomin...@googlemail.com wrote:
 
 
  On 26 January 2012 09:24, Christopher Brown cm...@st-andrews.ac.uk
  wrote:
  Hi Thomas,
 
  By static semantics I mean use and bind locations for every name in the
  AST.
 
  Right, that's what the renamer does in GHC.  The GHC AST is
  parameterised
  over the type of identifiers used.  The three different identifier types
  are:
 
  RdrName: is the name as it occurred in source code. This is the output
  of
  the parser.
  Name: is basically RdrName + unique ID, so you can distinguish two xs
  bound at different locations (this is what you want). This is the output
  of
  the renamer.
  Id: is Name + Type information and consequently is the output of the
  type
  checker.
 
  Diagram:
 
     String  --parser--  HsModule RdrName  --renamer--  HsModule Name
   --type-checker--  HsBinds Id
 
  Since you can't hook in-between renamer and type checker, it's perhaps
  more
  

Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-26 Thread Christopher Brown
Hi Thomas,

By static semantics I mean use and bind locations for every name in the AST.

For example:


f x = let x = x + 1 in x

Should parse as something like

HsMatch (f (HsPat x (1,2) (1,2)) (HsBody (HsExp (HsLet (HsMatch (x (8,1) 
(8,1)) (HsExp (HsInfix (+) (1) (x) (12,1) (8,1)) (x (16,1) (8,1

I'm steering towards haskell-src-exts right now as the sheer complexity of the 
ghc-api is putting me off. I need something simple, as I can't be spending all 
my time learning the ghc-api and hacking it together to do what I want. It does 
look a bit of a mess. Just trying to do simple things like parsing a file and 
showing its output proved to be much more complicated than it really needed to 
be.


 
 Let me know if you decide to take on this project.
 

We have decided to take it on. :)

Chris.




 
 On 24 January 2012 10:35, Christopher Brown cm...@st-andrews.ac.uk wrote:
 
 
 Have you looked at ghc-syb-utils, which gives a neat way to print an AST?
 
 http://hackage.haskell.org/packages/archive/ghc-syb-utils/0.2.1.0/doc/html/GHC-SYB-Utils.html
 
 
 Yes I found that yesterday!
 
 Chris.
 
 
 
 
 --
 JP Moresmau
 http://jpmoresmau.blogspot.com/
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 
 
 -- 
 Push the envelope. Watch it bend.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-26 Thread Thomas Schilling
On 26 January 2012 09:24, Christopher Brown cm...@st-andrews.ac.uk wrote:
 Hi Thomas,

 By static semantics I mean use and bind locations for every name in the
AST.

Right, that's what the renamer does in GHC.  The GHC AST is parameterised
over the type of identifiers used.  The three different identifier types
are:


   - RdrName: is the name as it occurred in source code. This is the output
   of the parser.
   - Name: is basically RdrName + unique ID, so you can distinguish two
   xs bound at different locations (this is what you want). This is the
   output of the renamer.
   - Id: is Name + Type information and consequently is the output of the
   type checker.

Diagram:

   String  --parser--  HsModule RdrName  --renamer--  HsModule Name
 --type-checker--  HsBinds Id

Since you can't hook in-between renamer and type checker, it's perhaps more
accurately depicted as:

   String  --parser--  HsModule RdrName  --renamer+type-checker--
 (HsModule Name,  HsBinds Id)

The main reasons why it's tricky to use the GHC API are:


   1. You need to setup the environment of packages etc.  E.g., the renamer
   needs to look up imported modules to correctly resolve imported names (or
   give a error).
   2. The second is that the current API is not designed for external use.
As I mentioned, you cannot run renamer and typechecker independently,
   there are dozens of invariants, there are environments being updated by the
   various phases, etc.  For example, if you want to generate code it's
   probably best to either generate HsModure RdrName or perhaps the Template
   Haskell API (never tried that path).


 I'm steering towards haskell-src-exts right now as the sheer complexity
of the ghc-api is putting me off. I need something simple, as I can't be
spending all my time learning the ghc-api and hacking it together to do
what I want. It does look a bit of a mess. Just trying to do simple things
like parsing a file and showing its output proved to be much more
complicated than it really needed to be

 We have decided to take it on. :)

Could you clarify that?  Are you doing everything in haskell-src-exts or
are you using the GHC API and translate the result into haskell-src-exts?
The former might be easier to implement, the latter could later be extended
to give you type info as well (without the need to implement a whole type
checker that most likely will bit rot compared to GHC sooner or later).

/ Thomas

-- 
Push the envelope. Watch it bend.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-26 Thread JP Moresmau
Thomas, thank you for that explanation about the different type of
identifiers in the different phases of analysis. I've never seen that
information so clearly laid out before, can it be added to the wikis
(in http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/API
or http://www.haskell.org/haskellwiki/GHC/As_a_library maybe)? I think
it would be helpful to all people that want to dive into the GHC API.

On a side note, I'm going to do something very similar in my
BuildWrapper project (which is now the backend of the EclipseFP IDE
plugins): instead of going back to the API every time the user
requests to know the type of something in the AST, I'm thinking of
sending the whole typed AST to the Java code. Maybe that's something
Christopher could use. Both the BuildWrapper code and Thomas's scion
code are available on GitHub, as they provide examples on how to use
the GHC API.

JP


On Thu, Jan 26, 2012 at 2:31 PM, Thomas Schilling
nomin...@googlemail.com wrote:


 On 26 January 2012 09:24, Christopher Brown cm...@st-andrews.ac.uk wrote:
 Hi Thomas,

 By static semantics I mean use and bind locations for every name in the
 AST.

 Right, that's what the renamer does in GHC.  The GHC AST is parameterised
 over the type of identifiers used.  The three different identifier types
 are:

 RdrName: is the name as it occurred in source code. This is the output of
 the parser.
 Name: is basically RdrName + unique ID, so you can distinguish two xs
 bound at different locations (this is what you want). This is the output of
 the renamer.
 Id: is Name + Type information and consequently is the output of the type
 checker.

 Diagram:

    String  --parser--  HsModule RdrName  --renamer--  HsModule Name
  --type-checker--  HsBinds Id

 Since you can't hook in-between renamer and type checker, it's perhaps more
 accurately depicted as:

    String  --parser--  HsModule RdrName  --renamer+type-checker--
  (HsModule Name,  HsBinds Id)

 The main reasons why it's tricky to use the GHC API are:

 You need to setup the environment of packages etc.  E.g., the renamer needs
 to look up imported modules to correctly resolve imported names (or give a
 error).
 The second is that the current API is not designed for external use.  As I
 mentioned, you cannot run renamer and typechecker independently, there are
 dozens of invariants, there are environments being updated by the various
 phases, etc.  For example, if you want to generate code it's probably best
 to either generate HsModure RdrName or perhaps the Template Haskell API
 (never tried that path).


 / Thomas

 --
 Push the envelope. Watch it bend.




-- 
JP Moresmau
http://jpmoresmau.blogspot.com/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-26 Thread Thomas Schilling
On 26 January 2012 16:33, JP Moresmau jpmores...@gmail.com wrote:

 Thomas, thank you for that explanation about the different type of
 identifiers in the different phases of analysis. I've never seen that
 information so clearly laid out before, can it be added to the wikis
 (in http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/API
 or http://www.haskell.org/haskellwiki/GHC/As_a_library maybe)? I think
 it would be helpful to all people that want to dive into the GHC API.


Will do.



 On a side note, I'm going to do something very similar in my
 BuildWrapper project (which is now the backend of the EclipseFP IDE
 plugins): instead of going back to the API every time the user
 requests to know the type of something in the AST, I'm thinking of
 sending the whole typed AST to the Java code. Maybe that's something
 Christopher could use. Both the BuildWrapper code and Thomas's scion
 code are available on GitHub, as they provide examples on how to use
 the GHC API.


I really don't think you want to do much work on the front-end as that will
just need to be duplicated for each front-end.  That was the whole point of
building Scion in the first place.  I understand, of course, that Scion is
not useful enough at this time.

Well, I currently don't have much time to work on Scion, but the plan is as
follows:

  - Scion becomes a multi-process architecture.  It has to be since it's
not safe to run multiple GHC sessions inside the same process.  Even if
that were possible, you wouldn't be able to, say, have a profiling compiler
and a release compiler in the same process due to how static flags work.
Separate processes have the additional advantage that you can kill them if
they use too much memory (e.g., because you can't unload loaded interfaces).

  - Scion will be based on Shake and GHC will mostly be used in one-shot
mode (i.e., not --make).  This makes it easier to handle preprocessed
files.  It also allows us to generate and update meta-information on
demand.  I.e., instead of parsing and typechecking a file and then caching
the result for the current file, Scion will simply generate meta
information whenever it (re-)compiles a source file and writes that meta
information to a file.  Querying or caching that meta information then is
completely orthogonal to generating it.  The most basic meta information
would be a type-annotated version of the compiled AST (possibly + warnings
and errors from the last time it was compiled).  Any other meta information
can then be generated from that.

 - The GHCi debugger probably needs to be treated specially.  There also
should be automatic detection of files that aren't supported by the
bytecode compiler (e.g., those using UnboxedTuples) and force compilation
to machine code for those.

 - The front-end protocol should be specified somewhere.  I'm thinking
about using protobuf specifications and then use ways to generate custom
formats from that (e.g., JSON, Lisp S-Expressions, XML?).  And if the
frontend supports protocol buffers, then it can use that and be fast.  That
also means that all serialisation code can be auto-generated.

I won't have time to work on this before the ICFP deadline (and only very
little afterwards), but Scion is not dead (just hibernating).



 JP


 On Thu, Jan 26, 2012 at 2:31 PM, Thomas Schilling
 nomin...@googlemail.com wrote:
 
 
  On 26 January 2012 09:24, Christopher Brown cm...@st-andrews.ac.uk
 wrote:
  Hi Thomas,
 
  By static semantics I mean use and bind locations for every name in the
  AST.
 
  Right, that's what the renamer does in GHC.  The GHC AST is parameterised
  over the type of identifiers used.  The three different identifier types
  are:
 
  RdrName: is the name as it occurred in source code. This is the output of
  the parser.
  Name: is basically RdrName + unique ID, so you can distinguish two xs
  bound at different locations (this is what you want). This is the output
 of
  the renamer.
  Id: is Name + Type information and consequently is the output of the type
  checker.
 
  Diagram:
 
 String  --parser--  HsModule RdrName  --renamer--  HsModule Name
   --type-checker--  HsBinds Id
 
  Since you can't hook in-between renamer and type checker, it's perhaps
 more
  accurately depicted as:
 
 String  --parser--  HsModule RdrName  --renamer+type-checker--
   (HsModule Name,  HsBinds Id)
 
  The main reasons why it's tricky to use the GHC API are:
 
  You need to setup the environment of packages etc.  E.g., the renamer
 needs
  to look up imported modules to correctly resolve imported names (or give
 a
  error).
  The second is that the current API is not designed for external use.  As
 I
  mentioned, you cannot run renamer and typechecker independently, there
 are
  dozens of invariants, there are environments being updated by the various
  phases, etc.  For example, if you want to generate code it's probably
 best
  to either generate HsModure RdrName or perhaps the Template Haskell API
  (never 

Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-25 Thread Thomas Schilling
I assume by static semantics you mean the renamed Haskell source code.
Due to template Haskell it (currently) is not possible to run the
renamer and type checker separately.  Note that the type checker
output is very different in shape from the renamed output.  The
renamed output mostly follows the original source definitions, but the
type checker output is basically only top-level definitions (no
instances, classes have become data types, etc.) so it can be a bit
tricky to map types back to the input terms.

Still, I think solution with the best trade-off between
maintainability and usability is to use the GHC API and annotate a
haskell-src-exts representation of the given input file.  The GHC AST
structures are volatile, and have lots of ugly invariants that need to
be maintained.  E.g., some fields are only defined after renaming and
are others may no longer be defined after renaming.  If you look at
those fields when they are not defined, you get an error.

Let me know if you decide to take on this project.


On 24 January 2012 10:35, Christopher Brown cm...@st-andrews.ac.uk wrote:


 Have you looked at ghc-syb-utils, which gives a neat way to print an AST?

 http://hackage.haskell.org/packages/archive/ghc-syb-utils/0.2.1.0/doc/html/GHC-SYB-Utils.html


 Yes I found that yesterday!

 Chris.




 --
 JP Moresmau
 http://jpmoresmau.blogspot.com/

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Push the envelope. Watch it bend.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-24 Thread Christopher Brown
Hi Ozgur,

Yes I've looked at haskell-src-exts and it does look *much* easier to use.

I need this to build a new refactoring tool for Haskell (for the Paraphrase 
project). One advantage to using the ghc-api directly
is that's it's always cutting edge and maintained by the ghc team. Having one 
more library dependancy to worry about is not always a good thing.

At the moment (and spending half a day yesterday just working out how to 'show' 
an AST from the ghc-api) I'm veering towards haskell-src-exts. I think 
extending it to contain use and bind locations in the AST would be the best 
option for me. There's also a question of having types in the AST as well.

 I don't know what you actually need, but if haskell-src-exts is an option, it 
 is quite a bit easier to use (definitely easier to understand for me!). 
 Especially when used together with Uniplate.
 
 For example, for a given piece of AST one can get all the identifiers used 
 like so:
 
 [ x | Ident x - universeBi ast ]
 

Uniplate isn't a powerful enough generic system to design a full refactoring 
engine, as we need top down/bottom up/ full/stop/once plus preservation and 
unification, much in the style of Strafunski. I think SYB is better for our 
needs. Perhaps we could use a combination of uniplate+SYB depending on what 
traversals/rewrites we need to do.


 Finding where they are bound shouldn't be very hard either.
 

No, but it would be much easier if the information was already there, of 
course! :)


Thanks for your response! 
Chris.

 Hope this helps,
 Ozgur
 
 On 23 January 2012 17:33, Christopher Brown cm...@st-andrews.ac.uk wrote:
 Hi,
 
 I was wondering if anyone could tell me if it's possible to get an AST from 
 the ghc-api decorated with static-semantics?
 In particular, I am interested in use and bind locations for all names in the 
 AST together with the module they are bound, etc.
 
 Looking through the online docs, there doesn't seem to be a way to do this.
 Even if I can tell from the AST where a variable is bound that would be 
 enough,  if this is by making all names unique and qualified that would be 
 better than nothing.
 
 Hope someone can help,
 Chris.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-24 Thread JP Moresmau
On Tue, Jan 24, 2012 at 11:04 AM, Christopher Brown
cm...@st-andrews.ac.uk wrote:

 At the moment (and spending half a day yesterday just working out how to
 'show' an AST from the ghc-api) I'm veering towards haskell-src-exts. I
 think extending it to contain use and bind locations in the AST would be the
 best option for me. There's also a question of having types in the AST as
 well.


Have you looked at ghc-syb-utils, which gives a neat way to print an AST?

http://hackage.haskell.org/packages/archive/ghc-syb-utils/0.2.1.0/doc/html/GHC-SYB-Utils.html

-- 
JP Moresmau
http://jpmoresmau.blogspot.com/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-24 Thread Christopher Brown
 
 
 Have you looked at ghc-syb-utils, which gives a neat way to print an AST?
 
 http://hackage.haskell.org/packages/archive/ghc-syb-utils/0.2.1.0/doc/html/GHC-SYB-Utils.html
 

Yes I found that yesterday! 

Chris.




 -- 
 JP Moresmau
 http://jpmoresmau.blogspot.com/
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ghc-api Static Semantics?

2012-01-23 Thread Christopher Brown
Hi,

I was wondering if anyone could tell me if it's possible to get an AST from the 
ghc-api decorated with static-semantics? 
In particular, I am interested in use and bind locations for all names in the 
AST together with the module they are bound, etc.

Looking through the online docs, there doesn't seem to be a way to do this. 
Even if I can tell from the AST where a variable is bound that would be enough, 
 if this is by making all names unique and qualified that would be better than 
nothing.

Hope someone can help,
Chris.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ghc-api Static Semantics?

2012-01-23 Thread Ozgur Akgun
Hi,

I don't know what you actually need, but if haskell-src-exts is an option,
it is quite a bit easier to use (definitely easier to understand for me!).
Especially when used together with Uniplate.

For example, for a given piece of AST one can get all the identifiers used
like so:

[ x | Ident x - universeBi ast ]

Finding where they are bound shouldn't be very hard either.

Hope this helps,
Ozgur

On 23 January 2012 17:33, Christopher Brown cm...@st-andrews.ac.uk wrote:

 Hi,

 I was wondering if anyone could tell me if it's possible to get an AST
 from the ghc-api decorated with static-semantics?
 In particular, I am interested in use and bind locations for all names in
 the AST together with the module they are bound, etc.

 Looking through the online docs, there doesn't seem to be a way to do this.
 Even if I can tell from the AST where a variable is bound that would be
 enough,  if this is by making all names unique and qualified that would be
 better than nothing.

 Hope someone can help,
 Chris.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe