Re: Dataflow analysis for Cmm

2016-10-24 Thread Michal Terepeta
On Fri, Oct 21, 2016 at 4:04 PM Simon Marlow  wrote:

> On 16 October 2016 at 14:03, Michal Terepeta 
> wrote:
>
> Hi,
>
> I was looking at cleaning up a bit the situation with dataflow analysis
> for Cmm.
> In particular, I was experimenting with rewriting the current
> `cmm.Hoopl.Dataflow` module:
> - To only include the functionality to do analysis (since GHC doesn’t seem
> to use
>   the rewriting part).
>   Benefits:
>   - Code simplification (we could remove a lot of unused code).
>   - Makes it clear what we’re actually using from Hoopl.
> - To have an interface that works with transfer functions operating on a
> whole
>   basic block (`Block CmmNode C C`).
>   This means that it would be up to the user of the algorithm to traverse
> the
>   whole block.
>
>
> Ah! This is actually something I wanted to do but didn't get around to.
> When I was working on the code generator I found that using Hoopl for
> rewriting was prohibitively slow, which is why we're not using it for
> anything right now, but I think that pulling out the basic block
> transformation is possibly a way forwards that would let us use Hoopl.
>

Right, I've also seen:
https://plus.google.com/107890464054636586545/posts/dBbewpRfw6R
https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/HooplPerformance
but it seems that there weren't any follow-ups/conclusions on that.
Also, I haven't started writing anything for the rewriting yet (only
analysis for
now).

Btw. I'm currently experimenting with the GHC's fork of Dataflow module -
and for now I'm not planning on pushing the changes to the upstream Hoopl.
There are already projects that depend on the current interface of Hoopl
(it's
on Hackage after all) and it's going to be hard to make certain changes
there.
Hope that's ok with everyone!
(also, we can always revisit this question later)

A lot of the code you're removing is my attempt at "optimising" the Hoopl
> dataflow algorithm to make it usable in GHC.  (I don't mind removing this,
> it was a failed experiment really)
>

Thanks for saying that!


>   Benefits:
>   - Further simplifications.
>   - We could remove `analyzeFwdBlocks` hack, which AFAICS is just a
> copy&paste
> of `analyzeFwd` but ignores the middle nodes (probably for efficiency
> of
> analyses that only look at the blocks).
>
>
> Aren't we using this in dataflowAnalFwdBlocks, that's used by
> procpointAnalysis?
>

Yes, sorry for confusion! What I meant is that
analyzeFwdBlocks/dataflowAnalFwdBlocks is currently a special case of
analyzeFwd/dataflowAnalFwd that only looks at first and last nodes. So if we
move to block-oriented interface, it simply stops being a special case and
fits
the new interface (since it's the analysis that decides whether to look at
the
whole block or only parts of it). So it's removed in the sense of "removing
a
special case".

Cheers,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-21 Thread Simon Marlow
On 16 October 2016 at 14:03, Michal Terepeta 
wrote:

> Hi,
>
> I was looking at cleaning up a bit the situation with dataflow analysis
> for Cmm.
> In particular, I was experimenting with rewriting the current
> `cmm.Hoopl.Dataflow` module:
> - To only include the functionality to do analysis (since GHC doesn’t seem
> to use
>   the rewriting part).
>   Benefits:
>   - Code simplification (we could remove a lot of unused code).
>   - Makes it clear what we’re actually using from Hoopl.
> - To have an interface that works with transfer functions operating on a
> whole
>   basic block (`Block CmmNode C C`).
>   This means that it would be up to the user of the algorithm to traverse
> the
>   whole block.
>

Ah! This is actually something I wanted to do but didn't get around to.
When I was working on the code generator I found that using Hoopl for
rewriting was prohibitively slow, which is why we're not using it for
anything right now, but I think that pulling out the basic block
transformation is possibly a way forwards that would let us use Hoopl.

A lot of the code you're removing is my attempt at "optimising" the Hoopl
dataflow algorithm to make it usable in GHC.  (I don't mind removing this,
it was a failed experiment really)


>   Benefits:
>   - Further simplifications.
>   - We could remove `analyzeFwdBlocks` hack, which AFAICS is just a
> copy&paste
> of `analyzeFwd` but ignores the middle nodes (probably for efficiency
> of
> analyses that only look at the blocks).
>

Aren't we using this in dataflowAnalFwdBlocks, that's used by
procpointAnalysis?

Cheers
Simon

  - More flexible (e.g., the clients could know which block they’re
> processing;
> we could consider memoizing some per block information, etc.).
>
> What do you think about this?
>
> I have a branch that implements the above:
> https://github.com/michalt/ghc/tree/dataflow2/1
> It’s introducing a second parallel implementation (`cmm.Hoopl.Dataflow2`
> module), so that it's possible to run ./validate while comparing the
> results of
> the old implementation with the new one.
>
> Second question: how could we merge this? (assuming that people are
> generally
> ok with the approach) Some ideas:
> - Change cmm/Hoopl/Dataflow module itself along with the three analyses
> that use
>   it in one step.
> - Introduce the Dataflow2 module first, then switch the analyses, then
> remove
>   any unused code that still depends on the old Dataflow module, finally
> remove
>   the old Dataflow module itself.
> (Personally I'd prefer the second option, but I'm also ok with the first
> one)
>
> I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> the
> recommended workflow for code that’s not ready for review…
>
> Thanks,
> Michal
>
>
> ___
> ghc-devs mailing list
> ghc-devs@haskell.org
> http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-20 Thread Ben Gamari
Jan Stolarek  writes:

>> I don't entirely agree. I personally find it very hard to review large
>> patches as the amount of mental context generally seems to grow
>> super-linearly in the amount of code touched. Moreover, I think it's
>> important to remember that the need to read patches does not vanish the
>> moment the patch is committed. To the contrary, review is merely the
>> first of many instances in which a patch will be read. Other instances
>> include,
>
> I wholeheartedly agree with everything you say. I don't see it as 
> contradicting in any way 
> principles that I outlined. It's just that sometimes doing a single logical 
> change to the code 
> requires a large patch and breaking it artificially into smaller patches 
> might actually make 
> matters worse. I believe this would be the case for this scenario. And 
> honestly speaking I don't 
> think that the patch here will be very big. But like you say, there's a 
> compromise to be struck.
>
Ahh, it looks like I was probably reading more into what you wrote than
you intended; my apologies!

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-20 Thread Jan Stolarek
> I don't entirely agree. I personally find it very hard to review large
> patches as the amount of mental context generally seems to grow
> super-linearly in the amount of code touched. Moreover, I think it's
> important to remember that the need to read patches does not vanish the
> moment the patch is committed. To the contrary, review is merely the
> first of many instances in which a patch will be read. Other instances
> include,
I wholeheartedly agree with everything you say. I don't see it as contradicting 
in any way 
principles that I outlined. It's just that sometimes doing a single logical 
change to the code 
requires a large patch and breaking it artificially into smaller patches might 
actually make 
matters worse. I believe this would be the case for this scenario. And honestly 
speaking I don't 
think that the patch here will be very big. But like you say, there's a 
compromise to be struck.

Janek


>
>  * when the patch is backported
>
>  * when someone is trying to rebase one of their own changes on top of
>the patch
>
>  * when another contributor is trying to follow the evolution of a piece
>of the compiler
>
>  * when someone is later trying to understand a bug in the patch
>
> Consequently, I think it is fairly important not to throw out the
> structure that multiple, sensibly sized commits provides. Of course
> there is a compromise to be struck here: we don't want dozens of
> five-line patches; however I think that one mega-patch swings too far to
> the other extreme in the case of most features.
>
> Cheers,
>
> - Ben


 

--- 
Politechnika Łódzka 
Lodz University of Technology 

Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. 
Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę 
prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. 

This email contains information intended solely for the use of the individual 
to whom it is addressed. 
If you are not the intended recipient or if you have received this message in 
error, 
please notify the sender and delete it from your system. 


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-20 Thread Ben Gamari
Jan Stolarek  writes:

>> (did you intend to send two identical links?)
> No :-) The branches should be js-hoopl-cleanup-v1 and js-hoopl-cleanup-v2
>
>> Yes, the end result would be the same - I'm merely asking what would be
>> preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
>> usually are).
> I don't think this should be visible in your final patch. In a perfect world 
> each repositrory 
> commit shouyld provide exactly one functionality - not more and not less. So 
> your final patch 
> should not reflect intermediate steps you took to implement some 
> functionality because they are 
> not really relevant. For the purpose of development and review it is fine to 
> have a branch with 
> lots of small commits but before merging you should just squash them into one.
>
I don't entirely agree. I personally find it very hard to review large
patches as the amount of mental context generally seems to grow
super-linearly in the amount of code touched. Moreover, I think it's
important to remember that the need to read patches does not vanish the
moment the patch is committed. To the contrary, review is merely the
first of many instances in which a patch will be read. Other instances
include,

 * when the patch is backported

 * when someone is trying to rebase one of their own changes on top of
   the patch

 * when another contributor is trying to follow the evolution of a piece
   of the compiler

 * when someone is later trying to understand a bug in the patch

Consequently, I think it is fairly important not to throw out the
structure that multiple, sensibly sized commits provides. Of course
there is a compromise to be struck here: we don't want dozens of
five-line patches; however I think that one mega-patch swings too far to
the other extreme in the case of most features.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-18 Thread Jan Stolarek
> (did you intend to send two identical links?)
No :-) The branches should be js-hoopl-cleanup-v1 and js-hoopl-cleanup-v2

> Yes, the end result would be the same - I'm merely asking what would be
> preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
> usually are).
I don't think this should be visible in your final patch. In a perfect world 
each repositrory 
commit shouyld provide exactly one functionality - not more and not less. So 
your final patch 
should not reflect intermediate steps you took to implement some functionality 
because they are 
not really relevant. For the purpose of development and review it is fine to 
have a branch with 
lots of small commits but before merging you should just squash them into one.

Janek
 

--- 
Politechnika Łódzka 
Lodz University of Technology 

Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. 
Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę 
prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. 

This email contains information intended solely for the use of the individual 
to whom it is addressed. 
If you are not the intended recipient or if you have received this message in 
error, 
please notify the sender and delete it from your system. 


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Ben Gamari
Michal Terepeta  writes:

> On Mon, Oct 17, 2016 at 10:57 AM Jan Stolarek 
> wrote:
>
>> Second question: how could we merge this? (...)
>> I'm not sure if I understand. The end result after merging will be exactly
>> the same, right? Are
>> you asking for advice what is the best way of doing this from a technical
>> point if view? I would
>> simply edit the existing module. Introducing a temporary second module
>> seems like unnecessary
>> extra work and perhaps complicating the patch review.
>>
>
> Yes, the end result would be the same - I'm merely asking what would be
> preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
> usually are).
>
It varies quite wildly. In general I would prefer fine-grained patches
(but of course atomic) over coarse patches as they are easier to
understand during review and after merge. Moreover, it's generally much
easier to squash together patches that are too fine-grained than it is
to split up a large patch, so I generally err on the side of finer
rather than coarser during development.

Cheers,

- Ben



signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Michal Terepeta
On Mon, Oct 17, 2016 at 10:57 AM Jan Stolarek 
wrote:

> Michał,
>
> Dataflow module could indeed use cleanup. I have made two attempts at this
> in the past but I don't
> think any of them was merged - see [1] and [2]. [2] was mostly
> type-directed simplifications. It
> would be nice to have this included in one form or another. It sounds like
> you also have a more
> in-depth refactoring in mind. Personally as long as it is semantically
> correct I think it will be
> a good thing. I would especially support removing dead code that we don't
> really use.
>
> [1] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2
> [2] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2


Ok, I'll have a look at this!
(did you intend to send two identical links?)

> Second question: how could we merge this? (...)
> I'm not sure if I understand. The end result after merging will be exactly
> the same, right? Are
> you asking for advice what is the best way of doing this from a technical
> point if view? I would
> simply edit the existing module. Introducing a temporary second module
> seems like unnecessary
> extra work and perhaps complicating the patch review.
>

Yes, the end result would be the same - I'm merely asking what would be
preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
usually are).


> > I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> > the recommended workflow for code that’s not ready for review…
> This is OK but please remember to set status of revision to "Planned
> changes" after uploading it
> to Phab so it doesn't sit in reviewing queue.
>

Cool, I didn't know about the "Planned changes" status.
Thanks for mentioning it!

Cheers,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Jan Stolarek
Michał,

Dataflow module could indeed use cleanup. I have made two attempts at this in 
the past but I don't 
think any of them was merged - see [1] and [2]. [2] was mostly type-directed 
simplifications. It 
would be nice to have this included in one form or another. It sounds like you 
also have a more 
in-depth refactoring in mind. Personally as long as it is semantically correct 
I think it will be 
a good thing. I would especially support removing dead code that we don't 
really use.

[1] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2
[2] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2

> Second question: how could we merge this? (...)
I'm not sure if I understand. The end result after merging will be exactly the 
same, right? Are 
you asking for advice what is the best way of doing this from a technical point 
if view? I would 
simply edit the existing module. Introducing a temporary second module seems 
like unnecessary 
extra work and perhaps complicating the patch review.

> I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> the recommended workflow for code that’s not ready for review…
This is OK but please remember to set status of revision to "Planned changes" 
after uploading it 
to Phab so it doesn't sit in reviewing queue.

Janek

Dnia niedziela, 16 października 2016, Michal Terepeta napisał:
> Hi,
>
> I was looking at cleaning up a bit the situation with dataflow analysis for
> Cmm.
> In particular, I was experimenting with rewriting the current
> `cmm.Hoopl.Dataflow` module:
> - To only include the functionality to do analysis (since GHC doesn’t seem
> to use
>   the rewriting part).
>   Benefits:
>   - Code simplification (we could remove a lot of unused code).
>   - Makes it clear what we’re actually using from Hoopl.
> - To have an interface that works with transfer functions operating on a
> whole
>   basic block (`Block CmmNode C C`).
>   This means that it would be up to the user of the algorithm to traverse
> the
>   whole block.
>   Benefits:
>   - Further simplifications.
>   - We could remove `analyzeFwdBlocks` hack, which AFAICS is just a
> copy&paste
> of `analyzeFwd` but ignores the middle nodes (probably for efficiency
> of analyses that only look at the blocks).
>   - More flexible (e.g., the clients could know which block they’re
> processing;
> we could consider memoizing some per block information, etc.).
>
> What do you think about this?
>
> I have a branch that implements the above:
> https://github.com/michalt/ghc/tree/dataflow2/1
> It’s introducing a second parallel implementation (`cmm.Hoopl.Dataflow2`
> module), so that it's possible to run ./validate while comparing the
> results of
> the old implementation with the new one.
>
> Second question: how could we merge this? (assuming that people are
> generally
> ok with the approach) Some ideas:
> - Change cmm/Hoopl/Dataflow module itself along with the three analyses
> that use
>   it in one step.
> - Introduce the Dataflow2 module first, then switch the analyses, then
> remove
>   any unused code that still depends on the old Dataflow module, finally
> remove
>   the old Dataflow module itself.
> (Personally I'd prefer the second option, but I'm also ok with the first
> one)
>
> I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> the
> recommended workflow for code that’s not ready for review…
>
> Thanks,
> Michal


 

--- 
Politechnika Łódzka 
Lodz University of Technology 

Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. 
Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę 
prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. 

This email contains information intended solely for the use of the individual 
to whom it is addressed. 
If you are not the intended recipient or if you have received this message in 
error, 
please notify the sender and delete it from your system. 


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Dataflow analysis for Cmm

2016-10-16 Thread Michal Terepeta
Hi,

I was looking at cleaning up a bit the situation with dataflow analysis for
Cmm.
In particular, I was experimenting with rewriting the current
`cmm.Hoopl.Dataflow` module:
- To only include the functionality to do analysis (since GHC doesn’t seem
to use
  the rewriting part).
  Benefits:
  - Code simplification (we could remove a lot of unused code).
  - Makes it clear what we’re actually using from Hoopl.
- To have an interface that works with transfer functions operating on a
whole
  basic block (`Block CmmNode C C`).
  This means that it would be up to the user of the algorithm to traverse
the
  whole block.
  Benefits:
  - Further simplifications.
  - We could remove `analyzeFwdBlocks` hack, which AFAICS is just a
copy&paste
of `analyzeFwd` but ignores the middle nodes (probably for efficiency of
analyses that only look at the blocks).
  - More flexible (e.g., the clients could know which block they’re
processing;
we could consider memoizing some per block information, etc.).

What do you think about this?

I have a branch that implements the above:
https://github.com/michalt/ghc/tree/dataflow2/1
It’s introducing a second parallel implementation (`cmm.Hoopl.Dataflow2`
module), so that it's possible to run ./validate while comparing the
results of
the old implementation with the new one.

Second question: how could we merge this? (assuming that people are
generally
ok with the approach) Some ideas:
- Change cmm/Hoopl/Dataflow module itself along with the three analyses
that use
  it in one step.
- Introduce the Dataflow2 module first, then switch the analyses, then
remove
  any unused code that still depends on the old Dataflow module, finally
remove
  the old Dataflow module itself.
(Personally I'd prefer the second option, but I'm also ok with the first
one)

I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
the
recommended workflow for code that’s not ready for review…

Thanks,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs