Re: Improving GHC GC for latency-sensitive networked services

2016-10-17 Thread Ben Gamari
Christopher Allen  writes:

> It'd be unfortunate if more companies trying out Haskell came to the
> same result: 
> https://blog.pusher.com/latency-working-set-ghc-gc-pick-two/#comment-2866985345
> (They gave up and rewrote the service in Golang)
>
Aside: Go strikes me as an odd choice here; I would have thought they
would just move to something like Rust or C++ to avoid GC entirely and
still benefit from a reasonably expressive type system. Anyways, moving
along...

> Most of the state of the art I'm aware of (such as from Azul Systems)
> is from when I was using a JVM language, which isn't necessarily
> applicable for GHC.
>
> I understand Marlow's thread-local heaps experiment circa 7.2/7.4 was
> abandoned because it penalized performance too much. Does the
> impression that there isn't the labor to maintain two GCs still hold?
> It seems like thread-local heaps would be pervasive.
>
Yes, I believe that this indeed still holds. In general the RTS lacks
hands and garbage collectors (especially parallel implementations)
require a fair bit of background knowledge to maintain.

> Does anyone know what could be done in GHC itself to improve this
> situation? Stop-the-world is pretty painful when the otherwise
> excellent concurrency primitives are much of why you're using Haskell.
>
Indeed it is quite painful. However, I suspect that compact regions
(coming in 8.2) could help in many workloads.

In the case of Pusher's workload (which isn't very precisely described,
so I'm guessing here) I suspect you could take batches of N messages and
add them to a compact region, essentially reducing the number of live
heap objects (and hence work that the GC must perform) by a factor of N.
Of course, in doing this you give up the ability to "retire" messages
individually. To recover this ability one could introduce a Haskell
"garbage collector" task to scan the active regions and copy messages
that should be kept into a new region, dropping those that
should be retired. Here you benefit from the fact that copying into a
compact region can be done in parallel (IIRC), allowing us to
essentially implement a copying, non-stop-the-world GC in our Haskell
program. This allows the runtime's GC to handle a large, static heap as
though it were a constant factor smaller, hopefully reducing pause
duration. That being said, this is all just wild speculation; I could be
wrong, YMMV, etc.

Of course, another option is splitting your workload across multiple
runtime systems. Cloud Haskell is a very nice tool for this which I've
used on client projects with very good results. Obviously it isn't
always possible to segment your heap as required by this approach, but
it is quite effective when possible.

While clearly neither of these are as convenient as a more scalable
garbage collector, they are both things we can (nearly) do today.

Looking farther into the future, know there is a group looking to add
linear types to GHC/Haskell with a separate linear heap (which needn't
be garbage collected). I'll let them elaborate if they so desire.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Compact regions in users guide

2016-10-17 Thread Ben Gamari
Hello Compact Regions authors,

It occurs to me that the compact regions support that is due to be
included in GHC 8.2 is lacking any discussion in the users guide. At
very least we should have a mention in the release notes (this is one of
the major features of 8.2, afterall) and a brief overview of the feature
elsewhere. It's a bit hard saying where the overview would fit
(parallel.rst is an option, albeit imperfect; glasgow_exts.rst is
another). I'll leave this up to you.

I've opened #12413 [1] to track this task. Do you suppose one of you
could take a few minutes to finish this off?

Thanks!

Cheers,

- Ben


[1] https://ghc.haskell.org/trac/ghc/ticket/12413


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Improving GHC GC for latency-sensitive networked services

2016-10-17 Thread Christopher Allen
It'd be unfortunate if more companies trying out Haskell came to the
same result: 
https://blog.pusher.com/latency-working-set-ghc-gc-pick-two/#comment-2866985345
(They gave up and rewrote the service in Golang)

Most of the state of the art I'm aware of (such as from Azul Systems)
is from when I was using a JVM language, which isn't necessarily
applicable for GHC.

I understand Marlow's thread-local heaps experiment circa 7.2/7.4 was
abandoned because it penalized performance too much. Does the
impression that there isn't the labor to maintain two GCs still hold?
It seems like thread-local heaps would be pervasive.

Does anyone know what could be done in GHC itself to improve this
situation? Stop-the-world is pretty painful when the otherwise
excellent concurrency primitives are much of why you're using Haskell.

--- Chris Allen
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Ben Gamari
Michal Terepeta  writes:

> On Mon, Oct 17, 2016 at 10:57 AM Jan Stolarek 
> wrote:
>
>> Second question: how could we merge this? (...)
>> I'm not sure if I understand. The end result after merging will be exactly
>> the same, right? Are
>> you asking for advice what is the best way of doing this from a technical
>> point if view? I would
>> simply edit the existing module. Introducing a temporary second module
>> seems like unnecessary
>> extra work and perhaps complicating the patch review.
>>
>
> Yes, the end result would be the same - I'm merely asking what would be
> preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
> usually are).
>
It varies quite wildly. In general I would prefer fine-grained patches
(but of course atomic) over coarse patches as they are easier to
understand during review and after merge. Moreover, it's generally much
easier to squash together patches that are too fine-grained than it is
to split up a large patch, so I generally err on the side of finer
rather than coarser during development.

Cheers,

- Ben



signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Status of GHC testsuite driver on Windows

2016-10-17 Thread Ben Gamari
Ben Gamari  writes:

> So I spent my weekend in the jungles Windows compatibility layers. I'll
> spare you the details as they are gruesome but here's a brief summary,
>
>  * There are a few nasty bugs currently in msys2 which affect the GHC
>testsuite driver:
>
> * Mingw Python packages are terribly broken (#12554)
>
> * Msys Python packages are also broken, but differently and only
>   with msys2-runtime >= 2.5.1 (#12660)
>
My apologies, this was supposed to read #12661.

Cheers,

- Ben


signature.asc
Description: PGP signature
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Michal Terepeta
On Mon, Oct 17, 2016 at 10:57 AM Jan Stolarek 
wrote:

> Michał,
>
> Dataflow module could indeed use cleanup. I have made two attempts at this
> in the past but I don't
> think any of them was merged - see [1] and [2]. [2] was mostly
> type-directed simplifications. It
> would be nice to have this included in one form or another. It sounds like
> you also have a more
> in-depth refactoring in mind. Personally as long as it is semantically
> correct I think it will be
> a good thing. I would especially support removing dead code that we don't
> really use.
>
> [1] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2
> [2] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2


Ok, I'll have a look at this!
(did you intend to send two identical links?)

> Second question: how could we merge this? (...)
> I'm not sure if I understand. The end result after merging will be exactly
> the same, right? Are
> you asking for advice what is the best way of doing this from a technical
> point if view? I would
> simply edit the existing module. Introducing a temporary second module
> seems like unnecessary
> extra work and perhaps complicating the patch review.
>

Yes, the end result would be the same - I'm merely asking what would be
preferred by GHC devs (i.e., I don't know how fine grained patches to GHC
usually are).


> > I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> > the recommended workflow for code that’s not ready for review…
> This is OK but please remember to set status of revision to "Planned
> changes" after uploading it
> to Phab so it doesn't sit in reviewing queue.
>

Cool, I didn't know about the "Planned changes" status.
Thanks for mentioning it!

Cheers,
Michal
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: GHC Trac spam filter is rejecting new registrations

2016-10-17 Thread Robert Henderson
Thanks for fixing that, registration seems to be working fine now.

Cheers,
Rob

On 15/10/16 16:38, Ben Gamari wrote:
> Robert Henderson  writes:
> 
>> Hi,
>>
>> I've been trying to register a new account on GHC Trac in order to
>> submit a bug report, and I'm getting the following error:
>>
>>   Submission rejected as potential spam
>>   SpamBayes determined spam probability of 90.82%
>>
> Oh dear, very sorry about that. I've adjusted the spam filter
> configuration; can you try again?
> 
>> Could this be a bug or issue with a recent release of the Trac software?
>> I've noticed people complaining about the same problem on other websites
>> that use Trac, e.g.:
>>
> It's not a bug; it's just that spammers are quite good at emulating
> humans and unfortunately Trac doesn't have very strong tools for
> catching them. We use a Bayesian spam classifier to catch Trac spam, but
> sadly it's imperfect.
> 
> Thanks for bringing up your issue!
> 
> Cheers,
> 
> - Ben
> 
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Dataflow analysis for Cmm

2016-10-17 Thread Jan Stolarek
Michał,

Dataflow module could indeed use cleanup. I have made two attempts at this in 
the past but I don't 
think any of them was merged - see [1] and [2]. [2] was mostly type-directed 
simplifications. It 
would be nice to have this included in one form or another. It sounds like you 
also have a more 
in-depth refactoring in mind. Personally as long as it is semantically correct 
I think it will be 
a good thing. I would especially support removing dead code that we don't 
really use.

[1] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2
[2] https://github.com/jstolarek/ghc/commits/js-hoopl-cleanup-v2

> Second question: how could we merge this? (...)
I'm not sure if I understand. The end result after merging will be exactly the 
same, right? Are 
you asking for advice what is the best way of doing this from a technical point 
if view? I would 
simply edit the existing module. Introducing a temporary second module seems 
like unnecessary 
extra work and perhaps complicating the patch review.

> I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> the recommended workflow for code that’s not ready for review…
This is OK but please remember to set status of revision to "Planned changes" 
after uploading it 
to Phab so it doesn't sit in reviewing queue.

Janek

Dnia niedziela, 16 października 2016, Michal Terepeta napisał:
> Hi,
>
> I was looking at cleaning up a bit the situation with dataflow analysis for
> Cmm.
> In particular, I was experimenting with rewriting the current
> `cmm.Hoopl.Dataflow` module:
> - To only include the functionality to do analysis (since GHC doesn’t seem
> to use
>   the rewriting part).
>   Benefits:
>   - Code simplification (we could remove a lot of unused code).
>   - Makes it clear what we’re actually using from Hoopl.
> - To have an interface that works with transfer functions operating on a
> whole
>   basic block (`Block CmmNode C C`).
>   This means that it would be up to the user of the algorithm to traverse
> the
>   whole block.
>   Benefits:
>   - Further simplifications.
>   - We could remove `analyzeFwdBlocks` hack, which AFAICS is just a
> copy
> of `analyzeFwd` but ignores the middle nodes (probably for efficiency
> of analyses that only look at the blocks).
>   - More flexible (e.g., the clients could know which block they’re
> processing;
> we could consider memoizing some per block information, etc.).
>
> What do you think about this?
>
> I have a branch that implements the above:
> https://github.com/michalt/ghc/tree/dataflow2/1
> It’s introducing a second parallel implementation (`cmm.Hoopl.Dataflow2`
> module), so that it's possible to run ./validate while comparing the
> results of
> the old implementation with the new one.
>
> Second question: how could we merge this? (assuming that people are
> generally
> ok with the approach) Some ideas:
> - Change cmm/Hoopl/Dataflow module itself along with the three analyses
> that use
>   it in one step.
> - Introduce the Dataflow2 module first, then switch the analyses, then
> remove
>   any unused code that still depends on the old Dataflow module, finally
> remove
>   the old Dataflow module itself.
> (Personally I'd prefer the second option, but I'm also ok with the first
> one)
>
> I’m happy to export the code to Phab if you prefer - I wasn’t sure what’s
> the
> recommended workflow for code that’s not ready for review…
>
> Thanks,
> Michal


 

--- 
Politechnika Łódzka 
Lodz University of Technology 

Treść tej wiadomości zawiera informacje przeznaczone tylko dla adresata. 
Jeżeli nie jesteście Państwo jej adresatem, bądź otrzymaliście ją przez pomyłkę 
prosimy o powiadomienie o tym nadawcy oraz trwałe jej usunięcie. 

This email contains information intended solely for the use of the individual 
to whom it is addressed. 
If you are not the intended recipient or if you have received this message in 
error, 
please notify the sender and delete it from your system. 


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs