Re: compilation of pattern-matching?

Simon Marlow Thu, 26 Mar 2009 01:58:29 -0700

Lennart Augustsson wrote:

When you tried switching Nil and Cons, did you try it on many examples?
For a single example a 2-3% could be easily attributed to random
effects like different instruction cache hit patterns.  If you get it
consistently over several programs then it seems likely to mean
something, but I'm not sure what.

I've seen cases where simply inserting a couple of nops in a hot functionimproved performance by a significant margin (>10%, IIRC). The only theoryI could come up with was that there were more branches in a cache line thanthe branch prediction cache on this processor could cope with. I don'tthink it was merely an alignment issue.


Cheers,
        Simon

On Wed, Mar 25, 2009 at 6:01 PM, Claus Reinke <claus.rei...@talk21.com> wrote:

Indeed GHC does not attempt to retain the order of alternatives, although
a) it might be possible to do so by paying more attention in numerous
places
b) GHC may do so already, by accident, in certain cases

That adds even more unpredictability. One thing that I don't want whenever I
have to care about performance is small changes
having odd/unexplainable effects (I vaguely recall a case where removing an
unused parameter from a recursion made the program
slower, or eliminating returned constructors by using continuations
made one inner-loop function faster, another slower..).
Lennart is of course right: even if GHC would respect the ordering indicated
in my source program, I might not be able to tune that source to make good
use of even a single target processor (I tried defining a foldl over a
user-defined List type, so that I could switch
the order of Nil/Cons, and hence the test order used by GHC, and found the
Nil-before-Cons order to be 2-3% faster for folding a
very long list than the Cons-before-Nil order I wanted), but it is very
frustrating if I'm not even given the chance because GHC
sorts the alternatives, not even according to its own interpretation
of branching performance, but completely arbitrarily!-)

* The issue at stake is a small one: not the *number of tests* but *which
tests branch, and which fall through*.

Right on the issue, but I'm not quite sure how small it is: the test
case source I attached a few messages ago consistently showed one ordering
to be 5% faster than the other for the extreme case
of one test nearly always failing. There may well be more profitable
optimizations remaining to be implemented first - what disturbs me is that
Haskell code is full of conditionals and matches, which I tend to arrange
according to expected frequency, and GHC simply ignores all those hints.

With the hint about branch prediction, I also found this old ticket (which
seems to have been waiting for the new backend, and
indicates rather larger performance differences):

  Offer control  over branch prediction
  http://hackage.haskell.org/trac/ghc/ticket/849

* Simply ordering the equations doesn't really work, because pattern-match
compilation will match an entire column at once:
 f (x:xs) True = ...
 f []     True = ...
 f []     False = ...
 f (x:xs) False = ...
Which "order" should the (:)/[] test go in?

In the order indicated in the source!? The usual pattern-match
optimizations should not change that, they will just skip the two
'[]' cases if the list isn't empty, or use the constructor tag to jump
directly to a sub-column. Haskell specifies left-to-right, top-down.

* Not only does GHC currently not attempt to retain order, but for a
particular order it makes no guarantees about which falls through.  For
example, given
      case ... of { A -> e1; C -> e2; B -> e3 }
We might test for A and then
either fall though to e1
or     fall through to the test for C

That is the part I missed, and which might give the UNLIKELY
pragma, as suggested in #849, more expressive power than
plain clause ordering. However, since Haskell specifies a match
order, I don't see why that couldn't be used as the basis for
mapping to branches as well, with the clauses listed in decreasing
likelyhood, and GHC generating the branch predictions and fallthroughs to
match this information to the target processor characteristics?

* When the number of constructors is larger, instead of a linear sequence
of tests, GHC may generate a table-jump; or a balanced tree of tests.

The table-jump would make all alternatives equally costly/fast,
with no penalty for adding infrequent alternatives, right? The
balanced tree sounds like one of the pattern-match state machines, and there
would still be room for representing expected frequency in terms of
tree-path/-rotation/-representation.

* Which plan performs best is tremendously architecture dependent, and may
well vary a lot between different chips implementing the same instruction
set.  It's a losing battle to fix the strategy in source code.
* More promising might be to say "this is the hot branch".  That
information about frequency could in principle be used by the back end to
generate better code.  However, I am unsure how
      a) to express this info in source code
      b) retain it throughout optimisation


So it should be specified in the source, after all, just in a way
that gives programmers room to express their knowledge while
leaving GHC free to implement that knowledge on the target.

Things like the UNLIKELY pragma would seem useful, if
attached to decisions: unless GHC can optimize the whole decision away, it
will remain throughout optimization, and come out as some form of branch,
with the hint still attached.

But UNLIKELY only covers the most common case
(marking alternatives with minimal expected frequency) -
if clause ordering was respected, relative frequencies of
alternatives could be specified without pragmas, just by
ordering pattern-match or conditional clauses according
to expected frequency.

Claus, if you think this thread is worth capturing, then do write a
Commentary page, and I'll check its veracity.

Given the existence of #849, I've just linked this thread
from there.

Claus

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: compilation of pattern-matching?

Reply via email to