Re: Loop unrolling + fusion ?

Claus Reinke Fri, 27 Mar 2009 14:32:24 -0700

I can't see any issues with this version of the spec.

Thanks. From the silence, we seemed to have lost the innocentbystanders? Anyway, for those who haven't noticed, there is nowa feature request ticket (for that good feeling of closing it when thisis finally implemented;-) as well as a wiki page describing the issues,spec, and examples:


http://hackage.haskell.org/trac/ghc/ticket/3123
http://hackage.haskell.org/trac/ghc/wiki/Inlining

I think in the implementation it makes most sense to do this as a
core2core pass at an early stage in the pipeline, probably via plugins

(so will have to wait until I get those into HEAD).


What are the plans for plugin support? I do think plugins will be
useful, but inlining is pretty central to the existing optimizer 
transformations,
isn't it? Would the transformation code differ much between in-GHC
and via-plugins? Perhaps the transformation pass could be implemented
now, and later moved out into a plugin, possibly along with other passes.

I have also been wondering about the relation between rewrite RULES
and plugins. Assuming we can find a more convenient syntax, aren't
plugin+syb-based rewrites going to be more expressive, with more

control than RULES? Or is the syntactic/compiletime overhead goingto remain so high that both RULES and plugins will be kept in GHC?


(cf the recent thread on "optimization and rewrite rules questions"
http://www.haskell.org/pipermail/glasgow-haskell-users/2009-February/016702.html
 )

In the case of
PEEL, we don't want to identify all call sites directly and do the
substitution in the pass so we should just output some bindings which
will certainly be inlined into call sites later on. So, the
transformation should be bottom up on the Core syntax tree and when it
meets a recursive group of bindings we should do something like this:

{-# INLINE f g PEEL 3 UNROLL 2 #-}
f = ... g ... f ... h ...
g = ... g ... f ... h ...
h = ... g ... f ... h ...

=(my pass)=>

-- Temporary copies of f and g - dead code
f_old = ... g_old ... f_old ... h ...
g_old = ... g_old ... f_old ... h ...
-- H unchanged for now, might get PEELed stuff inlined later
h = ... g .. f ... h ...


You mean UNROLLed stuff (PEEL is only for entries into the group).

-- Top level unrolled definiiton - if we weren't doing peeling, these
would be the new f and g
f_unrolled = ... g_unrolled_1 ... f_unrolled_1 ... h ...
g_unrolled = ... g_unrolled_1 ... f_unrolled_1 ... h ...

-- Unrolled iteration. Will get inlined into f_unrolled / g_unrolled soon
{-# INLINE f_unrolled_1 g_unrolled_1 #-}
f_unrolled_1 = ... g_unrolled ... f_unrolled ... h ...
g_unrolled_1 = ... g_unrolled ... f_unrolled ... h ...


Ah, yes, we need to be unambiguous about the interpretation of the

counters:-) I was thinking of n+1 (adding n copies to the original), youare thinking of n (adding copies until there are n).

-- One level of peeling
{-# INLINE f_1 g_1 #-}
f_1 = ... g_unrolled ... f_unrolled ... h ...
g_1 = ... g_unrolled ... f_unrolled ... h ...

-- Second level of peeling
{-# INLINE f_2 g_2 #-}
f_2 = ... g_1 ... f_1 ... h ...
g_2 = ... g_1 ... f_1 ... h ...

-- Final level of peeling and new definitions for f and g. Inline pragmas
-- make sure all of this gets inlined at the call site
{-# INLINE f g #-}
f = ... g_2 ... f_2 ... h ...
g = ... g_2 ... f_2 ... h ...


Wait, now you are counting to n+1 for PEEL and to n for UNROLL?

=(after the simplifier has run - effectively - there are a few
harmless lies here)=>

-- NB: I haven't shown inlining of the new f and g here, but it /will/ happen
h = ... g .. f ... h ...


Since we are interpreting recursive groups as single entities, and there
is usually no inlining into definitions that will get inlined, we will have to
specify this carefully.

-- I've inlined the inner unrolled iteration at every /call site/
within the top level unrolled iteration, as per
-- the pragmas. Noone actualy calls this unrolled thing directly
though, since we used PEEL as well
f_unrolled = ... (... g_unrolled ... f_unrolled ... h ...) ... (...
g_unrolled ... f_unrolled ... h ...) ... h ...
g_unrolled = ... (... g_unrolled ... f_unrolled ... h ...) ... (...
g_unrolled ... f_unrolled ... h ...) ... h ...


Again, we have to make sure of this interpretation.

-- This huge chunk of code gets inlined at every call site, which in
turn call through to the unrolled bodies
{-# INLINE f g #-}
f = ... (... (... g_unrolled ... f_unrolled ... h ...) ... (...
g_unrolled ... f_unrolled ... h ...) ... h ...) ... (... (...
g_unrolled ... f_unrolled ... h ...) ... (... g_unrolled ...
f_unrolled ... h ...) ... h ...) ... h ...
g = ... (... (... g_unrolled ... f_unrolled ... h ...) ... (...
g_unrolled ... f_unrolled ... h ...) ... h ...) ... (... (...
g_unrolled ... f_unrolled ... h ...) ... (... g_unrolled ...
f_unrolled ... h ...) ... h ...) ... h ...

So this would be the result of inlining all the PEEL instances into 'f' and 'g'.

By ensuring that f and g are tagged INLINE we get the existing INLINE
restrictions automatically in later Core passes.


So the INLINE gets added after your pass is through, so that it isn't
affected, but later passes are. But what if there are multiple such PEEL/
UNROLL definitions handled by the one pass? Since the pass doesn't
do general INLINE, that is out of the way, but wouldn't it still PEEL
stuff from one group into the definitions from another group, even if
those definitions themselves are about to be PEELed/INLINEd? And
do we want that or not?

I think that this example transformation matches your spec - am I right?

Looks mostly right, apart from the ambiguities I mentioned. Couldyou please add your implementation sketch to the wiki page?


Claus

_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Loop unrolling + fusion ?

Reply via email to