Uniplate instances for GHC types

Claus Reinke Wed, 16 Jul 2008 06:39:38 -0700

The thesis suggests that using Uniplate with the fastest instances
will be around 30% slower than manual operations, but using SYB based
instances will be 300% slower. In practice, these are benchmarks on
generic traversals, and may not accurately reflect how people using
the libraries in practice - the bottlenecks may be elsewhere.


As one data point, when we decided to use Strafunski (SYB predecessor)
for HaRe [1], it seemed that something had to be done about the various
sources of inefficiency in those generic traversals, but the reduction in

code size and improved readability/maintainability made it a winneranyway, turning an impractical implementation effort into somethingmanageable. In practice, it never became the main performancebottleneck, so we never did anything about it.


That was for one-at-a-time traversals, though (you do a refactoring,

which will do a very small number of traversals for analysis andtransformation, and then you are back in the GUI). The situation

is likely to be different when the Ast types are very complex and
large, and when many traversals are chained in sequence.

No boilerplate removal code 1:30min
Uniplate with PlateDirect (fastest method) 1:45min
Uniplate with PlateData (based on SYB) 5min

And I suspect directly using SYB would have been somewhere in the
10-15min range.


It would be useful to have more information about the causes
for those timings. As I recall, the main sources of inefficiency
in SYB were (a) traversing irrelevant parts of the structure
(because everything is treated generically), (b) combining
results in simple, but inefficient ways ((++) nesting with the

structure), (c) repeated runtime type checks to determinewhich function to apply.


While runtime type checks are inherent in SYB (and alternatives
have been proposed that claim to avoid them), all of a/b/c can
be addressed to some extent in defining traversals using the
library. There tends to be a trade-off for a vs c, as avoiding
senseless generic traversals of specific substructures implies
additional runtime type checks to identify those substructures
(see the various special cases in showData for GHC, at [2]).

To avoid that trade-off, one could specialise often-used
traversal schemes, building in more knowledge about the
layout of GHC Ast types (instead of rediscovering said layout

at runtime in a fully generic traversal). As an extreme step,one could prototype using SYB, then switch to handtunedtraversals for successful but inefficient prototypes. As Neilpoints out, that tuning need not even change the interfaces.

My view is that anyone manipulating an abstract syntax tree inside a
compiler without using some form of generic programming is doing it

wrong.


This also applies to other Ast manipulating tools. Having Data/
Typeable instances enables tools and experiments that are simply

not practical without these tools. We should keep an eye onperformance issues, however.


Claus

[1] 
http://www.cs.kent.ac.uk/projects/refactor-fp/publications/tool-support-for-rfp.pdf
[2] http://hackage.haskell.org/trac/ghc/wiki/GhcApiAstTraversals

_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Re: Data/Typeable/Uniplate instances for GHC types

Reply via email to