Using "Generators" to remove intermediate lists

Kevin Atkinson Mon, 05 Jul 1999 17:37:13 -0400
This is a multi-part message in MIME format.
--------------EF4673DF4E1172F415EEF3AF
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Attached is my idea for removing intermediate lists using what I call
Generators.  I would be very interested if any one has tried something
like this.  I would also be interested in how it compares with Andrew
John Gill's idea presented in his Cheap Deforestation theses and as
currently implemented with CVS version of GHC.

For numeric instance calculations my idea seems to make a BIG
difference.

Ghc compiler writers: As usual, I am having some trouble getting rules
to behave.  Any help would be appreciated.
-- 
Kevin Atkinson
[EMAIL PROTECTED]
http://metalab.unc.edu/kevina/
--------------EF4673DF4E1172F415EEF3AF
Content-Type: text/plain; charset=us-ascii;
 name="Opt.hs"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="Opt.hs"


{------ Using so called Generators to remove intermediate lists. ------}
{------ Kevin Atkinson, [EMAIL PROTECTED]                        ------}

module Opt where

import Prelude hiding (foldr, foldl, map, filter, take, drop, iterate)
import Maybe (fromJust)

{- The basic idea behind my idea is the Generator type: -}

type Gen h t = (t -> Maybe (h,t), -- generator function
                t )               -- initial value

{- where the generator function takes in an initial state (say a list) 
   and returns a value (the head of a list for example) plus a new state 
   (the tail of the list) or Nothing if there is nothing to return (an
    empty list).  It is the exact same type on function one
   would feed to the unfoldr (defined in the List module) and is very
   similar to an Ananorphism pair except that the generator and the 
   predicate function is wrapped into one function.  

   Now all functions which normal deal with lists will instead work 
   with nothing but generators. -}
   
foldrG :: (a -> b -> b) -> b -> Gen a t -> b
foldrG f v (g,i) = f' i
    where f' i = case g i of 
                 Just (h,t) -> f h (f' t)
                 Nothing    -> v

foldr1G :: (a -> a -> a) -> Gen a t -> a
foldr1G f (g,i) = f' $ fromJust $ g i
    where f' (h,t) = case g t of 
                     Nothing -> h
                     Just v  -> f h (f' v)

foldlG :: (b -> a -> b) -> b -> Gen a t -> b
foldlG f v (g,i) = f' v i
    where f' v i = case g i of
                   Just (h, t) -> f' (f v h) t
                   Nothing     -> v


foldl1G :: (a -> a -> a) -> Gen a t -> a
foldl1G f (g,i) = foldlG f v (g,i') where (v,i') = fromJust$ g i

mapFst f (a,b) = (f a,b)
mapSnd f (a,b) = (a,f b)

mapG :: (h -> i) -> Gen h t -> Gen i t 
mapG f (g,i) = ((\a->fmap (mapFst f) (g a)),i)
               
data TakeG v = TakeG !Int v

takeG :: Int -> Gen h t -> Gen h (TakeG t)
takeG num (g,i) = (g',TakeG 0 i) 
    where g' (TakeG i v) | i == num  = Nothing
                         | otherwise = fmap (mapSnd $ TakeG $ i+1) (g v)

filterG :: (h -> Bool) -> Gen h t -> Gen h t
filterG f (g,i) = (g',i)
    where g' i = case g i of
                 Just (h,t) | f h       -> Just (h,t)
                            | otherwise -> g' t
                 Nothing                -> Nothing

rangeG (a,z) = numRangeG (a, a+1, z)

numRangeG (a,b,z) = (g,a)
    where g i | i > z     = Nothing
              | otherwise = Just (i,i+(b-a))

iterateG f i = ((\i->Just (i,f i)),i)

headG (g,i) = fst$ fromJust$ g i
tailG (g,i) = snd$ fromJust$ g i

lastG = foldr1G (flip const)

{- And to work with normal lists the normal list functions will simply 
   convert the generator to and from a list as necessary  -}

fromG :: Gen a b -> [a]
fromG = foldrG (:) [] 

toG   :: [a] -> Gen a [a]
toG   l   = (g,l) 
    where g (h:t) = Just (h,t)
          g _     = Nothing

foldr f v = foldrG f v . toG

foldl f v = foldlG f v . toG

map f = fromG . mapG f . toG

take n = fromG . takeG n . toG

filter f = fromG . filterG f . toG

intRange :: (Int, Int) -> [Int]
intRange = fromG . rangeG

numRange :: (Ord a, Num a) => (a,a,a) -> [a]
numRange = fromG . numRangeG

iterate f i = fromG $ iterateG f i

{- and finally use the rule: -}

{-# RULES
"toG/fromG"    forall a.  toG (fromG a) = a
 #-}

{- to remove any intermediate conversion to list -}

{- the following inlines seam necessary to make rule work out -}

{-# INLINE fromG #-}
{-# INLINE toG #-}
{-# INLINE foldr #-}
{-# INLINE foldl #-}
{-# INLINE map #-}
{-# INLINE take #-}
{-# INLINE filter #-}
{-# INLINE intRange #-}
{-# INLINE numRange #-}
{-# INLINE iterate #-}

{- 

   Although I have not had a chance to look into Andrew John Gill's
   _Cheap Deforestation for Non-strict Functional Languages_ (1996) in 
   detail, as I see my approach semas to have two distinct advantages:

   1) Generators work with tail-recursive foldl.

   2) They are easier to generalize to other data structures (simply
      create a functions to convert the data structure to and from 
      a generator.

   For numeric instance calculations such as finding numeric approximations
   for integrals, and solutions to differential equations the tail-recursive
   foldl can make a big difference.

   For a demonstration of the difference it can make so the file Main.cc. 
   The integrate with a foldlG is only 25% slower than a direct c or c++
   approach as demonstrated in ctest.cc.

-}

--------------EF4673DF4E1172F415EEF3AF
Content-Type: text/plain; charset=us-ascii;
 name="Main.hs"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="Main.hs"


module Main where
    
import Prelude hiding (foldr, foldl, map, filter, take, drop, iterate)
import System
import Opt

-- usage ./main <command> <integral> [<rts options>]

{- Use "+RTS -s -RTS" when running the program  to benchmark it.  
   The stats will be dumped in main.stat -}

main = do [command, integral'] <- getArgs
          let integral = read integral'
          print$ case command of
                 "l"  -> integratel  (**2) (integral,1) integral
                 "r"  -> integrater  (**2) (integral,1) integral
                 "l2" -> integratel2 (**2) (integral,1) integral
                 "r2" -> integrater2 (**2) (integral,1) integral
                 "l3" -> integratel3 (**2) (integral,1) integral
                 "r3" -> integrater3 (**2) (integral,1) integral
                 _    -> error "Invalid Command\n"

infixl 1 $>
($>) = flip ($)

integratel :: (Double -> Double) -> (Double,Double) -> Double -> Double
integratel f (a,z) i = numRangeG (a,a+i,z) $> 
                       mapG f              $>
                       mapG (*i)           $>
                       foldlG (+) 0    

integrater :: (Double -> Double) -> (Double,Double) -> Double -> Double
integrater f (a,z) i = numRangeG (a,a+i,z) $> 
                       mapG f              $>
                       mapG (*i)           $>
                       foldrG (+) 0    

--these two replay on the compiler to eliminate toG/fromGs based on the
--rule that it is an idenity however.  However the compiler is not fully
--doing this job as thee are intergates are several times slower

integratel2 :: (Double -> Double) -> (Double,Double) -> Double -> Double
integratel2 f (a,z) i = numRange (a,a+i,z) $> 
                        map f              $>
                        map (*i)           $>
                        foldl (+) 0    

integrater2 :: (Double -> Double) -> (Double,Double) -> Double -> Double
integrater2 f (a,z) i = numRange (a,a+i,z) $> 
                        map f              $>
                        map (*i)           $>
                        foldr (+) 0    

-- these two use standard prelude functions.  You can very easilly run
-- out of stack/heap space when using these for small values of integral

integratel3 :: (Double -> Double) -> (Double,Double) -> Double -> Double
integratel3 f (a,z) i = [a,a+i .. z]       $>
                        Prelude.map f      $>
                        Prelude.map (*i)   $>
                        Prelude.foldl (+) 0 

integrater3 :: (Double -> Double) -> (Double,Double) -> Double -> Double
integrater3 f (a,z) i = [a,a+i .. z]       $>
                        Prelude.map f      $>
                        Prelude.map (*i)   $>
                        Prelude.foldr (+) 0

--------------EF4673DF4E1172F415EEF3AF
Content-Type: text/plain; charset=us-ascii;
 name="ctest.cc"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="ctest.cc"


#include <iostream>
#include <math.h>
#include <time.h>

// To use simply run ctest.  It will prompt you for the integral and will
// report on the time it took to estimate the integration of x^2 from 
// 0 to 1

int main(int argc, const char * argv[]) {
  double integral;
  cout << "Integral: ";
  cin >> integral;
  double accum = 0;
  clock_t start = clock();
  for (double i = integral; i < 1; i = i + integral) {
    accum = accum + pow(i,2) * integral;
  }
  cout << accum << endl;
  cout << "Time: " << double(clock() - start)/CLOCKS_PER_SEC << endl;
}

--------------EF4673DF4E1172F415EEF3AF
Content-Type: text/plain; charset=us-ascii;
 name="makefile"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="makefile"


HC      = ghc
HC_OPTS = -recomp -cpp -fglasgow-exts -O2-for-C\
          -O -funfolding-use-threshold24 -fmax-simplifier-iterations6 \
          $(EXTRA_HC_OPTS)

#note: the unfolding-use-threshold really does need to be this high for
#      maxim performance.  I am not sure if setting 
#      max-simplifier-iterations to 6 does any good but the simplifier 
#      was bailing out when it was set to 4 (the default) or 5.
#      Adding -O2-for-c semas to make performace sligtly worse!

SRCS = $(wildcard *.hs)
OBJ2 = $(SRCS:.hs=.o)
OBJS = $(OBJ2:.lhs=.o)

.SUFFIXES : .o .hi .lhs .hc .s

world : main ctest

main : $(OBJS)
        rm -f $@
        $(HC) -o $@ $(HC_OPTS) $(OBJS)

ctest : ctest.cc
        g++ -O2 ctest.cc -o ctest

clean :
        rm -f *.o *.hi 

realclean: clean
        rm -f main ctest main.* core *~ *.bak

depend :
        mkdependHS -- $(HC_OPTS) -- $(SRCS)

# Standard suffix rules
%.hi : %.o
        @:

%.o : %.hs
        $(HC) -c $< $(HC_OPTS)

%.o : %.lhs
        $(HC) -c $< $(HC_OPTS)



























# DO NOT DELETE: Beginning of Haskell dependencies
Main.o : Main.hs
Main.o : ./Opt.hi
Opt.o : Opt.hs
# DO NOT DELETE: End of Haskell dependencies

--------------EF4673DF4E1172F415EEF3AF--
Using "Generators" to remove intermediate lists

Reply via email to