Thanks that is *exactly* the problem I am having. The missing line breaks are kind of weird.
I will try the newest head (which will mean getting about 20 libs to compile...) and report back. I left the core out intentionally because the vector library tends to produce ugly core when it can not optimize. Should the problem *not* go away with HEAD (right now I am using 6.12.3) then I will create a self-contained set of files that reproduce the problem. Thanks again, Christian ----- Original message ----- > It's hard to read your code because the line breaks have been removed. > Moreover I think your code depends on some unspecified Hackage package. > And I couldn't find the enclosed Core dumps. Nor do you specify which > version of GHC you are using. > > Still I believe that the nub is this. You have a class with a default > method declaration > > class FoldFunctions a where > opt = <blah> > > and you get different behaviour if you say > > instance FoldFunctions Int > from > instance FoldFunctions Int where > opt = <blah> > > That's surprising because the second case simply fills in the method > with the identical code to the code in the class declaration. > > Is that right? > > If so, I think you might want to try the HEAD. Can you do that (use a > binary snapshot). I think I fixed this default-method-inlining stuff so > that there is no difference between the above two. > > But I could have failed, or there could be something else going on in > your example. > > Simon > > > | -----Original Message----- > | From: [email protected] > [mailto:glasgow-haskell- | [email protected]] On Behalf Of > Christian Höner zu Siederdissen | Sent: 26 August 2010 10:36 > | To: Glasgow-Haskell-Users > | Subject: late optimization of typeclass callers > | > | Hi, > | > | I do have the problem that my code is not completely optimized. We > begin with | > | -- Ring.hs > | class Ring a where > | rmult :: a -> a -> a > | zero :: a > | > | -- PrimitiveArray.hs > | class PrimArrayOps a b where > | data PrimArray a b :: * > | unsafeIndex :: PrimArray a b -> a -> b > | > | -- PAInstances.hs > | -- for any 'a' of Data.Primitive.Types by Roman, have unboxed arrays > instance | (Prim a) => PrimArrayOps where > | data PrimArray (Int,Int) a = PaIIxI {-# UNPACK #-} !(Int,Int) {-# > UNPACK #- | } !ByteArray > | unsafeIndex (PaIIxI (mI,mJ) arr) (i,j) = {-# CORE "IIxIunsafeIndex" > #-} | case (i*(mJ+1)+j) of idx -> indexByteArray arr idx > | > | -- RNAfoldFunctions.hs > | -- VU.Unbox a, because of the vector package import > Data.Vector.Unboxed as VU | class (Ring a, VU.Unbox a, Prim a) => > FoldFunctions a where | opt = VU.foldl' rmult zero $ base turnertables > inp table table i j | > | base trnr inp m m1 i j = VU.zipWith rmult ms m1s where > | cnt = j-i-2 -- TODO when to stop? > | ms = VU.map (\ik -> m `unsafeIndex` ik) $ VU.generate cnt (\k > -> > | (i,i+k)) > | m1s = VU.map (\kj -> m1 `unsafeIndex` kj) $ VU.generate cnt (\k -> > | (k+1+i,j)) {-# INLINE multibranchCloseBase #-} > | > | > | > | If I now use this stuff... > | -- [1] > | instance Ring Int where > | rmult = min > | zero = 10000 > | instance FoldFunctions Int > | module MyProgram where > | main = do > | let val = opt trnr inp myM myM1 15 78 > | > | I get this core [1]. If I do this > | -- [2] > | instance Ring Int where > | rmult = min > | zero = 10000 > | instance FoldFunctions Int > | opt = VU.foldl' rmult zero $ base turnertables inp table table i j > module | MyProgram where main = do > | let myM = PrimArray of (Int,Int) with Int values > | ... > | let val = opt trnr inp myM myM1 15 78 > | > | Is there a way to get the program without an explicit FoldFunctions > instance | to specialize to the same Core as the second? The runtime > (Criterion was | used) is 7.1us (or worse) for [1] and 2.7us for [2]. I > could put nice INLINEs | everywhere but they do not help optimizing. > These functions run O(n^2) times, | with n between 100 and 10000. The > core shows that temporary arrays are even | created, filled, and then > the fold run over them. | > | So basically, can I get code running through several class instances > to be | optimized at the caller, where all instances are known? > Otherwise I could | live with [2], but as the code will almost always be > the same for | FoldFunctions instances, it would be really nice to be > able to use the | defaults that were defined on (Ring a, VU.Unbox a, > Prim a). | > | Thanks, > | Christian _______________________________________________ Glasgow-haskell-users mailing list [email protected] http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
