Re: [Haskell-cafe] vector operations

2012-06-11 Thread Evan Laforge
On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy r...@cse.unsw.edu.au 
wrote:
 On 29/05/2012, at 19:49, Evan Laforge wrote:

 Good question.. I copied both to a file and tried ghc-core, but it
 inlines big chunks of Data.Vector and I can't read it very well, but
 it looks like the answer is no, it still builds the the list of sums.
 I guess the next step is to benchmark and see how busy the gc is on
 each version.

 Vector should definitely fuse this, if it doesn't it's a bug. Please report 
 if it doesn't for you. To verify, just count the number of letrecs in the 
 optimised Core. You'll see one letrec if it has been fused and two if it 
 hasn't.

I see two letrecs in find_before2, but both of them are on findIndex.
I only have one findIndex so I'm not sure what's going on.  The first
one calls the second, but there's an boxed Either argument in there,
which must be coming out of vector internals.

I had to stick NOINLINE on the functions so I could find them in the
core.  I don't think this should affect the optimization of the
contents, though.

The fold_abort version is shorter and simpler, only has one letrec
that takes all unboxed arguments, and I think I can more or less
follow what it's doing.

Of course that doesn't mean it's any faster, I could be just
misreading the core.  I could do a bug report, but maybe someone else
should look at the core first to make sure I'm not just confused?  I
appended the file below, just run ghc-core and search for find_before.

On Tue, May 29, 2012 at 12:54 PM, Duncan Coutts
duncan.cou...@googlemail.com wrote:
 Note that foldr allows early abort so that's fine. Also, there's no
 fundamental restriction due to stream fusion. Stream fusion can be
 used for lazy lists afterall and can implement Data.List.foldr just
 fine.

But can foldr do a sum running from left to right?  I thought you had
to be left-biased for that.

And as for early abort with foldr, I can think of how to do so if I'm
generating lazy data with a right-biased constructor like (:), but how
could you do that for, say, a sum?  The obvious version, 'foldr (\x v
- if v  10 then v else v + x) 0' will still run the function on
every element.

I suppose if fusion works its magic then early abort with foldl or
scanl should happen.  If the generating loop gets fused with the
consuming loop, and the consuming loop only consumes part of the
input, as it would with findIndex.



import qualified Data.Vector.Unboxed as Unboxed

-- | Find the index of the last value whose running sum is still below the
-- given number.
{-# NOINLINE find_before #-}
find_before :: Int - Unboxed.Vector Int - Int
find_before n = fst . fold_abort go (0, 0)
where
go (i, total) a
| total + a = n = Just (i+1, total+a)
| otherwise = Nothing

fold_abort :: (Unboxed.Unbox a) =
(accum - a - Maybe accum) - accum - Unboxed.Vector a - accum
fold_abort f accum vec = go 0 accum
where go i accum = maybe accum (go (i+1)) $ f accum = vec Unboxed.!? i

{-# NOINLINE find_before2 #-}
find_before2 :: Int - Unboxed.Vector Int - Int
find_before2 n vec = case Unboxed.findIndex (n) sums of
Just i - max 0 (i-1)
Nothing - 0
where sums = Unboxed.scanl' (+) 0 vec

main :: IO ()
main = do
print (t0 find_before)
print (t0 find_before2)

t0 :: (Int - Unboxed.Vector Int - Int) - [Int]
t0 f = [f n (Unboxed.fromList [2, 2, 2, 2]) | n - [0..6]]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-06-11 Thread Roman Leshchinskiy
On 11/06/2012, at 18:52, Evan Laforge wrote:

 On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy r...@cse.unsw.edu.au 
 wrote:
 
 Vector should definitely fuse this, if it doesn't it's a bug. Please report 
 if it doesn't for you. To verify, just count the number of letrecs in the 
 optimised Core. You'll see one letrec if it has been fused and two if it 
 hasn't.
 
 I see two letrecs in find_before2, but both of them are on findIndex.
 I only have one findIndex so I'm not sure what's going on.  The first
 one calls the second, but there's an boxed Either argument in there,
 which must be coming out of vector internals.

Hmm, which version of GHC and what compiler flags are you using? I'm not 
familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 
-ddump-simpl and look at the output. Below is the code I'm getting for 
find_before2 with 7.4.2. As you can see, everything has been fused (although I 
notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some 
reason, looks like a new regression but not a particularly bad one and nothing 
to do with fusion).

find_before2_rkk :: Int - Vector Int - Int
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LU(LLL)m]
find_before2_rkk =
  \ (n_arE :: Int) (vec_arF :: Vector Int) -
case vec_arF `cast` ...
of _ { Vector ipv_s2Jf ipv1_s2Jg ipv2_s2Jh -
case n_arE of _ { I# y_a11t -
case # 0 y_a11t of _ {
  False -
letrec {
  $sfindIndex_loop_s2Qz [Occ=LoopBreaker]
:: Int# - Int# - Int# - Id (Maybe Int)
  [LclId, Arity=3, Str=DmdType LLL]
  $sfindIndex_loop_s2Qz =
\ (sc_s2Q8 :: Int#) (sc1_s2Q9 :: Int#) (sc2_s2Qa :: Int#) -
  case =# sc_s2Q8 ipv1_s2Jg of _ {
False -
  case indexIntArray# ipv2_s2Jh (+# ipv_s2Jf sc_s2Q8)
  of wild_a2JM { __DEFAULT -
  let {
x_a11p [Dmd=Just L] :: Int#
[LclId, Str=DmdType]
x_a11p = +# sc1_s2Q9 wild_a2JM } in
  case # x_a11p y_a11t of _ {
False -
  $sfindIndex_loop_s2Qz (+# sc_s2Q8 1) x_a11p (+# sc2_s2Qa 
1);
True - (Just @ Int (I# sc2_s2Qa)) `cast` ...
  }
  };
True - (Nothing @ Int) `cast` ...
  }; } in
case ($sfindIndex_loop_s2Qz 0 0 1) `cast` ... of _ {
  Nothing - lvl_r2QO;
  Just i_arH -
case i_arH of _ { I# x_a11Q -
let {
  y1_a124 [Dmd=Just L] :: Int#
  [LclId, Str=DmdType]
  y1_a124 = -# x_a11Q 1 } in
case =# 0 y1_a124 of _ {
  False - lvl_r2QO;
  True - I# y1_a124
}
}
};
  True - lvl_r2QO
}
}
}

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-06-11 Thread Evan Laforge
On Mon, Jun 11, 2012 at 1:29 PM, Roman Leshchinskiy r...@cse.unsw.edu.au 
wrote:
 Hmm, which version of GHC and what compiler flags are you using? I'm not 
 familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 
 -ddump-simpl and look at the output. Below is the code I'm getting for 
 find_before2 with 7.4.2. As you can see, everything has been fused (although 
 I notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some 
 reason, looks like a new regression but not a particularly bad one and 
 nothing to do with fusion).

I'm using 7.0.3, but I tried with 7.4.2 and it looks like what you
got, with only one letrec.

So it probably has to do with new optimizations introduced since 7.0.

Thanks for the help, and I will use this count the letrecs technique
in the future if I have questions.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-05-29 Thread Evan Laforge
Good question.. I copied both to a file and tried ghc-core, but it
inlines big chunks of Data.Vector and I can't read it very well, but
it looks like the answer is no, it still builds the the list of sums.
I guess the next step is to benchmark and see how busy the gc is on
each version.

But my impression was that stream fusion can't handle early aborts,
which was why I was wondering why Vector lacks a foldAbort type
function.

On Wed, May 23, 2012 at 5:13 AM, Jake McArthur jake.mcart...@gmail.com wrote:
 Have you already verified that stream fusion won't just do this for you?

 On May 23, 2012 12:35 AM, Evan Laforge qdun...@gmail.com wrote:

 So I wanted to find the first index in a vector whose running sum is
 greater than a given number.

 The straightforward way is to create the running sum and then search:

 Vector.findIndex (=target) (Vector.scanl' (+) 0 vector)

 But vectors are strict so it could do extra work, and what if I don't
 want to generate garbage?  I could do it with a fold, but it would
 have to have the ability to abort early.  Of course I could write such
 a fold myself using indexing:

 import qualified Data.Vector.Generic as Vector

 fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum
    - v a - accum
 fold_abort f accum vec = go 0 accum
    where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i

 find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int
 find_before n = fst . fold_abort go (0, 0)
    where
    go (i, total) a
        | total + a = n = Nothing
        | otherwise = Just (i+1, total+a)

 So it's bigger and clunkier, but I would think it would be much more
 efficient (provided using Data.Vector.Generic won't inhibit inlining
 and unboxing).  But I'm a bit surprised there isn't already something
 like fold_abort... or is there?

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-05-29 Thread Roman Leshchinskiy
On 29/05/2012, at 19:49, Evan Laforge wrote:

 Good question.. I copied both to a file and tried ghc-core, but it
 inlines big chunks of Data.Vector and I can't read it very well, but
 it looks like the answer is no, it still builds the the list of sums.
 I guess the next step is to benchmark and see how busy the gc is on
 each version.

Vector should definitely fuse this, if it doesn't it's a bug. Please report if 
it doesn't for you. To verify, just count the number of letrecs in the 
optimised Core. You'll see one letrec if it has been fused and two if it hasn't.

 But my impression was that stream fusion can't handle early aborts,
 which was why I was wondering why Vector lacks a foldAbort type
 function.

Stream fusion easily handles early aborts. There isn't anything like foldAbort 
precisely because it can be built out of existing operations at no extra cost.

Roman

 On Wed, May 23, 2012 at 5:13 AM, Jake McArthur jake.mcart...@gmail.com 
 wrote:
 Have you already verified that stream fusion won't just do this for you?
 
 On May 23, 2012 12:35 AM, Evan Laforge qdun...@gmail.com wrote:
 
 So I wanted to find the first index in a vector whose running sum is
 greater than a given number.
 
 The straightforward way is to create the running sum and then search:
 
 Vector.findIndex (=target) (Vector.scanl' (+) 0 vector)
 
 But vectors are strict so it could do extra work, and what if I don't
 want to generate garbage?  I could do it with a fold, but it would
 have to have the ability to abort early.  Of course I could write such
 a fold myself using indexing:
 
 import qualified Data.Vector.Generic as Vector
 
 fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum
- v a - accum
 fold_abort f accum vec = go 0 accum
where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i
 
 find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int
 find_before n = fst . fold_abort go (0, 0)
where
go (i, total) a
| total + a = n = Nothing
| otherwise = Just (i+1, total+a)
 
 So it's bigger and clunkier, but I would think it would be much more
 efficient (provided using Data.Vector.Generic won't inhibit inlining
 and unboxing).  But I'm a bit surprised there isn't already something
 like fold_abort... or is there?
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-05-29 Thread Duncan Coutts
On 29 May 2012 11:49, Evan Laforge qdun...@gmail.com wrote:
 Good question.. I copied both to a file and tried ghc-core, but it
 inlines big chunks of Data.Vector and I can't read it very well, but
 it looks like the answer is no, it still builds the the list of sums.
 I guess the next step is to benchmark and see how busy the gc is on
 each version.

 But my impression was that stream fusion can't handle early aborts,
 which was why I was wondering why Vector lacks a foldAbort type
 function.

Note that foldr allows early abort so that's fine. Also, there's no
fundamental restriction due to stream fusion. Stream fusion can be
used for lazy lists afterall and can implement Data.List.foldr just
fine.

Duncan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-05-23 Thread Jake McArthur
Have you already verified that stream fusion won't just do this for you?
On May 23, 2012 12:35 AM, Evan Laforge qdun...@gmail.com wrote:

 So I wanted to find the first index in a vector whose running sum is
 greater than a given number.

 The straightforward way is to create the running sum and then search:

 Vector.findIndex (=target) (Vector.scanl' (+) 0 vector)

 But vectors are strict so it could do extra work, and what if I don't
 want to generate garbage?  I could do it with a fold, but it would
 have to have the ability to abort early.  Of course I could write such
 a fold myself using indexing:

 import qualified Data.Vector.Generic as Vector

 fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum
- v a - accum
 fold_abort f accum vec = go 0 accum
where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i

 find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int
 find_before n = fst . fold_abort go (0, 0)
where
go (i, total) a
| total + a = n = Nothing
| otherwise = Just (i+1, total+a)

 So it's bigger and clunkier, but I would think it would be much more
 efficient (provided using Data.Vector.Generic won't inhibit inlining
 and unboxing).  But I'm a bit surprised there isn't already something
 like fold_abort... or is there?

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] vector operations

2012-05-22 Thread Evan Laforge
So I wanted to find the first index in a vector whose running sum is
greater than a given number.

The straightforward way is to create the running sum and then search:

Vector.findIndex (=target) (Vector.scanl' (+) 0 vector)

But vectors are strict so it could do extra work, and what if I don't
want to generate garbage?  I could do it with a fold, but it would
have to have the ability to abort early.  Of course I could write such
a fold myself using indexing:

import qualified Data.Vector.Generic as Vector

fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum
- v a - accum
fold_abort f accum vec = go 0 accum
where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i

find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int
find_before n = fst . fold_abort go (0, 0)
where
go (i, total) a
| total + a = n = Nothing
| otherwise = Just (i+1, total+a)

So it's bigger and clunkier, but I would think it would be much more
efficient (provided using Data.Vector.Generic won't inhibit inlining
and unboxing).  But I'm a bit surprised there isn't already something
like fold_abort... or is there?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe