Re: [Caml-list] Optimizing Float Ref's

2010-04-14 Thread Goswin von Brederlow
Dmitry Bely dmitry.b...@gmail.com writes:

 rewrite the function to get rid of it? My best achievement so far is

 let max_val (a:float array) =
   let m = [|-.min_float|] in
   for i = 0 to (Array.length a) - 1 do
 if a.(i)  m.(0)
 then m.(0) - a.(i)
   done;
   m.(0)

 but it looks ugly...

 - Dmitry Bely

Why not store the index of the max element instead of the float itself?
Recursively, to avoid the ref as well, this could look like this:

let max_val (a:float array) =
  let len = Array.length a in
  let rec loop max i =
if i = len
then a.(max)
else
  if a.(max)  a.(i)
  then loop max (i + 1)
  else loop i (i + 1)
  in
if len = 0
then -.min_float
else loop 0 1

MfG
Goswin

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2010-03-31 Thread Dmitry Bely
On Thu, Sep 3, 2009 at 1:44 PM, Xavier Leroy xavier.le...@inria.fr wrote:
(...)
 But, I thought that float ref's were automatically unboxed by the compiler
 when they didn't escape the local context.

 Yes, if all uses of the float ref are unboxed, which is the case in
 your code.

let max_val (a:float array) =
  let m = ref min_float in
  for i = 0 to (Array.length a) - 1 do
if a.(i)  !m
then m := a.(i)
  done;
  !m

!m is not unboxed (Ocaml 3.11). Should it?

- Dmitry Bely

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2010-03-31 Thread Dmitry Bely
On Wed, Mar 31, 2010 at 9:55 PM, Jake Donham j...@donham.org wrote:
 On Wed, Mar 31, 2010 at 10:21 AM, Dmitry Bely dmitry.b...@gmail.com wrote:
 let max_val (a:float array) =
  let m = ref min_float in
  for i = 0 to (Array.length a) - 1 do
    if a.(i)  !m
    then m := a.(i)
  done;
  !m

 !m is not unboxed (Ocaml 3.11). Should it?

 Why do you say that? It looks unboxed in the lambda code:

  (let
    (max_val/58
       (function a/59
         (let (m/60 (field 13 (global Pervasives!)))
           (seq
             (for i/61 0 to (- (array.length a/59) 1)
               (if (. (array.get a/59 i/61) m/60)
                 (assign m/60 (array.get a/59 i/61)) 0a))
             m/60

No. assign is translated into memory allocation inside the loop.

 My x86 assembler is not very good but it looks to me like m is kept in %esi.

Ok, probably returning float makes allocation place not obvious. Try
this variant:

let max_val (a:float array) =
  let m = ref min_float in
  for i = 0 to (Array.length a) - 1 do
if a.(i)  !m
then m := a.(i)
  done;
  truncate !m

It's translated into the following C-- code (-unsafe flag):

(function camlT__max_val_58 (a/59: addr)
 (let m/60 (load (+a camlPervasives 52))
   (let (i/61 1 bound/65 (+ (or (u (load (+a a/59 -4)) 10) 1) -2))
 (catch
   (if ( i/61 bound/65) (exit 3)
 (loop
   (if
 (f (load float64u (+a (+a a/59 ( i/61 2)) -4))
   (load float64u m/60))
 (assign m/60
   (alloc 2301
 (load float64u (+a (+a a/59 ( i/61 2)) -4
 [])
   (let i/64 i/61 (assign i/61 (+ i/61 2))
 (if (== i/64 bound/65) (exit 3) []
 with(3) []))
   (+ ( (intoffloat (load float64u m/60)) 1) 1)))

See alloc 2301 inside the loop? That's boxed float.

- Dmitry Bely

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2010-03-31 Thread Alain Frisch

On 31/03/2010 19:21, Dmitry Bely wrote:

On Thu, Sep 3, 2009 at 1:44 PM, Xavier Leroyxavier.le...@inria.fr  wrote:
(...)

But, I thought that float ref's were automatically unboxed by the compiler
when they didn't escape the local context.


Yes, if all uses of the float ref are unboxed, which is the case in
your code.


let max_val (a:float array) =
   let m = ref min_float in
   for i = 0 to (Array.length a) - 1 do
 if a.(i)  !m
 then m := a.(i)
   done;
   !m

!m is not unboxed (Ocaml 3.11). Should it?


As far as I can tell, the boxing for the reference cell is removed by 
the compiler (that is, the reference is implemented as a mutable local 
variable), but not the boxing for the float contained in the reference. 
Since a is a float array, it is needed to box the float before putting 
it in the reference, hence the allocation.



-- Alain


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2010-03-31 Thread Dmitry Bely
On Wed, Mar 31, 2010 at 11:52 PM, Ted Sandler ted.sand...@gmail.com wrote:
 rpretation of Xavier's words was that the compiler would
 completely optimize out memory allocation and keep !m in register. I
 can live with the allocation but not at every iteration... How to
 rewrite the function to get rid of it? My best achievement so far is

 Try tail-recursion.

Won't work. Show the code.

- Dmitry Bely

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-09-03 Thread Xavier Leroy
I'm running OCaml 3.11.1, and I noticed something strange in some native 
code for matrix multiply today.  The code was

[...]
[Local float ref being unboxed or not? ]


You omitted the definition of dims, but after adding the obvious
definition, the float ref sum is indeed completely unboxed and is
kept in a float register (on x86-64 bits) or stack location (on x86-32
bits).  No modify takes place in the inner loop.  So, I don't
understand the problem you observed.  Feel free to post a report on
the BTS with a *complete* piece of code that reproduces the problem.

But, I thought that float ref's were automatically unboxed by the 
compiler when they didn't escape the local context.


Yes, if all uses of the float ref are unboxed, which is the case in
your code.

- Xavier Leroy

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-09-03 Thread Will M Farr

Xavier,

Thanks for the comments.  I thought that float ref's were unboxed by  
default!  In fact, I find that breaking out the code into a stand- 
alone example which loops through matrix multiplies only indeed does  
not have any calls to caml_modify; everything is unboxed and stored  
on the stack (I'm on x86) as you say.  It must be the remainder of my  
application that is producing the calls to caml_modify (the profile I  
was using was embedded in a larger application with lots of inlining  
going on, so maybe something that calls caml_modify is getting inlined  
around the matrix multiply and confusing the profiler).


Thanks,
Will

On Sep 3, 2009, at 5:44 AM, Xavier Leroy wrote:

I'm running OCaml 3.11.1, and I noticed something strange in some  
native code for matrix multiply today.  The code was

[...]
[Local float ref being unboxed or not? ]


You omitted the definition of dims, but after adding the obvious
definition, the float ref sum is indeed completely unboxed and is
kept in a float register (on x86-64 bits) or stack location (on x86-32
bits).  No modify takes place in the inner loop.  So, I don't
understand the problem you observed.  Feel free to post a report on
the BTS with a *complete* piece of code that reproduces the problem.

But, I thought that float ref's were automatically unboxed by the  
compiler when they didn't escape the local context.


Yes, if all uses of the float ref are unboxed, which is the case in
your code.

- Xavier Leroy




PGP.sig
Description: This is a digitally signed message part
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-08-31 Thread Till Varoquaux
True. All float records and float arrays are unboxed so modifications
don't have to pass through caml_modify. A single cell array will still
have dynamic bound checking so I would recommend going for the record.
Till

On Sun, Aug 30, 2009 at 3:43 PM, Yaron Minskyymin...@gmail.com wrote:
 Float refs are not unboxed automatically, because refs
 Are polymorphic containers. If you create your own pseudo-ref, i.e., a
 record with a single mutable float field, then I believe you should get the
 behaviour you expect.

 Come to think of it, I wonder if it would be better to implement ref on top
 of a single-cell array, since then everyone would get the float unboxing
 whenever applicable. I imagine there is some runtime overhead to this,
 though.

 y


 On Aug 28, 2009, at 4:32 PM, Will M Farr f...@mit.edu wrote:

 Hello all,

 I'm running OCaml 3.11.1, and I noticed something strange in some native
 code for matrix multiply today.  The code was

 let mmmul store m1 m2 =
  let (ni,nk) = dims m1 and
     (nk2,nj) = dims m2 and
     (sni,snj) = dims store in
  assert(nk=nk2);
  assert(ni=sni);
  assert(nj=snj);
  for i = 0 to ni - 1 do
   let row1 = m1.(i) and
       srow = store.(i) in
   for j = 0 to nj - 1 do
     let sum = ref 0.0 in   (* Un-boxed float ref? *)
     for k = 0 to nk - 1 do
       let row2 = Array.unsafe_get m2 k in
       let x = Array.unsafe_get row1 k and
           y = Array.unsafe_get row2 j in
       sum := !sum +. x*.y
     done;
     Array.unsafe_set srow j !sum
   done
  done;
  store

 (I compiled with ocamlopt.)  It multiplies the matrices (represented as
 arrays of arrays of floats) m1 and m2 together and puts the result into the
 matrix store.  Profiling the code, I noticed a call to caml_modify during
 the execution of this function!  Turns out that the culprit was the float
 ref sum.  Changing to the following code (which eliminates the float ref,
 and uses the - and .( ) operators instead of unsafe_set and unsafe_get)
 eliminated that call, and sped things up tremendously:

 let mmmul store m1 m2 =
  let (ni,nk) = dims m1 and
     (nk2,nj) = dims m2 in
  for i = 0 to ni - 1 do
   let row1 = m1.(i) and
       srow = store.(i) in
   for j = 0 to nj - 1 do
     srow.(j) - 0.0;
     for k = 0 to nk - 1 do
       let row2 = Array.unsafe_get m2 k in
       let x = row1.(k) and
           y = row2.(j) in
       srow.(j) - srow.(j) +. x*.y
     done
   done
  done;
  store

 But, I thought that float ref's were automatically unboxed by the compiler
 when they didn't escape the local context.  Is this a complier bug, is there
 a bad interaction with unsafe_get and unsafe_set, or is there something else
 going on that I don't understand?  Any enlightenment would be appreciated.

 Thanks!
 Will
 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs

 ___
 Caml-list mailing list. Subscription management:
 http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
 Archives: http://caml.inria.fr
 Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
 Bug reports: http://caml.inria.fr/bin/caml-bugs


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-08-31 Thread Will M Farr
Thanks to all who have replied.  I didn't realize that references were  
always boxed (I thought that, internally, a reference was implemented  
as a one-element array, and therefore float refs would benefit from  
the automatic unboxing of float arrays).  That's good to know.


Will

On Aug 31, 2009, at 10:09 AM, Till Varoquaux wrote:


True. All float records and float arrays are unboxed so modifications
don't have to pass through caml_modify. A single cell array will still
have dynamic bound checking so I would recommend going for the record.
Till

On Sun, Aug 30, 2009 at 3:43 PM, Yaron Minskyymin...@gmail.com  
wrote:

Float refs are not unboxed automatically, because refs
Are polymorphic containers. If you create your own pseudo-ref,  
i.e., a
record with a single mutable float field, then I believe you should  
get the

behaviour you expect.

Come to think of it, I wonder if it would be better to implement  
ref on top
of a single-cell array, since then everyone would get the float  
unboxing
whenever applicable. I imagine there is some runtime overhead to  
this,

though.

y


On Aug 28, 2009, at 4:32 PM, Will M Farr f...@mit.edu wrote:


Hello all,

I'm running OCaml 3.11.1, and I noticed something strange in some  
native

code for matrix multiply today.  The code was

let mmmul store m1 m2 =
 let (ni,nk) = dims m1 and
(nk2,nj) = dims m2 and
(sni,snj) = dims store in
 assert(nk=nk2);
 assert(ni=sni);
 assert(nj=snj);
 for i = 0 to ni - 1 do
  let row1 = m1.(i) and
  srow = store.(i) in
  for j = 0 to nj - 1 do
let sum = ref 0.0 in   (* Un-boxed float ref? *)
for k = 0 to nk - 1 do
  let row2 = Array.unsafe_get m2 k in
  let x = Array.unsafe_get row1 k and
  y = Array.unsafe_get row2 j in
  sum := !sum +. x*.y
done;
Array.unsafe_set srow j !sum
  done
 done;
 store

(I compiled with ocamlopt.)  It multiplies the matrices  
(represented as
arrays of arrays of floats) m1 and m2 together and puts the result  
into the
matrix store.  Profiling the code, I noticed a call to caml_modify  
during
the execution of this function!  Turns out that the culprit was  
the float
ref sum.  Changing to the following code (which eliminates the  
float ref,
and uses the - and .( ) operators instead of unsafe_set and  
unsafe_get)

eliminated that call, and sped things up tremendously:

let mmmul store m1 m2 =
 let (ni,nk) = dims m1 and
(nk2,nj) = dims m2 in
 for i = 0 to ni - 1 do
  let row1 = m1.(i) and
  srow = store.(i) in
  for j = 0 to nj - 1 do
srow.(j) - 0.0;
for k = 0 to nk - 1 do
  let row2 = Array.unsafe_get m2 k in
  let x = row1.(k) and
  y = row2.(j) in
  srow.(j) - srow.(j) +. x*.y
done
  done
 done;
 store

But, I thought that float ref's were automatically unboxed by the  
compiler
when they didn't escape the local context.  Is this a complier  
bug, is there
a bad interaction with unsafe_get and unsafe_set, or is there  
something else
going on that I don't understand?  Any enlightenment would be  
appreciated.


Thanks!
Will
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs





PGP.sig
Description: This is a digitally signed message part
___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-08-31 Thread Jon Harrop
On Friday 28 August 2009 21:32:55 Will M Farr wrote:
 Hello all,

 I'm running OCaml 3.11.1, and I noticed something strange in some
 native code for matrix multiply today.

I'm running OCaml 3.11.0 and I get the opposite result on both x86 and x64:

$ ./will
Took 31.445965s
Took 39.942496s

$ ./will
Took 25.457591s
Took 40.414526s

 But, I thought that float ref's were automatically unboxed by the
 compiler when they didn't escape the local context.  Is this a
 complier bug, is there a bad interaction with unsafe_get and
 unsafe_set, or is there something else going on that I don't
 understand?  Any enlightenment would be appreciated.

That appears to occur in this case on my machine but it can be hampered 
sometimes if the accumulator is returned immediately and the problem worked 
around by performing a redundant floating point operation (such as 1.0 *. x) 
on the result.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs


Re: [Caml-list] Optimizing Float Ref's

2009-08-31 Thread Jon Harrop
On Sunday 30 August 2009 20:43:17 Yaron Minsky wrote:
 Float refs are not unboxed automatically, because refs 
 Are polymorphic containers. If you create your own pseudo-ref, i.e., a
 record with a single mutable float field, then I believe you should
 get the behaviour you expect.

I believe you are talking at cross purposes. Will wants his accumulator in a 
register. You are referring to float references in data structures being 
boxed because records are boxed.

Look at the compiled forms of the inner loops of these dot products, for 
example:

let dot a b =
  let x = ref 0.0 in
  for i=0 to Array.length a - 1 do
x := !x +. a.(i) *. b.(i)
  done;
  !x

.L101:
.L103:  movlcaml_young_ptr, %eax
subl$12, %eax
movl%eax, caml_young_ptr
cmplcaml_young_limit, %eax
jb  .L104
leal4(%eax), %eax
movl$2301, -4(%eax)
fldl-4(%edi, %ecx, 4)
fmull   -4(%ebx, %ecx, 4)
faddl   (%esi)
fstpl   (%eax)
movl%eax, %esi
movl%ecx, %eax
addl$2, %ecx
cmpl%edx, %eax
jne .L101

let dot2 a b =
  let x = ref 0.0 in
  for i=0 to Array.length a - 1 do
x := !x +. a.(i) *. b.(i)
  done;
  1.0 *. !x

.L107:
fldl-4(%eax, %ecx, 4)
fmull   -4(%ebx, %ecx, 4)
faddl   0(%esp)
fstpl   0(%esp)
movl%ecx, %edx
addl$2, %ecx
cmpl%esi, %edx
jne .L107

In the latter case, x is unboxed into a register.

 Come to think of it, I wonder if it would be better to implement ref
 on top of a single-cell array, since then everyone would get the float
 unboxing whenever applicable. I imagine there is some runtime overhead
 to this, though.

All-float (including one-float) records are unboxed anyway.

Boxing was discussed in the book OCaml for Scientists and the OCaml Journal 
articles about optimization and the SciMark2 benchmark.

-- 
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e

___
Caml-list mailing list. Subscription management:
http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
Archives: http://caml.inria.fr
Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
Bug reports: http://caml.inria.fr/bin/caml-bugs