Re: [polyml] FFI overhead

Artella Coding Sun, 20 Sep 2015 22:54:46 -0700

Hi, thanks for the pointers on the ffi.

"I strongly suspect that SML functions like IntArray.update take less time
than the FFI overhead, so no improvement is possible by using C functions."


Yes it seems so. I was hoping that I could somehow achieve the extremely
low overheads that I get when using the haskell ffi (because I think ghc
haskell compiler also uses libffi ;
https://github.com/ghc/ghc/commit/39e206a7badd18792a7c8159cff732c63a9b19e7)
but it is not obvious how I can do that.

It seems that "val previous = IntArray.sub(data,i-1)" and
"IntArray.update(data,i,use+1);" are bottlenecks. For example the following
program :


**********************************************************
**********************************************************
**********************************************************

val size:int = 50000;
val loops:int = 30000;
val cap:int = 50000;

val data = IntArray.array(size,0);

fun loop () =
  let
    fun loopI i =
      if i = size then
        let val _ = () in
          IntArray.update(data,0,IntArray.sub(data,size-1));
          ()
        end
      else
        let val previous = IntArray.sub(data,i-1)
            val use = if previous > cap then 0 else previous in
          IntArray.update(data,i,use+1);
          loopI (i+1)
      end
  in loopI 1 end

fun benchmarkRun () =
  let
    fun bench i =
      if i = loops then ()
      else let val _ = () in
             loop ();
             bench (i+1)
           end
  in bench 1 end

fun sum (i,value) =
  if i = size then value
  else sum(i+1,value+Array.sub(data,i))

fun main () = let val _ = () in
  benchmarkRun();
  print (Int.toString (sum (0,0)));
  print "\n"
  end

**********************************************************
**********************************************************
**********************************************************


takes about 52 seconds. Now if I modify "loop" (by commenting out "val
previous = IntArray.sub(data,i-1)" and "IntArray.update(data,i,use+1);") to
:


**********************************************************
**********************************************************
**********************************************************

fun loop () =
  let
    fun loopI i =
      if i = size then
        let val _ = () in
          IntArray.update(data,0,IntArray.sub(data,size-1));
          ()
        end
      else
        let (*val previous = IntArray.sub(data,i-1)*)
            val previous = 0
            val use = if previous > cap then 0 else previous in
          (*IntArray.update(data,i,use+1);*)
          loopI (i+1)
      end
  in loopI 1 end

**********************************************************
**********************************************************
**********************************************************


then the program takes 8 seconds.





On Sun, Sep 20, 2015 at 5:33 PM, Phil Clayton <[email protected]>
wrote:

> Poly/ML is using libffi to call C functions.  To determine the FFI
> overhead, you could create an example that just makes the same number of
> calls to empty C functions with the same number of arguments.
>
> To see what happens when a C function is called, look at call_sym function
> in foreign.cpp:
> https://github.com/polyml/polyml/blob/master/libpolyml/foreign.cpp#L874
> For a C function to improve efficiency, the C function needs to be faster
> by more than the FFI overhead.  I strongly suspect that SML functions like
> IntArray.update take less time than the FFI overhead, so no improvement is
> possible by using C functions.  (I suspect such simple SML functions take
> much less time, so I would expect use of C functions to be much slower.)
>
> Phil
>
> P.S. I think that there is scope for efficiency improvement in Poly/ML but
> with some upheaval.  For example, if call_sym first took parameters that
> indicate the types of arguments and the return value, then, for each call
> site, arg_values and arg_types could be created once and the call to
> ffi_prep_cif made once.  Still, the arguments have to be filled in on each
> call:
>     PolyWord p = arg_list;
>     for (POLYUNSIGNED i=0; i<num_args; i++,p=Tail(p))
>     {
>         arg_values[i] = DEREFVOL(taskData, Head(p).AsObjPtr()->Get(1));
>         arg_types[i] = ctypeToFfiType(taskData,
> Head(p).AsObjPtr()->Get(0));
>
>     }
>
>
> 19/09/15 16:17, Artella Coding wrote:
>
>> Hi, I thought that I would try to speed up the SML code at
>>
>> http://stackoverflow.com/questions/32425267/how-to-improving-array-benchmark-performance-in-polyml
>> by using the FFI, but this results in significant slowdown.
>>
>> Non ffi code :
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> val size:int = 50000;
>> val loops:int = 30;
>> val cap:int = 50000;
>>
>> val data = IntArray.array(size,0);
>>
>> fun loop () =
>>    let
>>      fun loopI i =
>>        if i = size then
>>          let val _ = () in
>>            IntArray.update(data,0,IntArray.sub(data,size-1));
>>            ()
>>          end
>>        else
>>          let val previous = IntArray.sub(data,i-1)
>>              val use = if previous > cap then 0 else previous in
>>            IntArray.update(data,i,use+1);
>>            loopI (i+1)
>>        end
>>    in loopI 1 end
>>
>> fun benchmarkRun () =
>>    let
>>      fun bench i =
>>        if i = loops then ()
>>        else let val _ = () in
>>               loop ();
>>               bench (i+1)
>>             end
>>    in bench 1 end
>>
>> fun sum (i,value) =
>>    if i = size then value
>>    else sum(i+1,value+Array.sub(data,i))
>>
>> fun main () = let val _ = () in
>>    benchmarkRun();
>>    print (Int.toString (sum (0,0)));
>>    print "\n"
>>    end
>>
>> (*val _ = main ()*)
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> FFI code :
>>
>> c code :
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> //intArray.c
>> #include <stdlib.h>
>> #include <stdio.h>
>>
>> typedef struct _intArray {
>>    int size;
>>    int* arr;
>> } intArray;
>>
>> intArray* createIntArray(int size){
>>    int i;
>>    intArray* p = (intArray*) malloc (sizeof(intArray));
>>    p->arr = (int*) malloc (size*sizeof(int));
>>    for(i=0; i<size; i++){
>>      p->arr[i] = 0;
>>    }
>>    p->size = size;
>>    return p;
>> }
>>
>> void destroyIntArray(intArray* p){
>>    free (p->arr);
>>    free (p);
>> }
>>
>> void setIntArray(intArray* p, int elem, int val){
>>    p->arr[elem] = val;
>> }
>>
>> int getIntArray(intArray *p, int elem){
>>    return p->arr[elem];
>> }
>>
>> int getSumIntArray(intArray* p){
>>    int sum = 0;
>>    int i;
>>    int size = p->size;
>>    for(i=0; i<size; i++){
>>      sum += p->arr[i];
>>    }
>>    return sum;
>> }
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> ml code :
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> open CInterface;
>>
>> val lib = load_lib "./intArray.so";
>> val get = get_sym "./intArray.so";
>>
>> val PINTARR = POINTER;
>>
>> val c1 = call1 (get "createIntArray") INT PINTARR
>> val c2 = call3 (get "setIntArray") (PINTARR,INT,INT) VOID
>> val c3 = call2 (get "getIntArray") (PINTARR,INT) INT
>> val c4 = call1 (get "getSumIntArray") (PINTARR) INT
>>
>> fun c_createIntArray (size) = c1 (size);
>> fun c_setIntArray (p,elem,value) = c2 (p,elem,value);
>> fun c_getIntArray (p,elem) = c3 (p,elem);
>> fun c_getSumIntArray (p) = c4 (p);
>>
>>
>> val size:int = 50000;
>> val loops:int = 30;
>> val cap:int = 50000;
>>
>> fun loop (pData2) =
>>    let
>>      fun loopI i =
>>        if i = size then
>>          let val _ = () in
>>            c_setIntArray(pData2,0,c_getIntArray(pData2,size-1));
>>            ()
>>          end
>>        else
>>          let
>>              val previous = c_getIntArray(pData2,i-1);
>>              val use = if previous > cap then 0 else previous in
>>            c_setIntArray(pData2,i,use+1);
>>            loopI (i+1)
>>        end
>>    in loopI 1 end
>>
>> fun benchmarkRun (pData2) =
>>    let
>>      fun bench i =
>>        if i = loops then ()
>>        else let val _ = () in
>>               loop (pData2);
>>               bench (i+1)
>>             end
>>    in bench 1 end
>>
>> fun main () =
>>    let
>>      val pData = c_createIntArray(size);
>>      val final = load_sym lib "destroyIntArray";
>>    in
>>    setFinal final pData;
>>    benchmarkRun(pData);
>>    print (Int.toString (c_getSumIntArray (pData)));
>>    print "\n"
>>    end
>>
>> *************************************************************************
>> *************************************************************************
>> *************************************************************************
>>
>> The times are :
>>
>> a)for non ffi sml : 0.09s
>> b)for ffi sml : 11.8s
>>
>> Is there any way I can improve the speeds on the ffi code? Thanks
>>
>>
>> _______________________________________________
>> polyml mailing list
>> [email protected]
>> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
>>
>>
> _______________________________________________
> polyml mailing list
> [email protected]
> http://lists.inf.ed.ac.uk/mailman/listinfo/polyml
>

_______________________________________________
polyml mailing list
[email protected]
http://lists.inf.ed.ac.uk/mailman/listinfo/polyml

Re: [polyml] FFI overhead

Reply via email to