On 2020-08-23 21:19, Kevin Li wrote:
Alternatively, I could define another function in the C file:

void f_list(double *x, long len) {
    for (long i = 0; i < len; i++) x[i] = f[i];
}

and expose this in the *.so.  Then I could define another verb

fs =: 'path.to.library.so f_list > n *d x&cd

and call that instead. I find that fs is usually 2-3 times faster than f,
presumably because of the overhead in calling into the shared library.
However, fs has two disadvantages:

1. It modifies the array in place, which makes my code less functional

Because fs passes a *d rather than a &d, f_list is modifying a copy in
place, and since it doesn't return anything the work is lost.

   a =: i.5
   fs a;#a
0
   a
0 1 2 3 4

J arrays have value semantics, not reference semantics.

This a bit awkward, especially n=.
2. It makes it harder to work with atoms (I have to convert an atom into a
1 dimensional array first).

fs2i =: './f_list.so f_list2 > x &d x'&cd
fs2 =: 3 : 0"1
  n =. ,:`] @.(*@:#@:$) y
  a =. fs2i n;#n
  r =. memr a,0,(#n),8
  memf a
  r
)

Where f_list2 returns a malloc'd pointer as an integer:

extern(C) double f(double x) {
    return x * 1.5;
}

extern(C) size_t f_list2(double* x, int len) {
    import core.stdc.stdlib: malloc;
    double* ret = cast(double*)malloc(double.sizeof * len);
    foreach (i; 0 .. len) ret[i] = f(x[i]);
    return cast(size_t)ret;
}

I also suspect there's a lot of unnecessary copying.
But usage is OK:

   fs2 5
7.5
   fs2 i.5
0 1.5 3 4.5 6
   fs2 i.5 5
   0  1.5    3  4.5    6
 7.5    9 10.5   12 13.5
  15 16.5   18 19.5   21
22.5   24 25.5   27 28.5
  30 31.5   33 34.5   36

Alternately:

fs3 =: 3 : 0
  n =. ,:y
  a =. fs2i n;*/$n
  r =. ($n) $ memr a,0,(*/$n),8
  memf a
  r
)

This has one FFI call regardless of y rank, as shown by
adding some noise to the library's functions:

   fs2 5
f_list2 called
7.5
   fs2 i.5
f_list2 called
0 1.5 3 4.5 6
   fs2 i.5 5
f_list2 called
f_list2 called
f_list2 called
f_list2 called
f_list2 called
   0  1.5    3  4.5    6
 7.5    9 10.5   12 13.5
  15 16.5   18 19.5   21
22.5   24 25.5   27 28.5
  30 31.5   33 34.5   36

vs.

   fs3 5
f_list2 called
7.5
   fs3 i.5
f_list2 called
0 1.5 3 4.5 6
   fs3 i.5 5
f_list2 called
   0  1.5    3  4.5    6
 7.5    9 10.5   12 13.5
  15 16.5   18 19.5   21
22.5   24 25.5   27 28.5
  30 31.5   33 34.5   36


Is there a way to get the best of both worlds?

Thanks!
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm

Reply via email to