Hello Clément,
Concerning my other question about the new Scilab API
Currently trying to port a Scilab interface to Scilab 6 standards, I
did try to find the equivalent of the deprecated Scifunction ? What is
the new function and where is documented ?
how does one call a scilab function from within a gateway code, since
"SciFunction" seems to belong to the past now ?
Thanks in advance,
Stéphane
Le 26/02/2018 à 12:36, Clément David a écrit :
Hello Stéphane,
AFAIK, as in the old interface the new C++ one allow you to get pointers to the
raw double* values
and so allow you to write anything to the inputs. However this kind of behavior
is considered a bug
as from the user point of view (eg. Scilab beginners) each assignation is
supposed to have inputs
(unmodified data) and outputs (computed data).
Note: about the += operator, this is indeed a way to tell the user that the
argument is modified in
place with a specific + operation.
Thanks,
--
Clément
Le jeudi 22 février 2018 à 14:18 +0100, Stéphane Mottelet a écrit :
Really, nobody knows ?
S.
Le 20/02/2018 à 11:57, Stéphane Mottelet a écrit :
Hello,
Continuing on this subject, Hello, I discovered that the new Scilab API allows
to modify input
parameters of a function (in-place assignment), e.g. I have modified the
previous daxpy such
that the expression
daxpy(2,X,Y)
has no output but formally does "Y+=2*X" if such an operator would exist in
Scilab. In this case
there is no matrix copy at all, hence no memory overhead.
Was it possible to do this with the previous API ?
S.
Le 19/02/2018 à 19:15, Stéphane Mottelet a écrit :
Hello,
After some tests, for the intended use (multiply a matrix by a scalar), dgemm
is not faster
that dscal, but in the C code of "iMultiRealScalarByRealMatrix", the part which
takes the most
of the CPU time is the call to "dcopy". For example, on my machine, for a
10000x10000 matrix,
the call to dcopy takes 540 milliseconds and the call to dscal 193
milliseconds. Continuing my
explorations today, I tried to see how Scilab expressions such as
Y+2*X
are parsed and executed. To this purpose I have written an interface (daxpy.sci
and daxpy.c
attached) to the BLAS function "daxpy" which does "y<-y+a*x" and a script
comparing the above
expression to
daxpy(2,X,Y)
for two 10000x10000 matrices. Here are the results (MacBook air core i7@1,7GHz):
daxpy(2,X,Y)
(dcopy: 582 ms)
(daxpy: 211 ms)
elapsed time: 793 ms
Y+2*X
elapsed time: 1574 ms
Considered the above value, the explanation is that in "Y+2*X" there are *two*
copies of a
10000x10000 matrix instead of only one in "daxpy(2,X,Y)". In "Y+2*X+3*Z" there
will be three
copies, although there could be only one if daxpy was used twice.
I am not blaming Scilab here, I am just blaming "vectorization", which can be
inefficient when
large objects are used. That's why explicits loops can sometimes be
faster than
vectorized operations in Matlab or Julia (which both use JIT compilation).
S.
Le 15/02/2018 à 17:11, Antoine ELIAS a écrit :
Hello Stéphane,
Interesting ...
In release, we don't ship the header of BLAS/LAPACK functions.
But you can define them in your C file as extern. ( and let the linker do his
job )
extern int C2F(dgemm) (char *_pstTransA, char *_pstTransB, int *_piN, int
*_piM, int *_piK,
double *_pdblAlpha, double *_pdblA, int *_piLdA,
double *_pdblB, int *_piLdB, double *_pdblBeta, double
*_pdblC, int
*_piLdC);
and
extern int C2F(dscal) (int *_iSize, double *_pdblVal, double *_pdblDest, int
*_iInc);
Others BLAS/LAPACK prototypes can be found at
http://cgit.scilab.org/scilab/tree/scilab/modu
les/elementary_functions/includes/elem_common.h?h=6.0
Regards,
Antoine
Le 15/02/2018 à 16:50, Stéphane Mottelet a écrit :
Hello all,
Following the recent discussion with fujimoto, I discovered that Scilab does
not (seem to)
fully use SIMD operation in BLAS as it should. Besides the bottlenecks of its
code, there
are also many operations of the kind
scalar*matrix
Althoug this operation is correctly delegated to the DSCAL BLAS function (can
be seen in C
function iMultiRealMatrixByRealMatrix in
modules/ast/src/c/operations/matrix_multiplication.c) :
int iMultiRealScalarByRealMatrix(
double _dblReal1,
double *_pdblReal2, int _iRows2, int _iCols2,
double *_pdblRealOut)
{
int iOne = 1;
int iSize2 = _iRows2 * _iCols2;
C2F(dcopy)(&iSize2, _pdblReal2, &iOne, _pdblRealOut, &iOne);
C2F(dscal)(&iSize2, &_dblReal1, _pdblRealOut, &iOne);
return 0;
}
in the code below the product "A*1" is likely using only one processor core,
as seen on
the cpu usage graph and on the elapsed time,
A=rand(20000,20000);
tic; for i=1:10; A*1; end; toc
ans =
25.596843
but this second piece of code is more than 8 times faster and uses 100% of the
cpu,
ONE=ones(20000,1);
tic; for i=1:10; A*ONE; end; toc
ans =
2.938314
with roughly the same number of multiplications. This second computation is
delegated to
DGEMM (C<-alpha*A*B + beta*C, here with alpha=1 and beta=0)
int iMultiRealMatrixByRealMatrix(
double *_pdblReal1, int _iRows1, int _iCols1,
double *_pdblReal2, int _iRows2, int _iCols2,
double *_pdblRealOut)
{
double dblOne = 1;
double dblZero = 0;
C2F(dgemm)("n", "n", &_iRows1, &_iCols2, &_iCols1, &dblOne,
_pdblReal1 , &_iRows1 ,
_pdblReal2, &_iRows2, &dblZero,
_pdblRealOut , &_iRows1);
return 0;
}
Maybe my intuition is wrong, but I have the feeling that using dgemm with
alpha=0 will be
faster than dscal. I plan to test this by making a quick and dirty code linked
to Scilab
so my question to devs is : which are the #includes to add on top of the source
(C) to be
able to call dgemm and dscal ?
Thanks for your help
S.
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
_______________________________________________
dev mailing list
[email protected]
http://lists.scilab.org/mailman/listinfo/dev