On 12/20/06, Francesc Altet <[EMAIL PROTECTED]> wrote:

A Dimecres 20 Desembre 2006 03:36, David Cournapeau escrigué:
> Francesc Altet wrote:
> > A Dimarts 19 Desembre 2006 08:12, David Cournapeau escrigué:
> >> Hi,
> >>


<snip>

@[EMAIL PROTECTED] (void *dst, void *src, int swap, void *arr)
{

         if (src != NULL) /* copy first if needed */
                memcpy(dst, src, sizeof(@type@));

[where the numpy code generator is replacing @fname@ by DOUBLE]

we see that memcpy is called under the hood (I don't know why oprofile
is not able to detect this call anymore).

After looking at the function, and remembering what Charles Harris
said in a previous message about the convenience to use a simple type
specific assignment, I've ended replacing the memcpy. Here it is the
patch:

--- numpy/core/src/arraytypes.inc.src   (revision 3487)
+++ numpy/core/src/arraytypes.inc.src   (working copy)
@@ -997,11 +997,11 @@
}

static void
[EMAIL PROTECTED]@_copyswap (void *dst, void *src, int swap, void *arr)
[EMAIL PROTECTED]@_copyswap (@type@ *dst, @type@ *src, int swap, void *arr)
{

         if (src != NULL) /* copy first if needed */
-                memcpy(dst, src, sizeof(@type@));
+                *dst = *src;

         if (swap) {
                 register char *a, *b, c;


We could get rid of the register keyword too, it is considered obsolete
these days.  Also, for most architectures

#if [EMAIL PROTECTED]@ == 4
               b = a + 3;
               c = *a; *a++ = *b; *b-- = c;
               c = *a; *a++ = *b; *b   = c;

will be notably slower than

#if [EMAIL PROTECTED]@ == 4
               c = a[0]; a[0] = a[3]; a[3] = c;
               c = a[1]; a[1] = a[2]; a[2] = c;

because loading the indexed addresses is a single instruction if a is in a
register.

Inlining would also be good, but can be tricky and compiler dependent. If
all the code is in one big chunk, things aren't so bad and a simple inline
directive should do the trick. We would also want to break the subroutine up
into smaller pieces so that the common case was inlined and the more
complicated cases remained function calls.

Chuck
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to