Re: 0.9.8: cfb_enc.c bug? and AES speed on Win64/x64

Brian Hurt Thu, 07 Jul 2005 12:20:27 -0700


On Thu, 7 Jul 2005, Jack Lloyd wrote:

On Thu, Jul 07, 2005 at 07:42:37PM +0200, Andy Polyakov wrote:

1) In openssl-0.9.8/crypto/des/cfb_enc.c line 170 there is "memcpy
(ovec,ovec+num,8);" and since ovec and ovec+num will overlap sometimes,
this function relies on undocumented/undefined behavior of memcpy?


The original reason for choosing of memcpy was a) it's comonly inlined
by compilers [most notably gcc], but not memmove, b) I fail to imagine
how it can fail with overlapping regions if num is guaranteed to be
positive, even if the routine is super-optimized, inlined, whatever. Can
you?


This doesn't make any sense - if memcpy can handle overlapping regions
without any slowdown, then wouldn't it make sense to implemenent
memmove as a #define (or inline call to) memcpy? Either memcpy does
not handle overlaps while memmove does, or memcpy and memmove work at
the same speed, because the ability to handle overlapping memory
regions is the only difference between the two. The only other
alternative is that memcpy and memmove do the exact same thing, but
memmove is slower. That seems quite unlikely.

If the regions overlap, the behavior is undefined according to thestandard- which means that the compiler or produced code can do somethingodd, segfault, or whistle dixie and explode, and still be conformant.

And it can fail with overlapping arguments. Consider the "normal"implementation (which is in no way gaurenteed) of memcpy:


void * memcpy(void * dst, const void * src, size_t len) {
    char * d = (char *) dst;
    const char * s = (const char *) src;

    while (len-- > 0) {
        *d++ = *s++;
    }

    return dst;
}

Now, call the above code the following way:
    {
        char mem[] = "Hello, world!";
        memcpy(mem+1, mem, sizeof(mem)-1);
    }

Instead of doing what was intended, moving the string up one place, thecode has different behavior.

One other comment I will make: I can write a faster memcpy for aligned oralignable medium to large copies (which is where, generally, performanceis important) than "rep movsb" on the x86. Which, for those who don'tknow x86 assembly, is the hardware equivelent of my implementation above.


Brian
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
Development Mailing List                       [email protected]
Automated List Manager                           [EMAIL PROTECTED]

Re: 0.9.8: cfb_enc.c bug? and AES speed on Win64/x64

Reply via email to