Alan Burlison created MAPREDUCE-6417:
----------------------------------------

             Summary: MapReduceClient's primitives.h is toxic and should be 
extirpated
                 Key: MAPREDUCE-6417
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6417
             Project: Hadoop Map/Reduce
          Issue Type: Sub-task
          Components: client
    Affects Versions: 2.7.0
            Reporter: Alan Burlison
            Priority: Blocker


MapReduceClient's primitives.h attempts to provide optimised versions of 
standard library memory copy and comparison functions. It has been the subject 
of several portability-related bugs:

* HADOOP-11505 hadoop-mapreduce-client-nativetask uses bswap where be32toh is 
needed, doesn't work on non-x86
* HADOOP-11665 Provide and unify cross platform byteorder support in native code
* MAPREDUCE-6397 MAPREDUCE makes many endian-dependent assumptions
* HADOOP-11484 hadoop-mapreduce-client-nativetask fails to build on ARM AARCH64 
due to x86 asm statements

At present it only works on x86 and ARM64 as it lacks definitions for bswap and 
bswap64 for any platforms other than those.

However it has even more serious problems on non-866 architectures, for example 
on SPARC simple_memcpy simply doesn't work at all:

{code}
$cat bang.cc
#include <string.h>
#define SIMPLE_MEMCPY
#include "primitives.h"
int main(int argc, char **argv)
{
    char b1[9];
    char b2[9];
    simple_memcpy(b2, b1, sizeof(b1));
}
$ gcc -o bang bang.cc && ./bang
Bus Error (core dumped)
{code}
That's because simple_memcpy does pointer fiddling that results in misaligned 
accesses, which are illegal on SPARC.

fmemcmp is also broken. Even if a definition of bswap is provided,on big-endian 
architectures the result is simply wrong because of it's unconditional use of 
bswap:

{code}
$ cat thud.cc
#include <stdio.h>
#include <strings.h>
#include "primitives.h"
int main(int argc, char **argv)
{
    char a[] = { 0,1,2,0 };
    char b[] = { 0,2,1,0 };
    printf("%lld %d\n", fmemcmp(a, b, sizeof(a), memcmp(a, b, sizeof(a))));
}
$ g++ -o thud thud.cc && ./thud
65280 -1
{code}

And in addition fmemcmp suffers from the same misalignment issues as 
simple_memcpy and coredumps on SPARC when asked to compare odd-sized buffers.

primitives.h should simply be deleted and replaced with the standard memory 
copy & compare functions, or with thin wrappers around them where the APIs are 
different.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to