Re: Help on memchr() EGLIBC assembly code
On 07/30/2009 09:29 AM, Aurelien Jarno wrote: Thanks for this patch I have tried it, and it does not have the original problem I have reported. Unfortunately it does not pass the glibc testsuite. I'll try to debug the problem later (I don't own an alpha machine, and need to have internet access to debug it remotely). This one passes the test suite (on x86, with replacements for the alpha builtins). r~ typedef unsigned long word; #ifdef __alpha__ #define cmpbeq0(X) __builtin_alpha_cmpbge(0, (X)) #define extql __builtin_alpha_extql #define extqh __builtin_alpha_extqh #define insbl __builtin_alpha_insbl #define zap __builtin_alpha_zap #else static word cmpbeq0(word y) { word i, r = 0; for (i = 0; i 8; ++i) { unsigned char yc = (y i*8); if (yc == 0) r |= 1 i; } return r; } static word extql(word x, word y) { return x ((y 7) * 8); } static word extqh(word x, word y) { return x (64 - ((y 7) * 8)); } static word zap(word x, word y) { word i, mask = 0; for (i = 0; i 8; ++i) if (y (1 i)) mask |= (word)0xff (i * 8); return x ~mask; } static word insbl(word x, word y) { word byte_mask = 1ul (y 7); word byte_loc = (y 7) * 8; return zap (x byte_loc, ~byte_mask); } #endif static inline word ldq_u(word s) { return *(const word *)(s -8); } #define unlikely(X) __builtin_expect ((X), 0) #define prefetch(X) __builtin_prefetch ((void *)(X), 0) #define find(X, Y) cmpbeq0 ((X) ^ (Y)) void * memchr (const void *xs, int xc, word n) { const word *s_align; const word s = (word) xs; word t, current, found; if (unlikely (n == 0)) return 0; current = ldq_u (s); /* Replicate low byte of XC into all bytes of C. */ t = insbl (xc, 1); /* 00c0 */ t = (xc 0xff) | t; /* 00cc */ t = (t 16) | t; /* */ const word c = (t 32) | t; /* */ /* When N = 8, we can perform the entire comparison in one go. Load the N bytes into the low part of CURRENT, via an unaligned quadword load sequence, and treat it as the last aligned word read. */ if (unlikely (n = 8)) { /* Tweak the standard unaligned quadword load sequence by issuing the second ldq_u at (s + n - 1) instead of (s + 8 - 1). This avoids crossing a page boundary when S+N doesn't. */ word last = extqh (ldq_u (s + n - 1), s); word first = extql (current, s); current = first | last; s_align = (const word *) s; goto last_quad; } /* Align the source, and decrement the count by the number of bytes searched in the first word. */ s_align = (const word *)(s -8); n += (s 7); /* Deal with misalignment in the first word for the comparison. */ { word mask = -1ul (s 7); found = find (current, c) mask; if (unlikely (found)) goto found_it; } s_align++; n -= 8; /* If the block is sufficiently large, align to cacheline and prefetch. */ if (unlikely (n = 256)) { /* Prefetch 3 cache lines beyond the one we're working on. */ prefetch (s_align + 8); prefetch (s_align + 16); prefetch (s_align + 24); while ((word)s_align 63) { current = *s_align; found = find (current, c); if (found) goto found_it; s_align++; n -= 8; } /* Within each cacheline, advance the load for the next word before the test for the previous word is complete. This allows us to hide the 3 cycle L1 cache load latency. We only perform this advance load within a cacheline to prevent reading across page boundary. */ #define CACHELINE_LOOP\ do { \ word i, next = s_align[0]; \ for (i = 0; i 7; ++i) \ { \ current = next; \ next = s_align[1]; \ found = find (current, c); \ if (unlikely (found)) \ goto found_it; \ s_align++; \ } \ current = next; \ found = find (current, c); \ if (unlikely (found)) \ goto found_it; \ s_align++;\ n -= 64;\ } while (0) /* While there's still lots more data to potentially be read, continue issuing prefetches for the 4th cacheline out. */ while (n = 256) { prefetch (s_align + 24); CACHELINE_LOOP; } /* Up to 3 cache lines remaining. Continue issuing advanced loads, but stop prefetching. */ while (n = 64) CACHELINE_LOOP; } /* Quadword aligned loop. */ current = *s_align; while (n 8) { found = find (current, c); if (unlikely (found)) goto found_it; current = *++s_align; n -= 8; } last_quad: { word mask = -1ul n; found = find (current, c) ~mask; if (found == 0) return 0; } found_it: { word offset; #ifdef __alpha_cix__ offset = __builtin_alpha_cttz (found); #else /* Extract LSB. */ found = -found; /* Binary search for the LSB. */ offset = (found 0x0f ? 0 : 4); offset += (found 0x33 ? 0 : 2); offset += (found
Re: Help on memchr() EGLIBC assembly code
On Wed, Jul 29, 2009 at 05:24:59PM -0700, Richard Henderson wrote: On 07/26/2009 04:45 PM, Aurelien Jarno wrote: Knowing that $31 could be used for prefetch, I have modified the assembly code from memchr.S to use it. It passes all the testsuite. This isn't intended to be a prefetch instruction, it's meant to be fetching the data for the next word. I.e. we've unrolled the loop and there's at least 8 bytes left in the search. Note the # At least two quads remain to be accessed. comment. At that point we're supposed to be more than 16 bytes away from the end of the input buffer. Actually, the confusion I see is farther upthread: The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I didn't realize this when I wrote the function. The entire function should be rewritten, since there's little point in using a prefetch instruction that close to the load. Prefetch instructions should be used to move data into the L1 cache, not hide the 3 cycle load delay. Thus a prefetch, if used, should be several cache lines ahead, not just a single word. Perhaps a better solution would be to read words until we get cacheline aligned, then read the entire line into 8 registers, prefetch the next line, then process each register one by one. Try this. Thanks for this patch I have tried it, and it does not have the original problem I have reported. Unfortunately it does not pass the glibc testsuite. I'll try to debug the problem later (I don't own an alpha machine, and need to have internet access to debug it remotely). -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On 07/26/2009 04:45 PM, Aurelien Jarno wrote: Knowing that $31 could be used for prefetch, I have modified the assembly code from memchr.S to use it. It passes all the testsuite. This isn't intended to be a prefetch instruction, it's meant to be fetching the data for the next word. I.e. we've unrolled the loop and there's at least 8 bytes left in the search. Note the # At least two quads remain to be accessed. comment. At that point we're supposed to be more than 16 bytes away from the end of the input buffer. Actually, the confusion I see is farther upthread: The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I didn't realize this when I wrote the function. The entire function should be rewritten, since there's little point in using a prefetch instruction that close to the load. Prefetch instructions should be used to move data into the L1 cache, not hide the 3 cycle load delay. Thus a prefetch, if used, should be several cache lines ahead, not just a single word. Perhaps a better solution would be to read words until we get cacheline aligned, then read the entire line into 8 registers, prefetch the next line, then process each register one by one. Try this. r~ typedef unsigned long word; #define ldq(X) (*(const word *)(X)) #define ldq_u(X) (*(const word *)((X) -8)) #define cmpbge __builtin_alpha_cmpbge #define extql __builtin_alpha_extql #define extqh __builtin_alpha_extqh #define insbl __builtin_alpha_insbl #define insqh __builtin_alpha_insqh #define zap __builtin_alpha_zap #define unlikely(X) __builtin_expect ((X), 0) #define prefetch(X) __builtin_prefetch ((void *)(X), 0) #define find(X, Y) cmpbge (0, (X) ^ (Y)) void *memchr (const void *xs, int xc, word n) { word s = (word)xs; word c; word current, found; word s_align; if (unlikely (n == 0)) return 0; current = ldq_u (s); /* Replicate low byte of C into all bytes. */ { word t = insbl (xc, 1); /* 00c0 */ c = (xc 0xff) | t; /* 00cc */ c = (c 16) | c; /* */ c = (c 32) | c; /* */ } if (unlikely (n 9)) goto only_quad; /* Align the source, and decrement the count by the number of bytes searched in the first word. */ s_align = s -8; n += (s 7); n -= 8; /* Deal with misalignment in the first word for the comparison. */ { word mask = insqh (-1, s); found = cmpbge (0, (current ^ c) | mask); if (found) goto found_it; } s_align += 8; /* If the block is sufficiently large, align to cacheline (minimum 64-bytes), prefetch the next line, and read entire cachelines at a time. */ if (unlikely (n = 256)) { prefetch (s_align + 64); while (s_align 63) { current = ldq (s_align); found = find (current, c); if (found) goto found_it; s_align += 8; n -= 8; } while (n 64) { word c0, c1, c2, c3, c4, c5, c6, c7; prefetch (s_align + 64); c0 = ldq (s_align + 0*8); c1 = ldq (s_align + 1*8); c2 = ldq (s_align + 2*8); c3 = ldq (s_align + 3*8); c4 = ldq (s_align + 4*8); c5 = ldq (s_align + 5*8); c6 = ldq (s_align + 6*8); c7 = ldq (s_align + 7*8); found = find (c0, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c1, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c2, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c3, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c4, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c5, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c6, c); if (unlikely (found)) goto found_it; s_align += 8; found = find (c7, c); if (unlikely (found)) goto found_it; s_align += 8; n -= 64; } } /* Quadword aligned loop. */ while (1) { current = ldq (s_align); if (n 8) goto last_quad; found = find (current, c); if (found) goto found_it; s_align += 8; n -= 8; } only_quad: { word end = zap (n, 0x80) - 1; word last = extqh (ldq_u (s + end), s); word first = extql (current, s); current = first | last; /* We've read one word and aligned it in the register. Thus the bit offset in current is relative to S. */ s_align = s; } last_quad: { word mask = (-1ul -n); found = find (current, c) mask; if (found == 0) return 0; } found_it: { word offset; #ifdef __alpha_cix__ offset = __builtin_alpha_cttz (found); #else /* Extract LSB. */ found = -found; /* Binary search for the LSB. */ offset = (found 0x0f ? 0 : 4); offset += (found 0x33 ? 0 : 2); offset += (found 0x55 ? 0
Re: Help on memchr() EGLIBC assembly code
On Mon, Jul 27, 2009 at 01:45:06AM +0200, Aurelien Jarno wrote: On Sun, Jul 19, 2009 at 04:29:33PM +0200, Aurelien Jarno wrote: On Wed, Jul 15, 2009 at 12:48:02PM -0700, Richard Henderson wrote: On 07/13/2009 03:16 PM, Matt Turner forwarded: The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. That's not supposed to matter -- faults from prefetch are supposed to be ignored; see do_page_fault: The problem is that the prefech is not done with $31, but using $1 and $3. It is called prefetch in the code, but it is more like read a value in advance. Knowing that $31 could be used for prefetch, I have modified the assembly code from memchr.S to use it. It passes all the testsuite. Comments are welcome. Then I'll do the alphaev6 version. Here is the alphaev6 version: --- a/sysdeps/alpha/alphaev6/memchr.S +++ b/sysdeps/alpha/alphaev6/memchr.S @@ -127,7 +127,7 @@ $first_quad: cmpbge $31, $1, $2# E : bne $2, $found_it # U : # At least one byte left to process. - ldq $1, 8($0) # L : + ldq $31, 8($0) # L : subq$5, 1, $18 # E : U L U L addq$0, 8, $0 # E : @@ -143,38 +143,38 @@ $first_quad: and $4, 8, $4 # E : odd number of quads? bne $4, $odd_quad_count # U : # At least three quads remain to be accessed - mov $1, $4 # E : L U L U : move prefetched value to correct reg + nop # E : L U L U : move prefetched value to correct reg .align 4 $unrolled_loop: - ldq $1, 8($0) # L : prefetch $1 - xor $17, $4, $2 # E : - cmpbge $31, $2, $2 # E : - bne $2, $found_it # U : U L U L + ldq $1, 0($0) # L : load quad + xor $17, $1, $2 # E : + ldq $31, 8($0) # L : prefetch next quad + cmpbge $31, $2, $2 # E : U L U L + bne $2, $found_it # U : addq$0, 8, $0 # E : nop # E : nop # E : - nop # E : $odd_quad_count: + ldq $1, 0($0) # L : load quad xor $17, $1, $2 # E : - ldq $4, 8($0) # L : prefetch $4 + ldq $31, 8($0) # L : prefetch $4 cmpbge $31, $2, $2 # E : - addq$0, 8, $6 # E : + addq$0, 8, $6 # E : bne $2, $found_it # U : cmpult $6, $18, $6 # E : addq$0, 8, $0 # E : - nop # E : bne $6, $unrolled_loop # U : - mov $4, $1 # E : move prefetched value into $1 nop # E : nop # E : - -$final:subq$5, $0, $18 # E : $18 - number of bytes left to do nop # E : + +$final:ldq $1, 0($0) # L : load last quad + subq$5, $0, $18 # E : $18 - number of bytes left to do nop # E : bne $18, $last_quad # U : -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On Sun, Jul 19, 2009 at 04:29:33PM +0200, Aurelien Jarno wrote: On Wed, Jul 15, 2009 at 12:48:02PM -0700, Richard Henderson wrote: On 07/13/2009 03:16 PM, Matt Turner forwarded: The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. That's not supposed to matter -- faults from prefetch are supposed to be ignored; see do_page_fault: The problem is that the prefech is not done with $31, but using $1 and $3. It is called prefetch in the code, but it is more like read a value in advance. Knowing that $31 could be used for prefetch, I have modified the assembly code from memchr.S to use it. It passes all the testsuite. Comments are welcome. Then I'll do the alphaev6 version. diff --git a/sysdeps/alpha/memchr.S b/sysdeps/alpha/memchr.S index 5d713d5..87c7fb1 100644 --- a/sysdeps/alpha/memchr.S +++ b/sysdeps/alpha/memchr.S @@ -119,7 +119,7 @@ $first_quad: # At least one byte left to process. - ldq t0, 8(v0) # e0: + ldq zero, 8(v0) # e0: prefetch next quad subqt4, 1, a2 # .. e1 : addqv0, 8, v0 #-e0: @@ -138,19 +138,19 @@ $first_quad: # At least three quads remain to be accessed - mov t0, t3 # e0: move prefetched value to correct reg - .align 4 $unrolled_loop: - ldq t0, 8(v0) #-e0: prefetch t0 - xor a1, t3, t1 # .. e1 : - cmpbge zero, t1, t1# e0: - bne t1, $found_it # .. e1 : + ldq t0, 0(v0) # e0: load quad + xor a1, t0, t1 # .. e1 : + ldq zero, 8(v0) # e0: prefetch next quad + cmpbge zero, t1, t1# .. e1: + bne t1, $found_it # e0: - addqv0, 8, v0 #-e0: + addqv0, 8, v0 #e1 : $odd_quad_count: + ldq t0, 0(v0) # e0: load quad xor a1, t0, t1 # .. e1 : - ldq t3, 8(v0) # e0: prefetch t3 + ldq zero, 8(v0) # e0: prefetch next quad cmpbge zero, t1, t1# .. e1 : addqv0, 8, t5 #-e0: bne t1, $found_it # .. e1 : @@ -159,8 +159,8 @@ $odd_quad_count: addqv0, 8, v0 # .. e1 : bne t5, $unrolled_loop #-e1 : - mov t3, t0 # e0: move prefetched value into t0 -$final:subqt4, v0, a2 # .. e1 : a2 - number of bytes left to do +$final:ldq t0, 0(v0) # e0: load last quad + subqt4, v0, a2 # .. e1 : a2 - number of bytes left to do bne a2, $last_quad # e1: $not_found: -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On 07/13/2009 03:16 PM, Matt Turner forwarded: The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. That's not supposed to matter -- faults from prefetch are supposed to be ignored; see do_page_fault: /* As of EV6, a load into $31/$f31 is a prefetch, and never faults (or is suppressed by the PALcode). Support that for older CPUs by ignoring such an instruction. */ if (cause == 0) { unsigned int insn; __get_user(insn, (unsigned int __user *)regs-pc); if ((insn 21 0x1f) == 0x1f /* ldq ldl ldt lds ldg ldf ldwu ldbu */ (1ul (insn 26) 0x30f1400ul)) { regs-pc += 4; return; } } Can you figure out why that kernel code isn't working? I no longer have working alpha hw... r~ -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Help on memchr() EGLIBC assembly code
Hi all, With a lot of patches (E)GLIBC 2.10 builds on alpha, but it fails on the testsuite for the memchr() function, which is an optimized assembly code on alpha. Unfortunately I don't speak alpha assembly very well, so help is needed. The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I have built a small testcase (see file attached), which contains the code to trigger the bug and the assembly code of the memchr() function, copied from EGLIBC. It would be nice if someone can fix the assembly code so that the prefetching does not create memory faults. Thanks in advance. Cheers, Aurelien -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net test-memchr.tar.gz Description: Binary data
Re: Help on memchr() EGLIBC assembly code
On Mon, Jul 13, 2009 at 1:31 PM, Aurelien Jarnoaurel...@aurel32.net wrote: With a lot of patches (E)GLIBC 2.10 builds on alpha, but it fails on the testsuite for the memchr() function, which is an optimized assembly code on alpha. Unfortunately I don't speak alpha assembly very well, so help is needed. The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I have built a small testcase (see file attached), which contains the code to trigger the bug and the assembly code of the memchr() function, copied from EGLIBC. It would be nice if someone can fix the assembly code so that the prefetching does not create memory faults. Thanks in advance. If you remove: ./sysdeps/alpha/alphaev6/memchr.S ./sysdeps/alpha/memchr.S and allow the build to fallback on string/memchr.c do the tests pass? You can always add memchr.S routines back in a later release if you need an immediate workaround. Cheers, Carlos. -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On Mon, Jul 13, 2009 at 02:24:00PM -0400, Carlos O'Donell wrote: On Mon, Jul 13, 2009 at 1:31 PM, Aurelien Jarnoaurel...@aurel32.net wrote: With a lot of patches (E)GLIBC 2.10 builds on alpha, but it fails on the testsuite for the memchr() function, which is an optimized assembly code on alpha. Unfortunately I don't speak alpha assembly very well, so help is needed. The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I have built a small testcase (see file attached), which contains the code to trigger the bug and the assembly code of the memchr() function, copied from EGLIBC. It would be nice if someone can fix the assembly code so that the prefetching does not create memory faults. Thanks in advance. If you remove: ./sysdeps/alpha/alphaev6/memchr.S ./sysdeps/alpha/memchr.S and allow the build to fallback on string/memchr.c do the tests pass? You can always add memchr.S routines back in a later release if you need an immediate workaround. Yeah, it works, and that's actually my plan if I get no answer. But I would really like to see this fixed now, as otherwise, it will most probably stay like that indefinitely. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On Mon, Jul 13, 2009 at 6:17 PM, Aurelien Jarnoaurel...@aurel32.net wrote: On Mon, Jul 13, 2009 at 02:24:00PM -0400, Carlos O'Donell wrote: On Mon, Jul 13, 2009 at 1:31 PM, Aurelien Jarnoaurel...@aurel32.net wrote: With a lot of patches (E)GLIBC 2.10 builds on alpha, but it fails on the testsuite for the memchr() function, which is an optimized assembly code on alpha. Unfortunately I don't speak alpha assembly very well, so help is needed. The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I have built a small testcase (see file attached), which contains the code to trigger the bug and the assembly code of the memchr() function, copied from EGLIBC. It would be nice if someone can fix the assembly code so that the prefetching does not create memory faults. Thanks in advance. If you remove: ./sysdeps/alpha/alphaev6/memchr.S ./sysdeps/alpha/memchr.S and allow the build to fallback on string/memchr.c do the tests pass? You can always add memchr.S routines back in a later release if you need an immediate workaround. Yeah, it works, and that's actually my plan if I get no answer. But I would really like to see this fixed now, as otherwise, it will most probably stay like that indefinitely. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net I'm copying Richard Henderson and Ivan Kokshayshy on this, as they are without a doubt the most knowledgeable people about this sort of thing. As an aside, please make sure these two bugs are fixed in eglibc (I don't know if they exist in eglibc, but I do realize that you filed them against glibc yourself -- so this is just a reminder). http://sources.redhat.com/bugzilla/show_bug.cgi?id=5350 http://sources.redhat.com/bugzilla/show_bug.cgi?id=5400 Matt -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Re: Help on memchr() EGLIBC assembly code
On Mon, Jul 13, 2009 at 07:16:16PM -0300, Matt Turner wrote: On Mon, Jul 13, 2009 at 6:17 PM, Aurelien Jarnoaurel...@aurel32.net wrote: On Mon, Jul 13, 2009 at 02:24:00PM -0400, Carlos O'Donell wrote: On Mon, Jul 13, 2009 at 1:31 PM, Aurelien Jarnoaurel...@aurel32.net wrote: With a lot of patches (E)GLIBC 2.10 builds on alpha, but it fails on the testsuite for the memchr() function, which is an optimized assembly code on alpha. Unfortunately I don't speak alpha assembly very well, so help is needed. The problem is that the memchr() function on alpha uses prefetch, which can cause a page boundary to be crossed, while the standards (POSIX and C99) says it should stop when a match is found. I have built a small testcase (see file attached), which contains the code to trigger the bug and the assembly code of the memchr() function, copied from EGLIBC. It would be nice if someone can fix the assembly code so that the prefetching does not create memory faults. Thanks in advance. If you remove: ./sysdeps/alpha/alphaev6/memchr.S ./sysdeps/alpha/memchr.S and allow the build to fallback on string/memchr.c do the tests pass? You can always add memchr.S routines back in a later release if you need an immediate workaround. Yeah, it works, and that's actually my plan if I get no answer. But I would really like to see this fixed now, as otherwise, it will most probably stay like that indefinitely. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net I'm copying Richard Henderson and Ivan Kokshayshy on this, as they are without a doubt the most knowledgeable people about this sort of thing. Thanks. As an aside, please make sure these two bugs are fixed in eglibc (I don't know if they exist in eglibc, but I do realize that you filed them against glibc yourself -- so this is just a reminder). http://sources.redhat.com/bugzilla/show_bug.cgi?id=5350 http://sources.redhat.com/bugzilla/show_bug.cgi?id=5400 I have already put a first set of patches needed for alpha in a git repository [1], and sent a pull request on the libc-ports mailing list. I hope it will work better than filling bug reports. If it also fails, I'll send them to EGLIBC. [1] http://git.aurel32.net/?p=glibc-ports.git;a=shortlog;h=refs/heads/alpha-for-upstream -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net -- To UNSUBSCRIBE, email to debian-alpha-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org