Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi Ivan, Thank you for sharing your insights! > > It would mean that Linux/e2k can hardly conform to > > POSIX well, as Bruno said, because POSIX requires different signals for > > different cases and incompatibilities can't be forgiven on the reason of > > speculative computations in the CPU. > The compiler would know how to > replay the faulty speculative computation, so it would be able > generate code to do this non-speculatively and trigger the real fault. Yes, you need to think at the kernel and the compiler together. As I understand it, the general approach in such cases is to: 1) See in the hardware manual whether there is a way to retrieve the exception details (exception code, and memory address in case of a memory access) from the speculative execution. If so, use it in the kernel, in linux//mm/fault.c. If not: 2) Implement a proposed solution in the compiler that results in discarding the speculative execution results when there was an exception during speculative execution. 3) Implement another proposed solution in the compiler that completely disables speculative execution for instructions that may produce exceptions (and leave it enabled only for guaranteed exception-free instructions, such as integer arithmetic instructions). [It is not unheard of that processor features get completely disabled. For example, OpenBSD/x86_64 disables hyperthreading, which many people previously thought to be a valuable processor feature.] 4) Benchmark the performance impact of 2) and 3) on programs. Choose the one with less impact. 5) If the impact is high, then invent a compiler option that allows the application developer to choose among POSIX compliant code or fast code. [This is the approach used e.g. for floating-point instructions on alpha in GCC: The instructions provided by the hardware are not IEEE 854 compliant, and the workaround that GCC adds to make it it IEEE 854 compliant is so much of a performance hit that it is only enabled through a compiler option.] Bruno
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Here is a follow-up to the story, for those curious what happens in a similar IA64 architecture. And this should be it. As for the problem on E2K itself, we should discuss it with MCST and/or investigate whether the missing information about the faults can be recovered to better satisfy POSIX. On Sat, 29 Dec 2018, Ivan Zakharyaschev wrote: > > > As for the SIGILL peculiarity, it has a reason in the Elbrus > > > architecture. > I've studied the assembler code and found the other true > reason in this specific case: these are faults "hidden" in an explicitly > "speculative" computation which utltimately result in SIGILL. (The E2K ISA > is reminiscent of IA64; this can help get the idea.) The specific kind of > the fault is "forgotten", unfortunately. > Besides, in many aspects including the newly mentioned by me explicitly > speculative instructions, E2K reminds IA64. > > And it'd be interesting to have a look how they treat faults coming from > speculative computations in Linux/ia64 to get an idea whether it can be > done in a manner with better conformance to POSIX. > * * * > > BTW, saving and forgetting the type of the original fault doesn't seem I meant "not forgetting". > to be something expensive to implement (after some thought): when a > register is marked as invalid, it shouldn't matter anymore what value > it holds. So, the same register can be used to save the information > about the type of the fault. As Dmitry Levin pointed out, probably not, because there can be too much information (the fault, and the associated addres) for a single register. > * * * > > I wanted to see how Linux/ia64 handles these complications arising > from speculative computations possibly causing a fault; and powered on > such a machine, and had a look at the above examples with SIGILL on > E2K: the third one, and the fifth one (speculative division by zero). > > The third example from above: > > imz@rx2620:~/test-speculative-SIGSEGV$ cc -Wall -O3 -xc - -S -o c.s && cat c.s > int main(int argc, char ** argv) { > if (0 < argc) > ++*(char*)0xbad; > return 0xbeef; > } > .file "" > .pred.safe_across_calls p1-p5,p16-p63 > .section.text.startup,"ax",@progbits > .align 16 > .align 64 > .global main# > .type main#, @function > .proc main# > main: > .prologue > .body > .mmi > cmp4.ge p6, p7 = 0, r32 > addl r14 = 2989, r0 > addl r8 = 48879, r0 > ;; > .mmi > (p7) ld1 r15 = [r14] > ;; > (p7) adds r15 = 1, r15 > nop 0 > ;; > .mib > (p7) st1 [r14] = r15 > nop 0 > br.ret.sptk.many b0 > .endp main# > .ident "GCC: (Debian 4.6.3-14) 4.6.3" > .section.note.GNU-stack,"",@progbits > imz@rx2620:~/test-speculative-SIGSEGV$ cc -Wall -O3 c.s && ./a.out; echo $? > Segmentation fault > 139 > Notes on the assembler: the possible groupings into VLIWs are > separated by double semicolons (";;"). Predicative execution of > instructions is marked by a prefix with the corresponding predicate > register in parentheses, like "(p7)" in the code above: > > .mmi > (p7) ld1 r15 = [r14] > ;; > (p7) adds r15 = 1, r15 > nop 0 > ;; > .mib > (p7) st1 [r14] = r15 > > These are the "load", "add", and "store" instructions corresponding to: > ++*(char*)0xbad > > All this shows that gcc-4.6 on IA-64 doesn't generate speculative > computations for the same examples that had speculative computations > on E2K. Unfortunately, this means that we couldn't compare the > interesting bits of the behavior between Linux/e2k and Linux/ia64 > quickly. Perhaps, editing the IA64 assembler code can give a desired > example. Cool! Linux/ia64 also produces SIGILL in the same situation; it seems to have no magic. (But there is a second part of the story!) imz@rx2620:~/test-speculative-SIGSEGV$ diff c.s c_s.s 18c18 < (p7) ld1 r15 = [r14] --- > (p7) ld1.s r15 = [r14] imz@rx2620:~/test-speculative-SIGSEGV$ cc c_s.s && ./a.out; echo $? Illegal instruction 132 "ld1.s" is the "load 1 byte" instruction with the "speculative" flag. If we do not use the "invalid" register in a "store" instruction, then there is no fault: imz@rx2620:~/test-speculative-SIGSEGV$ diff c_s.s c_nost.s 24,25d23 < (p7) st1 [r14] = r15 < nop 0 imz@rx2620:~/test-speculative-SIGSEGV$ cc c_nost.s && ./a.out; echo $? 239 And the second part: The problem has a solution on IA64. The compiler would know how to replay the faulty speculative computation, so it would be able generate code to do this non-speculatively and trigger the real fault. And there is an instruction that checks whether a register is "valid"[1] and helps to jump to the recovery code[2]: "chk.s". I've implemented this approach manually in c_chk.s like this (but I have not seen what a compiler would do actually; IA64 has other flavors of speculative instruct
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
On Sat, Dec 29, 2018 at 06:03:42PM +0300, Ivan Zakharyaschev wrote: [...] > BTW, saving and forgetting the type of the original fault doesn't seem > to be something expensive to implement (after some thought): when a > register is marked as invalid, it shouldn't matter anymore what value > it holds. So, the same register can be used to save the information > about the type of the fault. Note that SIGILL, SIGFPE, SIGSEGV, and SIGBUS come with si_addr specifying the memory location which caused the fault. When memory fault is transformed into illegal operand failt, the location of the original memory fault is likely lost, too - you can easily check this hypothesis by installing a signal handler: if si_addr is not 0xbad from your example, then it's been lost. -- ldv signature.asc Description: PGP signature
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi, On Sat, 29 Dec 2018, Dmitry V. Levin wrote: > On Fri, Dec 28, 2018 at 05:23:09PM +0300, Ivan Zakharyaschev wrote: > > As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. > No, this particular case (++*argv[argc]) has nothing to do with tagged memory, > I hope Ivan will share his findings here. I've done it. Thanks for your hints regarding a test for another kind of fault (SIGFPE) happenning speculatively, and regarding the hexadecimal values which are easy to detect visually (0xbad etc.)! -- Best regards, Ivan
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi Bruno, On Sat, 29 Dec 2018, Bruno Haible wrote: > > "system in development" is the one which suits > > Linux/E2k better. The port to E2K (MCST Elbrus general purpose hardware > > architecture) is quite mature, but not yet released publicly. > > Thanks for the info. Based on it, I found a couple of other pointers as well: > [1][2]. > [1] > https://linux.slashdot.org/story/99/03/31/2324218/linus-will-move-to-moscow-to-work-with-elbrus [1] is fun. :) > > As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. > > ... > > And it's not a segmentation fault. Meanwhile, I have found out that my explanations about it being the consequence of tagged memory (at least, in this specific case of test-c-stack.c) were largely incorrect. I'm sorry for that misleading information. I've studied the assembler code and found the other true reason in this specific case: these are faults "hidden" in an explicitly "speculative" computation which utltimately result in SIGILL. (The E2K ISA is reminiscent of IA64; this can help get the idea.) The specific kind of the fault is "forgotten", unfortunately. Bruno, this discovery makes your claims even more strong and relevant: this kind of fault is expected by all programs to be SIGSEGV normally, and they can't care whether the computation was done speculatively or not (i.e., with immediate effects). > I believe you should make it signal a SIGSEGV or SIGBUS, not SIGILL, for > the following reasons: > > * Look at the second table in > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html. > It defines a couple of signal codes for SIGILL, SIGSEGV, and SIGBUS. > It implies that SIGILL means an invalid instruction (and "illegal operand" > means an invalid operand that is in the instruction stream). > Whereas SIGSEGV and SIGBUS mean a problem with an instruction in combination > with a memory address. Thanks for the explanation concerning "illegal operand"! This was a rebuttal more relevant given my first imagined explanation, but not the actual one. But anyway important to know. > * The main users of SIGSEGV and SIGBUS are catching stack overflow, garbage > collection, and similar (e.g. by use of GNU libsigsegv). The fact that > you observe an incompatibility between your Linux adaptation and > application programs that work fine across Linux/BSD/AIX/Solaris is a sure > indication that you will encounter similar incompatibilities along the > lines, > until you fix that port, to produce SIGSEGV or SIGBUS instead of SIGILL. That's what I'm feeling now, too. It only remains a question concerning the hardware: whether it can save the type of the fault that happened in a speculative computation to give it back when the result of the speculative computation is actually needed. > This reminds the segmented architectures, such as the ones used by AIX > and Linux/ia64. In these OSes, SIGSEGV is produced when a memory address > is used that does not fit with the instruction. Thanks for the information about similar conditions (to what I wrote about tagged memory) in other OSes! Besides, in many aspects including the newly mentioned by me explicitly speculative instructions, E2K reminds IA64. And it'd be interesting to have a look how they treat faults coming from speculative computations in Linux/ia64 to get an idea whether it can be done in a manner with better conformance to POSIX. * * * Here are the actual facts about what happens on E2k (and little bit on IA64) with a set of minimal contrasting examples: Here are four example programs; the first two write to the memory, the latter two first read from the memory. (There is an amazing difference between the last two examples.) Probably, the demonstrated contrasts do not cover all conditions under which SIGILL can occur. $ cc -Wall -xc - && ./a.out; echo $? int main(int argc, char ** argv) { *(char*)0 = 175; return 0; } Segmentation fault 139 $ cc -Wall -xc - && ./a.out; echo $? int main(int argc, char ** argv) { if (0 < argc) *(char*)0 = 175; return 0; } Segmentation fault 139 $ cc -Wall -xc - && ./a.out; echo $? int main(int argc, char ** argv) { if (0 < argc) ++*(char*)0; return 0; } Illegal instruction 132 $ cc -Wall -xc - && ./a.out; echo $? int main(int argc, char ** argv) { ++*(char*)0; return 0; } Segmentation fault 139 $ cc --version lcc:1.23.12:Aug--6-2018:e2k-v4-linux gcc (GCC) 5.5.0 compatible $ This leads to a suspicion that not only the direction of the memory access matters (read or write), but also the speculative execution of the memory access instruction (in the third example) -- for the sake of optimization, something is done before the actual value of the condition is computed. (Otherwise, without a speculative computation, it's unclear how a redundant condition can affect anything.) The speculative instructions are written explicitly in E2K ISA (and this is also li
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
I wrote: > I believe you should make it signal a SIGSEGV or SIGBUS, not SIGILL, for > the following reasons: A third reason is that the application will want to react depending on the memory address which produced the fault. (I mean the memory address of the data, not of the instruction.) This memory address is available as si_addr in the siginfo struct only for SIGSEGV and SIGBUS, see http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html Bruno
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Andrey Savchenko wrote: > This is not possible. Four generations of hardware are already > manufactured and they use SIGILL for such cases. It may be fixed in > future generations if CPU designers will agree to do so The mapping from hardware exception code to Unix signal number is done in software, not in hardware. For an example, look in linux-4.20/arch/sparc/mm/fault_32.c. Bruno
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
On Sat, Dec 29, 2018 at 02:31:11PM +0300, Andrey Savchenko wrote: > On Sat, 29 Dec 2018 12:17:32 +0100 Bruno Haible wrote: > > > As for the SIGILL peculiarity, it has a reason in the Elbrus > > > architecture. > > > ... > > > And it's not a segmentation fault. > > > > I believe you should make it signal a SIGSEGV or SIGBUS, not SIGILL, for > > the following reasons: > > > > * Look at the second table in > > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html. > > It defines a couple of signal codes for SIGILL, SIGSEGV, and SIGBUS. > > It implies that SIGILL means an invalid instruction (and "illegal operand" > > means an invalid operand that is in the instruction stream). > > Whereas SIGSEGV and SIGBUS mean a problem with an instruction in > > combination > > with a memory address. > > > > * The main users of SIGSEGV and SIGBUS are catching stack overflow, garbage > > collection, and similar (e.g. by use of GNU libsigsegv). The fact that > > you observe an incompatibility between your Linux adaptation and > > application programs that work fine across Linux/BSD/AIX/Solaris is a sure > > indication that you will encounter similar incompatibilities along the > > lines, > > until you fix that port, to produce SIGSEGV or SIGBUS instead of SIGILL. > > This is not possible. Four generations of hardware are already > manufactured and they use SIGILL for such cases. It may be fixed in > future generations if CPU designers will agree to do so, but we > have to deal with already produced and used in production hardware. It's all up to the kernel what signal to generate in response to that particular non-SIGSEGV kind of trap. I agree with Bruno here, as long as the code in question causes SIGILL, the architecture is not compatible and its users will suffer more because of this unneeded incompatibility. -- ldv signature.asc Description: PGP signature
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
On Fri, Dec 28, 2018 at 05:23:09PM +0300, Ivan Zakharyaschev wrote: > Hi Bruno, > > On Thu, 20 Dec 2018, Bruno Haible wrote: > > > > +# E2K (elbrus) systems send SIGILL on an access to an invalid > > > address. > > > > This is a bug in the system. Access of an invalid address ought to produce a > > SIGSEGV or SIGBUS. > > > > 'elbrus' is not an important OS so far, for which it would be worth adding > > workarounds in the gnulib source. > > Is it still in development? -> If so, please fix that bug. > > Or is it a museum system? -> If so, just bear with the test failure. > > Of these descriptions, "system in development" is the one which suits > Linux/E2k better. The port to E2K (MCST Elbrus general purpose hardware > architecture) is quite mature, but not yet released publicly. > > As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. > AFAIU, a different protection mechanism comes into play here. It is based > on tagging values/memory: if an attempt is made to use a value in a way > which contradicts its tag, then the "illegal operand" condition arises. > Namely, a "load" instruction can expect a certain tag, and then there can > be a mismatch between the assumptions of the code and the actual value > and its tag. No, this particular case (++*argv[argc]) has nothing to do with tagged memory, I hope Ivan will share his findings here. -- ldv signature.asc Description: PGP signature
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi all! On Sat, 29 Dec 2018 12:17:32 +0100 Bruno Haible wrote: > > As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. > > ... > > And it's not a segmentation fault. > > I believe you should make it signal a SIGSEGV or SIGBUS, not SIGILL, for > the following reasons: > > * Look at the second table in > http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html. > It defines a couple of signal codes for SIGILL, SIGSEGV, and SIGBUS. > It implies that SIGILL means an invalid instruction (and "illegal operand" > means an invalid operand that is in the instruction stream). > Whereas SIGSEGV and SIGBUS mean a problem with an instruction in combination > with a memory address. > > * The main users of SIGSEGV and SIGBUS are catching stack overflow, garbage > collection, and similar (e.g. by use of GNU libsigsegv). The fact that > you observe an incompatibility between your Linux adaptation and > application programs that work fine across Linux/BSD/AIX/Solaris is a sure > indication that you will encounter similar incompatibilities along the > lines, > until you fix that port, to produce SIGSEGV or SIGBUS instead of SIGILL. This is not possible. Four generations of hardware are already manufactured and they use SIGILL for such cases. It may be fixed in future generations if CPU designers will agree to do so, but we have to deal with already produced and used in production hardware. Best regards, Andrew Savchenko pgpraseYmZo15.pgp Description: PGP signature
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi Ivan, > "system in development" is the one which suits > Linux/E2k better. The port to E2K (MCST Elbrus general purpose hardware > architecture) is quite mature, but not yet released publicly. Thanks for the info. Based on it, I found a couple of other pointers as well: [1][2]. > As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. > ... > And it's not a segmentation fault. I believe you should make it signal a SIGSEGV or SIGBUS, not SIGILL, for the following reasons: * Look at the second table in http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html. It defines a couple of signal codes for SIGILL, SIGSEGV, and SIGBUS. It implies that SIGILL means an invalid instruction (and "illegal operand" means an invalid operand that is in the instruction stream). Whereas SIGSEGV and SIGBUS mean a problem with an instruction in combination with a memory address. * The main users of SIGSEGV and SIGBUS are catching stack overflow, garbage collection, and similar (e.g. by use of GNU libsigsegv). The fact that you observe an incompatibility between your Linux adaptation and application programs that work fine across Linux/BSD/AIX/Solaris is a sure indication that you will encounter similar incompatibilities along the lines, until you fix that port, to produce SIGSEGV or SIGBUS instead of SIGILL. > But wait, while writing this explanation, I seem to have come to see a way > how the code in test-c-stack.c: > > ++*argv[argc]; /* Intentionally dereference NULL. */ > > could be rewritten to cause the intended SIGSEGV and not SIGILL like now: If you get SIGSEGV in one case (write to the memory location), you should also get SIGSEGV in the other case (read from the memory location). > AFAIU, a different protection mechanism comes into play here. It is based > on tagging values/memory: if an attempt is made to use a value in a way > which contradicts its tag, then the "illegal operand" condition arises. This reminds the segmented architectures, such as the ones used by AIX and Linux/ia64. In these OSes, SIGSEGV is produced when a memory address is used that does not fit with the instruction. Bruno [1] https://linux.slashdot.org/story/99/03/31/2324218/linus-will-move-to-moscow-to-work-with-elbrus [2] http://elbrus2k.wikidot.com/elbrus-operating-system
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi Bruno, On Thu, 20 Dec 2018, Bruno Haible wrote: > > + # E2K (elbrus) systems send SIGILL on an access to an invalid > > address. > > This is a bug in the system. Access of an invalid address ought to produce a > SIGSEGV or SIGBUS. > > 'elbrus' is not an important OS so far, for which it would be worth adding > workarounds in the gnulib source. > Is it still in development? -> If so, please fix that bug. > Or is it a museum system? -> If so, just bear with the test failure. Of these descriptions, "system in development" is the one which suits Linux/E2k better. The port to E2K (MCST Elbrus general purpose hardware architecture) is quite mature, but not yet released publicly. As for the SIGILL peculiarity, it has a reason in the Elbrus architecture. AFAIU, a different protection mechanism comes into play here. It is based on tagging values/memory: if an attempt is made to use a value in a way which contradicts its tag, then the "illegal operand" condition arises. Namely, a "load" instruction can expect a certain tag, and then there can be a mismatch between the assumptions of the code and the actual value and its tag. And it's not a segmentation fault. (This must be just a simple case of the use of tagging in this architecture, whereas--AFAIK--MCST has been developing some smarter protection modes to make use of tags to track the array bounds along with pointers and for other things. The smarter modes are probably not enabled by default in the compiler. Now, I could google up a 2018 report on such recent work by searching for "elbrus" "e2k" "SIGILL", in Russian.) But wait, while writing this explanation, I seem to have come to see a way how the code in test-c-stack.c: ++*argv[argc]; /* Intentionally dereference NULL. */ could be rewritten to cause the intended SIGSEGV and not SIGILL like now: $ ./test-c-stack 1; echo $? Illegal instruction 132 $ The tags that are seen and checked by a "load" instruction must have been stored before. So, if we now think about storing values to memory, we see that when storing a value, one is not checking the tag, but rather writing it initially. So (at least in the simple protection mode), there can be no SIGILL when writing. And I've tested running test-c-stack with this code instead: *argv[argc] = 175; /* Intentionally dereference NULL. */ and it indeed causes a SIGSEGV: $ ./test-c-stack 1; echo $? test-c-stack: stack overflow 77 $ and with libsigsegv: $ ./test-c-stack 1; echo $? test-c-stack: program error Aborted 134 $ ./test-c-stack2.sh; echo $? 0 $ So, now I suggest a patch that replaces the reading-and-then-writing a value at this place with just writing a value. (A complete patch is attached.) This way we don't need a workaround in the test for the Linux/E2K platform, and the test shouldn't have got worse. There is a possibility to follow the "first-writing" part by a "then-reading" part, but this doesn't seem to be essential. At least, on E2K and probably most other architectures it would never come to it. (But that way the new code would be closer to the old code in the involved operations, and who knows, there might be some architecture where one needs to read to cause a fault.) -- Best regards, IvanFrom 057259bd81fbb60233df00d0a2846304088e1d47 Mon Sep 17 00:00:00 2001 From: Ivan Zakharyaschev Date: Fri, 28 Dec 2018 17:03:18 +0300 Subject: [PATCH] c-stack tests: Avoid test failure on Linux/E2K. Reading a value without having initialized it caused a SIGILL on Linux/E2K rather than SIGSEGV as desired. This made test-c-stack2.sh fail on E2K. As for test-c-stack2.sh, its intention is to test whether we can tell a stack overflow from other cases when SIGSEGV is sent, and the way we cause a SIGSEGV in this test is just an implementation detail. It turned out that these implementation details need to be slightly changed for Linux/E2K. --- tests/test-c-stack.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/test-c-stack.c b/tests/test-c-stack.c index 1dae74e6c..14fec8e07 100644 --- a/tests/test-c-stack.c +++ b/tests/test-c-stack.c @@ -63,7 +63,9 @@ main (int argc, char **argv) if (1 < argc) { exit_failure = 77; - ++*argv[argc]; /* Intentionally dereference NULL. */ + *argv[argc] = 175; /* Intentionally dereference NULL. Writing an +arbitrary value, because reading without having +initialized it causes a SIGILL on Linux/E2K. */ } return recurse (0); } -- 2.19.2
Re: [RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
Hi Ivan, > +# E2K (elbrus) systems send SIGILL on an access to an invalid > address. This is a bug in the system. Access of an invalid address ought to produce a SIGSEGV or SIGBUS. 'elbrus' is not an important OS so far, for which it would be worth adding workarounds in the gnulib source. Is it still in development? -> If so, please fix that bug. Or is it a museum system? -> If so, just bear with the test failure. Bruno
[RFC PATCH] test-c-stack2.sh: skip if the platform sent SIGILL on an invalid address.
I can think of two ways to think about the purpose of this test: 1. distinguish stack overflow from an access to an invalid address ("programm error") 2. distinguish stack overflow from other cases when SIGSEGV is sent Under view 2, then the access to an invalid address is just an implementation detail: a simple way to cause SIGSEGV. I assume view 2 in this patch and simply consider the platform which doesn't send a SIGSEGV on this condition (but rather sends SIGILL as E2K (i.e., elbrus)) not suitable for this implementation of the test. Therefore, the result is skip. Under view 1, it could even be consiidered a success: the distinction is made, but not thanks to our code, but thanks to the platform sending a different signal. Here is what it looks like on E2K (i.e., elbrus): $ ./test-c-stack 1; echo $? Illegal instruction 132 $ ./test-c-stack; echo $? test-c-stack: stack overflow 1 $ --- tests/test-c-stack2.sh | 5 + 1 file changed, 5 insertions(+) diff --git a/tests/test-c-stack2.sh b/tests/test-c-stack2.sh index 0cd49c969..a04d861cd 100755 --- a/tests/test-c-stack2.sh +++ b/tests/test-c-stack2.sh @@ -23,6 +23,11 @@ case $? in exit 77 fi ;; + 132) echo 'not applicable if non-SIGSEGV is sent in the case to be told from stack overflow' >&2 + # E2K (elbrus) systems send SIGILL on an access to an invalid address. + # So, this test is skipped: + exit 77 + ;; 0) (exit 1); exit 1 ;; esac if grep 'program error' t-c-stack2.tmp >/dev/null ; then -- 2.17.1