Re: Ping: C-family stack check for threads
Please post your pings under a more meaningful subject line that indicates that this is an ARM back-end and middle-end patch. It's not a C-family patch, C-family maintainers can't give it any useful review. -- Joseph S. Myers jos...@codesourcery.com
Ping: C-family stack check for threads
ping references http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00310.html http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00216.html http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00281.html http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00149.html http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01872.html http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html gcc/ChangeLog 2011-09-20 Thomas Klein th.r.kl...@web.de * opts.c (common_handle_option): introduce new parameters direct and indirect * flag-types.h (enum stack_check_type): Likewise * explow.c (allocate_dynamic_stack_space): - suppress stack probing if parameter direct, indirect or if a stack-limit is given - do additional read of limit value if parameter indirect and a stack-limit symbol is given - emit a call to a stack_failure function [as an alternative to a trap call] (function probe_stack_range): if allowed to override the range porbe emit generic_limit_check_stack * config/arm/arm.c (stack_check_work_registers): new function to find possible working registers [only used by stack check] (emit_push_regs): add push RTL instruction without keeping regnumber and frame memory in mind. (emit_pop_regs): add pop RTL instruction to revert the above push (emit_stack_check_insns): new function to write RTL instructions for stack check at prologue stage. (arm_expand_prologue): stack check integration for ARM and Thumb-2 (thumb1_output_function_prologue): stack check integration for Thumb-1 * config/arm/arm.md (cbranchsi4_insn): allow compare and branch using stack pointer register [at thumb mode] (arm_cmpsi_insn): allow comparing using stack pointer register [at arm] (probe_stack): do not emit code when parameters direct or indirect is given, emit move code way same as in gcc/explow.c [function emit_stack_probe] (probe_stack_done): dummy to make sure probe_stack insns are not optimized away (generic_limit_check_stack): if stack-limit and parameter generic is given use the limit the same way as in function allocate_dynamic_stack_space (stack_failure): failure call used in stack check functions emit_stack_check_insns, generic_limit_check_stack or allocate_dynamic_stack_space [similar to a trap but avoid conflict with builtin_trap] Index: gcc/opts.c === --- gcc/opts.c (revision 179007) +++ gcc/opts.c (working copy) @@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts, : STACK_CHECK_STATIC_BUILTIN ? STATIC_BUILTIN_STACK_CHECK : GENERIC_STACK_CHECK; + else if (!strcmp (arg, indirect)) + /* This is an other stack checking method. */ + opts-x_flag_stack_check = INDIRECT_STACK_CHECK; + else if (!strcmp (arg, direct)) + /* This is an other stack checking method. */ + opts-x_flag_stack_check = DIRECT_STACK_CHECK; else warning_at (loc, 0, unknown stack check parameter \%s\, arg); break; Index: gcc/flag-types.h === --- gcc/flag-types.h(revision 179007) +++ gcc/flag-types.h(working copy) @@ -153,7 +153,15 @@ enum stack_check_type /* Check the stack and entirely rely on the target configuration files, i.e. do not use the generic mechanism at all. */ - FULL_BUILTIN_STACK_CHECK + FULL_BUILTIN_STACK_CHECK, + + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is directly given e.g. by address + of a symbol */ + DIRECT_STACK_CHECK, + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is given by global variable. */ + INDIRECT_STACK_CHECK }; /* Names for the different levels of -Wstrict-overflow=N. The numeric Index: gcc/explow.c === --- gcc/explow.c(revision 179007) +++ gcc/explow.c(working copy) @@ -1386,7 +1386,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (STACK_CHECK_MOVING_SP) + if ( STACK_CHECK_MOVING_SP +#ifdef HAVE_generic_limit_check_stack + || crtl-limit_stack +#endif + || flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) ; else if (flag_stack_check == GENERIC_STACK_CHECK) probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE, @@ -1423,19 +1428,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* Check stack bounds if necessary. */ if (crtl-limit_stack) { + rtx limit_rtx; rtx available; rtx
Re: Ping: C-family stack check for threads
On Mon, Sep 5, 2011 at 1:45 AM, Thomas Klein th.r.kl...@web.de wrote: +static int +stack_check_work_registers (rtx *workreg) +{ + int reg, i, k, n, nregs; + + if (crtl-args.info.pcs_variant = ARM_PCS_AAPCS_LOCAL) +{ + nregs = crtl-args.info.aapcs_next_ncrn; +} + else +nregs = crtl-args.info.nregs; Missing {} for else + n = 0; ... + /* check if we can use one of the argument registers r0..r3 as long as they + * not holding data*/ + for (reg = 0; reg = LAST_ARG_REGNUM i 2; reg++) ... + n = (reg + 1) % 4; Avoid immediate register number. use ARG_REGISTER (1) to replace reg 0 use NUM_ARG_REGS to replace 4 + if (is_non_opt_thumb2 || is_thumb2_hi_reg[0]) + arm_emit_movpair(reg[0], stack_limit_rtx); + else +emit_move_insn(reg[0], stack_limit_rtx); and + rtx lr = gen_rtx_REG (SImode, LR_REGNUM); + emit_push_regs (1, lr); Wrong indent
Re: Ping: C-family stack check for threads
On 09/05/11 09:45, Ye Joey wrote: + /* check if we can use one of the argument registers r0..r3 as long as they + * not holding data*/ + for (reg = 0; reg= LAST_ARG_REGNUM i 2; reg++) ... + n = (reg + 1) % 4; Avoid immediate register number. use ARG_REGISTER (1) to replace reg 0 use NUM_ARG_REGS to replace 4 The 4 is the number of argument registers so you are right to use NUM_ARG_REGS here. The calculation should give the next possible argument register. E.g. if the current register is r0, the next register is r1. Except if the current register is r3, then the next register is r0. I think the ARG_REGISTER macro will not reduce confusion. n = ( ARG_REGISTER(reg+1) + 1) % NUM_ARG_REGS; identical to n = (reg + 1) % NUM_ARG_REGS; regards Thomas Klein gcc/ChangeLog 2011-09-05 Thomas Klein th.r.kl...@web.de * opts.c (common_handle_option): introduce new parameters direct and indirect * flag-types.h (enum stack_check_type): Likewise * explow.c (allocate_dynamic_stack_space): - suppress stack probing if parameter direct, indirect or if a stack-limit is given - do additional read of limit value if parameter indirect and a stack-limit symbol is given - emit a call to a stack_failure function [as an alternative to a trap call] (function probe_stack_range): if allowed to override the range porbe emit generic_limit_check_stack * config/arm/arm.c (stack_check_work_registers): new function to find possible working registers [only used by stack check] (emit_push_regs): add push RTL instruction without keeping regnumber and frame memory in mind. (emit_pop_regs): add pop RTL instruction to revert the above push (emit_stack_check_insns): new function to write RTL instructions for stack check at prologue stage. (arm_expand_prologue): stack check integration for ARM and Thumb-2 (thumb1_output_function_prologue): stack check integration for Thumb-1 * config/arm/arm.md (cbranchsi4_insn): allow compare and branch using stack pointer register [at thumb mode] (arm_cmpsi_insn): allow comparing using stack pointer register [at arm] (probe_stack): do not emit code when parameters direct or indirect is given, emit move code way same as in gcc/explow.c [function emit_stack_probe] (probe_stack_done): dummy to make sure probe_stack insns are not optimized away (generic_limit_check_stack): if stack-limit and parameter generic is given use the limit the same way as in function allocate_dynamic_stack_space (stack_failure): failure call used in stack check functions emit_stack_check_insns, generic_limit_check_stack or allocate_dynamic_stack_space [similar to a trap but avoid conflict with builtin_trap] Index: gcc/opts.c === --- gcc/opts.c (revision 178554) +++ gcc/opts.c (working copy) @@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts, : STACK_CHECK_STATIC_BUILTIN ? STATIC_BUILTIN_STACK_CHECK : GENERIC_STACK_CHECK; + else if (!strcmp (arg, indirect)) + /* This is an other stack checking method. */ + opts-x_flag_stack_check = INDIRECT_STACK_CHECK; + else if (!strcmp (arg, direct)) + /* This is an other stack checking method. */ + opts-x_flag_stack_check = DIRECT_STACK_CHECK; else warning_at (loc, 0, unknown stack check parameter \%s\, arg); break; Index: gcc/flag-types.h === --- gcc/flag-types.h(revision 178554) +++ gcc/flag-types.h(working copy) @@ -153,7 +153,15 @@ enum stack_check_type /* Check the stack and entirely rely on the target configuration files, i.e. do not use the generic mechanism at all. */ - FULL_BUILTIN_STACK_CHECK + FULL_BUILTIN_STACK_CHECK, + + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is directly given e.g. by address + of a symbol */ + DIRECT_STACK_CHECK, + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is given by global variable. */ + INDIRECT_STACK_CHECK }; /* Names for the different levels of -Wstrict-overflow=N. The numeric Index: gcc/explow.c === --- gcc/explow.c(revision 178554) +++ gcc/explow.c(working copy) @@ -1372,7 +1372,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (STACK_CHECK_MOVING_SP) + if ( STACK_CHECK_MOVING_SP +#ifdef HAVE_generic_limit_check_stack + || crtl-limit_stack +#endif + || flag_stack_check ==
Re: Ping: C-family stack check for threads
Hello Here is my next try to put the stack check into rtl at prologue stage. To me, it was not as easy as I hoped. I've had little problems to get push/pop and the compare/jump working. Hoping the way i choose is acceptable. With rtl no extra pool to hold pointer or size values is required any more. That's fine. So this movement to rtl dose make sense. Regards Thomas Klein Index: gcc/opts.c === --- gcc/opts.c(revision 176974) +++ gcc/opts.c(working copy) @@ -1644,6 +1644,12 @@ common_handle_option (struct gcc_options *opts, : STACK_CHECK_STATIC_BUILTIN ? STATIC_BUILTIN_STACK_CHECK : GENERIC_STACK_CHECK; + else if (!strcmp (arg, indirect)) +/* This is an other stack checking method. */ +opts-x_flag_stack_check = INDIRECT_STACK_CHECK; + else if (!strcmp (arg, direct)) +/* This is an other stack checking method. */ +opts-x_flag_stack_check = DIRECT_STACK_CHECK; else warning_at (loc, 0, unknown stack check parameter \%s\, arg); break; Index: gcc/flag-types.h === --- gcc/flag-types.h(revision 176974) +++ gcc/flag-types.h(working copy) @@ -153,7 +153,15 @@ enum stack_check_type /* Check the stack and entirely rely on the target configuration files, i.e. do not use the generic mechanism at all. */ - FULL_BUILTIN_STACK_CHECK + FULL_BUILTIN_STACK_CHECK, + + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is directly given e.g. by address + of a symbol */ + DIRECT_STACK_CHECK, + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is given by global variable. */ + INDIRECT_STACK_CHECK }; /* Names for the different levels of -Wstrict-overflow=N. The numeric Index: gcc/explow.c === --- gcc/explow.c(revision 176974) +++ gcc/explow.c(working copy) @@ -1358,7 +1358,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (STACK_CHECK_MOVING_SP) + if ( STACK_CHECK_MOVING_SP +#ifdef HAVE_generic_limit_check_stack + || crtl-limit_stack +#endif + || flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) ; else if (flag_stack_check == GENERIC_STACK_CHECK) probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE, @@ -1392,19 +1397,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* Check stack bounds if necessary. */ if (crtl-limit_stack) { + rtx limit_rtx; rtx available; rtx space_available = gen_label_rtx (); + if ( GET_CODE (stack_limit_rtx) == SYMBOL_REF + flag_stack_check == INDIRECT_STACK_CHECK) +limit_rtx = expand_unop (Pmode, mov_optab, +gen_rtx_MEM (Pmode, stack_limit_rtx), +NULL_RTX, 1); + else +limit_rtx = stack_limit_rtx; #ifdef STACK_GROWS_DOWNWARD available = expand_binop (Pmode, sub_optab, -stack_pointer_rtx, stack_limit_rtx, +stack_pointer_rtx, limit_rtx, NULL_RTX, 1, OPTAB_WIDEN); #else available = expand_binop (Pmode, sub_optab, -stack_limit_rtx, stack_pointer_rtx, +limit_rtx, stack_pointer_rtx, NULL_RTX, 1, OPTAB_WIDEN); #endif emit_cmp_and_jump_insns (available, size, GEU, NULL_RTX, Pmode, 1, space_available); +#ifdef HAVE_stack_failure + if (HAVE_stack_failure) +emit_insn (gen_stack_failure ()); + else +#endif #ifdef HAVE_trap if (HAVE_trap) emit_insn (gen_trap ()); @@ -1547,6 +1565,13 @@ probe_stack_range (HOST_WIDE_INT first, rtx size) return; } #endif +#ifdef HAVE_generic_limit_check_stack + else if (HAVE_generic_limit_check_stack) +{ + rtx addr = memory_address (Pmode,stack_pointer_rtx); + emit_insn (gen_generic_limit_check_stack (addr)); +} +#endif /* Otherwise we have to generate explicit probes. If we have a constant small number of them to generate, that's the easy case. */ Index: gcc/config/arm/arm.c === --- gcc/config/arm/arm.c(revision 176974) +++ gcc/config/arm/arm.c(working copy) @@ -15809,6 +15809,299 @@ thumb_set_frame_pointer (arm_stack_offsets *offset RTX_FRAME_RELATED_P (insn) = 1; } +/*search for possible work registers for stack-check operation at prologue + return the number of register that can be used without extra push/pop */ + +static int
Re: Ping: C-family stack check for threads
On Sun, 3 Jul 2011, Thomas Klein wrote: Ye Joey wrote: Thomas, I think your are working on a very useful feature. I have ARM MCU applications running of out stack space and resulting strange behaviors silently. I'd like to try your patch and probably give further comments I also think this will be a very useful feature (not just for threads), and I hope you'll persevere through the review process. ;) Not your first patch and you have copyright assignments in place, so that's covered. The first thing I see is that you need to fix the issues regarding the GCC coding standards, http://gcc.gnu.org/codingconventions.html as that is a hurdle for reviewers, and you don't want that. Be very careful. I haven't ran contrib/check_GNU_style.sh myself but maybe it'll be helpful. The second issue I see is that documentation for the new patterns is missing, that should go in gcc/doc/md.texi, somewhere under @node Standard Names. I can imagine there'll be a thing or two to tweak regarding them and that best reviewed through the documentation. Generally, as much as possible should be general and not ARM-specific. If you need helper functions, add them to libgcc. Regards Thomas Klein gcc/ChangeLog 2011-07-03 Thomas Kleinth.r.kl...@web.de mailto:th.r.kl...@web.de * opts.c (common_handle_option): introduce additional stack checking parameters direct and indirect * flag-types.h (enum stack_check_type): Likewise * explow.c (allocate_dynamic_stack_space): - suppress stack probing if parameter direct, indirect or if a stack-limit is given - do additional read of limit value if parameter indirect and a stack-limit symbol is given - emit a call to a stack_failure function [as an alternative to a trap call] No bullet list in the changelog, please. Individual sentences. Follow the existing format; full sentences with capitalization and all that. brgds, H-P
Re: Ping: C-family stack check for threads
On 07/04/2011 03:25 PM, Thomas Klein wrote: There is a emit_multi_reg_push but is there something like emit_multi_reg_pop, too. There's a multi-reg push because that's one instruction. Are the other operations (compare, branche, ..) still allowed? Of course. Everything is still allowed. See the alpha prologue where the stack is probed in a loop to make sure that each page is present. Yes, if the failure function is taken the info will be wrong. If this is a major problem do I have to add this info after any push and pop operation? Of course it is a major problem. How are you supposed to debug a failure? Guess at the stack trace? Will the rtl push/pop do this already for me? It'll do some of it, yes. r~
Re: Ping: C-family stack check for threads
Richard Henderson wrote: On 07/03/2011 08:06 AM, Thomas Klein wrote: +/* + * Write prolouge part of stack check into asm file. + * For Thumb this may look like this: + * push {rsym,ramn} + * ldr rsym, .LSPCHK0 + * ldr rsym, [rsym] + * ldr ramn, .LSPCHK0 + 4 + * add rsym, rsym, ramn + * cmp sp, rsym + * bhs .LSPCHK1 + * push {lr} + * bl __thumb_stack_failure + * .align 2 + * .LSPCHK0: + * .word symbol_addr_of(stack_limit_rtx) + * .word lenght_of(amount) + * .LSPCHK1: + * pop {rsym,ramn} + */ +void +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount, + unsigned numregs) +{ Is there an exceedingly good reason you're emitting this much code as text, rather than as rtl? To me, the stack check is one coherent operation. This is placed after an initial push, which can't be eliminated, but before a major stack adjustment. I have, had some problems with rtl at prologue stage. Is there a way to encapsulate a rtl sequence within prologue. There is a emit_multi_reg_push but is there something like emit_multi_reg_pop, too. Are the other operations (compare, branche, ..) still allowed? In particular, you adjust the stack but not the unwind info. So if one puts a breakpoint at your __thumb_stack_failure function, the unwind information will be incorrect. Yes, if the failure function is taken the info will be wrong. If this is a major problem do I have to add this info after any push and pop operation? Will the rtl push/pop do this already for me? Regards Thomas Klein
Re: Ping: C-family stack check for threads
Ye Joey wrote: Thomas, I think your are working on a very useful feature. I have ARM MCU applications running of out stack space and resulting strange behaviors silently. I'd like to try your patch and probably give further comments - Joey Hi Due to convention of of thumb prologue to rtl, this patch needs to be modified too. Regards Thomas Klein gcc/ChangeLog 2011-07-03 Thomas Kleinth.r.kl...@web.de mailto:th.r.kl...@web.de * opts.c (common_handle_option): introduce additional stack checking parameters direct and indirect * flag-types.h (enum stack_check_type): Likewise * explow.c (allocate_dynamic_stack_space): - suppress stack probing if parameter direct, indirect or if a stack-limit is given - do additional read of limit value if parameter indirect and a stack-limit symbol is given - emit a call to a stack_failure function [as an alternative to a trap call] (function probe_stack_range): if allowed to override the range probe emit generic_limit_check_stack * config/arm/arm.c (stack_check_output_function): new function to write the stack check code sequence to the assember file (inside prologue) (stack_check_work_registers): new function to find possible working registers [only used by stack check] (arm_expand_prologue): stack check integration for ARM and Thumb-2 (thumb1_expand_prologue): stack check integration for Thumb-1 * config/arm/arm.md (probe_stack): do not emit code when parameters direct or indirect given, emit move code as in gcc/explow.c [function emit_stack_probe] (probe_stack_done): dummy to make sure probe_stack insns are not optimized away (generic_limit_check_stack): if stack-limit and parameter generic is given use the limit the same way as in function allocate_dynamic_stack_space (stack_check): ARM/Thumb-2/Thumb-1 insn to output function stack_check_output_function (stack_failure): failure call used in function allocate_dynamic_stack_space [similar to a trap but avoid conflict with builtin_trap] Index: gcc/flag-types.h === --- gcc/flag-types.h(revision 175786) +++ gcc/flag-types.h(working copy) @@ -153,7 +153,15 @@ enum stack_check_type /* Check the stack and entirely rely on the target configuration files, i.e. do not use the generic mechanism at all. */ - FULL_BUILTIN_STACK_CHECK + FULL_BUILTIN_STACK_CHECK, + + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is directly given e.g. by address + of a symbol */ + DIRECT_STACK_CHECK, + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is given by global variable. */ + INDIRECT_STACK_CHECK }; /* Names for the different levels of -Wstrict-overflow=N. The numeric Index: gcc/explow.c === --- gcc/explow.c(revision 175786) +++ gcc/explow.c(working copy) @@ -1358,7 +1358,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (STACK_CHECK_MOVING_SP) + if ( STACK_CHECK_MOVING_SP +#ifdef HAVE_generic_limit_check_stack + || crtl-limit_stack +#endif + || flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) ; else if (flag_stack_check == GENERIC_STACK_CHECK) probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE, @@ -1392,19 +1397,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* Check stack bounds if necessary. */ if (crtl-limit_stack) { + rtx limit_rtx; rtx available; rtx space_available = gen_label_rtx (); + if ( GET_CODE (stack_limit_rtx) == SYMBOL_REF + flag_stack_check == INDIRECT_STACK_CHECK) +limit_rtx = expand_unop (Pmode, mov_optab, + gen_rtx_MEM (Pmode, stack_limit_rtx), + NULL_RTX, 1); + else +limit_rtx = stack_limit_rtx; #ifdef STACK_GROWS_DOWNWARD available = expand_binop (Pmode, sub_optab, - stack_pointer_rtx, stack_limit_rtx, + stack_pointer_rtx, limit_rtx, NULL_RTX, 1, OPTAB_WIDEN); #else available = expand_binop (Pmode, sub_optab, - stack_limit_rtx, stack_pointer_rtx, + limit_rtx, stack_pointer_rtx, NULL_RTX, 1, OPTAB_WIDEN); #endif emit_cmp_and_jump_insns (available, size, GEU, NULL_RTX, Pmode, 1, space_available); +#ifdef
Re: Ping: C-family stack check for threads
On 07/03/2011 08:06 AM, Thomas Klein wrote: +/* + * Write prolouge part of stack check into asm file. + * For Thumb this may look like this: + * push {rsym,ramn} + * ldr rsym, .LSPCHK0 + * ldr rsym, [rsym] + * ldr ramn, .LSPCHK0 + 4 + * add rsym, rsym, ramn + * cmp sp, rsym + * bhs .LSPCHK1 + * push {lr} + * bl __thumb_stack_failure + * .align 2 + * .LSPCHK0: + * .word symbol_addr_of(stack_limit_rtx) + * .word lenght_of(amount) + * .LSPCHK1: + * pop {rsym,ramn} + */ +void +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount, + unsigned numregs) +{ Is there an exceedingly good reason you're emitting this much code as text, rather than as rtl? In particular, you adjust the stack but not the unwind info. So if one puts a breakpoint at your __thumb_stack_failure function, the unwind information will be incorrect. r~
Re: Ping: C-family stack check for threads
On Fri, Jun 24, 2011 at 11:51 PM, Thomas Klein th.r.kl...@web.de wrote: Hi This is a ping of (http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html). Repeating my request. I would like to have a stack check for threads with small amount of stack space per thread. (I'm using a ARM Cortex-M3 microcontroller with a stack size of a 1 KByte per Thread.) Each thread having its own limit address. The thread scheduler can then calculate the limit and store this value inside of a global variable. The compiler may generate code to check the stack for overflow at function entry. In principal this can be done this way: - push registers as usual - figure out if one or two work registers, that can be used directly without extra push - if not enough registers found push required work registers to stack - load limit address into first working register - load value of limit address (into the same register) - if stack pointer will go to extend the stack (e.g. for local variables) load this size value too (here the second work register can be used) - compare for overflow - if overflow occur call stack_failure function - pop work registers that are pushed before - continue function prologue as usual e.g. extend stack pointer The ARM target has an option -mapcs-stack-check but this is more or less not working. (implementation seems to be missing) There are also architecture independent options like -fstack-check=generic, -fstack-limit-symbol=current_stack_limit or -fstack-limit-register=r6 that can be used. The generic stack check is doing a probe at end of function prologue phase (e.g by writing 12K ahead the current stack pointer position). If this stack space is not available the probe may generates a fault. This require that the CPU is having a MPU or a MMU. For machines with small memory space an additional mechanism should be available. The option -fstack-check can be extend by the switches direct and indirect to emit compare code in function prologue. If switch direct is given the address of -fstack-limit-symbol represents the limit itself. If switch indirect is given -fstack-limit-symbol is a kind of global variable that needs be read before comparison. Thomas, I think your are working on a very useful feature. I have ARM MCU applications running of out stack space and resulting strange behaviors silently. I'd like to try your patch and probably give further comments - Joey