Hi, I've found a serious stack issue when debugging EFI amd64 port. The problem is in grub_parser_cmdline_state, the c code is all right, problem is in the assembly code generated by gcc:
grub_parser_cmdline_state: mov %edi,-0x14(%rsp) movl $0x1,-0xc(%rsp) leaq EXT_C(my_state_transitions)(%rip),%rcx xor %r8d,%r8d jmp loc_31 loc_1b: cmp %edi,%eax jne loc_2d mov 0x8(%rcx),%al cmp %sil,%al je loc_3e test %al,%al cmove %rcx,%r8 loc_2d: add $0x10,%rcx loc_31: mov (%rcx),%eax test %eax,%eax loc_35: jne loc_1b loc_37: jmp loc_4d loc_39: lea -0x18(%rsp),%rcx loc_3e: xor %eax,%eax cmpl $0x0,0xc(%rcx) cmovne %esi,%eax mov %al,(%rdx) mov 0x4(%rcx),%eax addq $0x18, %rsp retq loc_4d: test %r8,%r8 mov %r8,%rcx jne loc_3e jmp loc_39 You can see that it access local variable default_transition using negative offset from %esp. In EFI, hardware interrupt is enabled. When an interrupt fires (such as timer), it uses the same stack, which in effect overwritten default_transition ! The result is that the parser is halted by wrong grub_parser_cmdline_state result, and the menu is truncated. If I subtract %rsp and uses positive offset, this problem goes away. For grub_parser_cmdline_state itself, there is a simple solution, just change default_transition to global variable: diff --git a/kern/parser.c b/kern/parser.c index e931853..a0ab0e7 100644 --- a/kern/parser.c +++ b/kern/parser.c @@ -54,6 +54,7 @@ static struct grub_parser_state_transition state_transitions[] = { 0, 0, 0, 0} }; +static struct grub_parser_state_transition default_transition; /* Determines the state following STATE, determined by C. */ grub_parser_state_t @@ -61,7 +62,6 @@ grub_parser_cmdline_state (grub_parser_state_t state, char c, char *result) { struct grub_parser_state_transition *transition; struct grub_parser_state_transition *next_match = 0; - struct grub_parser_state_transition default_transition; int found = 0; default_transition.to_state = state; But there could be other hidden bombs lurking around. Especially that random reboot issue, I suspect it's caused by the same bug. I've tried a few gcc options, but doesn't seems to work. Anyone knows how to instruct gcc to subtract %rsp pointer for local variable ? -- Bean _______________________________________________ Grub-devel mailing list Grub-devel@gnu.org http://lists.gnu.org/mailman/listinfo/grub-devel