Re: [DWARF] Tracking uninitialized variables
On 07/20/2015 08:11 PM, Jakub Jelinek wrote: On Mon, Jul 20, 2015 at 07:55:46PM +0300, Nikolai Bozhenov wrote: On 07/17/2015 08:31 PM, Michael Eager wrote: On 07/17/2015 03:43 AM, Nikolai Bozhenov wrote: Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } First of all, -fvar-tracking-uninit is misdesigned mess that really should have never been added to GCC, so please don't consider DW_OP_GNU_uninit as something that should be used in any approach for this. Thanks for your reply, Jakub! So, I take it -fvar-tracking-uninit is completely useless/broken for now and is likely to be removed in future versions of GCC, isn't it? Anyway, it seems that I was trying to use the wrong tool to solve the problem. In my initial example I got confusing results because unconditionally initialized variables were in fact uninitialized at the point where I expected them to be initialized (at the last line of the function). But in fact, the confusion was caused by reordering of operations rather than by uninitialized variables. In the following slightly modified example I would face the same confusion at line 11 (because likewise new values wouldn't be yet assigned to the variables) though there are no uninitialized variables any more: 3 int foo(int i1, int i2) { 4 int a = getvalue(); 5 int b = getvalue(); 6 int c = getvalue(); 7 if (baz()) { 8 a = i1 i2; 9 b = (i1 + i2) * i1; 10 c = (b + i1); 11 bar(a, b, c); 12 } . . . I mean that when debugging an optimized code the main problem is that operations are heavily reordered. Therefore, to simplify debugging one would like to determine what source code parts have been fully executed, what parts are currently being executed and what parts are yet to be executed. In the latter example if one could determine that lines 8 through 10 hadn't been completely executed when the breakpoint at line 11 was hit, any unexpected values for variables a, b, c wouldn't cause much confusion because it would be clear that they simply haven't got new values yet. Then, one could set the breakpoint at some following instruction and get values of interesting variables after execution of corresponding source lines is finished. Nikolai
Re: [DWARF] Tracking uninitialized variables
On 07/20/2015 09:55 AM, Nikolai Bozhenov wrote: On 07/17/2015 08:31 PM, Michael Eager wrote: A related issue is where the breakpoint is taken. GCC sets breakpoints at the first instruction generated for a statement, which in this case, appears to be before any of the arguments to bar are evaluated. A possibly better location would be after arguments are evaluated, before the call is executed. In this case GDB set the breakpoint at the instruction at 0x0d where evaluation of the first argument for the call is performed. I'm not sure that there is a less confusing way to choose an address to set a breakpoint. For example I don't think it is a good idea to ignore evaluation of function arguments and set a breakpoint right at the call instruction. But even if there is a better way, such new feature is likely to be implemented in GDB rather than in GCC. Debugging optimized programs is difficult, and deciding the best location to set a breakpoint is a matter for some thought. (See Caroline Tice's dissertation.) If none of the arguments is evaluated until after the first instruction of a function call, then you cannot display the argument values if that is your breakpoint. No matter what location you feel is the best for a breakpoint, it is the compiler which generates the line number table and indicates where a breakpoint should be placed, not the debugger. Don't expect anything in GDB which will do something when GCC provides incomplete or inaccurate information. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [DWARF] Tracking uninitialized variables
On 07/17/2015 08:31 PM, Michael Eager wrote: On 07/17/2015 03:43 AM, Nikolai Bozhenov wrote: Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } Let's compile it: $ gcc-trunk tst.c -g -fvar-tracking-uninit -O2 After hitting a breakpoint at line 8 (the last line of the function foo) I have some random (and very confusing) values displayed in gdb for all three variables a, b and c. This is because GCC allocates these three variables on the stack (their addresses are taken) and creates for them DWARF entries like this: 2a8: Abbrev Number: 8 (DW_TAG_variable) a9 DW_AT_name: a ab DW_AT_decl_file : 1 ac DW_AT_decl_line : 4 ad DW_AT_type: 0x64 b1 DW_AT_location: 2 byte block: 91 64 (DW_OP_fbreg: -28) This (incorrectly) says that the variable is at the specified location for the entire scope of the function. This should be a location list, which specifies the live range for the variable. At the breakpoint, this location list would/should have no location for the variable. A related issue is where the breakpoint is taken. GCC sets breakpoints at the first instruction generated for a statement, which in this case, appears to be before any of the arguments to bar are evaluated. A possibly better location would be after arguments are evaluated, before the call is executed. Not sure that this is a GCC issue. I annotated instructions with line numbers taken from debug info that GCC generated for the function and these numbers look perfectly correct: foo: 0: 89 f1 mov%esi,%ecx // 5 2: 01 fe add%edi,%esi // 6 4: 48 83 ec 18 sub$0x18,%rsp // 3 8: 0f af f7imul %edi,%esi // 6 b: 89 f8 mov%edi,%eax // 5 d: 48 8d 54 24 0c lea0xc(%rsp),%rdx // 8 12: d3 e0 shl%cl,%eax// 5 14: 89 44 24 04 mov%eax,0x4(%rsp) // 5 18: 89 74 24 08 mov%esi,0x8(%rsp) // 6 1c: 01 fe add%edi,%esi // 7 1e: 48 8d 7c 24 04 lea0x4(%rsp),%rdi // 8 23: 89 74 24 0c mov%esi,0xc(%rsp) // 7 27: 48 8d 74 24 08 lea0x8(%rsp),%rsi // 8 2c: e8 00 00 00 00 callq 31 foo+0x31 // 8 31: 48 83 c4 18 add$0x18,%rsp // 9 35: c3 retq // 9 In this case GDB set the breakpoint at the instruction at 0x0d where evaluation of the first argument for the call is performed. I'm not sure that there is a less confusing way to choose an address to set a breakpoint. For example I don't think it is a good idea to ignore evaluation of function arguments and set a breakpoint right at the call instruction. But even if there is a better way, such new feature is likely to be implemented in GDB rather than in GCC. Nikolai
Re: [DWARF] Tracking uninitialized variables
On Mon, Jul 20, 2015 at 07:55:46PM +0300, Nikolai Bozhenov wrote: On 07/17/2015 08:31 PM, Michael Eager wrote: On 07/17/2015 03:43 AM, Nikolai Bozhenov wrote: Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } Let's compile it: $ gcc-trunk tst.c -g -fvar-tracking-uninit -O2 First of all, -fvar-tracking-uninit is misdesigned mess that really should have never been added to GCC, so please don't consider DW_OP_GNU_uninit as something that should be used in any approach for this. As for the reduction of ranges, you've chosen a testcase where it is perhaps possible to track down the first writes to an addressable variable and reduce .debug_loc extents where the corresponding memory is documented as holding the value of the variable. Generally, it is significantly harder though. Consider void foo (int x, int y, int z) { int a; if (x) a = x + 3; if (y) a = y + 2; if (z) a = z - 1; bar (a, 0, 0); } Where would you consider the range of var a to start? It is initialized conditionally, in some cases it might not be initialized at all before calling bar. Or would you consider a function taking address of a variable known not to be initialized yet as the start of the range? That function might store there a value, but might not (conditionally or unconditionally). Also, think about loop unrolling, if you have: for (int i = 0; i 16; i++) { { int a; bar (0); a = i; bar (a); } baz (); } If this is unrolled and a as addressable var is assigned some stack slot, how would you find out which unrolled statement touching the var is from which iteration (and whether to represent the location info as multiple shorter ranges from the a = i; statements to bar (a); and then again make it optimized away)? Jakub
Re: [DWARF] Tracking uninitialized variables
On 07/17/2015 02:16 PM, Kyrill Tkachov wrote: Have you tried the -Og option? Well, with -Og operations are not rearranged, so binary code is very close to the source code and debugging is very straightforward. As for debug info, it is generally the same: when I try to inspect yet uninitialized variable, some random value is printed without warning that the variable is uninitialized (despite -fvar-tracking-uninit option). If the variables were marked with DW_OP_GNU_uninit, gdb would print something like that: a = [uninitialized] 64 b = [uninitialized] 4198512 c = [uninitialized] 0 Nikolai
Re: [DWARF] Tracking uninitialized variables
On 17/07/15 11:43, Nikolai Bozhenov wrote: Hello! It is certainly true that debugging an optimized code is an inherently difficult task. Though, I wonder if the compiler could make such debugging experience slightly less surprising. Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } Let's compile it: $ gcc-trunk tst.c -g -fvar-tracking-uninit -O2 Just a drive-by thought. Have you tried the -Og option? The documentation for it says: Optimize debugging experience.-Ogenables optimizations that do not interfere with debugging. It should be the optimization level of choice for the standard edit-compile-debug cycle, offering a reasonable level of optimization while maintaining fast compilation and a good debugging experience. Kyrill
[DWARF] Tracking uninitialized variables
Hello! It is certainly true that debugging an optimized code is an inherently difficult task. Though, I wonder if the compiler could make such debugging experience slightly less surprising. Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } Let's compile it: $ gcc-trunk tst.c -g -fvar-tracking-uninit -O2 After hitting a breakpoint at line 8 (the last line of the function foo) I have some random (and very confusing) values displayed in gdb for all three variables a, b and c. This is because GCC allocates these three variables on the stack (their addresses are taken) and creates for them DWARF entries like this: 2a8: Abbrev Number: 8 (DW_TAG_variable) a9 DW_AT_name: a ab DW_AT_decl_file : 1 ac DW_AT_decl_line : 4 ad DW_AT_type: 0x64 b1 DW_AT_location: 2 byte block: 91 64 (DW_OP_fbreg: -28) That is, actual values for variables are supposed to reside in fixed stack slots throughout the whole function. But in fact, by the time the breakpoint is hit none of the values is stored on the stack (the last line of the function, evaluation of the first argument for the call): Dump of assembler code for function foo: 0x004004c0 +0: mov%esi,%ecx 0x004004c2 +2: add%edi,%esi 0x004004c4 +4: sub$0x18,%rsp 0x004004c8 +8: imul %edi,%esi 0x004004cb +11:mov%edi,%eax = 0x004004cd +13:lea0xc(%rsp),%rdx 0x004004d2 +18:shl%cl,%eax 0x004004d4 +20:mov%eax,0x4(%rsp) 0x004004d8 +24:mov%esi,0x8(%rsp) 0x004004dc +28:add%edi,%esi 0x004004de +30:lea0x4(%rsp),%rdi 0x004004e3 +35:mov%esi,0xc(%rsp) 0x004004e7 +39:lea0x8(%rsp),%rsi 0x004004ec +44:callq 0x4004f6 bar 0x004004f1 +49:add$0x18,%rsp 0x004004f5 +53:retq End of assembler dump. By contrast, If I didn't take addresses of these variables, they would be optimized out until we can determine their correct values. I believe that such behavior is much better than displaying wrong values for variables. Also, GCC seems to be able to produce DW_OP_GNU_uninit operations to mark uninitialized variables (when -fvar-tracking-uninit is used) but I've never seen it in generated debug info. At first glance the problem seems to have something to do with the fact that there's no gimple_debug/BIND and debug_insn/var_location instructions for such stack variables in internal representation. There are only statements like this: gimple_assign lshift_expr, a.0_3, i1_1(D), i2_2(D), NULL # .MEM_5 = VDEF .MEM_4(D) gimple_assign ssa_name, a, a.0_3, NULL, NULL and *no* BIND statement like gimple_debug BIND a, a.0_3 Furthermore, in dwarf2out.c there are a lot of calls to mem_loc_descriptor where var_init_status is unconditionally set to VAR_INIT_STATUS_INITIALIZED. And it looks like currently all values in memory are considered to be initialized by default. And only in few places NOTE_VAR_LOCATION_STATUS is taken into account (anyway, there's no such notes for variables a, b, c in this case). So, here are my questions: 1) Wouldn't it be nice if GCC marked such stack variables either as uninitialized (with DW_OP_GNU_uninit) or as optimized out? 2) Is it feasible to implement such tracking for variables? 3) What exactly is GCC supposed to track with -fvar-tracking-uninit option? Thanks, Nikolai
Re: [DWARF] Tracking uninitialized variables
On 07/17/2015 03:43 AM, Nikolai Bozhenov wrote: Hello! It is certainly true that debugging an optimized code is an inherently difficult task. Though, I wonder if the compiler could make such debugging experience slightly less surprising. Consider the following example: 1 extern void bar(int *i1, int *i2, int *i3); 2 3 int __attribute__((noinline)) foo(int i1, int i2) { 4 int a, b, c; 5 a = i1 i2; 6 b = (i1 + i2) * i1; 7 c = (b + i1); 8 bar(a, b, c); 9 } 10 11 int main() { 12 foo(42, 12); 13 } Let's compile it: $ gcc-trunk tst.c -g -fvar-tracking-uninit -O2 After hitting a breakpoint at line 8 (the last line of the function foo) I have some random (and very confusing) values displayed in gdb for all three variables a, b and c. This is because GCC allocates these three variables on the stack (their addresses are taken) and creates for them DWARF entries like this: 2a8: Abbrev Number: 8 (DW_TAG_variable) a9 DW_AT_name: a ab DW_AT_decl_file : 1 ac DW_AT_decl_line : 4 ad DW_AT_type: 0x64 b1 DW_AT_location: 2 byte block: 91 64 (DW_OP_fbreg: -28) This (incorrectly) says that the variable is at the specified location for the entire scope of the function. This should be a location list, which specifies the live range for the variable. At the breakpoint, this location list would/should have no location for the variable. A related issue is where the breakpoint is taken. GCC sets breakpoints at the first instruction generated for a statement, which in this case, appears to be before any of the arguments to bar are evaluated. A possibly better location would be after arguments are evaluated, before the call is executed. That is, actual values for variables are supposed to reside in fixed stack slots throughout the whole function. But in fact, by the time the breakpoint is hit none of the values is stored on the stack (the last line of the function, evaluation of the first argument for the call): Dump of assembler code for function foo: 0x004004c0 +0: mov%esi,%ecx 0x004004c2 +2: add%edi,%esi 0x004004c4 +4: sub$0x18,%rsp 0x004004c8 +8: imul %edi,%esi 0x004004cb +11:mov%edi,%eax = 0x004004cd +13:lea0xc(%rsp),%rdx 0x004004d2 +18:shl%cl,%eax 0x004004d4 +20:mov%eax,0x4(%rsp) 0x004004d8 +24:mov%esi,0x8(%rsp) 0x004004dc +28:add%edi,%esi 0x004004de +30:lea0x4(%rsp),%rdi 0x004004e3 +35:mov%esi,0xc(%rsp) 0x004004e7 +39:lea0x8(%rsp),%rsi 0x004004ec +44:callq 0x4004f6 bar 0x004004f1 +49:add$0x18,%rsp 0x004004f5 +53:retq End of assembler dump. By contrast, If I didn't take addresses of these variables, they would be optimized out until we can determine their correct values. I believe that such behavior is much better than displaying wrong values for variables. Also, GCC seems to be able to produce DW_OP_GNU_uninit operations to mark uninitialized variables (when -fvar-tracking-uninit is used) but I've never seen it in generated debug info. At first glance the problem seems to have something to do with the fact that there's no gimple_debug/BIND and debug_insn/var_location instructions for such stack variables in internal representation. There are only statements like this: gimple_assign lshift_expr, a.0_3, i1_1(D), i2_2(D), NULL # .MEM_5 = VDEF .MEM_4(D) gimple_assign ssa_name, a, a.0_3, NULL, NULL and *no* BIND statement like gimple_debug BIND a, a.0_3 Furthermore, in dwarf2out.c there are a lot of calls to mem_loc_descriptor where var_init_status is unconditionally set to VAR_INIT_STATUS_INITIALIZED. And it looks like currently all values in memory are considered to be initialized by default. And only in few places NOTE_VAR_LOCATION_STATUS is taken into account (anyway, there's no such notes for variables a, b, c in this case). So, here are my questions: 1) Wouldn't it be nice if GCC marked such stack variables either as uninitialized (with DW_OP_GNU_uninit) or as optimized out? 2) Is it feasible to implement such tracking for variables? 3) What exactly is GCC supposed to track with -fvar-tracking-uninit option? Thanks, Nikolai -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077