[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #16 from bergner at gcc dot gnu dot org 2008-01-17 03:22 --- Subject: Bug 33796 Author: bergner Date: Thu Jan 17 03:21:36 2008 New Revision: 131589 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=131589 Log: PR rtl-optimization/33796 * sparseset.c (sparseset_alloc): Use xcalloc rather than xmalloc. Modified: trunk/gcc/ChangeLog trunk/gcc/sparseset.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #17 from bergner at gcc dot gnu dot org 2008-01-17 03:28 --- Ok, after talking with Kenny offline, we agreed to commit Janis' patch that changed xmalloc to xcalloc. As the comment I added to the changes says, if future users of this code see a performance issue, we can revisit this change. -- bergner at gcc dot gnu dot org changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution||FIXED http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #15 from bonzini at gnu dot org 2008-01-13 12:02 --- Considering that we allocate exactly 1 sparseset per function, it cannot change the performance in any way to initialize the memory, so maybe we do want to go for that. But it is just stupid IMHO if the beauty of the data structure is that it allows O(1) creation. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #9 from zadeck at naturalbridge dot com 2008-01-12 16:34 --- i personally think that this patch in #8 is not the right way to go. unless there is a compelling argument that initializing this is going to have some negative performance effect, we should properly initialize this datastructure as we do (or at least try to do) for every other data structure in the compiler. While the current code that peter has written is correct, it is quite tricky. This compiler has way too much tricky code, which only adds to the long term maintainance cost of the compiler. the patch in 8 is not acceptable because it requires that the compiler be built in a particular way to avoid the valgrind errors. The last thing that you want to have to do if you have a heisenbug is to rebuild your compiler in a way that will change the layout and alignment of things. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #10 from bonzini at gnu dot org 2008-01-12 16:52 --- an alternative is to prepare a suppression file for valgrind, and distribute it with gcc. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #11 from zadeck at naturalbridge dot com 2008-01-12 17:05 --- until someone has the slightest bit of evidence that initializing the datastructure is costly, this is just a waste of time. peter wrote the code this way to be cute, not because there was any reason to believe that getting a zerod data structure was going to be noticeable. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #12 from bonzini at gnu dot org 2008-01-12 21:09 --- well if you can enjoy O(n) initialization (and O(1) clearing as in Peter's code), you had better rewrite the code completely to query an item with one (not two) memory accesses. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #13 from dcb314 at hotmail dot com 2008-01-12 22:17 --- (In reply to comment #9) i personally think that this patch in #8 is not the right way to go. unless there is a compelling argument that initializing this is going to have some negative performance effect, we should properly initialize this datastructure as we do (or at least try to do) for every other data structure in the compiler. While the current code that peter has written is correct, it is quite tricky. This compiler has way too much tricky code, which only adds to the long term maintainance cost of the compiler. Well said Sir. If the tricky code bought a 10% speedup, I'd advocate it. No one seems to have measured and documented how much it really saves. Unmeasured claims really don't count. Machines double in performance every few years, hence the prudent engineer codes in the default case for clarity, not performance. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #14 from hp at gcc dot gnu dot org 2008-01-12 23:05 --- (In reply to comment #9) i personally think that this patch in #8 is not the right way to go. unless there is a compelling argument that initializing this is going to have some negative performance effect, we should properly initialize this datastructure as we do (or at least try to do) for every other data structure in the compiler. While the current code that peter has written is correct, it is quite tricky. This compiler has way too much tricky code, which only adds to the long term maintainance cost of the compiler. the patch in 8 is not acceptable because it requires that the compiler be built in a particular way to avoid the valgrind errors. The last thing that you want to have to do if you have a heisenbug is to rebuild your compiler in a way that will change the layout and alignment of things. You know, that patch *enables*, it does not *require*. Nothing forces it to be the end of the story regarding this code, or that this PR be closed with it applied. It doesn't stop anyone (like you) from actually changing the way that code looks. It does however, enable people to build and test using --enable-checking=valgrind, like it could before (making the current behavior a regression). Other people, in particular, write-privilege maintainers of the code in question, have expressed sentiments that the patch was the way to go. You want to *stop* the patch because it doesn't *help* with an odd special case? Well, it helps with other debug cases, so let's have progress, and not have the perfect stand in way of gradual improvement. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #8 from bonzini at gnu dot org 2007-10-31 13:21 --- Reopening and marking as enhancement. A patch like this should work: Index: sparseset.c === --- sparseset.c (revision 129768) +++ sparseset.c (working copy) @@ -21,6 +21,18 @@ along with GCC; see the file COPYING3. #include libiberty.h #include sparseset.h +#ifdef ENABLE_VALGRIND_CHECKING +# ifdef HAVE_VALGRIND_MEMCHECK_H +# include valgrind/memcheck.h +# elif defined HAVE_MEMCHECK_H +# include memcheck.h +# else +# include valgrind.h +# endif +#else +/* Avoid #ifdef:s when we can help it. */ +#define VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE(x, y) +#endif /* Allocate and clear a n_elms SparseSet. */ @@ -33,6 +45,8 @@ sparseset_alloc (SPARSESET_ELT_TYPE n_el sparseset set = (sparseset) xmalloc (n_bytes); set-dense = (set-elms[0]); set-sparse = (set-elms[n_elms]); + VALGRIND_MAKE_MEM_DEFINED_IF_ADDRESSABLE +(set-sparse, n_elms * sizeof (SPARSESET_ELT_TYPE)); set-size = n_elms; sparseset_clear (set); return set; but only if the compiler is compiled with --enable-checking=valgrind or --enable-checking=yes,valgrind. There are data structures that *rely* on safe uninitialized data reads for speed, this is one of them. -- bonzini at gnu dot org changed: What|Removed |Added Severity|normal |enhancement Status|RESOLVED|REOPENED GCC target triplet|x86_64-linux-gnu|{i86,x86_64}-linux-gnu Resolution|INVALID | http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #7 from bergner at gcc dot gnu dot org 2007-10-25 18:46 --- So yes we do some uninitialized accesses to the sparse array, but that's ok. So absolutely *any* value is fine ? Yes, absolutely *any* value is fine. If you look at the code, you'll see that the result of the uninitialized read is used to index into the dense array, but only after guaranteeing that it is within bounds of the dense array. Therefore, we're not just widely accessing random memory locations. Aren't there some machines where unaligned accesses to words fail ? These are uninitialized reads, not unaligned reads. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #6 from dcb314 at hotmail dot com 2007-10-18 17:33 --- (In reply to comment #2) Although valgrind is correct that we are doing an uninitialized read, the code is actually working as designed and is correct. I wish I had a pound for every time I've heard that ;- So yes we do some uninitialized accesses to the sparse array, but that's ok. Sure ? So absolutely *any* value is fine ? Aren't there some machines where unaligned accesses to words fail ? It's also a benefit of sparseset, given that we don't have to memset/clear the whole sparseset data structure before using it, so it's fast. I'd appreciate seeing some performance measurement numbers which tell us just how fast, so we can make a comparison of just how much speed this tricky code is buying us. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #3 from zadeck at naturalbridge dot com 2007-10-17 11:25 --- Subject: Re: valgrind error with -O2 for linux kernel code bergner at gcc dot gnu dot org wrote: --- Comment #2 from bergner at gcc dot gnu dot org 2007-10-17 04:46 --- Although valgrind is correct that we are doing an uninitialized read, the code is actually working as designed and is correct. When we allocate a sparseset, we only need to set set-members to 0 to clear the set. The arrays set-sparse[] and set-dense[] are not and do not need to be initialized. To test a value n for membership in set, it needs to statisfy two properties: set-sparse[n] set-members and set-dense[set-sparse[n]] == n The uninitialized read occurs when n is not (and never has been) a member of set. In this case, set-sparse[n] will be uninitialized and could be any value. If set-sparse[n] happens to be = set-members, we luckily (but correctly) return that n is not a member of the set. If the uninitialized set-sparse[n] is set-members, we continue on to verify that set-dense[set-sparse[n]] == n. This test cannot be true since all set-dense[i] entries for i set-members are initialized and n is not a member of the set. So yes we do some uninitialized accesses to the sparse array, but that's ok. It's also a benefit of sparseset, given that we don't have to memset/clear the whole sparseset data structure before using it, so it's fast. peter, i think that this is clever and nice but it is not going to fly. people will be running valgrind and this will hit them over and over again. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #4 from bergner at gcc dot gnu dot org 2007-10-17 14:21 --- People using valgrind already have to deal with false positives and actual uninitialized uses (like this one) that are harmless. If you look at your valgrind install, you'll see that there are several error suppression files installed with valgrind: [EMAIL PROTECTED]:~/gcc/bugs rpm -ql valgrind | grep '.supp$' /usr/lib64/valgrind/default.supp /usr/lib64/valgrind/glibc-2.2.supp /usr/lib64/valgrind/glibc-2.3.supp /usr/lib64/valgrind/glibc-2.4.supp /usr/lib64/valgrind/glibc-2.5.supp /usr/lib64/valgrind/xfree-3.supp /usr/lib64/valgrind/xfree-4.supp To suppress this particular error/warning, you just need to create another suppression file for this error with the contents below and pass --suppressions=/path/to/sparseset.supp to valgrind and all should be well. [EMAIL PROTECTED]:~/gcc/bugs cat sparseset.supp { SparseSet-Cond-0 Memcheck:Cond fun:global_conflicts fun:rest_of_handle_global_alloc fun:execute_one_pass fun:execute_pass_list fun:execute_pass_list fun:tree_rest_of_compilation fun:cgraph_expand_function fun:cgraph_optimize fun:c_write_global_declarations fun:toplev_main fun:main } { SparseSet-Value4-0 Memcheck:Value4 fun:global_conflicts fun:rest_of_handle_global_alloc fun:execute_one_pass fun:execute_pass_list fun:execute_pass_list fun:tree_rest_of_compilation fun:cgraph_expand_function fun:cgraph_optimize fun:c_write_global_declarations fun:toplev_main fun:main } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #5 from wilson at specifix dot com 2007-10-17 20:53 --- Subject: Re: valgrind error with -O2 for linux kernel code bergner at gcc dot gnu dot org wrote: --- Comment #2 from bergner at gcc dot gnu dot org 2007-10-17 04:46 --- Although valgrind is correct that we are doing an uninitialized read, the code is actually working as designed and is correct. A comment in the code mentioning that the uninit reads are intentional would be useful. Otherwise, you will keep getting this same question, and have to keep answering it. Also, it might be a good idea to make sure that the sparse field is marked as volatile, to ensure that reads are never optimized away. As the gcc optimizer gets better, e.g. LTO and aggressive inlining, it might be possible to end up with a case where gcc can prove that it has an uninit read of this field and optimize it away. This will result in a use of an uninit register. On IA-64, an uninit register may have the NaT (Not a Thing) bit set which is used for speculation. Hence use of an uninit register may cause an unexpected speculation failure, which will cause a program to crash. We can avoid this chain of events by making this a volatile field, which will make it impossible for gcc to optimize away reads from this field, even if uninitialized. It also serves as a clue to other people reading the code that this field is special. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
-- bergner at gcc dot gnu dot org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2007-10-17 04:01:49 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796
[Bug rtl-optimization/33796] valgrind error with -O2 for linux kernel code
--- Comment #2 from bergner at gcc dot gnu dot org 2007-10-17 04:46 --- Although valgrind is correct that we are doing an uninitialized read, the code is actually working as designed and is correct. When we allocate a sparseset, we only need to set set-members to 0 to clear the set. The arrays set-sparse[] and set-dense[] are not and do not need to be initialized. To test a value n for membership in set, it needs to statisfy two properties: set-sparse[n] set-members and set-dense[set-sparse[n]] == n The uninitialized read occurs when n is not (and never has been) a member of set. In this case, set-sparse[n] will be uninitialized and could be any value. If set-sparse[n] happens to be = set-members, we luckily (but correctly) return that n is not a member of the set. If the uninitialized set-sparse[n] is set-members, we continue on to verify that set-dense[set-sparse[n]] == n. This test cannot be true since all set-dense[i] entries for i set-members are initialized and n is not a member of the set. So yes we do some uninitialized accesses to the sparse array, but that's ok. It's also a benefit of sparseset, given that we don't have to memset/clear the whole sparseset data structure before using it, so it's fast. -- bergner at gcc dot gnu dot org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||INVALID http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33796