On Tue, 2019-12-10 at 16:00 -0700, Martin Sebor wrote: > On 12/10/19 3:07 PM, David Malcolm wrote: > > On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote: > > > After allocating a new chunk of memory hash_table::expand() copy- > > > assigns elements from the current array over the newly allocated > > > elements without constructing them. > > > > > > Similarly, after destroying existing elements, hash_table:: > > > empty_slow() assigns a new value to them. This bug was > > > introduced in r249234 when trying to deal with -Wclass-memaccess > > > instances when the warning was first added. > > > > > > Neither of these is a problem for PODs but both cause trouble > > > when > > > the hash_table contains elements of a type with a user-defined > > > copy > > > assignment operator. There are at least two such instances in > > > GCC > > > already and a third is under review. > > > > > > The attached patch avoids this by using placement new to > > > construct > > > new elements in uninitialized storage and restoring the original > > > call to memset in hash_table::empty_slow(), analogously to what > > > was done in r272893 for PR90923, > > > > > > Longer term, to make these templates safely and efficiently > > > usable > > > with non-POD types with user-defined copy ctors and copy > > > assignment > > > operators I think these classes will probably need to be enhanced > > > to make use of "assign" and "move" traits functions to > > > efficiently > > > assign and move objects. > > > > > > Martin > > > diff --git a/gcc/hash-table.h b/gcc/hash-table.h > > > index ba5d64fb7a3..26bac624a08 100644 > > > --- a/gcc/hash-table.h > > > +++ b/gcc/hash-table.h > > > @@ -818,8 +818,7 @@ hash_table<Descriptor, Lazy, > > > Allocator>::expand () > > > if (!is_empty (x) && !is_deleted (x)) > > > { > > > value_type *q = find_empty_slot_for_expand > > > (Descriptor::hash (x)); > > > - > > > - *q = x; > > > + new ((void*) q) value_type (x); > > > } > > > > > > p++; > > > @@ -869,14 +868,8 @@ hash_table<Descriptor, Lazy, > > > Allocator>::empty_slow () > > > m_size_prime_index = nindex; > > > } > > > else > > > - { > > > -#ifndef BROKEN_VALUE_INITIALIZATION > > > - for ( ; size; ++entries, --size) > > > - *entries = value_type (); > > > -#else > > > - memset (entries, 0, size * sizeof (value_type)); > > > -#endif > > > - } > > > + memset ((void *) entries, 0, size * sizeof (value_type)); > > > + > > > m_n_deleted = 0; > > > m_n_elements = 0; > > > } > > > > On attempting to rebase my analyzer branch I found that this patch > > uncovered a bug in it, but also possibly a bug in hash-table.h. > > > > In the analyzer branch I have a hash_map with a key where the > > "empty" > > value isn't all-zero-bits. > > There's a test case for it in comment #3 in 92762.
Thanks. I've adapted that into a selftest in this patch. > > Specifically, in gcc/analyzer/program-state.h: sm_state_map has a > > hash_map <svalue_id, entry_t> map_t, where svalue_id, the key type, > > has > > hash traits: > > > > template <> > > inline void > > pod_hash_traits<svalue_id>::mark_empty (value_type &v) > > { > > v = svalue_id::null (); > > } > > > > which is a -1 int (all ones). > > > > memset to zero bits populates the "empty" slots with a key with a > > non- > > empty value for this key type, effectively corrupting the data > > structure (luckily a selftest caught this). > > > > Looking at the above patch, it looks like I was unwittingly relying > > on > > two things: > > (a) #ifndef BROKEN_VALUE_INITIALIZATION, and > > (b) that the default ctor for my key type was the "empty" value. > > > > However, shouldn't empty_slow be calling the Descriptor::mark_empty > > function? Doesn't this memset code assume that the "empty" value > > of > > the hash_table entry is all-zeroes (and thus imposing the same > > assumption on all hash_maps' key and value types?) - which isn't > > the > > case for my code. I don't remember seeing that assumption > > documented. Correcting myself, I *think* this would impose the assumption that all hash_maps' keys types "empty" value is all-zeroes; I don't think that holds for the value types though. > > > > Or am I misreading this? > > IIRC, I had tried the below but it caused problems during bootstrap > that I didn't spend enough time investigating. > > for (size_t i = size - 1; i < size; i--) > if (!is_empty (entries[i]) && !is_deleted (entries[i])) > { > Descriptor::remove (entries[i]); > mark_empty (entries[i]); > } > > (I think it also caused compilation error in one of the IPA passes > because of a missing mark_empty function in its traits class; maybe > ipa_bit_ggc_hash_traits in ipa-prop.c). > > To be honest, I'm not sure I understand why the memset is even there. > It seems like a leak to me (I can reproduce leaks with a user-defined > type). I was going to get back to it at some point. > > Martin Thanks for all the info. Considering the change you tried to hash_table::empty_slow, AIUI, an entry in a hash_table can be in one of three states: (a) "empty" (b) "deleted" (c) a properly constructed object If I'm reading the change you tried above correctly, it's only marking as empty the entries that are in state (c); it's leaving "deleted" entries untouched. Later in empty_slow, control flow splits into resize vs non-resize cases. In the resize case, alloc_entries is called, and alloc_entries has this loop: for (size_t i = 0; i < n; i++) mark_empty (nentries[i]); thus ensuring that every entry is in the "empty" state. However, for the non-resize case, we now currently reach the code touched in r279139 (the above patch): memset (entries, 0, size * sizeof (value_type)); Hence if a hash_map is emptied without being resized, we end up with either entries still being marked as deleted despite m_n_deleted == 0 (the fix you tried above), or all the entries being zero-filled and thus effectively things if zero isn't an "empty" state (current trunk). The following patch updates hash_table::empty_slow so that the non-resize case uses the same loop as the resize case, thus marking all entries as empty, without assuming that in must be all-zeroes. The patch is actually on top of: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01514.html which merely adds more selftests to hash-map-tests.c and which Jeff already approved (the hash_map in that new test used an EMPTY value of -1, but the various ops happened not to call empty_slow). Hopefully I'm understanding things correctly. Is it OK for a hash_map key to have a "empty" value that isn't all-zeroes (and thus the same for a hash_table entry)? Is the following patch OK for trunk? Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu (albeit in conjuction with the rest of the analyzer patch kit, which it fixes; I'll retest cleanly before committing if approved). Thanks Dave gcc/ChangeLog: * hash-map-tests.c (selftest::test_nonzero_empty_key): New selftest. (selftest::hash_map_tests_c_tests): Call it. * hash-table.h (hash_table<Descriptor, Lazy, Allocator>::empty_slow): When not resizing the entries buffer, replace the memset to zero with a loop that calls mark_empty on all elements. --- gcc/hash-map-tests.c | 22 ++++++++++++++++++++++ gcc/hash-table.h | 5 ++++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/gcc/hash-map-tests.c b/gcc/hash-map-tests.c index 9a13a80442c8..9a92425df5db 100644 --- a/gcc/hash-map-tests.c +++ b/gcc/hash-map-tests.c @@ -278,6 +278,27 @@ test_map_of_type_with_ctor_and_dtor () } } +/* Test calling empty on a hash_map that has a key type with non-zero + "empty" value. */ + +static void +test_nonzero_empty_key () +{ + typedef int_hash<int, INT_MIN, INT_MAX> IntHash; + hash_map<int, int, simple_hashmap_traits<IntHash, int> > x; + + for (int i = 1; i != 32; ++i) + x.put (i, i); + + ASSERT_EQ (x.get (0), NULL); + ASSERT_EQ (*x.get (1), 1); + + x.empty (); + + ASSERT_EQ (x.get (0), NULL); + ASSERT_EQ (x.get (1), NULL); +} + /* Run all of the selftests within this file. */ void @@ -286,6 +307,7 @@ hash_map_tests_c_tests () test_map_of_strings_to_int (); test_map_of_int_to_strings (); test_map_of_type_with_ctor_and_dtor (); + test_nonzero_empty_key (); } } // namespace selftest diff --git a/gcc/hash-table.h b/gcc/hash-table.h index 26bac624a082..126bd6d22ad5 100644 --- a/gcc/hash-table.h +++ b/gcc/hash-table.h @@ -868,7 +868,10 @@ hash_table<Descriptor, Lazy, Allocator>::empty_slow () m_size_prime_index = nindex; } else - memset ((void *) entries, 0, size * sizeof (value_type)); + { + for (size_t i = 0; i < size; i++) + mark_empty (entries[i]); + } m_n_deleted = 0; m_n_elements = 0; -- 2.21.0