Re: [PATCH] hash: declare some functions with the warn_unused_result attribute

Jim Meyering Wed, 17 Jun 2009 12:20:35 -0700

Eric Blake wrote:
> +  /* Guarantee that once we start moving entries into the new table,
> +     we will not need any further memory allocations.  We only need
> +     new allocations if the hasher consolidates multiple buckets from
> +     the old size into one bucket with the new size (a sign of a bad
> +     hasher, but we must deal with it).  Therefore, we need merely
> +     check that there are at least table->n_buckets_used-1 entries on
> +     the free entry list for the worst case collision (everything gets
> +     combined into one bucket during the rehash).  */


That strikes me as pessimistic.
Imagine that we have a full table, with n_buckets_used == 20
and n_buckets == 23.

When rehashing to say 31, odds are very good that we can
get by with *no* entries on the free list -- if we're careful.
So requiring 22 is a recipe for unnecessary failure.

Right after your initial report, I began rewriting hash_rehash
to work as follows:

  allocate new table

  [ optimization to minimize possibility of unnecessary failure:
    in the old table,
      liberate any overflow (chained) entries by moving "data"
        into empty buckets ]


  iterate through old table members, hashing them into the new table,

    [Possible optimization: process all overflow-entries before
     any single-entry bucket.  But that means two passes.
     Alternatively: one pass through buckets, but for chains of
     length > 1, process all allocated entries before the first,
     non-allocated one, which (hence) might require an allocation
     in the new table. ]

   If, in spite of all that, allocation is required, restore to
   the original table any entries that we've moved into the new one
   and free the new one and return false.  However, restoring them
   may be tricky, so we'd have to be careful there, too.

This approach has the added advantage of decreasing the memory
footprint slightly, since we're more likely to reuse existing
allocated entries than to allocate new ones that would eventually
end up on the free list.

Re: [PATCH] hash: declare some functions with the warn_unused_result attribute

Reply via email to