Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Andrew Stitcher Tue, 28 Apr 2015 10:04:39 -0700


> On April 28, 2015, 4:42 p.m., Andrew Stitcher wrote:
> > I can't see anything obviously wrong with this code, but that doesn't mean 
> > too much - so I agree some unit tests are pretty essential.
> > 
> > Incidentally, I think the strategy I would have used would be to mark the 
> > entry deleted, and then only actually discard it when and if the table gets 
> > resized, but it sounds like you maty have tried that approach. And it 
> > certainly has the problem of expanding the table perhaps unnecessarily.
> 
> Gordon Sim wrote:
>     I initially tried introducing an extra 'deleted' state, but still 
> rehashed chains (rather arbitrarily if they ever contained 2 or more 
> 'deleted' entries). Something like that is I think workable, but is a more 
> complex change. (I'll revisit, but I'd like to get a simpler fix first).
>     
>     If you don't reuse or free deleted entries then the chains can get longer 
> which can then (in theory) impact lookup. If you only rehash when expanding 
> then I think this is exarcerbated as the length of chains increases with 
> increased capacity.


Well you can reuse the deleted entries as if they were empty and that won't 
affect the chains. But you are correct that leaving them in the chains without 
reinserting the following entries will make the chains longer and longer thus 
impacting lookup and insertion time.

In any case the whole coalesced strategy is not going to work well for a hash 
table that has a lot of deletes compared to inserts and lookups. I think this 
is generally true as a comparison of the efficiency of hash tables against some 
form of tree structure.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33623/#review81828
-----------------------------------------------------------


On April 28, 2015, 12:50 p.m., Gordon Sim wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33623/
> -----------------------------------------------------------
> 
> (Updated April 28, 2015, 12:50 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Andrew Stitcher, and Rafael Schloming.
> 
> 
> Bugs: PROTON-858
>     https://issues.apache.org/jira/browse/PROTON-858
> 
> 
> Repository: qpid-proton-git
> 
> 
> Description
> -------
> 
> The internal map implementation for proton-c uses coalesced hashing, but the 
> current deletion algorithm does not take account of this. On deletion of 
> entries, the map can lose its ability to locate other entries by key.
> 
> This patch is the simplest possible fix. It frees the deleted entry, makes 
> any previous entry that points to this tha tail of its chain, and then 
> relocates any entries in a chain following the deleted entry. Though I 
> believe this to be correct and safe, it does involve more work on some 
> deletions. The approach can be optimised further, either be reducing the 
> amount of chain that needs relocated, or by marking records as deleted in the 
> first instance and only rehashing later on some criteria.
> 
> (I started off trying to do the latter, but it is more complicated and I 
> still haven't got my patch fully correct. When I get that working I'll post 
> that also. It is a more involved change though, so perhaps we may want to 
> consider this simpler approach for 0.9.1).
> 
> 
> Diffs
> -----
> 
>   proton-c/src/object/map.c 1a758a1 
> 
> Diff: https://reviews.apache.org/r/33623/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Gordon Sim
> 
>

Re: Review Request 33623: Ensure deletion doesn't break coalesced hashing

Reply via email to