On Wed, Mar 13, 2024 at 11:23 AM Peter Smith <smithpb2...@gmail.com> wrote: > > On Wed, Mar 13, 2024 at 12:48 PM Masahiko Sawada <sawada.m...@gmail.com> > wrote: > > > > On Wed, Mar 13, 2024 at 10:15 AM Peter Smith <smithpb2...@gmail.com> wrote: > > > > > > On Tue, Mar 12, 2024 at 4:23 PM Masahiko Sawada <sawada.m...@gmail.com> > > > wrote: > > > > > > > > On Fri, Mar 8, 2024 at 12:58 PM Peter Smith <smithpb2...@gmail.com> > > > > wrote: > > > > > > > > ... > > > > > > > 5. > > > > > > > + * > > > > > > > + * If 'indexed' is true, we create a hash table to track of each > > > > > > > node's > > > > > > > + * index in the heap, enabling to perform some operations such > > > > > > > as removing > > > > > > > + * the node from the heap. > > > > > > > */ > > > > > > > binaryheap * > > > > > > > -binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > > void *arg) > > > > > > > +binaryheap_allocate(int capacity, binaryheap_comparator compare, > > > > > > > + bool indexed, void *arg) > > > > > > > > > > > > > > BEFORE > > > > > > > ... enabling to perform some operations such as removing the node > > > > > > > from the heap. > > > > > > > > > > > > > > SUGGESTION > > > > > > > ... to help make operations such as removing nodes more efficient. > > > > > > > > > > > > > > > > > > > But these operations literally require the indexed binary heap as we > > > > > > have an assertion: > > > > > > > > > > > > void > > > > > > binaryheap_remove_node_ptr(binaryheap *heap, bh_node_type d) > > > > > > { > > > > > > bh_nodeidx_entry *ent; > > > > > > > > > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > > > > > Assert(heap->bh_indexed); > > > > > > > > > > > > > > > > I didn’t quite understand -- the operations mentioned are "operations > > > > > such as removing the node", but binaryheap_remove_node() also removes > > > > > a node from the heap. So I still felt the comment wording of the patch > > > > > is not quite correct. > > > > > > > > Now I understand your point. That's a valid point. > > > > > > > > > > > > > > Now, if the removal of a node from an indexed heap can *only* be done > > > > > using binaryheap_remove_node_ptr() then: > > > > > - the other removal functions (binaryheap_remove_*) probably need some > > > > > comments to make sure nobody is tempted to call them directly for an > > > > > indexed heap. > > > > > - maybe some refactoring and assertions are needed to ensure those > > > > > *cannot* be called directly for an indexed heap. > > > > > > > > > > > > > If the 'index' is true, the caller can not only use the existing > > > > functions but also newly added functions such as > > > > binaryheap_remove_node_ptr() and binaryheap_update_up() etc. How about > > > > something like below? > > > > > > > > > > You said: "can not only use the existing functions but also..." > > > > > > Hmm. Is that right? IIUC those existing "remove" functions should NOT > > > be called directly if the heap was "indexed" because they'll delete > > > the node from the heap OK, but any corresponding index for that > > > deleted node will be left lying around -- i.e. everything gets out of > > > sync. This was the reason for my original concern. > > > > > > > All existing binaryheap functions should be available even if the > > binaryheap is 'indexed'. For instance, with the patch, > > binaryheap_remote_node() is: > > > > void > > binaryheap_remove_node(binaryheap *heap, int n) > > { > > int cmp; > > > > Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > > Assert(n >= 0 && n < heap->bh_size); > > > > /* compare last node to the one that is being removed */ > > cmp = heap->bh_compare(heap->bh_nodes[--heap->bh_size], > > heap->bh_nodes[n], > > heap->bh_arg); > > > > /* remove the last node, placing it in the vacated entry */ > > replace_node(heap, n, heap->bh_nodes[heap->bh_size]); > > > > /* sift as needed to preserve the heap property */ > > if (cmp > 0) > > sift_up(heap, n); > > else if (cmp < 0) > > sift_down(heap, n); > > } > > > > The replace_node(), sift_up() and sift_down() update node's index as > > well if the binaryheap is indexed. When deleting the node from the > > binaryheap, it will also delete its index from the hash table. > > > > I see now. Thanks for the information. > > ~~~ > > Some more review comments for v8-0002 > > ====== > > 1. > +/* > + * Remove the node's index from the hash table if the heap is indexed. > + */ > +static bool > +delete_nodeidx(binaryheap *heap, bh_node_type node) > +{ > + if (!binaryheap_indexed(heap)) > + return false; > + > + return bh_nodeidx_delete(heap->bh_nodeidx, node); > +} > > I wasn't sure if having this function was a good idea. Yes, it makes > code more readable, but I felt the heap code ought to be as efficient > as possible so maybe it is better for the index check to be done at > the caller, instead of incurring any overhead of function calls that > might do nothing. > > SUGGESTION > if (binaryheap_indexed(heap)) > found = bh_nodeidx_delete(heap->bh_nodeidx, node);
I think we can have the function inlined, instead of doing the same things in multiple places. I've changed it in the v9 patch. > ~~~ > > 2. > +/* > + * binaryheap_update_up > + * > + * Sift the given node up after the node's key is updated. The caller must > + * ensure that the given node is in the heap. O(log n) worst case. > + * > + * This function can be used only if the heap is indexed. > + */ > +void > +binaryheap_update_up(binaryheap *heap, bh_node_type d) > +{ > + bh_nodeidx_entry *ent; > + > + Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > + Assert(binaryheap_indexed(heap)); > + > + ent = bh_nodeidx_lookup(heap->bh_nodeidx, d); > + Assert(ent); > + Assert(ent->index >= 0 && ent->index < heap->bh_size); > + > + sift_up(heap, ent->index); > +} > + > +/* > + * binaryheap_update_down > + * > + * Sift the given node down after the node's key is updated. The caller must > + * ensure that the given node is in the heap. O(log n) worst case. > + * > + * This function can be used only if the heap is indexed. > + */ > +void > +binaryheap_update_down(binaryheap *heap, bh_node_type d) > +{ > + bh_nodeidx_entry *ent; > + > + Assert(!binaryheap_empty(heap) && heap->bh_has_heap_property); > + Assert(binaryheap_indexed(heap)); > + > + ent = bh_nodeidx_lookup(heap->bh_nodeidx, d); > + Assert(ent); > + Assert(ent->index >= 0 && ent->index < heap->bh_size); > + > + sift_down(heap, ent->index); > +} > > Since those functions are almost identical, wouldn't it be better to > combine them, passing the sift direction? > > SUGGESTION > binaryheap_resift(binaryheap *heap, bh_node_type d, bool sift_dir_up) > { > ... > > if (sift_dir_up) > sift_up(heap, ent->index); > else > sift_down(heap, ent->index); > } I'm not really sure binaryheap_resift() is a better API than binaryheap_update_up() and _down(). Having different APIs for different behavior makes sense to me. On the other hand, I see your point that these two functions have duplicated codes, so I created a common function for them to remove the duplication. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com