All the time since https://list.orgmode.org/orgmode/ch3pr84mb3424dc5a13943a4a500fee3cc5...@ch3pr84mb3424.namprd84.prod.outlook.com/ where the clock sum code has been rewritten to use org-element API, I have been hitting numerous errors, ranging from AVL-tree not maintaining its ordering, to plain infinite loops. Even though it seems to be triggered by the patch, it is not the patch implementation that is introducing the bugs. The patch merely exposed problems in org-element.
The first is the known problem with org-element-cache-map going into infinite loop. The reproducer is known - https://orgmode.org/list/87seeoznr6.fsf@localhost, but it is not the first problem with that function. The underlying cause is complexity of managing constantly rebalancing AVL-tree and various edge cases in the org-element-cache-map code itself. It does not help that org-element-cache-map should be able to handle AVL tree being modified under the feet when the buffer changes, or we need to parse new items. The second is AVL-tree data structure itself. When used for clock sums, it tends to accumulate large number of CLOCK elements into cache - one after others. Each insertion into AVL tree is a log N operation. Since avl-tree.el does not have any locality optimizations, we end up with significant slowdowns. The third, is the ordering dance we have to do when updating the cache partially (`org-element-cache-get-key'). Normally, the cache is ordered by :begin values, but :begin values are not always correct while the cache is in the middle of updating as the user inserts/deletes text. So, Org has to roll out intermediate key that are only relevant during cache updates. That makes AVL tree rebalancing tricky and leads to subtle bugs. To properly fix the above, I am thinking to abandon AVL tree entirely, and switch to skip lists that do not have to re-balance. Lists are also much easier to traverse, even in the presence of changes. https://git.sr.ht/~yantar92/org-mode/tree/feature/cache-skip-list/item/lisp/org-skip-list.el But such a switch is not at all trivial. So, for the time being, I will revert the changes to org-clock.el. More drastic changes will have to wait until after the next Org release. Otherwise, the next release will have to be postponed indefinitely. I now temporarily reverted the org-clock-sum implementation using org-element API. https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=cde2b1e99 https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=af587bcc5 https://git.savannah.gnu.org/cgit/emacs/org-mode.git/commit/?id=e85ecd813 -- Ihor Radchenko // yantar92, Org mode maintainer, Learn more about Org mode at <https://orgmode.org/>. Support Org development at <https://liberapay.com/org-mode>, or support my work at <https://liberapay.com/yantar92>
