Hello, I am interested in contributing code for persistent binary search trees to the GLib project. This would actually count as credit towards a grad level data structures class for me, but I have a lot of experience working with GLib in the past, and developing C-based open source projects. First some background on what persistence means and why it would be nice to have available in an open source data structures library.
A persistent binary search tree is simply a BST that is versioned over changes, such that you can do a lookup in any previous version of the tree. An efficient implementation of this data structure requires O(1) additional space per insertion into/deletion from the tree. That is, after n insertions, creating n different versions of the tree, the total space required would be roughly 2*(space for a single version), while allowing you to search in any version of the tree. Searching in any version takes the same number of comparisons and pointer redirections as a tree with only a single version. Persistent BSTs are used for operations such as a Plane Sweep  in Computational Geometry. At the same time, the code behind the persistence can be generalized to provide an additional data structure that performs Fractional Cascading  for no extra cost to the BST implementation. There are currently no readily available open source implementations of a persistent BST. This would be a great tool for students, academic researchers, and other developers. I have noticed that GLib has a Binary Search Tree data structure, which currently uses an AVL tree. AVL trees can be made persistent, but the space requirement becomes O(log n) per insert/delete instead of O(1). There are 2 other BSTs that would work effectively to provide a persistent BST, while also maintaining a very efficient implementation for when persistence is not used. They are: - Treaps - Red-Black Trees Treaps can be expected to outperform Red-Black Trees for any sized tree. The constant in the search cost for a Treap is Approximately equal to AVL trees (and slightly less for large trees), and is always smaller than Red-Black Trees. (It's a small constant regardless.) Treaps are also much simpler to implement, requiring less complicated code. However Treaps are a randomized data structure, which some people try to stay away from. My proposed changes to the GTree API are the addition of the following functions: // returns the latest version of the tree available (starts from 0) guint g_tree_current_version (GTree *tree); // creates a new version of the tree, changes to the tree will no longer // affect the previous versions. returns the new current version guint g_tree_next_version (GTree *tree); For each lookup/traversal function, a new function which takes arguments (GTree *tree, guint version, ...) would be added, with existing functions becoming macros pointing to the new ones. This API allows a version to contain more than a single insert/delete, if desired, making them more of a logical entity. So my questions are this: 1) Is there interest in having a persistent binary search tree in the GLib library ? 2) If so, would there be resistance to using a random data strucuture, i.e. a Treap to implement them ? Thanks very much, Dana Jansens p.s. I sent this before subscribing, but its been stuck in the moderator queue for more than a week now, so I'm sending again. Apologies if you see this twice.  http://en.wikipedia.org/wiki/Sweep_line_algorithm  http://en.wikipedia.org/wiki/Fractional_cascading _______________________________________________ gtk-devel-list mailing list firstname.lastname@example.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list