On Mon, Oct 5, 2009 at 3:05 PM, Dana Jansens <d...@cg.scs.carleton.ca> wrote: > Hello, > > I am interested in contributing code for persistent binary search > trees to the GLib project. This would actually count as credit > towards a grad level data structures class for me, but I have a lot of > experience working with GLib in the past, and developing C-based open > source projects. First some background on what persistence means and > why it would be nice to have available in an open source data > structures library. > > A persistent binary search tree is simply a BST that is versioned over > changes, such that you can do a lookup in any previous version of the > tree. An efficient implementation of this data structure requires > O(1) additional space per insertion into/deletion from the tree. That > is, after n insertions, creating n different versions of the tree, the > total space required would be roughly 2*(space for a single version), > while allowing you to search in any version of the tree. Searching in > any version takes the same number of comparisons and pointer > redirections as a tree with only a single version. > > Persistent BSTs are used for operations such as a Plane Sweep [1] in > Computational Geometry. At the same time, the code behind the > persistence can be generalized to provide an additional data structure > that performs Fractional Cascading [2] for no extra cost to the BST > implementation. > > There are currently no readily available open source implementations > of a persistent BST. This would be a great tool for students, > academic researchers, and other developers. > > I have noticed that GLib has a Binary Search Tree data structure, > which currently uses an AVL tree. AVL trees can be made persistent, > but the space requirement becomes O(log n) per insert/delete instead > of O(1). > > There are 2 other BSTs that would work effectively to provide a > persistent BST, while also maintaining a very efficient implementation > for when persistence is not used. They are: > - Treaps > - Red-Black Trees > > Treaps can be expected to outperform Red-Black Trees for any sized > tree. The constant in the search cost for a Treap is Approximately > equal to AVL trees (and slightly less for large trees), and is always > smaller than Red-Black Trees. (It's a small constant regardless.) > Treaps are also much simpler to implement, requiring less complicated > code. However Treaps are a randomized data structure, which some > people try to stay away from. > > My proposed changes to the GTree API are the addition of the following > functions: > > // returns the latest version of the tree available (starts from 0) > guint g_tree_current_version (GTree *tree); > > // creates a new version of the tree, changes to the tree will no longer > // affect the previous versions. returns the new current version > guint g_tree_next_version (GTree *tree); > > For each lookup/traversal function, a new function which takes > arguments (GTree *tree, guint version, ...) would be added, with > existing functions becoming macros pointing to the new ones. > > This API allows a version to contain more than a single insert/delete, > if desired, making them more of a logical entity. > > So my questions are this: > > 1) Is there interest in having a persistent binary search tree in the > GLib library ? > 2) If so, would there be resistance to using a random data strucuture, > i.e. a Treap to implement them ? > > > Thanks very much, > Dana Jansens > > p.s. I sent this before subscribing, but its been stuck in the > moderator queue for more than a week now, so I'm sending again. > Apologies if you see this twice. > > [1] http://en.wikipedia.org/wiki/Sweep_line_algorithm > [2] http://en.wikipedia.org/wiki/Fractional_cascading > _______________________________________________ > gtk-devel-list mailing list > gtk-devel-list@gnome.org > http://mail.gnome.org/mailman/listinfo/gtk-devel-list >
Hi, I'm just a glib user as well, not a dev, but let me get this right. For example lets take a directory with files in it and all those file names are stored in the tree. Now i rename a file. would that mean a new tree gets created with the new filename and the rest remains the same? kinda like version control stuff works. Example: --Somedir -- - file 1 -- - file 2 < rename file 1 to file 01 > -- Somedir -- - file 01 -- - file 2 And how i understand it this is what would be in the tree: -- Somedir -- - file 01 -- (older tree version) file 1 -- - file 2 So, resulting in a tree history per file..? I can imagine this being efficient for something like mac's time machine only then for linux ^_^ Or am i completely wrong here? _______________________________________________ gtk-devel-list mailing list gtk-devel-list@gnome.org http://mail.gnome.org/mailman/listinfo/gtk-devel-list