On Sun, May 9, 2010 at 11:54 PM, Stefan Seelmann <[email protected]>wrote:
> No objection at all. > > I updated XDBM to use an <ID> type parameter to be flexible for > different ID types. The reason was that I wanted to use UUID for the > HBase partition. If we would use UUID in general for all partitions we > can remove that type parameter again. > > Hmmm I guess you tried this out with the HBase partition already? Was wondering how it worked since the increment for the long is used to update the on disk stored value. I would have thought that the ID parameter extended Numeric or something. Alex > > Alex Karasulu wrote: > > Hi all, > > > > Any thoughts about using the globally visible UUID in the XDBM partition > > design for the primary key for Entries instead of using a partition > > specific Long ID? > > > > I'm thinking we need one day to implement certain features. Let me list > > then and also point out why using the globally unique UUID might be > > advantageous: > > > > (1) System wide DN and Entry Cache > > > > Rather than having each partition manage it's own cache a central > > DN and Entry cache makes sense. In this case a global identifier for an > > entry might come in handy for hashing cached values. > > > > (2) Nested Partitions, Default Root Partition, Hash Partitioning and > > Range Partitioning > > > > At some point we will want to have nestable partitions. This means > > we can have one ADS Partition mounted under another ADS Partition with > > operation routing taking place properly to the nested partition where > > appropriate. > > > > Nested partitions will also allow us to also have a default root > > partition from which we can mount other partitions. The default root > > partition is nice to have since it allows us to add administrative areas > > and their administrative points with subentries onto the root empty > > string DN. It also makes it so the RootDSE is now stored in this > > partition properly with persistence. Right now the RootDSE is generated > > and not mutable. > > > > Hash partitioning and range partitioning entails distributing > > entries across partitions under some container entry based on some > > value. Hash partitioning uses the value's hash to distribute entries > > where as range partitioning uses ranges of values to distribute the > > entries. So it's not really the DN that determines which partition the > > entry is pushed into but this hash or range value. This makes it so we > > can scale to very large numbers of entries in the DIT while also > > distributing the disk access load across several disk spindles as does > > Oracle's RDBMS in these kinds of configurations. > > > > (3) Global Indices > > > > If we use a globally unique UUID instead of a partition specific > > Long ID then we can expose index segments managed by partitions to > > higher layers to construct global indices. These global indices can > > then be used to conduct searches outside of the partition one step > > higher. This makes it possible for us to implement certain virtual > > directory strategies irregardless of the partition implementations used > > in a server's configuration. The XDBM search algorithm can leverage > > these global indices or delegate sub partition search to a partition if > > a partition uses it's own search mechanism. There's a lot to be said > > here but this is neither the time or the place to expand on this topic. > > But global indices is a key factor for several things including > > virtualization. > > > > Thoughts? > > > > -- > > Alex Karasulu > > My Blog :: http://www.jroller.com/akarasulu/ > > Apache Directory Server :: http://directory.apache.org > > Apache MINA :: http://mina.apache.org > > To set up a meeting with me: http://tungle.me/AlexKarasulu > > -- Alex Karasulu My Blog :: http://www.jroller.com/akarasulu/ Apache Directory Server :: http://directory.apache.org Apache MINA :: http://mina.apache.org To set up a meeting with me: http://tungle.me/AlexKarasulu
