On Sat, May 8, 2010 at 11:09 AM, Emmanuel Lecharny <[email protected]> wrote: > On 5/8/10 9:43 AM, Alex Karasulu wrote: >> >> Hi all, >> >> Any thoughts about using the globally visible UUID in the XDBM partition >> design for the primary key for Entries instead of using a partition >> specific >> Long ID? >> >> I'm thinking we need one day to implement certain features. Let me list >> then >> and also point out why using the globally unique UUID might be >> advantageous: >> >> (1) System wide DN and Entry Cache >> >> Rather than having each partition manage it's own cache a central DN >> and Entry cache makes sense. In this case a global identifier for an entry >> might come in handy for hashing cached values. >> >> (2) Nested Partitions, Default Root Partition, Hash Partitioning and Range >> Partitioning >> >> At some point we will want to have nestable partitions. This means >> we >> can have one ADS Partition mounted under another ADS Partition with >> operation routing taking place properly to the nested partition where >> appropriate. >> >> Nested partitions will also allow us to also have a default root >> partition from which we can mount other partitions. The default root >> partition is nice to have since it allows us to add administrative areas >> and >> their administrative points with subentries onto the root empty string DN. >> It also makes it so the RootDSE is now stored in this partition properly >> with persistence. Right now the RootDSE is generated and not mutable. >> >> Hash partitioning and range partitioning entails distributing >> entries >> across partitions under some container entry based on some value. Hash >> partitioning uses the value's hash to distribute entries where as range >> partitioning uses ranges of values to distribute the entries. So it's not >> really the DN that determines which partition the entry is pushed into but >> this hash or range value. This makes it so we can scale to very large >> numbers of entries in the DIT while also distributing the disk access load >> across several disk spindles as does Oracle's RDBMS in these kinds of >> configurations. >> >> (3) Global Indices >> >> If we use a globally unique UUID instead of a partition specific >> Long >> ID then we can expose index segments managed by partitions to higher >> layers >> to construct global indices. These global indices can then be used to >> conduct searches outside of the partition one step higher. This makes it >> possible for us to implement certain virtual directory strategies >> irregardless of the partition implementations used in a server's >> configuration. The XDBM search algorithm can leverage these global >> indices >> or delegate sub partition search to a partition if a partition uses it's >> own >> search mechanism. There's a lot to be said here but this is neither the >> time or the place to expand on this topic. But global indices is a key >> factor for several things including virtualization. >> >> Thoughts? >> > > One other advantage will be that we won't need anymore to store an increment > on the disk. Atm, each time we add an element in the backend, we have to ask > for a Long, which has to be unique. This is potentially a bottleneck, and > it's costly, as this unique Long has to be stored on disk. besides this I see some more advantages
*if* we keep the entryUUID of entry also as the ID of the entry then, building the DN using the RDN index will be a lot easier (cause finding the parent of an entry requires now a full DN construction which can be avoided by doing a reverse lookup in RDN idex if we know the entry's ID) > > I don't yet see any other negative impact we can get by using UUID instead > of Long, except that it will requires more disk space (slightly). yeap, and RDN index also takes more disk space now Kiran Ayyagari
