On Thu, Aug 1, 2013 at 11:47 PM, Emmanuel Lécharny <[email protected]>wrote:
> Hi guys, > > I did some more profiling today. here are some thouhts about what I have > found, and places we can improve the code > > 1) Comparators > > While processing a search, we are comparing keys with what we have in > trees over and over. If the BTree has N level and M elements per page, > in order to find an entry, we will do N*log2(M) comparisons. In my test, > for 200 000 searches, we are doing around 4 x 5 comparison per search, > so around 4 millions calls to the compare() method. In fact, way more > than 4 millions : > > 44 881 820 99384.1 5.9 > > org.apache.mavibot.btree.AbstractPage:compare(Ljava/lang/Object;Ljava/lang/Object;)I > > So the comparator is called plenty of times. Let's have a look at where > it gets called : > > o DefaultSearchEngine.computeResult() calls the db.getEntryId( baseDn > )method > This is to check if the baseDn exist. The getEntryId will use the > RdnIndex to chec that. The problem is that this index can only handle > RDNs, and not DN, so depending on how deep is the baseDN (ie, how many > RDN it has), we will have to do many searches in this index. > > In any case, we will do many comparisons to find the right Dn > > o DefaultOptimizer.annotate() method calls the idx.count() method > Here, we are trying to know how many candiates we will get using one > index. If the AttributeType is singleValued, we will get either 0 (if > the index has the ExprNode value) or 1 (of the index does not have the > ExprNode value). If the AttributeType is multiValued, we will get the > number of associated values for the given key. This is for an > equalityScan, for a <= or >=, the count is computed differently. > > Again, we have many comparisons to do here (and this is multiplicated by > the number of index we have in the filter). > > At some point, the cost of doing all those comparisons largely exceed > the cost of fetching useless candidates from the MasterTable (we are > spending 8 more times fetching data from the indexes than grabing the > entries from the masterTable) > > How can we improve this ? > ------------------------- > > o The first step is tryng to retrieve the entryUUID associated with the > baseDN. It's likely to be done many times, so we could benefit from > using a <Dn, entryUUID> cache, to avoid those costly operations (keep in > mind that we not only do comparisons, we also potentially fetch pages > from the disk if they are not present in memory). A MRU cache will save > us a lot of time here. > > we have a DN cache (DnFactory) but we are not using it effectively o At this point, we compare normalized values with normalized values. > That means we don't have to normalize the values again. Sadly, this is > what we do : > > PrepareString.insignifiantSpacesString(String, boolean) line: 4803 > PrepareString.normalize(String, PrepareString$StringType) line: 244 > DeepTrimToLowerNormalizer.normalize(String) line: 103 > CachingNormalizer.normalize(String) line: 124 > > > DeepTrimToLowerCachingNormalizingComparator(NormalizingComparator).compare(String, > String) line: 76 > DeepTrimToLowerCachingNormalizingComparator.compare(String, String) > line: 32 > > > DeepTrimToLowerCachingNormalizingComparator(NormalizingComparator).compare(Object, > Object) line: 36 > SerializableComparator<E>.compare(E, E) line: 86 > Node<K,V>(AbstractPage<K,V>).compare(K, K) line: 254 > Node<K,V>(AbstractPage<K,V>).findPos(K) line: 189 > Node<K,V>.getValues(K) line: 858 > BTree<K,V>.getValues(K) line: 962 > MavibotTable<K,V>.count(K) line: 526 > MavibotIndex<K,V>.count(K) line: 260 > DefaultOptimizer<E>.getEqualityScan(SimpleNode<V>) line: 304 > DefaultOptimizer<E>.annotate(ExprNode) line: 148 > DefaultOptimizer<E>.getConjunctionScan(BranchNode) line: 245 > DefaultOptimizer<E>.annotate(ExprNode) line: 185 > DefaultSearchEngine.computeResult(SchemaManager, > SearchOperationContext) line: 220 > > MavibotPartition(AbstractBTreePartition).search(SearchOperationContext) > line: 1032 > > As we can see, we are using the AT comparator (here, this is for 'cn') > which will normalize the value, which is already normalized by the > NormalizerInterceptor (for the filter and for the stored value). There > is no need to normalize neither the key we are looking for nor the > stored key. For the record, the prepareString.insignifiantSpacesString() > is extremely voracious : > > 12046499 303 480 18% > > org.apache.directory.api.ldap.model.schema.PrepareString:insignifiantSpacesString(Ljava/lang/String;Z)Ljava/lang/String; > > The first number is the number of time the method is called, the second > the time it takes to execute, and the third number the percentage of > global CPU it takes. > > Now, it's a bit hard to get rid of this computation : the comparator is > associated with the AttributeType, and we don't have one which does not > normalize the value for the server, and another one that normalize the > values for the client (keep in mind that the SchemaManager is used on > the client and the server). So how do we distinguish the use case we are > in ? Tht's the key ; if we are in the server, we should not normalize, > and if we are on the client, we must normalize... > > Atm, I have no idea on how to do that. The only thing is that it's > extremely critical to avoid this extra computation. > > thinking about it realized that a flag based check to avoid normalization is dangerous and will break an embedded server (where the user uses CoreSession API) 2) Annotate() and build() methods > > Those two methods are called by the computeResult() method. They are > approximately as expensive. They mainly do the same thing : > - find the number of candidate > - construct a set of entryUUID based on the smallest index. > > In the test I'm running the filter is like (cn=entryNNN) where NNN is a > random number. Basically, we are spending as much time to find the > number of candidate (1) in the annotate() method, and to fetch him in > the build() method. > > > How can we improve this ? > ------------------------- > > It's not simple. One can think that an equality filter wil return only a > few values, or even 1 or 0. this is true for the kind of filter I'm > using n this test, but it's not true for a filter like > (ObjectClass=person), which can select thousands of candidates. > > One possible way would be to get rid of the count() method, and to fetch > the candidate immediately instead, up to a number of candidate (100 ?). > The fetched candidate will be stored associated with the annotated node, > and can be used immediately by the build() method, instead of being > feteched a second time. > > count() method in Mavibot is efficient and accurate (cause it stores the number of elements present in the tree), so I would still lean towards using count > For single valued AttributeType, it's even simpler : either the index > contains the value, or not, but in any case the count will be zero or > one. In this case, it's obvious that we must fetch the entryUUID > immediately and store it with the ExprNode, as it will be selected as > the smallest index. > > For multi-valued AttributeType, as the value is stored in a sub-btree, > we can decide to fetch the set of candidate if this sub-btree contains > less than a number of candidate, or store the number of candidates. > > Implementing those two improvements is not a big deal, and the gain will > be huge. > > > > Conclusion > ---------- > > I'm not done with the analysis of the profiling session, but if we can > work on the two items I have mentionned in this mail, we would get > already a great deal of improvement. Once done, we can conduct another > profiling session to find which part need to be worked out. > > thoughts ? > > I can join you tomorrow in experimenting with these thanks for the excellent analysis > -- > Regards, > Cordialement, > Emmanuel Lécharny > www.iktek.com > > > -- Kiran Ayyagari http://keydap.com
