There are plenty of improvements :) I'm actively working on some implementation details that should benefit the overall performance of MongoMK. Switching / supporting WT would be beneficial but there are other improvements being discussed.
N. On Fri, Jun 12, 2015 at 8:01 PM, Ian Boston <[email protected]> wrote: > Hi Norberto, > > Thank you. That saved me a lot of time, and I learnt something in the > process. > > So in your opinion, is there anything that can or should be done in the > DocumentNodeStore from a schema point of view to improve the read or write > performance of Oak on MongoDB without resorting to sharding or upgrading to > 3.0 and WiredTiger ? > I am interested in JCR nodes not including blobs. > > Best Regards > Ian > > On 12 June 2015 at 18:54, Norberto Leite <[email protected]> > wrote: > > > Hi Ian, > > > > indexes are bound per collection. That means that if you have a large > > collection that index will be correspondingly large. In the case of *_id* > > which > > is the primary key of all collections on MongoDB this is proportional to > > the number of documents that you contain per collection. > > Having a large data spread across different collections makes those > indexes > > individually smaller but in combination larger (we need to account for > the > > overhead of each index entries and some header information that composes > > the indexes). > > Also take into account that every time you switch between collections to > > perform different queries (there are no joins in MongoDB) you will need > to > > reload to memory the index structure of all individual collections > affected > > by your query, which comes with some penalties, if you do not have enough > > space in ram for the full amount. > > That said, in MongoDB all information is handled using one single big > file > > per database (although spread across different extensions on disk) on > > storage engine MMApv1 (current default for both 3.0 and 2.6). With > > WiredTiger this is broke down to individual files per collection and per > > index structure. > > > > Bottom line is, if there would be a marginal benefit for insert rates if > > you break the JCR nodes collection into different collections due to the > > fact that per insert you would have smaller index and data structures to > > transverse and update, but a lot more inefficiencies on the query part > > since you would be page faulting more often to address the traverse > > required on both indexes and collection data. > > > > So yes, Chetan is right by stating that the actual size occupied by the > > indexes would not be smaller, it would actually increase. > > > > What is important to mention is that sharding takes care of this by > > spreading the load between instances and this reflects immediately both > on > > the size of the data that each individual shard would have to handle > > (smaller data collections = smaller indexes) and allows paralleled > workload > > while retrieving back the query requests. > > > > Another aspect to considered is that fragmentation of the data set will > > affect reads and writes on the long term. I'm going to be delivering a > talk > > soon at http://www.connectcon.ch/2015/en.html where I address this (If > you > > are interested on attending) on how to handled and detect these > situations > > on JCR implementations. > > > > To complete the description, the concurrency control mechanism (often > > quoted by locking) is more granular in 3.0 MMApv1 implementation, going > > from database level to collection. > > > > > > N. > > > > On Fri, Jun 12, 2015 at 7:31 PM, Ian Boston <[email protected]> wrote: > > > > > H Norberto, > > > > > > Thank you for the feedback on the questions. I see you work for as an > > > Evangelist for MongoDB, so will probably know the answers, and can save > > me > > > time. I agree it's not worth doing anything about concurrency even if > > logs > > > indicate there is contention on locks in 2.6, as the added complexity > > would > > > make read things worse. If an upgrade to 3.0 has been done, anything > > > collection based makes is a waste of time due to the availability of > > > WiredTiger. > > > > > > Could you confirm that separating one large collection into a number of > > > smaller collections will not reduce the size of the indexes that have > to > > be > > > consulted for queries of the form that Chetan shared earlier ? > > > > > > I'll try and clarify that question. DocumentNodeStore has 1 collection > > > containing all Documents "nodes". Some queries are only interested in a > > key > > > space representing a certain part of the "nodes" collection, eg > > > n:/largelystatic/**. If those Documents were stored in nodes_x, and > > > count(nodes_x) <= 0.001*count(nodes), would there be any performance > > > advantage or does MongoDB, under the covers, treat all collections as a > > > single massive collection from an index and query point of view ? > > > > > > If you have any pointer to how 2.6 scale relative to collection size, > > > number of collections and index size that would help me understand more > > > about its behaviour. > > > > > > Best Regards > > > Ian > > > > > > > > > > > > > > > On 12 June 2015 at 17:08, Norberto Leite <[email protected]> > > > wrote: > > > > > > > Hi Ian, > > > > > > > > Your proposal would not be very efficient. > > > > The concurrency control mechanism that 2.6 offers (current supported > > > > version), although not neglectable, would not be that beneficial on > the > > > > write load. On the reading part, which we can assume is the gross > > > workload > > > > that JCR will be doing, is not affected by that. > > > > One needs to consider that every time you would be reading from the > JCR > > > you > > > > either would be providing a complex M/R operation, which is designed > to > > > > span out to the full amount of documents existing in a given > > collection, > > > > and would need to recur all affected collections. Not very effective. > > > > > > > > The existing mechanism is way more simple and more efficient. > > > > With the upcoming support for wired tiger, the concurrency control > > > > (potential issue) becomes totally irrelevant. > > > > > > > > Also don't forget that you cannot predict the number of child nodes > > that > > > a > > > > given system would implement to define their content tree. > > > > If you do have a very nested (on specific level) number of documents > > you > > > > would need to treat that collection separately(when needing to scale > > just > > > > shard that collection and not the others) bringing in more > operational > > > > complexity. > > > > > > > > What can be a good discussion point would be to separate the blobs > > > > collection into its own database given the flexibility that JCR > offers > > > when > > > > treating these 2 different data types. > > > > Actually, this reminded me that I was pending on submitting a jira > > > request > > > > on this matter <https://issues.apache.org/jira/browse/OAK-2984>. > > > > > > > > As Chetan is mentioning, sharding comes into play once we have to > scale > > > the > > > > write throughput of the system. > > > > > > > > N. > > > > > > > > > > > > On Fri, Jun 12, 2015 at 4:15 PM, Chetan Mehrotra < > > > > [email protected]> > > > > wrote: > > > > > > > > > On Fri, Jun 12, 2015 at 7:32 PM, Ian Boston <[email protected]> wrote: > > > > > > Initially I was thinking about the locking behaviour but I > realises > > > > 2.6.* > > > > > > is still locking at a database level, and that only changes to > at a > > > > > > collection level 3.0 with MMAPv1 and row if you switch to > > WiredTiger > > > > [1]. > > > > > > > > > > I initially thought the same and then we benchmarked the throughput > > by > > > > > placing the BlobStore in a separate database (OAK-1153). But did > not > > > > > observed any significant gains. So that approach was not pursued > > > > > further. If we have some benchmark which can demonstrate that write > > > > > throughput increases if we _shard_ node collection into separate > > > > > database on same server then we can look further there > > > > > > > > > > Chetan Mehrotra > > > > > > > > > > > > > > >
