Hi Ian, indexes are bound per collection. That means that if you have a large collection that index will be correspondingly large. In the case of *_id* which is the primary key of all collections on MongoDB this is proportional to the number of documents that you contain per collection. Having a large data spread across different collections makes those indexes individually smaller but in combination larger (we need to account for the overhead of each index entries and some header information that composes the indexes). Also take into account that every time you switch between collections to perform different queries (there are no joins in MongoDB) you will need to reload to memory the index structure of all individual collections affected by your query, which comes with some penalties, if you do not have enough space in ram for the full amount. That said, in MongoDB all information is handled using one single big file per database (although spread across different extensions on disk) on storage engine MMApv1 (current default for both 3.0 and 2.6). With WiredTiger this is broke down to individual files per collection and per index structure.
Bottom line is, if there would be a marginal benefit for insert rates if you break the JCR nodes collection into different collections due to the fact that per insert you would have smaller index and data structures to transverse and update, but a lot more inefficiencies on the query part since you would be page faulting more often to address the traverse required on both indexes and collection data. So yes, Chetan is right by stating that the actual size occupied by the indexes would not be smaller, it would actually increase. What is important to mention is that sharding takes care of this by spreading the load between instances and this reflects immediately both on the size of the data that each individual shard would have to handle (smaller data collections = smaller indexes) and allows paralleled workload while retrieving back the query requests. Another aspect to considered is that fragmentation of the data set will affect reads and writes on the long term. I'm going to be delivering a talk soon at http://www.connectcon.ch/2015/en.html where I address this (If you are interested on attending) on how to handled and detect these situations on JCR implementations. To complete the description, the concurrency control mechanism (often quoted by locking) is more granular in 3.0 MMApv1 implementation, going from database level to collection. N. On Fri, Jun 12, 2015 at 7:31 PM, Ian Boston <[email protected]> wrote: > H Norberto, > > Thank you for the feedback on the questions. I see you work for as an > Evangelist for MongoDB, so will probably know the answers, and can save me > time. I agree it's not worth doing anything about concurrency even if logs > indicate there is contention on locks in 2.6, as the added complexity would > make read things worse. If an upgrade to 3.0 has been done, anything > collection based makes is a waste of time due to the availability of > WiredTiger. > > Could you confirm that separating one large collection into a number of > smaller collections will not reduce the size of the indexes that have to be > consulted for queries of the form that Chetan shared earlier ? > > I'll try and clarify that question. DocumentNodeStore has 1 collection > containing all Documents "nodes". Some queries are only interested in a key > space representing a certain part of the "nodes" collection, eg > n:/largelystatic/**. If those Documents were stored in nodes_x, and > count(nodes_x) <= 0.001*count(nodes), would there be any performance > advantage or does MongoDB, under the covers, treat all collections as a > single massive collection from an index and query point of view ? > > If you have any pointer to how 2.6 scale relative to collection size, > number of collections and index size that would help me understand more > about its behaviour. > > Best Regards > Ian > > > > > On 12 June 2015 at 17:08, Norberto Leite <[email protected]> > wrote: > > > Hi Ian, > > > > Your proposal would not be very efficient. > > The concurrency control mechanism that 2.6 offers (current supported > > version), although not neglectable, would not be that beneficial on the > > write load. On the reading part, which we can assume is the gross > workload > > that JCR will be doing, is not affected by that. > > One needs to consider that every time you would be reading from the JCR > you > > either would be providing a complex M/R operation, which is designed to > > span out to the full amount of documents existing in a given collection, > > and would need to recur all affected collections. Not very effective. > > > > The existing mechanism is way more simple and more efficient. > > With the upcoming support for wired tiger, the concurrency control > > (potential issue) becomes totally irrelevant. > > > > Also don't forget that you cannot predict the number of child nodes that > a > > given system would implement to define their content tree. > > If you do have a very nested (on specific level) number of documents you > > would need to treat that collection separately(when needing to scale just > > shard that collection and not the others) bringing in more operational > > complexity. > > > > What can be a good discussion point would be to separate the blobs > > collection into its own database given the flexibility that JCR offers > when > > treating these 2 different data types. > > Actually, this reminded me that I was pending on submitting a jira > request > > on this matter <https://issues.apache.org/jira/browse/OAK-2984>. > > > > As Chetan is mentioning, sharding comes into play once we have to scale > the > > write throughput of the system. > > > > N. > > > > > > On Fri, Jun 12, 2015 at 4:15 PM, Chetan Mehrotra < > > [email protected]> > > wrote: > > > > > On Fri, Jun 12, 2015 at 7:32 PM, Ian Boston <[email protected]> wrote: > > > > Initially I was thinking about the locking behaviour but I realises > > 2.6.* > > > > is still locking at a database level, and that only changes to at a > > > > collection level 3.0 with MMAPv1 and row if you switch to WiredTiger > > [1]. > > > > > > I initially thought the same and then we benchmarked the throughput by > > > placing the BlobStore in a separate database (OAK-1153). But did not > > > observed any significant gains. So that approach was not pursued > > > further. If we have some benchmark which can demonstrate that write > > > throughput increases if we _shard_ node collection into separate > > > database on same server then we can look further there > > > > > > Chetan Mehrotra > > > > > >
