Hi, > Currently we are storing blobs by breaking them into small chunks and > then storing those chunks in MongoDB as part of blobs collection. This > approach would cause issues as Mongo maintains a global exclusive > write locks on a per database level [1]. So even writing multiple > small chunks of say 2 MB each would lead to write lock contention.
so far we observed high lock content primarily when there are a lot of updates. inserts were not that big of a problem, because you can batch them. it would probably be good to have a test to see how big the impact is when blogs come into play. > Mongo also provides GridFS[2]. However it also uses a similar strategy > like we are currently using and such a support is built into the > Driver. For server they are just collection entries. > > So to minimize contentions for write locks for uses cases where big > assets are being stored in Oak we can opt for following strategies > > 1. Store the blobs collection in a different database. As Mongo write > locks [1] are taken per db level then storing the blobs in different > db would allow the read/write of node data (majority usecase) to > continue. sounds reasonable. what is the impact of such a design when it comes to map-reduce features? I was thinking that we could use it e.g. for garbage collection, but I don't know if this is still an option when data is spread across multiple databases. > 2. For more asset/binary heavy usecase use a separate database server > itself to server the binaries. connecting to a second server would add quite some complexity to the system. wouldn't it be easier to just leverage standard mongodb sharding to distribute the load? > 3. Bring back the JR2 DataStore implementation and just save metadata > related to binaries in Mongo. We already have S3 based implementation > there and they would continue to work with Oak also that was one of my initial thoughts as well, but I was wondering what the impact of such a deployment is on data store garbage collection. regards marcel > Chetan Mehrotra > [1] http://docs.mongodb.org/manual/faq/concurrency/#how-granular-are- > locks-in-mongodb > [2] http://docs.mongodb.org/manual/core/gridfs/
