Ah, yeah, this makes sense to me. I think this has great potential! On Thu, Nov 23, 2017 at 4:56 AM Mike Rhodes <mrho...@linux.vnet.ibm.com> wrote:
> > > > On 22 Nov 2017, at 18:39, Geoffrey Cox <redge...@gmail.com> wrote: > > > > Hi Mike, this sounds like a pretty cool enhancement. Just to clarify, > > you're also proposing modifying the PUT/POST doc, etc... so that you can > > specify a shard key per doc so that the doc can be stored on a specific > > shard? > > Yes, sort of. A document create request specifies a shard key as part of > the document ID. The guarantee with respect to document placement then is: > > "All documents with the same shard key are stored in the same shard". > > By means of contrast, this *isn't* a way of saying "Put document on > specific shard X". I don't find that ability very compelling for a user > (why would they care that their doc was in range 000000000-abababab or > whatever?), but introducing this grouping mechanism as a higher level > abstraction on things meaningful within a data model I think does offer > substantial benefit. > > To elaborate on why this is useful a couple use-cases might help. > > The first example is along the lines of using a user ID as a shard key. > All documents for that user then end up on the same shard. A query can then > be scoped by user ID (as its the shard key), which means that queries for a > single user's data can be efficiently served from a single shard rather > than asking all shards. This would significantly improve performance of an > application from the point of view of that user. > > Or, in an IoT use case, you might use the device ID as the shard key > enabling fast retrieval of measurements from a single device. > > It's important to note too that a shard may store documents from many > different shard keys, so long as the above guarantee holds. In addition, > the shard key needs to have high cardinality and to effectively spread > requests over the shards. > > An example that doesn't work is using the date as the shard key for the > IoT case: while this has a high cardinality, at any given time, only a > single shard will be in the write path. > > Mike. > > >