I've worked with folks using partitioned database so I thought I'd drop my experience of that here:
- partitioned databases can definitely give a performance boost (in CouchDB < 4 scenarios) to use-cases where the main "read" use-case can be directed to a single partition. In such cases, only a fraction of the shards are exercised in answering the query - so there are scalability benefits there. - not everyone who wanted to migrate from non-partitioned --> partitioned did end up doing so - migrating involves mutating the document _id and replication can't help - plus having to rethink indexing, access patterns is too much for some etc. It seemed much better suited to "green field" projects. - in some cases partitioned databases made performance worse - by directing a large proportion of traffic to one or a handful of partitions. This may not be obvious at the design stage, you only find out when real-world traffic arrives! - it would have been nice to have a "per partition changes feed" - which would allow a "one partition per user" model, with all the data in the same database for reporting purposes. On Mon, 11 May 2020 at 12:35, Garren Smith <gar...@apache.org> wrote: > Coming back to this. I still think we should support it fully in 4.x so > that anyone using it in 3.x will not experience any api changes when moving > to 4.x. Once we have had more people use it in 3.x we can make a call on > deprecating it for 5.x or look at adding more features to it. > > On Tue, Apr 21, 2020 at 11:01 PM Robert Samuel Newson <rnew...@apache.org> > wrote: > > > On Adam's point that the partitioned query api encourages good choices > > ("discourages hot spots"), that's only true for folks that read the > > documentation, which in my experience is a low percentage of folks. I've > > encountered a heavy user of partitioned dbs that had precisely four > > partitions in mind, for millions of docs (They chose "doc_type" as their > > partition value). > > > > My view for 4.0 is; > > > > 1) ignore the partitioned flag when creating databases > > > I don't think we should ignore it. > > 2) the "partitioned" property no longer reported in GET /dbname > > > > I would prefer we report the partitioned flag. It seems confusing to not > report a setting a user intentionally set. > > 3) the various _partition endpoints still work > > 4) all views work either "global" or "partitioned" depending on the > > endpoint used. > > > > for 5.0 I'm +0 on removing the _partition endpoints, but we can take that > > vote at the time based on contemporary feedback. > > > > B. > > > > > On 21 Apr 2020, at 21:35, Robert Samuel Newson <rnew...@apache.org> > > wrote: > > > > > > Hi, > > > > > > Good points on both sides of this. One thing we can hopefully get > > agreement on is the ?partitioned=true flag on creation and, deeper, the > > lack of distinction between the two "types" of database going forward? > > > > > > B. > > > > > >> On 21 Apr 2020, at 18:51, Garren Smith <gar...@apache.org> wrote: > > >> > > >> I'm on the fence when it comes to removing it. In terms of the > original > > >> plan of making querying faster by querying fewer shards that obviously > > >> isn't needed. But I think it does create a nice mental model/design > > pattern > > >> when building an application in CouchDB. Splitting your data into > > >> partitions that contain similar documents makes sense. And once we on > > FDB > > >> it would be awesome to see if we could have a changes feed per > > partition. > > >> That would be a really nice feature. > > >> > > >> Cheers > > >> Garren > > >> > > >> On Tue, Apr 21, 2020 at 5:51 PM Adam Kocoloski <kocol...@apache.org> > > wrote: > > >> > > >>> I think it’s difficult to make a call when 3.0 is still so new. > > >>> > > >>> The case for deprecation here is basically less code to maintain, > > right? > > >>> It’s not like a user of partitioned databases is causing pain for an > > >>> FDB-based CouchDB; if anything, there’s a second-order benefit > because > > the > > >>> partitioning discourages hot spots from forming in the > > (range-partitioned) > > >>> FDB keyspace. > > >>> > > >>> Cheers, Adam > > >>> > > >>>> On Apr 20, 2020, at 11:51 PM, Kyle Snavely <kjsnav...@gmail.com> > > wrote: > > >>>> > > >>>> My two cents is the same. Let's allow 3.* users migrate to 4.* > without > > >>>> needing to e.g. change the PQ part of their application and remove > > the PQ > > >>>> endpoints in 5.0. > > >>>> > > >>>> Best, > > >>>> Kyle > > >>>> > > >>>> On Mon, Apr 20, 2020, 4:16 PM Ilya Khlopotov <iil...@apache.org> > > wrote: > > >>>> > > >>>>> Given that it unlikely that there are too many people using it and > > it is > > >>>>> being noop in FDB world. I think we should deprecate and remove > > >>> _partition > > >>>>> endpoint. > > >>>>> > > >>>>> On 2020/04/20 21:04:58, Robert Samuel Newson <rnew...@apache.org> > > >>> wrote: > > >>>>>> Hi All, > > >>>>>> > > >>>>>> I'd like to get views on whether we should preserve the _partition > > >>>>> endpoints in CouchDB 4.0 or remove them. In CouchDB 4.0 all _view > and > > >>> _find > > >>>>> queries will automatically benefit from the same performance boost > > that > > >>> the > > >>>>> "partitioned database" feature brings, by virtue of FoundationDB. > > >>>>>> > > >>>>>> If we're preserving it, are we also deprecating it (so it's gone > in > > >>> 5.0)? > > >>>>>> > > >>>>>> If we're ditching it, what will the endpoint return instead (404 > Not > > >>>>> Found, 410 Gone?) > > >>>>>> > > >>>>>> Thoughts welcome. > > >>>>>> > > >>>>>> B. > > >>>>> > > >>> > > >>> > > > > > > > >