Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

Robert Newson Thu, 31 Jan 2019 06:10:25 -0800

Hi Nick,

I don't think anyone responded to your points yet.

I think it would significantly complicate this work to make it a per-database 
decision. I think it has to be a wholesale cutover to a new backend with 
appropriate warnings in release notes and guidance on migration. This is why 
the plan is for a major 3.0 release before the fdb-based release (likely to be 
4.0) as a pathway to that. To your specific point about the pluggable storage 
engine, I believe none of that code makes it over the couchdb-on-fdb release.

I called out the problems with reduce functionality in the first post of this 
thread specifically to shake out people's concerns there, so thank you for 
voicing yours. The current approach to reduce only works because we control the 
writing of the b+tree nodes, including when they're split, etc, so we're able 
to maintain intermediate values on the inner nodes as the data changes over 
time. This is not something we can do with FoundationDB directly (or, indeed, 
anything else). We're looking for a solution here. The best we have so far 
preserves the group level reduces (including group level of 0, i.e, the reduce 
value of everything in the view). Those group level reduces will be at least as 
efficient as today, I think. For arbitrary start/endkey reduce we might decide 
to not support them, or to support them the expensive way (without the benefit 
of precomputed intermediate values). 

B.

-- 
  Robert Newson
  rnew...@apache.org

On Sun, 27 Jan 2019, at 14:45, nicholas a. evans wrote:
> Thanks Jan,
> 
> On Sun, Jan 27, 2019, 3:43 AM Jan Lehnardt <m...@jan.io wrote:
> 
> > The FDB proposal is starting at a higher level than the pluggable storage
> > engines. This isn't just about storage, but also about having a new
> > abstraction over the distributed systems aspects of CouchDB.
> >
> 
> Right, FoundationDB would *also* replace all of the
> internal-cluster-replication, since it is already a scalable distributed
> database. I was just curious if we'd be able to leverage the pluggable
> storage engine work. I.e. could the other parts that change (fabric, etc)
> *also* be swapped out such that foundationdb or legacy
> couch_bt_engine+fabric could be selected on a per db basis. Maybe not, but
> IMO it'd still be interesting if we could somehow try out a foundationdb
> pluggable storage engine as a proof of concept.
> 
> As for reduce: CouchDB will *not* lose reduce. Details are TBD, so let's
> > wait to discuss them for when the technical proposal for that part is out,
> > please.
> >
> 
> > So far, all IBM has mentioned is that in their preliminary exploration of
> > this, they couldn't find a trivial way to support *efficient querying* for
> > *custom reduce functions* (anything that isn't _sum/_count/_stats).
> >
> 
> Yes, I understand all of that. But I *really* *really* need efficient
> querying for custom reduce functions. Ideally, I'd like group_level queries
> to be even *more* efficient than they currently are.
> 
> I wasn't trying to jump into a deep technical proposal. I was just putting
> forward a naive napkin sketch level idea, and wondering if it could
> possibly work as a trivial way to support efficient queries on custom
> reduce functions.
> 
> I either need that to stay (or improve) or else I need some new
> features (i.e. view changes feed, dbcopy, etc) that make it easier and
> worthwhile to rewrite all of my code that relies on it. Because if I had to
> had to significantly change my codebase to migrate to CouchDB 4.0, I
> honestly think my bosses might opt to replace our storage layer with
> something other than CouchDB rather than upgrade.
> 
> Thanks,
> Nick Evans
> 
> >

Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

Reply via email to