davisp opened a new pull request #496: Couchdb 3287 pluggable storage engines
URL: https://github.com/apache/couchdb/pull/496
 
 
   ## Overview
   
   Pluggable Storage Engines (Oh my!)
   
   I've finally covered enough bases to start asking for reviews on the 
pluggable storage engine work. I know that this is a fairly large change so I 
don't expect this to actually be merged for a number of weeks. So everyone that 
wants to review any part of this feel free to take your time and be thorough.
   
   I highly suggest reviewing this work a commit at a time as there are a few 
fairly large commits. A few signposts for the intrepid reader:
   
   1. Add couch_db_engine module
   
   This is a single file addition that just introduces the new pluggable 
storage engine API. Its fairly thoroughly documented and hopefully makes sense 
as to the level that this API is geared at. When I initially started this work 
I spent a lot of time trying to figure out what level of the API would be the 
best place to separate the storage engine from database logic. Go too high and 
every engine is reimplementing core bits of logic. Go to low and there's not 
enough room for interesting changes to the actual storage engine algorithms. I 
believe this is a happy middle ground that gives storage engines the room to 
play and invent alternative implementations while also not requiring an 
extremely large amount of reimplementation for various bits of behavior that 
are required.
   
   2. Add legacy storage engine implementation
   
   This commit is quite big but its important to note that all its doing is 
copying the existing parts of couch_db.erl and couch_db_updater.erl and moving 
them to a new set of modules named couch_bt_engine_*.erl. Nothing uses this 
code yet. I have this in its own commit since its fairly large. However, if you 
mostly just want to check it you should be able to see that the implementation 
is just pulled from existing functions. The behavior of this engine is 
identical to the "pre-PSE" engine because that's what it is. Its just been 
reformatted a bit and had its name changed.
   
   Also, I've kept this and couch_db_engine.erl in the main couch application 
since we'll always want to have at least one storage engine. Though this was 
pre-monorepo when adding another git repo to our builds seemed awfully heavy. 
Now that we're monorepo we're just creating a folder which is easy enough. For 
any reviewer I'd like to have them keep this in mind as one thing I expect to 
discuss is whether we should split this out into its own application.
   
   3. Implement pluggable storage engines
   
   This is the doozy. There are two main bits of work going on in this commit. 
First, the removal of all the code that was in the previous commit and second, 
the addition of all the code to have couch_db and couch_db_updater start using 
the couch_db_engine APIs. Its long and big but really if you take your time 
there's nothing magical going on here. For the core bits to study I'd recommend 
spending some extra time on couch_server to see how engines are configured and 
chosen at runtime.
   
   4. Add storage engine test suite
   
   This is a fairly complete test suite for the entire storage engine API. I 
don't remember what coverage of couch_bt_engine was but I know its fairly high. 
The useful bit about this is when devs are creating new storage engines they 
can just pull this in with a single eunit case to test their implementations. 
I'll show a few examples on this below.
   
   5. Ensure deterministic revisions for attachments
   
   I may end up squashing this into the implementation commit. This was a fix 
while I was developing PSE that I ended up refixing after the fact slightly 
differently. The original goal was to make sure everything compiled and all 
tests suites ran for each commit. I think I've just convinced myself to squash 
this. So if its missing its because I did that and then forgot to edit this 
description to remove this paragraph...
   
   ## Testing recommendations
   
   $ make check
   
   ## JIRA issue number
   
   https://issues.apache.org/jira/browse/COUCHDB-3287
   
   ## Related Pull Requests
   
   This PR is based on the previous mixed cluster upgrade PR. However when 
merge time comes around we'll merge #495 first and then this will be merged 
into master so that users can deploy the first changes before these changes. 
(Uptime is fun time).
   
   https://github.com/apache/couchdb/pull/495
   
   ## Checklist
   
   - [ ] Code is written and works correctly;
   - [ ] Changes are covered by tests;
   - [ ] Documentation reflects the changes;
   
   (We should probably remove the todo item for updating rebar.config since 
we're doing almost all work on the monorepo now)
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to