On Apr 23, 2010, at 10:12 AM, Miles Fidelman wrote: > Adam, > > Adam Kocoloski wrote: >> On Apr 23, 2010, at 8:52 AM, Miles Fidelman wrote: >> >>> - notes on the replication process (step-by-step, what happens when >>> replication is invoked - what code modules are involved and so forth), >>> and/or, >>> >> couch_rep_* modules handle replication. How familiar are you with >> Erlang/OTP? couch_rep_sup is a supervisor for all replications, each of >> which has a couch_rep gen_server and changes_feed, missing_revs, reader, and >> writer processes. Each of those processes handles one part of the >> "conversation" on the slide I pointed out to you two days ago. Data flows >> from changes_feed -> missing_revs -> reader -> writer. >> > > Pretty familiar with Erlang at a conceptual/system level; starting to take > the time to get fluent in programming. Haven't done functional languages in > a long time. > >>> - an overview of the code for someone new to the project - what lives in >>> what modules, how they string together - anything that might shortcut >>> having to read through every module and make sense of things from scratch >>> >>> Anything - handwritten notes, slides from a code walkthrough, that kind of >>> thing. >>> >> Hi Miles, not to sound critical, but I don't think such a broad request will >> get you very far. If you have specific questions I'll be happy to answer >> them. >> > With all do respect... lots of projects maintain documentation of internals, > particularly efforts focused on platform technologies intended for long-term > and broad-based application. Certainly in the world of commercial software > development it's the rare project that doesn't have documentation providing a > high level view of a large software system -- it's pretty hard to either > bring new team members on board, or to perform long-term maintenance of code. > Granted that it's a bit harder to maintain this level of documentation on > open-source projects without steady funding, but I will point at some > examples: > - Linux Kernel Internals: somewhat old (2.4), but > http://tldp.org/LDP/lki/index.html (I know there are updates) > - Apache HTTPD: http://httpd.apache.org/docs/2.2/developer/ > - MongoDB, documentation of replication internals: > http://www.mongodb.org/display/DOCS/Replication+Internals > - or even http://wiki.github.com/erlang/otp/routemap-source-tree - providing > a basic overview of Erlang's internals > >> Please, take a shot at reading the code for the part you're interested in. >> If you come across something you don't understand, send an email or join >> #couchdb on IRC. Many of the devs hang out there regularly and can walk you >> through the code. Best, >> > > It doesn't seem that unreasonable to at least ask whether Couch has some > similar documentation floating around - if only at the level of notes put > together by an individual developer, or for discussion among developers. > > Couch is certainly aiming at long-term viability as a platform for > broad-based use, and seems to be aiming at being a broad-based open-source > effort. To succeed over the long term, it will NEED to have a good set of > developer-level documentation. "Read the code" is not a a long-term solution. > > Re. replication, in specific, the the couch_rep_* modules do not contain much > in the way of comments. > > Personally, I've been involved in a LOT of network protocol-related work > (BBN, back to the ARPANET days). I've yet to see any kind of protocol work > where someone hasn't jotted down at least a sequence diagram and some kind of > dataflow diagram showing how all the pieces fit together. More common is a > full-blown ASN.1 description, and eventually an RFC in full gory detail. > > It does not seem unreasonable to ask if someone has jotted down notes about > the full set of steps executed, and code modules involved, when Couch > receives a "POST /_replicate" transaction. > > At the very least, it sure would be helpful to have something like: > http://httpd.apache.org/docs/2.2/developer/request.html, or > http://www.apachetutor.org/dev/request > to detail the sequence of events and code involved in request processing. > > If, in fact, that kind of information has never been put on "paper," and > lives only in the source code and a few people's heads, that scares me a lot > vis-a-vis committing to Couch as a platform for any kind of serious project. > > Miles Fidelman
Hi Miles, I wasn't calling your request unreasonable, and I wasn't vouching for reading the code as the optimal source of developer documentation. But it is what we have right now when you want to learn about things at module-level granularity. It terms of broader architectural overviews, you may find Ricky Ho's set of articles useful: http://horicky.blogspot.com/2008/10/couchdb-implementation.html Regards, Adam