On Apr 23, 2010, at 10:12 AM, Miles Fidelman wrote:

> Adam,
> 
> Adam Kocoloski wrote:
>> On Apr 23, 2010, at 8:52 AM, Miles Fidelman wrote:
>>   
>>> - notes on the replication process (step-by-step, what happens when 
>>> replication is invoked - what code modules are involved and so forth), 
>>> and/or,
>>>     
>> couch_rep_* modules handle replication.  How familiar are you with 
>> Erlang/OTP?  couch_rep_sup is a supervisor for all replications, each of 
>> which has a couch_rep gen_server and changes_feed, missing_revs, reader, and 
>> writer processes.  Each of those processes handles one part of the 
>> "conversation" on the slide I pointed out to you two days ago.  Data flows 
>> from changes_feed ->  missing_revs ->  reader ->  writer.
>>   
> 
> Pretty familiar with Erlang at a conceptual/system level; starting to take 
> the time to get fluent in programming.  Haven't done functional languages in 
> a long time.
> 
>>> - an overview of the code for someone new to the project - what lives in 
>>> what modules, how they string together - anything that might shortcut 
>>> having to read through every module and make sense of things from scratch
>>> 
>>> Anything - handwritten notes, slides from a code walkthrough, that kind of 
>>> thing.
>>>     
>> Hi Miles, not to sound critical, but I don't think such a broad request will 
>> get you very far.  If you have specific questions I'll be happy to answer 
>> them.
>>   
> With all do respect... lots of projects maintain documentation of internals, 
> particularly efforts focused on platform technologies intended for long-term 
> and broad-based application.  Certainly in the world of commercial software 
> development it's the rare project that doesn't have documentation providing a 
> high level view of a large software system -- it's pretty hard to either 
> bring new team members on board, or to perform long-term maintenance of code. 
>  Granted that it's a bit harder to maintain this level of documentation on 
> open-source projects without steady funding, but I will point at some 
> examples:
> - Linux Kernel Internals: somewhat old (2.4), but 
> http://tldp.org/LDP/lki/index.html (I know there are updates)
> - Apache HTTPD: http://httpd.apache.org/docs/2.2/developer/
> - MongoDB, documentation of replication internals: 
> http://www.mongodb.org/display/DOCS/Replication+Internals
> - or even http://wiki.github.com/erlang/otp/routemap-source-tree - providing 
> a basic overview of Erlang's internals
> 
>> Please, take a shot at reading the code for the part you're interested in.  
>> If you come across something you don't understand, send an email or join 
>> #couchdb on IRC.  Many of the devs hang out there regularly and can walk you 
>> through the code.  Best,
>>   
> 
> It doesn't seem that unreasonable to at least ask whether Couch has some 
> similar documentation floating around - if only at the level of notes put 
> together by an individual developer, or for discussion among developers.
> 
> Couch is certainly aiming at long-term viability as a platform for 
> broad-based use, and seems to be aiming at being a broad-based open-source 
> effort.  To succeed over the long term, it will NEED to have a good set of 
> developer-level documentation.  "Read the code" is not a a long-term solution.
> 
> Re. replication, in specific, the the couch_rep_* modules do not contain much 
> in the way of comments.
> 
> Personally, I've been involved in a LOT of network protocol-related work 
> (BBN, back to the ARPANET days).  I've yet to see any kind of protocol work 
> where someone hasn't jotted down at least a sequence diagram and some kind of 
> dataflow diagram showing how all the pieces fit together.  More common is a 
> full-blown ASN.1 description, and eventually an RFC in full gory detail.
> 
> It does not seem unreasonable to ask if someone has jotted down notes about 
> the full set of steps executed, and code modules involved, when Couch 
> receives a "POST /_replicate" transaction.
> 
> At the very least, it sure would be helpful to have something like:
> http://httpd.apache.org/docs/2.2/developer/request.html, or
> http://www.apachetutor.org/dev/request
> to detail the sequence of events and code involved in request processing.
> 
> If, in fact, that kind of information has never been put on "paper," and 
> lives only in the source code and a few people's heads, that scares me a lot 
> vis-a-vis committing to Couch as a platform for any kind of serious project.
> 
> Miles Fidelman

Hi Miles, I wasn't calling your request unreasonable, and I wasn't vouching for 
reading the code as the optimal source of developer documentation.  But it is 
what we have right now when you want to learn about things at module-level 
granularity.

It terms of broader architectural overviews, you may find Ricky Ho's set of 
articles useful:

http://horicky.blogspot.com/2008/10/couchdb-implementation.html

Regards, Adam

Reply via email to