[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Paul Joseph Davis (Commented) (JIRA) Wed, 16 Nov 2011 17:42:19 -0800

    [ 
https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151723#comment-13151723
 ]


Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------

@Damien

That's an awful lot of disappointment packed into a single comment. First, 
resorting to an ad hominem attack to insinuate that I'm not intelligent enough 
to work on databases is quite disconcerting. Secondly, its an egregious fallacy 
to suggest that because a patch appears to be technically correct that it 
should be committed. Thirdly, declaring what is and isn't a valid reason to 
hold up a patch is not how the ASF works.

And now back to the regularly scheduled technical discussion.

First, couch_file:flush/1. Unless I'm missing something extremely subtle here, 
it's existence is so that clients can read their own writes. Yet the couch_file 
gen_server has all the knowledge it needs to know if it has to flush to service 
a write call. If the requested read position is between #file.eof and #file.eof 
+ #file.queued_write_bytes, then it can call flush and move on with its life. 
Not only does this mean that clients don't have remember to call flush, but it 
removes unnecessary message passing that every unconditional call to flush 
would generate.

Second, this is doubling the number of file descriptors required for anything 
that isn't a database. On the first production machine I checked that's an 
increase of 75% from 40K to 70K file descriptors. That's a fairly serious 
change that ought to be discussed. At the very least it ought to be mentioned 
somewhere so ops teams know to expect it.

Third, this is spawning long lived processes that aren't looping on exported 
functions. After two code upgrades this would crash every couch_file in the VM 
simultaneously.

Fourth, as I've mentioned numerous times before, the proper way to 
synchronously start a process that might fail to initialize is to use 
proc_lib:start_link and proc_lib:init_ack.

Fifth, has anyone considered using a write buffer outside of the couch_file API 
that would allow clients more precise control. For instance, thinking briefly 
on the view updater, you could buffer writes for a single add_remove call. This 
also leads to the possibility that mostly read views aren't needlessly holding 
open a writer fd  for no reason.


                
> Asynchronous file writes
> ------------------------
>
>                 Key: COUCHDB-1342
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1342
>             Project: CouchDB
>          Issue Type: Improvement
>          Components: Database Core
>            Reporter: Jan Lehnardt
>             Fix For: 1.3
>
>         Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit: 
> https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (COUCHDB-1342) Asynchronous file writes

Reply via email to