[
https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13151430#comment-13151430
]
Paul Joseph Davis commented on COUCHDB-1342:
--------------------------------------------
> These are deliberate. Currently we are maintaining an Fd/couch_file pair for
> reads and writes in a bit of a clumsy way in the #btree.fd record and the
> couch_file module and do a switcheroo for writes vs. reads. Now all of that
> is moved into couch_file and couch_db doesn't have to concern itself with the
> details of efficient file access.
Oh I hadn't noticed that weirdness before, but I see it now.
> Good point, API-wise this is a bit leaky and we should definitely look into
> making this better, but I don't think this blocks the bulk of the
> improvements that this patch introduces. I'm happy to open a new ticket about
> this. Good starting points are calling flush() inside couch_file 1) after
> writing a header and 2) if a read fails with {error, eof} (fwiw, Damien tried
> the latter before but removed it again, the details elude me).
I disagree fairly strongly on this point. couch_file is about as core of an API
as it gets. couch_file:flush/1 shouldn't even exist as far as I'm concerned.
This isn't a mediocre API that should be improved later, its a bad API that
shouldn't go in at all. Its 100% foot-gun.
> The difference is that delayed_write does buffering whereas we just want to
> keep writing to the file all the time. delayed_write would not write until
> the buffer is full or the timeout is hit. In addition, we wouldn't be able to
> tell for delayed_commits=false writes when data hit the file reliably.
Oh right, I remember that discussion about clearing the buffer being an issue
now. Good call.
> The whole point of this patch is to make couch_file smarter and move the
> reader/writer abstraction further down the stack and reap the associated
> performance benefits. Unless we have an alternate patch, we can't really
> compare this.
Code Review 2.0:
Q: "This looks complicated, can we think about trying to simplify some of the
organization?"
A: "No. Write it yourself if you care that much."
I don't need another patch to know that this one is complicated and could be
less complicated.
> This doesn't change the error scenarios. We already rely on monitoring to
> tell the request process a couch_file write didn't work
You're pointing at code that's about three abstractions away from couch_file's
writer loop. Suffice it to say, erlang:monitor/2 on a random process in the
write path does little to assuage my fears that we're entering dangerous
territory relying on our own file writing buffers.
> Asynchronous file writes
> ------------------------
>
> Key: COUCHDB-1342
> URL: https://issues.apache.org/jira/browse/COUCHDB-1342
> Project: CouchDB
> Issue Type: Improvement
> Components: Database Core
> Reporter: Jan Lehnardt
> Fix For: 1.3
>
> Attachments: COUCHDB-1342.patch
>
>
> This change updates the file module so that it can do
> asynchronous writes. Basically it replies immediately
> to process asking to write something to the file, with
> the position where the chunks will be written to the
> file, while a dedicated child process keeps collecting
> chunks and write them to the file (and batching them
> when possible). After issuing a series of write request
> to the file module, the caller can call its 'flush'
> function which will block the caller until all the
> chunks it requested to write are effectively written
> to the file.
> This maximizes the IO subsystem, as for example, while
> the updater is traversing and modifying the btrees and
> doing CPU bound tasks, the writes are happening in
> parallel.
> Originally described at http://s.apache.org/TVu
> Github Commit:
> https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira