On Nov 17, 2011, at 10:06 PM, Damien Katz (Commented) (JIRA) wrote: > > [ > https://issues.apache.org/jira/browse/COUCHDB-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13152608#comment-13152608 > ] > > Damien Katz commented on COUCHDB-1342: > -------------------------------------- > > Paul, what I mean by "Apache users concerns" is that #3 isn't something that > vanilla Apache CouchDB users deal with, but third parties who modify the code > or embed in interesting way might (I suppose Cloudant has to deal with this). > Perhaps I'm mistaken about that. I do think patches should only be concerned > with the vanilla use cases in order to be considered check-in quality. > > #4 is a style issue, not a correctness issue, or at least you haven't made a > case that it's a correctness issues. I have no problems with you changing it > to a style you prefer, but we should not expect that submitters of patches > conform to an undocumented style. > > There is no urgency around this patch, at Couchbase we can keep adding > performance enhancements and drift our codebase further and further from > Apache. I don't want to see that happen, but it only hurts the Apache project.
Damien, I agree with both these points, your codebase at Couchbase is drifting but you're not alone in that, we do need a culture where more correct fast code is checked in. I've only had a couple of days to look at this and I've not had the time to read your Couchbase work. As I look at this patch almost every concern Paul is raising is technically valid. We do have to consider more than the vanilla CouchDB as it gets embedded in BigCouch for example, and CouchDB is designed to be distributed, right? I first ran a simple test, adding 10K empty docs, and notice a 40K difference in the db file size. Probably harmless, but I don't know why. There's no real way to independently verify if this patch changes the db layout other than via the semantics of the code. Databases are hard, as you mention, very hard. Without good performance they are next to useless, but a lack correctness is also problematic, certainly in some domains. I share other's frustration with patches languishing. The patches to date I've submitted have all been small and have often had to be refactored as the code migrated away (I think I have 3 now, 2 of them bugs). COUCHDB-911 for example is a real bug, involving both couch_db and couch_db_updater, and as Adam notes is not just a bulk docs issue. It reports a conflict but adds data to the db anyway. Can you believe that? I tried a couple of fixes to minimize the surface area touched but there was no real way to solve it correctly without adding to the data structures. When I saw this patch my first reaction was wow, but now I'll have to rework 911 again as your patch also touches the same files. It's totally orthogonal so no big deal. I mention this only to point out that the review process is awesome and when taken seriously makes for a better result. This isn't just people's pet concerns. It takes time to do this. Fortunately it's not rocket science, it's just databases. The solution to the culture problem is "best practices". Best practices have to be practiced, and someone (Jan as the project lead I'm looking at you :) needs to crack the whip and set the tone. Of course I'm assuming that we're talking about a process to produce "production" quality code. I quote production as that phrase has evolved considerably over the years. If master is deemed acceptable for prototypes, proofs of concept, etc. then fine but otherwise I'd suggest we follow Randall's lead and work this patch on a branch first. Anyway, 'm sure you know these things, I don't mean to prattle on. Best Regards, Bob > And I do see we have some culture problems in the Apache project. We need a > culture where useful, correct, fast code is verified and checked in, and then > is improved incrementally. Right now we have a culture of everyone's pet > concerns must addressed before code gets checked in, which is demoralizing > and slows things down, which is a very big problem the project has right now. > I want your help in trying to change that. > >> Asynchronous file writes >> ------------------------ >> >> Key: COUCHDB-1342 >> URL: https://issues.apache.org/jira/browse/COUCHDB-1342 >> Project: CouchDB >> Issue Type: Improvement >> Components: Database Core >> Reporter: Jan Lehnardt >> Fix For: 1.3 >> >> Attachments: COUCHDB-1342.patch >> >> >> This change updates the file module so that it can do >> asynchronous writes. Basically it replies immediately >> to process asking to write something to the file, with >> the position where the chunks will be written to the >> file, while a dedicated child process keeps collecting >> chunks and write them to the file (and batching them >> when possible). After issuing a series of write request >> to the file module, the caller can call its 'flush' >> function which will block the caller until all the >> chunks it requested to write are effectively written >> to the file. >> This maximizes the IO subsystem, as for example, while >> the updater is traversing and modifying the btrees and >> doing CPU bound tasks, the writes are happening in >> parallel. >> Originally described at http://s.apache.org/TVu >> Github Commit: >> https://github.com/fdmanana/couchdb/commit/e82a673f119b82dddf674ac2e6233cd78c123554 > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators: > https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa > For more information on JIRA, see: http://www.atlassian.com/software/jira > >
