The erlang view server does sound like it would be a big performance win - although I think it will make CouchDB meet with more resistance to mass-adoption if it becomes the preferred view path - simply because Erlang is a much less common language than, say, Javascript - and Javascript is a more natural fit to deal with JSON data. Right now, I don't have to train my team up on a new language to merely be clients of CouchDB - and Javascript is far less intimidating and well documented to the novice. For us, there's additional benefits since most of our codebase is in Javascript anyway - so our views can leverage functionality by including code from our app. But if the erlang view server performance turns out to be significantly better, we'll probably have to bite the bullet and rewrite all of our views.
Still, I'm pretty convinced that you could make optimizations to make the javascript view server nearly as fast as the erlang one - especially since, when you really think about it - data comes into CouchDB as JSON, gets run through a Javascript view server, and gets returned to the client as JSON. With JSON in and out (which is barely touched by CouchDB - other than _id and _rev), it does seem like you could avoid most of the conversion overhead. And you could certainly re-architect couchjs to have a more efficient communication path. I know the standard response is 'you want it - well, patches are welcome!' but I'm speaking as a client of CouchDB - one that is very busy trying to get a product launched (and not blessed with enough free time at the present moment to really learn Erlang, let alone contribute significant amounts of code). So I'm just trying to be an advocate for the many developers who would love to use (or are using) CouchDB as a 'better SQL' - a persistent storage backend that makes a lot more sense for storing and dealing with the kinds of data most web apps face. I think there are lot more people like me out there, too - and tons more who are waiting with bated breath to see how CouchDB develops. On Mon, Jul 6, 2009 at 1:53 AM, Brian Candler<[email protected]> wrote: > On Sat, Jul 04, 2009 at 03:26:23PM -0700, Scott Shumaker wrote: >> 53K docs apparently take 68s to be converted to JSON, and received by >> the dummy server (with no docs emitted) - or about 780 docs/second. >> couchjs is slower than bork.rb in this case (unsurprising - bork.rb >> not really parsing the data) >> filtering on the couch side is an enormous win for our test case. >> >> K/V inserts - (5*53K in (200-68)s) = ~2000 per second >> >> This is a pretty big difference from Brian's results (8000/sec), >> although we're dealing with many more docs, and without comparing >> hardware specs, it's difficult to draw conclusions. > > Hmm, my hardware is pretty lame (Thinkpad X30 laptop, P3-M 1.2GHz, 1GB RAM, > 2.5" IDE drive). Maybe it's because you're using a substantially larger > dataset than mine, e.g. more Btree nodes to update or more flushes to disk > or something. I'll try increasing the size of my benchmark test, but > annoyingly RestClient inherits Net::HTTP's 60-second request timeout, and > the high-level RestClient.get / RestClient.put API doesn't allow it to be > overridden. I'll probably monkey-patch Net::HTTP for simplicity. > > Regards, > > Brian. >
