By way of not really scientifically benchmarking this, but for getting a feel 
for things, I ran timing tests with three different document size classes:

- 100 bytes
- 512 bytes
- 1024 bytes

I’m using our trusty benchbulk.sh[1] script, so the majority of the data is a 
single long value field with loads of `0`s in them. In no way representative, 
but quick to produce 1M docs.

I’m running this on an 8 core Mac Mini with a very fast SSD, three runs per 
version. This is on my regular work machine, so other things are going on, but 
the timings are surprisingly stable to the second.

I measured how long it takes to build an index over 1M documents in a q=2 
database, to leave enough cores for CouchDB and whatever else is going on on 
the box. At no point are CPU or RAM maxed out, neither is the disk IO capacity.

Repeated and interleaved runs should shake out any file system caching 
variability (which I couldn’t observe anyway).

100 byte docs:

couchjs is ~10% faster than deno, while using ~60% less CPU (40% vs. 70%), RAM 
usage is rather erratic, springs from 20MB to 180MB and back periodically. deno 
RAM grows very slowly, maxing out at 110MB at the end of the run, so I presume 
whatever long-generational GC isn’t even kicking in yet.

512 byte docs:

couchjs is ~5% faster than deno, same CPU and RAM profiles.

1024 byte docs:

couchis is 20% slower(!) than deno, same CPU and RAM profiles.

At 512 and 1024 byte docs, deno makes beam.smp work a little harder, about 5% 
CPU usage.

All of the runs take between 1 and 2 minutes, so longer-running impacts aren’t 
showing here.

As you can see, this is very unscientific, but gives us an interesting 
direction.

Depending on the workload, the deno query server *might* lead to faster 
indexing, on larger docs, while making potentially better use of available CPU 
resources (or less euphemistically: at the expense of using more CPU time), and 
with a lot more stable RAM profile.

Given that I was able to put this together relatively quickly, and deno is very 
new, I find this rather promising.

In addition, since it is rather easy distributing this query server (install 
deno, download the .js file, set an env var, done), this might be a nice 
community alternative to couchjs for folks who see benefits.

I’d also like to see us taking this to the deno folks to see if they have 
anything up their sleeve in terms of speeding up stdio, or if there are tricks 
we can pull on the JS side.

* * *

One more interesting point I didn’t mention last night: this is entirely in JS 
based on an existing runtime. As opposed to couchjs, where we currently 
maintain a C and a C++ integration layer that nobody likes touching.

A pure-JS implementation, and my cleaned up (albeit less feature-full) ~500LoC 
of relatively modern JS might lead to renewed innovation in the space. Who 
doesn’t like a well-defined performance game :)

Plus, it’d be interesting to see if the TypeScript compiler could add more 
optimisations once the query server implementation is translated and 
type-annotated to be proper TypeScript.

Best
Jan
—
[1]: https://github.com/apache/couchdb/blob/master/test/bench/benchbulk.sh

> On 14. May 2020, at 22:01, Jan Lehnardt <j...@apache.org> wrote:
> 
> Hey all,
> 
> I got nerd sniped by Joan this morning:
> 
>    <+Wohali> hmmmmm. https://github.com/denoland/deno
>    <+Wohali> i know i know another runtime but it's focused on security
> 
> I wondered what it would take to make a couchjs variant based on deno. Turns 
> out: about a day if you cut some corners ;)
> 
> One of the interesting aspects, as Joan notes, is its more-secure-by-default, 
> so I have some hopes that this might work out better than our ill-fated 
> nodejs query server experiment from a few years back.
> 
> I started by hacking up a readily generated main.js, then ran `make` again, 
> and did it all again. Overall, it is ~30 LOC changes. Since there is no 
> synchronous `readline()` available and JS code can either by sync or async, 
> we can’t make it so one source could run in our couchjs or deno.
> 
> So I went ahead and ripped all the basics out of our main.js and modernised 
> things a little bit along the way. The result is a main-deno.js that can run 
> map/reduce/rereduce/filter/view_filter/validate_doc_update functions (as 
> validated by the query server spec).
> 
>    https://gist.github.com/janl/c3139bc72efe663e35005d8864c4201f
> 
> I intentionally left out the couchappy functions, as at least lists with the 
> `getRow()` function won’t be implementable without an API break. I also left 
> out legacy compact with esprima/escodegen to keep things more manageable. Oh 
> and no lib/modules, given today’s JS packaging tooling, it’s an easy choice 
> to leave out.
> 
> I haven’t done any sort of benchmarking, but I’d love for someone here to 
> give this a try. Here’s how to hack up `./dev/run` to add support for `deno` 
> design docs:
> 
>   https://gist.github.com/janl/01559f8617ef44afd5ceec39ec8389e8
> 
> If you want to run this on a regular CouchDB setup, set up this env var 
> before launching CouchDB:
> 
>    COUCHDB_QUERY_SERVER_DENO="deno run --allow-write /path/to/main-deno.js”
> 
>    `--allow-write` is only required for the debug log (/tmp/deno-qs.log), but 
> won’t be required during operation, adding to the sandboxed nature of it all.
> 
> And some proof of operation:
> 
>   https://gist.github.com/janl/8636d469420a1fd2de481ae8f5780854
> 
> It’d be nice to see how stable this is in practice and if there are any 
> meaningful performance / resource-usage differences. Any takers? I’ll answer 
> any and all setup questions.
> 
> Now I’m passing the nerd-snipe torch to Paul:
> 
>    <+jan____> uh, and it is embeddable https://deno.land/manual/embedding_deno
> 
> Best
> Jan
> —

Reply via email to