On Thu, Aug 15, 2013 at 11:38 AM, Jan Lehnardt <j...@apache.org> wrote:

>
> On Aug 15, 2013, at 10:09 , Robert Newson <rnew...@apache.org> wrote:
>
> > A big +1 to Jason's clarification of "erlang" vs "native". CouchDB
> > could have shipped an erlang view server that worked in a separate
> > process and had the stdio overhead, to combine the slowness of the
> > protocol with the obtuseness of erlang. ;)
> >
> > Evaluating Javascript within the erlang VM process intrigues me, Jens,
> > how is that done in your case? I've not previously found the assertion
> > that V8 would be faster than SpiderMonkey for a view server compelling
> > since the bottleneck is almost never in the code evaluation, but I do
> > support CouchDB switching to it for the synergy effects of a closer
> > binding with node.js, but if it's running in the same process, that
> > would change (though I don't immediately see why the same couldn't be
> > done for SpiderMonkey). Off the top of my head, I don't know a safe
> > way to evaluate JS in the VM. A NIF-based approach would either be
> > quite elaborate or would trip all the scheduling problems that
> > long-running NIF's are now notorious for.
> >
> > At a step removed, the view server protocol itself seems like the
> > thing to improve on, it feels like that's the principal bottleneck.
>
> The code is here:
> https://github.com/couchbase/couchdb/tree/master/src/mapreduce
>
> I’d love for someone to pick this up and give CouchDB, say, a ./configure
> --enable-native-v8 option or a plugin that allows people to opt into the
> speed improvements made there. :)
>
> The choice for V8 was made because of easier integration API and more
> reliable releases as a standalone project, which I think was a smart move.
>
> IIRC it relies on a change to CouchDB-y internals that has not made it
> back from Couchbase to CouchDB (Filipe will know, but I doubt he’s reading
> this thread), but we should look into that and get us “native JS views”, at
> least as an option or plugin.
>
> CCing dev@.
>
> Jan
> --
>
>
Well on the first hand nifs look like a good idea but can be very
problematic:

- when the view computation take time it would block the full vm
scheduling. It can be mitigated using a pool of threads to execute the work
asynchronously but then can create other problems like memory leaking etc.
- nifs can't be upgraded easily during hot upgrade
- when a nif crash, all the vm crash.

(Note that we have the same problem when using a nif to decode/encode json,
it only works well with medium sized documents)

One other way to improve the js handling would be removing the main
bottleneck ie the serialization-deserialization we do on each step. Not
sure if it exists but  feasible, why not passing erlang terms from erlang
to js and js to erlang? So at the end the deserialization would happen only
on the JS side ie instead of having

get erlang term
encode to json
send to js
decode json
process
encode json
send json
decode json to erlang term
store

we sould just have

get erlang term
send over STDIO
decode erlang term to JS object
process
encode to erlang term
send erlang term
store

Erlang serialization is also very optimised.


Both solutions could co-exist, that may worh a try and benchmark each...


- benoit

Reply via email to