I knew that someone will make jq query server and here it is. Nice work, Matthieu!
Do you plan to implement jq in Golang? That will significantly improve your query server and will allow others to embed jq into their apps. -- ,,,^..^,,, On Sat, Mar 28, 2015 at 6:12 PM, Matthieu Rakotojaona <[email protected]> wrote: > Hello guys, > > I'd like to announce a jq-based view server for couchdb. It's extremely > rudimentary, but works as a proof of concept of what can be achieved: > > https://github.com/rakoo/jqouch > > A bit of background: jq is a cli tool to extract and render information > from any json you give it, with a custom but powerful syntax: > > $ curl localhost:5984 | jq '.vendor .version' > "1.6.1" > > $ curl localhost:5984/mydb | jq '.disk_size - .data_size' > 80892224 > > Looks like I'd better compact ! > > If you're dabbling with json and not using it already, I encourage you > to check it out. > > Basically jq is invoked with a filter (that's the '.vendor .version' > from the example above); you then feed jq with a JSON document in stdin, > and it gives you all matches and transformations on stdout. jqouch > works by taking the function given in "add_fun" and spawning an external > process with this fun as a filter, and forwarding documents in "map_doc" > to it. All output from jq is then sent back to CouchDB through jqouch > (jq processes are not killed after each doc, they stay alive as long as > the stdin is not closed, which jqouch never does until it dies) > > I have included some example in the repo, here they are. I'm using some > examples from a dump of... I don't know exactly what, but a sample is > here: > > https://github.com/rakoo/jqouch/blob/master/sample.json > > taken from http://parltrack.euwiki.org/dumps/eurlex.json.xz. That's > 22925 documents. I made some benchmarks on CouchDB 1.6: > > Here's a really simple view in js: > > function(doc) { > emit(doc.title, null) > } > > it maps all docs in ~ 35s > > And the equivalent in jq: > > [ [.title, null] ] > > it maps all docs in ~ 19s > > Each map function emits a list of kv pairs, there's no more emit(); it's > actually the format of what a query server has to return for each > mapping function. It may not be ideal, but it works. > > Here's an other, more "useful" set of view: > > function(doc) { > for (var i = 0; i < doc.dates.length; i++) { > emit([doc.dates[i].type, doc.dates[i].date], null) > } > } > > runs in ~ 32s > > [ .dates[] | [[.type, .date], null] ] > > runs in ~ 19s > > > > > There are a few things we can say: > > * For all 4 pairs of example views (see repo), jq is constantly almost > twice as fast as the equivalent js. Moreover the couchjs process is > always eating a large part of my CPU when running, whereas the jq > process is never over 30%. This indicates some overhead is spent on > passing documents betweer processes, which I'm going to investigate > with the jq C API. > > * jq views can be hard to understand and write, but they can be tested > through the cli jq tool directly, or even online with jqplay > (https://jqplay.org/) > > * using jq doesn't (AFAIK) allow one to output non-deterministic values, > by default > > * jq is "sandboxed" in that it can't do anything other than transform > documents, contrary to standard languages > > * jq filters are in my opininion very clear on what they do, such that a > one-line filter can be enough in most cases > > Of course, it's not all rainbows and unicorns: > > * there are still some quirks in the jq views, they can output something > like [null, null] when they should not return anything because the > view doesn't apply to the doc. > > * jqouch currently doesn't understand anything other than "reset", > "add_fun" and "map_doc" > > * I don't see the jq language as being enough for more generic functions > such as show and list, but who knows > > Anyway, there may be some value in using jq to define basic views, the > ones that just index a document on some value and don't do much more. As > a non-serious CouchDB user I've never had to use really fancy views. > > Thoughts ?
