Hi, Jan. Do you mean the Node.js view server? That one is called "couchjs" and my fork is here: https://github.com/jhs/couchjs
If you mean something else, hm, can you remind me? Thanks! :) On Tue, Mar 17, 2015 at 4:28 AM, Jan Lehnardt <[email protected]> wrote: > Dear Buddhika, > > thank you for your interest in CouchDB and the CouchDB View Server! > > This is an area where you can make significant contributions to CouchDB. > > It is also a little bit involved, but you seem to have all the skills > required to pull this off :) > > I’m happy to mentor you. > > On 16 Mar 2015, at 10:03, Buddhika Jayawardhana < > [email protected]> wrote: > > > > Hi, > > I am an Undergraduate of Department of Computer Science and Engineering > > University of Moratuwa. I have been subscribed to couchdb mailing list > > since months and I have been trying to learn some Erlang to work with > > couchdb. I noticed project "COUCHDB-1743 Make the view server & protocol > > faster" is related to GSoC. I am willing to submit a project proposal for > > this project. > > > > I have theoretical knowledge in software process, design patterns, and > > other Engineering concepts. I've been using 'java', 'C++' for high-level > > programming and 'C', a little bit of assembly for low-level programming > and > > PHP and JavaScript for web development. Also I have sound knowledge on > > Erlang. I would be much thankful if you can guide to get familiar with > the > > project as soon as possible. > > > > Here are the problems in my mind > > > > - Are the other programming languages that I should get familiar with? > > Erlang and JavaScript will do, some knowledge of C to understand the > current system will help. > > > > - What are the technologies I should get familiar with? > > General knowledge of Unix/POSIX fundamentals (processes, fds, stdio etc.) > will be required. Windows equivalent APIs too (but not strictly a > requirement just yet). > > > > - I can work 40 hours per week for the project. Would that be enough to > > successfully complete the project? > > I can’t estimate whether you’d be able to complete this 100%, but I’m sure > that this enough time to make a significant contribution, that the > community then can take and finish up, should you not get to the end. E.g. > don’t worry too much about this :) > > > > - What are the other resources that I should read before submitting the > > proposal? > > Familiarity with the CouchDB source can’t hurt. More in-depth knowledge of > Erlang as well, http://learnyousomeerlang.com is a great free resource and > the main Erlang docs are worth a read, as well. As are the various print > books that are available from various publishers. > > It will definitely also help to read through the CouchDB Guide: > http://guide.couchdb.org > > Although some parts have already been integrated into > http://docs.couchdb.org, > which you should also read, especially the bits about Design Documents, > Views > and List, Show, Validation, Filter and Update functions. > > In addition, check out the query_server_spec, it codifies the current query > server protocol: > > > https://github.com/apache/couchdb/blob/master/test/view_server/query_server_spec.rb > > > > Hope you will guide me through the project. > > Again, thanks for taking an interest in this! :) > > To get things rolling, here’s my rough idea for how this could play out: > > Generally, there are three components, the Erlang and the JavaScript part > and the JavaScript runtime or couchjs. > > We call all these things Query Server or View Server. > > The Erlang part lives in https://github.com/apache/couchdb-couch-mrview > > The JavaScript part lives in > https://github.com/apache/couchdb/tree/master/share/server > > The current JavaScript runtime is Spidermonkey. We have our own C-wrapper > around Spidermonkey, to make it a CLI tool that talks stdio: > > https://github.com/apache/couchdb-couch/tree/master/priv/couch_js > > > We’d generally like to move away from the custom C-wrapped Spidermonkey and > have V8 be the execution engine. We also like to get away from having to > maintain C/C++. It’d probably be simplest to use Node.js as a wrapper, > because then many more people can contribute to this. Also, Node.js is good > at streaming protocols, so it is a natural fit. > > > Here is how I would start: > > 1. Create a new Query Server that *only* handles Show, List, Filter, > Validation > and Update functions as that is a lot simpler on both the Erlang and > JavaScript side. > > 2. As part of 1: Design a new Query Server protocol that works in a > streaming > fashion. The current one is request/response based and both sides are > waiting > for one another while one of them is doing actual work. It’d be nice if > both > could just keep working on whatever they need to do. > > 3. Once 1. and 2. are in place and working correctly, expand the new Query > Server > to also handle Views. At this point, adding view support should not be > too > complicated anymore. > > > Things to watch out for: > > - map/reduce functions for CouchDB views need to be “pure”, e.g. we need > to guarantee > they stay the same unless CouchDB can see any changes (and then > invalidate the view > index). This means we need some extra isolation of the JS execution. And > some > limitation or observation of the require() system. > > There is a project that demonstrated we can do this. Jason Smith has run > this, > but I can’t seem to find it on his GitHub. Jason, do you have any > pointers? > > - A couchjs process can be used for multiple databases and different > access control can > be configured per database. Data MUST NOT leak between databases. E.g. > Errors that > are thrown when requesting a view result on database A must not show any > process state > data that comes from database B (and vice versa). > > - The current system works much like CGI. A single process can handle one > concurrent > request, if there are two concurrent requests, a new process is spawned. > The new > Query Server should be able to handle multiple concurrent requests. But > there will > be a time when a single process is saturated, at that point, we should > be able to > spawn more Query Servers to help with the load. — In the 1./2./3. list > above, I’d > either solve this upfront, or after 3., depending on what you are more > comfortable > with. It might be easier to get started without this, but it might be > harder to add > later and easier overall to have thought this through upfront. > > - Windows stdio can be troublesome, beware :) > > - Windows process handling can also be troublesome, that’s why we are using > https://github.com/apache/couchdb-couch/tree/master/priv/spawnkillable > to kill/reap > couchjs process there. Not sure we still need this when we use Node.js, > but worth > checking out. > > - I’ve had a bit time last year to experiment with streaming > Erlang/Node.js communication. > It worked fine, but I didn’t get very far (the JavaScript part just > echos commands > back to Erlang). The projects could help as inspiration: > > https://github.com/janl/couch_query_server2 > https://github.com/janl/node-couch-query-server2 key code is in > src/couch_query_server2_sup.erl > > It uses the Erlang pid as a stream marker so we can interleave requests. > > Please excuse the lack of a README or other instructions! > > > This is all I have for now. Other folks may want to chime in with their > opinions :) > > If you have any more questions, let me know. If you want to take this into > JIRA, let’s > open a new ticket. > > Best > Jan > -- > > > > > Thank You. > > > > -- > > *Buddhika Jayawardhana* > > Undergraduate | Department of Computer Science & Engineering > > University of Moratuwa > > *[email protected] <[email protected]>* | LinkedIn > > <http://lk.linkedin.com/in/buddhikajay/> > > -- > Professional Support for Apache CouchDB: > http://www.neighbourhood.ie/couchdb-support/ > >
