On Wed, Sep 23, 2009 at 11:27 AM, Debasish Ghosh <[email protected]> wrote: > Looks like I found the problem .. is this an intended change in CouchDB .. > > CouchDB wiki documents the following for processing "reset" by the view > server : > > """ > CouchDB sends: > ["reset"]\n > The view server responds: > true\n > """ > Accordingly I was doing a pattern match in my query server, expecting > ["reset"] .. > > In the latest snapshot, I did a trace and found that CouchDB actually > sends ["reset", {"reduce_limit":true}] .. hence I was getting an error > and the query server closes every time .. > > Is this the current specification of "reset" ? I changed my code to do > the corresponding pattern match .. and it now runs fine! > > Please confirm. >
The query_server_spec.rb should cover this, but perhaps it doesn't. I guess part of the issue is that JS is flexible about function arity in a way that other languages are not, which makes it really easy to absorb these sorts of differences. The command is currently ["reset", query_server_options] although the JS server works fine if it just gets ["reset"] as well. Chris > Thanks. > - Debasish > > On Wed, Sep 23, 2009 at 12:07 AM, Debasish Ghosh > <[email protected]> wrote: >> Thanks for the suggestions. I have not yet tried query_server_spec.rb. >> Will do soon to check. Though I logged everything that goes between >> couch server and the query server. The query server does get null from >> readLine of System.in with the later snapshots of the codebase that >> shuts it down. I need to investigate more on how it gets this. But as >> I mentioned before as well, the same query server runs fine with the >> earlier snapshot. >> >> Will let u know if I find anything meaningful. >> >> Thanks. >> - Debasish >> >> On Tue, Sep 22, 2009 at 11:19 PM, Paul Davis >> <[email protected]> wrote: >>> On Tue, Sep 22, 2009 at 1:11 PM, Debasish Ghosh >>> <[email protected]> wrote: >>>>> It may be that we're flushing the socket with no data, and the Scala >>>>> server is interpreting that as null input. The JS client uses >>>>> readline() implemented in C, so it shouldn't have access to data until >>>>> a line break has been sent by CouchDB. >>>> >>>> readLine blocks .. right .. and only comes out with the null input. >>>> The question is how it gets this null string with the new version of >>>> CouchDB. >>>> Is there something different that you were doing in earlier versions. >>>> Just wondering how it still runs with an earlier snapshot of CouchDB >>>> .. >>>> >>>> Thanks. >>>> - Debasish >>>> >>> >>> I'm still leaning towards the theory that your server is returning >>> something that CouchDB doesn't expect. When this happens the Erlang >>> process controller will shut down the view server by closing its input >>> stream. Though, theoretically, couchspawnkillable should kill -KILL >>> the process too, unless there's a tad bit of delay that occurs during >>> which you're spinning over the stdin stream returning NULL. >>> >>> Did you ever try adding tests to query_server_spec.rb and running that >>> way? I still need to modify that to make it more friendly to run >>> external view engines, but with a bit of hacking it should at least >>> point to the inconsistency. >>> >>> Paul Davis >>> >>>> On Mon, Sep 21, 2009 at 6:07 PM, Debasish Ghosh >>>> <[email protected]> wrote: >>>>> >>>>> The actual code is something like this .. >>>>> var s = isr.readLine >>>>> while (s != null) { >>>>> // do stuff >>>>> s = isr.readLine >>>>> } >>>>> I wrote the other version just to log what I get back. Now this same >>>>> version works ok with the earlier version of the couchdb server. That's >>>>> what beats me here .. >>>>> Thanks. >>>>> - Debasish >>>>> >>>>> On Mon, Sep 21, 2009 at 5:46 PM, Robert Newson <[email protected]> >>>>> wrote: >>>>>> >>>>>> I claim you are ignoring null here because of your comment; >>>>>> >>>>>> while (true) { >>>>>> s = inputstreamreader.readLine >>>>>> if (s == null) // ignore >>>>>> else >>>>>> toJson(s) match { >>>>>> //.. process reset, add_fun etc. >>>>>> } >>>>>> } >>>>>> >>>>>> When System.in is closed this loop will spin; readLine() will always >>>>>> return null. Since System.in is only closed when the JVM is exiting, >>>>>> it is never correct to ignore it and continue processing. >>>>>> >>>>>> The loop I presented is not the same as yours as mine will correctly >>>>>> exit on process termination. >>>>>> >>>>>> readLine() *cannot* return null under any circumstance but the close >>>>>> of the stream (couchdb cannot pass you null this way). System.in is >>>>>> never closed unless the process itself is exiting, and it is never >>>>>> reopened. >>>>>> >>>>>> The mishandling of readLine() is probably hiding the real problem. I >>>>>> would guess you pass invalid JSON to couchdb, or fail to return >>>>>> anything at all, under some conditions. Couch then kills your view >>>>>> server (and would then restart it). The view server, rather than >>>>>> gracefully exiting when this happens, will simple spin, never exiting. >>>>>> >>>>>> B. >>>>>> >>>>>> On Mon, Sep 21, 2009 at 8:19 AM, Debasish Ghosh >>>>>> <[email protected]> wrote: >>>>>> > It's in fact referring to a reader that wraps System.in. >>>>>> > readLine returns null on end of file, but the earlier version of the >>>>>> > snapshot handles it and does not close the query server process. While >>>>>> > the >>>>>> > new server seems to get throttled in the while loop. In fact this is >>>>>> > one >>>>>> > difference that I forgot to mention. In the earlier version the query >>>>>> > server >>>>>> > does not close, while in the new version it gets closed and restarted >>>>>> > for >>>>>> > every view operation. Maybe it's getting closed because of the null. I >>>>>> > can >>>>>> > figure that out from the logs. Is this an intentional change in >>>>>> > implementation ? >>>>>> > Robert - >>>>>> > I am not ignoring null. The while loop is very similar to what u >>>>>> > mention. I >>>>>> > switched to the while true version just to log and see if nulls are >>>>>> > getting >>>>>> > returned. >>>>>> > Thanks. >>>>>> > - Debasish >>>>>> > >>>>>> > On Mon, Sep 21, 2009 at 3:53 AM, Paul Davis >>>>>> > <[email protected]> >>>>>> > wrote: >>>>>> >> >>>>>> >> On Sun, Sep 20, 2009 at 1:34 AM, Debasish Ghosh >>>>>> >> <[email protected]> wrote: >>>>>> >> > Chris - >>>>>> >> > In my query server code, I logged everything that gets exchanged >>>>>> >> > between >>>>>> >> > the >>>>>> >> > couchdb server process and the query server. The difference that I >>>>>> >> > noticed >>>>>> >> > with the new changes are that the couchdb server sends a huge >>>>>> >> > number of >>>>>> >> > null >>>>>> >> > strings to the view server which chokes the latter. In the snippet >>>>>> >> > that >>>>>> >> > I >>>>>> >> > wrote before .. >>>>>> >> > >>>>>> >> > while (true) { >>>>>> >> >>> > s = inputstreamreader.readLine // this reads from stdin >>>>>> >> >>> > if (s == null) // ignore >>>>>> >> >>> > else >>>>>> >> >>> > toJson(s) match { >>>>>> >> >>> > //.. process reset, add_fun etc. >>>>>> >> >>> > } >>>>>> >> >>> > } >>>>>> >> > >>>>>> >> >>>>>> >> Does inputstreamreader.readLine refer to this function: >>>>>> >> >>>>>> >> >>>>>> >> http://java.sun.com/j2se/1.5.0/docs/api/java/io/BufferedReader.html#readLine%28%29 >>>>>> >> >>>>>> >> If so, and that's returning null, then is it signaling that CouchDB >>>>>> >> has tried to close the input stream? >>>>>> >> >>>>>> >> Paul >>>>>> >> >>>>>> >> > I put logs in the true branch of if (s == null) and moments later I >>>>>> >> > found a >>>>>> >> > log created of size 10 MB where the view server gets null strings >>>>>> >> > from >>>>>> >> > stdin. This may give some clues towards the problem. >>>>>> >> > >>>>>> >> > Hope this helps. >>>>>> >> > - Debasish >>>>>> >> > >>>>>> >> > >>>>>> >> > >>>>>> >> > >>>>>> >> > >>>>>> >> > On Sun, Sep 20, 2009 at 10:56 AM, Chris Anderson <[email protected]> >>>>>> >> > wrote: >>>>>> >> > >>>>>> >> >> On Sat, Sep 19, 2009 at 10:09 PM, Debasish Ghosh >>>>>> >> >> <[email protected]> wrote: >>>>>> >> >> > Yes, actually the reason I brought it up is that the same query >>>>>> >> >> > server >>>>>> >> >> runs >>>>>> >> >> > fine with the earlier version, while it stumbles with the changes >>>>>> >> >> > incorporated later. Actually there is a really really big >>>>>> >> >> > difference >>>>>> >> >> > in >>>>>> >> >> > performance which is primarily because of the timeouts. Thanks >>>>>> >> >> > for >>>>>> >> >> deciding >>>>>> >> >> > to look into it. I will currently stick around with the April >>>>>> >> >> > snapshot.Please post your findings on this list - I will be >>>>>> >> >> > happy to >>>>>> >> >> upgrade >>>>>> >> >> > to the latest. >>>>>> >> >> > Thanks. >>>>>> >> >> > - Debasish >>>>>> >> >> >>>>>> >> >> I think what we'll need is a way to get visibility between the beam >>>>>> >> >> process and the query server. this could be accomplished with a >>>>>> >> >> simple >>>>>> >> >> log wrapper around the query server, logging both stdin and stdout >>>>>> >> >> to >>>>>> >> >> individual files. >>>>>> >> >> >>>>>> >> >> I like the idea of implementing it as a wrapper because then we can >>>>>> >> >> wrap it around the scala as well as the JS query server (and other >>>>>> >> >> languages), and get complete transparency into what's going over >>>>>> >> >> the >>>>>> >> >> wire. >>>>>> >> >> >>>>>> >> >> This is definitely turning into dev@ territory so I'm moving this >>>>>> >> >> thread >>>>>> >> >> there. >>>>>> >> >> >>>>>> >> >> Chris >>>>>> >> >> >>>>>> >> >> > >>>>>> >> >> > On Sun, Sep 20, 2009 at 3:41 AM, Chris Anderson >>>>>> >> >> > <[email protected]> >>>>>> >> >> wrote: >>>>>> >> >> > >>>>>> >> >> >> On Sat, Sep 19, 2009 at 11:40 AM, Debasish Ghosh >>>>>> >> >> >> <[email protected]> wrote: >>>>>> >> >> >> > Here are some additional behavior changes that I am noticing >>>>>> >> >> >> > between >>>>>> >> >> the >>>>>> >> >> >> 2 >>>>>> >> >> >> > versions .. >>>>>> >> >> >> >>>>>> >> >> >> The other big change is in couch_os_process, the addition of >>>>>> >> >> >> couchspawnkillable - maybe this is acting up on your system. >>>>>> >> >> >> >>>>>> >> >> >> Partially I'm interested in getting to the bottom of this >>>>>> >> >> >> because it >>>>>> >> >> >> could be that it's inefficient with the JS query server, but not >>>>>> >> >> >> causing errors, and we just haven't noticed. >>>>>> >> >> >> >>>>>> >> >> >> > In the newer version, I notice lots of null strings being sent >>>>>> >> >> >> continuously >>>>>> >> >> >> > from the couchdb server to the view server. My view server >>>>>> >> >> >> > loop >>>>>> >> >> >> > looks >>>>>> >> >> >> like >>>>>> >> >> >> > the following :- >>>>>> >> >> >> > >>>>>> >> >> >> > while (true) { >>>>>> >> >> >> > s = inputstreamreader.readLine >>>>>> >> >> >> > toJson(s) match { >>>>>> >> >> >> > //.. process reset, add_fun etc. >>>>>> >> >> >> > } >>>>>> >> >> >> > } >>>>>> >> >> >> > >>>>>> >> >> >> > With the new version, I find lots of null strings coming in to >>>>>> >> >> >> > "s", >>>>>> >> >> which >>>>>> >> >> >> > makes me include something like the following .. >>>>>> >> >> >> > >>>>>> >> >> >> > while (true) { >>>>>> >> >> >> > s = inputstreamreader.readLine >>>>>> >> >> >> > if (s == null) // ignore >>>>>> >> >> >> > else >>>>>> >> >> >> > toJson(s) match { >>>>>> >> >> >> > //.. process reset, add_fun etc. >>>>>> >> >> >> > } >>>>>> >> >> >> > } >>>>>> >> >> >> > >>>>>> >> >> >> > And this null business is really huge. Has there been any >>>>>> >> >> >> > change >>>>>> >> >> >> > in >>>>>> >> >> the >>>>>> >> >> >> > protocol between the couchdb server and the view server ? I >>>>>> >> >> >> > suspect >>>>>> >> >> that >>>>>> >> >> >> > these null exchanges are taking up lots of cycles which >>>>>> >> >> >> > result in >>>>>> >> >> process >>>>>> >> >> >> > time out in the new version. I do not get this null stuff >>>>>> >> >> >> > with the >>>>>> >> >> older >>>>>> >> >> >> > version. Is there any chance of such happening with the >>>>>> >> >> >> > changes >>>>>> >> >> >> > that >>>>>> >> >> have >>>>>> >> >> >> > been done in couch_query_servers.erl ? >>>>>> >> >> >> > >>>>>> >> >> >> > Thanks. >>>>>> >> >> >> > - Debasish >>>>>> >> >> >> > >>>>>> >> >> >> > >>>>>> >> >> >> > On Sat, Sep 19, 2009 at 11:34 PM, Debasish Ghosh >>>>>> >> >> >> > <[email protected]>wrote: >>>>>> >> >> >> > >>>>>> >> >> >> >> actually my ["reset"] is not expensive at all .. it just has >>>>>> >> >> >> >> a >>>>>> >> >> >> array.clear >>>>>> >> >> >> >> kind of call. >>>>>> >> >> >> >> Just another observation when I run in debug mode I find that >>>>>> >> >> >> >> there >>>>>> >> >> are >>>>>> >> >> >> >> quite a few cases of OS Process Error {os_process_error, "OS >>>>>> >> >> >> >> process >>>>>> >> >> >> timed >>>>>> >> >> >> >> out."} being recorded in couch.log. I do not get this when I >>>>>> >> >> >> >> am >>>>>> >> >> running >>>>>> >> >> >> the >>>>>> >> >> >> >> earlier version. However no unnatural things appear in >>>>>> >> >> couchdb.stderr. >>>>>> >> >> >> My >>>>>> >> >> >> >> current setting of os_process_timeout is 20000 .. I guess >>>>>> >> >> >> >> that's >>>>>> >> >> >> >> 20 >>>>>> >> >> secs >>>>>> >> >> >> .. >>>>>> >> >> >> >> >>>>>> >> >> >> >> Thanks. >>>>>> >> >> >> >> - Debasish >>>>>> >> >> >> >> >>>>>> >> >> >> >> >>>>>> >> >> >> >> On Sat, Sep 19, 2009 at 10:27 PM, Chris Anderson >>>>>> >> >> >> >> <[email protected] >>>>>> >> >> >> >wrote: >>>>>> >> >> >> >> >>>>>> >> >> >> >>> On Sat, Sep 19, 2009 at 5:13 AM, Debasish Ghosh >>>>>> >> >> >> >>> <[email protected]> wrote: >>>>>> >> >> >> >>> > Hi - >>>>>> >> >> >> >>> > As I have mentioned previously I have been working on a >>>>>> >> >> >> >>> > Scala >>>>>> >> >> driver >>>>>> >> >> >> for >>>>>> >> >> >> >>> > CouchDB, which also includes a Query Server. I was working >>>>>> >> >> >> >>> > with an >>>>>> >> >> >> April >>>>>> >> >> >> >>> > snapshot of 2009/04/23. This worked fine for all the >>>>>> >> >> >> >>> > views and >>>>>> >> >> >> >>> validations >>>>>> >> >> >> >>> > that I have written.Things were running fine and I could >>>>>> >> >> >> >>> > write >>>>>> >> >> >> >>> map/reduce >>>>>> >> >> >> >>> > and validation functions in Scala. >>>>>> >> >> >> >>> > Recently I tried to upgrade to trunk. Suddenly the views >>>>>> >> >> >> >>> > and >>>>>> >> >> >> validations >>>>>> >> >> >> >>> > became very very slow. After some fact finding, I tried to >>>>>> >> >> >> >>> > poke >>>>>> >> >> into >>>>>> >> >> >> * >>>>>> >> >> >> >>> > couch_query_servers.erl*, since that seemed to be the >>>>>> >> >> >> >>> > obvious >>>>>> >> >> >> >>> > area >>>>>> >> >> to >>>>>> >> >> >> >>> look >>>>>> >> >> >> >>> > into. I may be worng though, but it was a blind guess. >>>>>> >> >> >> >>> > I noticed that previously I was working with *revision >>>>>> >> >> >> >>> > 749852* >>>>>> >> >> >> >>> > of >>>>>> >> >> the >>>>>> >> >> >> >>> file, >>>>>> >> >> >> >>> > which delivered the goods for me. Then when I faced >>>>>> >> >> >> >>> > problems >>>>>> >> >> >> >>> > with >>>>>> >> >> the >>>>>> >> >> >> >>> trunk, >>>>>> >> >> >> >>> > I started doing a git reset to earlier versions of this >>>>>> >> >> >> >>> > file. >>>>>> >> >> >> >>> > Now >>>>>> >> >> I >>>>>> >> >> >> find >>>>>> >> >> >> >>> > that it looks like the performance problem starts from >>>>>> >> >> >> >>> > *revision >>>>>> >> >> >> 780165* >>>>>> >> >> >> >>> of >>>>>> >> >> >> >>> > this file. Have a look at >>>>>> >> >> >> >>> > >>>>>> >> >> >> >>> >>>>>> >> >> >> >>>>>> >> >> >>>>>> >> >> http://svn.apache.org/viewvc/couchdb/trunk/src/couchdb/couch_query_servers.erl?r1=780165&r2=749852&diff_format=hfor >>>>>> >> >> >> >>> > the difference. Looks like there have been some major >>>>>> >> >> >> >>> > changes. >>>>>> >> >> >> >>> > I >>>>>> >> >> am >>>>>> >> >> >> >>> > just >>>>>> >> >> >> >>> > wondering if this change has anything to do with the >>>>>> >> >> >> >>> > performance >>>>>> >> >> >> issue. >>>>>> >> >> >> >>> > >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> A quick scan of that diff suggests that the only real >>>>>> >> >> >> >>> behavior >>>>>> >> >> change >>>>>> >> >> >> >>> that should effect you is the ["reset"] call for recycled >>>>>> >> >> >> >>> processes. >>>>>> >> >> >> >>> Maybe reset is expensive in your implementation? >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> BTW, have you tried running: >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> spec test/query_server_spec.rb -f specdoc --color >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> It should be simple to extend that test suite to test your >>>>>> >> >> >> >>> scala >>>>>> >> >> >> >>> server. If there are patches we can make to make it easier >>>>>> >> >> >> >>> to >>>>>> >> >> >> >>> integrate outside projects with the query server test >>>>>> >> >> >> >>> suite, I'm >>>>>> >> >> happy >>>>>> >> >> >> >>> to help there as well. >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> > Any help, pointer will be appreciated. >>>>>> >> >> >> >>> > >>>>>> >> >> >> >>> > Thanks. >>>>>> >> >> >> >>> > - Debasish >>>>>> >> >> >> >>> > >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> >>>>>> >> >> >> >>> -- >>>>>> >> >> >> >>> Chris Anderson >>>>>> >> >> >> >>> http://jchrisa.net >>>>>> >> >> >> >>> http://couch.io >>>>>> >> >> >> >>> >>>>>> >> >> >> >> >>>>>> >> >> >> >> >>>>>> >> >> >> > >>>>>> >> >> >> >>>>>> >> >> >> >>>>>> >> >> >> >>>>>> >> >> >> -- >>>>>> >> >> >> Chris Anderson >>>>>> >> >> >> http://jchrisa.net >>>>>> >> >> >> http://couch.io >>>>>> >> >> >> >>>>>> >> >> > >>>>>> >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> >> >> -- >>>>>> >> >> Chris Anderson >>>>>> >> >> http://jchrisa.net >>>>>> >> >> http://couch.io >>>>>> >> >> >>>>>> >> > >>>>>> > >>>>>> > >>>>> >>>> >>> >> > -- Chris Anderson http://jchrisa.net http://couch.io
