Re: "view_conflicts" test fail

Adam Kocoloski Sun, 05 Apr 2009 06:12:40 -0700


On Apr 4, 2009, at 6:53 PM, Patric Fors wrote:

5 apr 2009 kl. 00.23 skrev Adam Kocoloski:
On Apr 4, 2009, at 5:44 PM, Patric Fors wrote:
Hi,
Should I be worried that the "view_conflicts" test fail in theTest Suite?I mean, is it the test that fails, or is it couchdb that fails thetest. :-)
Hi Patrick, did you happen to run that test with Safari?view_conflicts fails for me in Safari 4, but passes in Firefox 3and in the command-line runner. In other words, I think it's thetest that fails, not Couch :-)
Aha, thanks!
And, yes, Safari was the browser I used, I confess :-)
Ran it again with Firefox and it's all good: 44 of 44 test(s) run, 0failures (55178 ms)
Hm...Command-line runner? Must have missed that one.
Well, while we are on the command line, I guess these errors arealso part of the Test Suits tests?
[info] [<0.10759.0>] 127.0.0.1 - - 'POST' /test_suite_db/_ensure_full_commit 201


<snipped file descriptor traceback>

[info] [<0.10759.0>] 127.0.0.1 - - 'POST' /_restart 200


/Patric

Hi Patric, funny you should bring that up. I've been trying tounderstand the source of those tracebacks myself. Short answer isthat you probably don't have anything to worry about. Long answerfollows ...

CouchDB uses a single file on disk for each database it creates, andall access to that file goes through a reference-counted gen_serverusing couch_file as the callback module. The tracebacks in the logsoccur when a couch_file gen_server terminates abnormally, where"abnormally" just means that the reason given in the exit signal issomething other than "normal". It happens rarely, and only when adatabase is deleted or the server is restarted, both of which occurmuch more frequently in the test suite than they do in normaloperation. It's not necessarily indicative of a problem.

I believe the issue is one of message ordering. In normal operationcouch_ref_counter is supposed to stop couch_file when the DB isdeleted or the server restarted. In your log, couch_ref_counter isthe neighbour at <0.10708.0>. Take a look at the ref_counter'smessage queue:


{messages, [{'DOWN',#Ref<0.0.0.128931>,process,<0.10644.0>,killed}]},

When the ref_counter processes that message I believe it will triggera normal shutdown of the couch_file. Unfortunately, couch_file gotthe message about the couch_server at <0.10644.0> going down first, soyou see what looks like a crash. The reason this is not a problem isthat couch_file doesn't do anything differently for a normal orabnormal termination. The only difference is that the Erlang loggerpukes out this stacktrace if its an abnormal termination.

I think we should look into refactoring the couch_file/couch_ref_counter stuff a bit; the current workflow (server spawnsfile and unlinks, server spawns ref_counter, ref_counter links tofile) is pretty tough to follow and opens us up to these occasionaltracebacks in the logs. Anyway, thanks for listening. Cheers,


Adam

P.S. Any Erlangers out there might have noticed something odd here.couch_server spawn_links a couch_file and then unlinks it, so why doescouch_file terminate when the server does? Chandru Mullaparthi (ofibrowse fame) pointed out an undocumented OTP feature that seems to beresponsible in this thread:


http://groups.google.com/group/erlang-programming/browse_thread/thread/8ab392fedcad19b6

Re: "view_conflicts" test fail

Reply via email to