I should point out that my test does this; 1) PUT _config/uuid/algorithm with "random" 2) insert some documents 3) PUT _config/uuid/algorithm with "sequential" 4) insert some documents
If you loop that, and insert as few as 10 documents at 2) and 4), you will get a connection refused and the stacktrace output, within 60 seconds. On Sat, Oct 3, 2009 at 2:33 PM, Robert Newson <[email protected]> wrote: > Ok, I've got a little further. If I change my test to much short runs > (even 10 documents), I can reproduce the connection refused symptom > and the stacktrace I pasted originally in under a minute, every time. > > What appears to be happening is that the couch_uuids gen_server is > failing (being restarted too frequently), part of the supervision tree > is torn down and rebuilt, and a concurrent write operation fails while > that is happening. Since I'm pretty sure that's not what should happen > with Erlang/OTP, it's hopefully a straightforward bug. > > Alas, my test client is in Java (using httpclient 4.0, fwiw), so I > can't easily post a unit test for this right now. > > B. > > On Sat, Oct 3, 2009 at 1:52 PM, Robert Newson <[email protected]> wrote: >> A subsequent run that encountered the connection refused error did not >> cause the couch_uuids supervisor to restart it, so the two problems >> are unrelated. >> >> On Sat, Oct 3, 2009 at 1:50 PM, Robert Newson <[email protected]> >> wrote: >>> Hi, >>> >>> Jan suggested I start a thread on dev about a problem I'm encountering >>> on couchdb trunk. I'm performing long running insertion tests (that >>> is, millions of inserts) in order to quantify the differences between >>> batch vs. sync and random identifiers vs. sequential ones. I find it >>> hard to complete a 5 million insertion run as my client eventually >>> (and randomly) gets a "connection refused" error from couchdb. >>> Immediately after that occurs, I can successfully hit couchdb with >>> curl, so it's transitory. I found the following errors in the log >>> around the time of the problem; >>> >>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 === >>> Supervisor: {local,couch_secondary_services} >>> Context: shutdown >>> Reason: reached_max_restart_intensity >>> Offender: [{pid,<0.5273.0>}, >>> {name,uuids}, >>> {mfa,{couch_uuids,start,[]}}, >>> {restart_type,permanent}, >>> {shutdown,brutal_kill}, >>> {child_type,worker}] >>> >>> [error] [<0.76.0>] {error_report,<0.30.0>, >>> {<0.76.0>,supervisor_report, >>> [{supervisor,{local,couch_server_sup}}, >>> {errorContext,child_terminated}, >>> {reason,shutdown}, >>> {offender, >>> [{pid,<0.2218.0>}, >>> {name,couch_secondary_services}, >>> {mfa,{couch_server_sup,start_secondary_services,[]}}, >>> {restart_type,permanent}, >>> {shutdown,infinity}, >>> {child_type,supervisor}]}]}} >>> >>> =SUPERVISOR REPORT==== 3-Oct-2009::13:32:18 === >>> Supervisor: {local,couch_server_sup} >>> Context: child_terminated >>> Reason: shutdown >>> Offender: [{pid,<0.2218.0>}, >>> {name,couch_secondary_services}, >>> {mfa,{couch_server_sup,start_secondary_services,[]}}, >>> {restart_type,permanent}, >>> {shutdown,infinity}, >>> {child_type,supervisor}] >>> >>> >>> =ERROR REPORT==== 3-Oct-2009::13:32:18 === >>> Error in process <0.5316.0> with exit value: >>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_databases},-1}]},{couch_stats_collector,decrement,1}]} >>> >>> >>> =ERROR REPORT==== 3-Oct-2009::13:32:18 === >>> Error in process <0.5312.0> with exit value: >>> {badarg,[{ets,insert,[stats_hit_table,{{couchdb,open_os_files},-1}]},{couch_stats_collector,decrement,1}]} >>> >> >
