The requests are made asynchronously, so, the log does not show the messages in a coherent way. You should look for PID of the reporter/reported to see how the messages are connected. Try limiting your requests to a number of characters and run the test again. You will see your problem will not repeat. The no_db_file error shouldn't crash the server, but enametoolong definitely does.

The error presented here is an example of how the supervisor is using its restart strategy to bring a child back to live after registering a crash (see the tuple {status, running} within a crash report).

CGS




On 12/07/2011 03:28 PM, Ramkrishna Kulkarni wrote:
Thanks for your responses but I'm still not able to understand what's
going on because I do not see couch_server restarting and terminating
message for every request that crosses DB name limit. Most of the
cases I receive '{"error":"error","reason":"enametoolong"}"'  or
'{"error":"not_found","reason":"no_db_file"}'. But then one of the
request causes it to terminate.

Pasting more detailed logs which highlight the scenario:
http://pastebin.com/BmDsq4mj

Also, the first set of errors in the logs are related to mochiweb and
I'm not sure if these two things are related.

[Wed, 07 Dec 2011 05:03:02 GMT] [error] [<0.18855.1>] {error_report,<0.31.0>,
     {<0.18855.1>,crash_report,
      [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
        {pid,<0.18855.1>},
        {registered_name,[]},
        {error_info,
            {error,
                {case_clause,{error,enotconn}},
                [{mochiweb_request,get,2},
                 {couch_httpd,handle_request_int,5},
                 {mochiweb_http,headers,5},
                 {proc_lib,init_p_do_apply,3}]}},
        {ancestors,
            [couch_httpd,couch_secondary_services,couch_server_sup,<0.32.0>]},
        {messages,[]},
        {links,[<0.104.0>,#Port<0.2245>]},
        {dictionary,[]},
        {trap_exit,false},
        {status,running},
        {heap_size,17711},
        {stack_size,24},
        {reductions,10141}],
       []]}}

[Wed, 07 Dec 2011 05:03:02 GMT] [error] [<0.104.0>] {error_report,<0.31.0>,
     {<0.104.0>,std_error,
      {mochiweb_socket_server,235,
          {child_error,{case_clause,{error,enotconn}}}}}}


Thanks.

On Wed, Dec 7, 2011 at 6:53 PM, CGS<[email protected]>  wrote:
There is another way to avoid the situation by limiting your script requests
to a maximum number of characters.

About terminating signal, yes, it's normal. The supervisor is instructed to
try to restart the generic server x times per second for y seconds. If the
restart strategy limit is reached or breached, the supervisor terminates
definitively the child (in this case, the generic server). It seems your
script gets stubborn to send continuous messages which crashes the generic
server and the supervisor stops permanently the generic server due to the
above mentioned limit.

CGS

PS: It could be a nice idea for the developers to implement a limit for the
length of the names, to avoid the generic server to crash. But this is up to
them.






On 12/07/2011 01:55 PM, Ramkrishna Kulkarni wrote:
Thanks. Restarting does solves the problem but I was hoping if there
is a way to avoid ending up in that situation.

As far as the DB name is concerned, I do no have any DB with name
length more than 10 characters. However, I did notice that the script
made several requests  to a non-existent DB with name 8000+
characters.

GET /aaaa.... (8000+ a's)

Almost immediately after that I see couch_server restarting and then
couch_server terminating message. Is this normal behavior?

[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.84.0>] Unexpected message,
restarting couch_server: {'EXIT',
...
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.84.0>] ** Generic server
couch_server terminating
...

Thanks.


On Wed, Dec 7, 2011 at 5:18 PM, CGS<[email protected]>    wrote:
Due to too many and fast crashes, the server is terminated permanently,
but
not the whole process. Restarting CouchDB should allow again the user
login.
  From the error, you have exceeded the maximum number of characters for
the
name of your database. I don't know which is the maximum allowed, but for
sure that long queue of a's will not compute.

CGS




On 12/07/2011 12:32 PM, Ramkrishna Kulkarni wrote:
I would like to add that around that time I find some generic server
terminated message:

-- Logs --

[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22630.1>] 'GET'


/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
{1,
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22630.1>] OAuth Params: []
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22630.1>] Minor error in
HTTP request: {not_found,no_db_file}
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22630.1>] Stacktrace:
[{couch_httpd_db,do_db_req,2},
[Wed, 07 Dec 2011 05:49:47 GMT] [info] [<0.22630.1>] xxx.xxx.xxx.xxx -
- 'GET'

/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
404
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22630.1>] httpd 404 error
response:
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22636.1>] 'GET'


/aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
{1,
[Wed, 07 Dec 2011 05:49:47 GMT] [debug] [<0.22636.1>] OAuth Params: []
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.84.0>] Unexpected message,
restarting couch_server: {'EXIT',<0.22638.1>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [emulator] Error in process
<0.22638.1>      with exit value:


{{case_clause,{error,enametoolong}},[{couch_db,open_db_file,2},{couch_file,open,2},{couch_db,start_link,3},{couch_server,'-open_async/5-fun-0-',4}]}
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.84.0>] ** Generic server
couch_server terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.84.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.79.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.22636.1>] Uncaught error
in HTTP request: {exit,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.155.0>] ** Generic server
<0.155.0>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.174.0>] ** Generic server
<0.174.0>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.155.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.174.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.236.0>] ** Generic server
<0.236.0>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.220.0>] ** Generic server
<0.220.0>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.236.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.220.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [info] [<0.224.0>] Shutting down view
group server, monitored db is closing.
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.91.0>] ** Generic server
<0.91.0>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.91.0>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.18846.1>] ** Generic
server<0.18846.1>      terminating
[Wed, 07 Dec 2011 05:49:47 GMT] [info] [<0.19172.1>] Shutting down
view group server, monitored db is closing.
[Wed, 07 Dec 2011 05:49:47 GMT] [error] [<0.18846.1>]
{error_report,<0.31.0>,
[Wed, 07 Dec 2011 05:49:47 GMT] [info] [<0.18904.1>] Shutting down
view group server, monitored db is closing.
[Wed, 07 Dec 2011 05:49:47 GMT] [info] [<0.22636.1>] Stacktrace:
[{gen_server,call,3},

-- End --


On Wed, Dec 7, 2011 at 4:37 PM, Ramkrishna Kulkarni
<[email protected]>      wrote:
During some rigorous testing, one of our scripts is making around 250K
GET requests (all different paths) with basic auth. Before this test,
all users are able to login but after the test, only a couple of them
are able to login. For all other valid users I see unauthorized
message in the logs (mentioned below).

I have changed only the following auth settings
auth_cache_size = 50000
timeout = 3600 ;seconds

I'm currently on 1.0.2.

Please help.

--- Logs ----
Wed, 07 Dec 2011 10:56:21 GMT] [debug] [<0.23516.2>] 'POST' /_session
{1,1}
Headers: [{'Accept',"application/json, text/javascript, */*; q=0.01,
application/json"},
          {'Accept-Charset',"UTF-8,*;q=0.5"},
          {'Accept-Encoding',"gzip,deflate,sdch"},
          {'Accept-Language',"en-US,en;q=0.8,hi;q=0.6"},
          {'Connection',"keep-alive"},
          {'Content-Length',"59"},
          {'Content-Type',"application/x-www-form-urlencoded"},
          {'Cookie',"AuthSession="},
          {'Host',"xxx.xxx.xx.xxx:5984"},
          {"Origin","http://xxx.xxx.xx.xxx:5984"},
          {'Referer',"http://xxx.xxx.xx.xxx:5984/"},
          {'User-Agent',"Mozilla/5.0 (X11; Linux i686)
AppleWebKit/535.2 (KHTML, like Gecko) Chrome/15.0.874.120
Safari/535.2"},
          {"X-Requested-With","XMLHttpRequest"}]

[Wed, 07 Dec 2011 10:56:21 GMT] [debug] [<0.23516.2>] OAuth Params: []

[Wed, 07 Dec 2011 10:56:21 GMT] [debug] [<0.23516.2>] Attempt Login:
FXXXXXXXe16b2658b5c8f8ed3dcd09d36d08f107

[Wed, 07 Dec 2011 10:56:21 GMT] [info] [<0.23516.2>] yyy.yyy.yyy.yyy -
- 'POST' /_session 401

[Wed, 07 Dec 2011 10:56:21 GMT] [debug] [<0.23516.2>] httpd 401 error
response:
  {"error":"unauthorized","reason":"Name or password is incorrect."}

--- end ---


Reply via email to