[ 
https://issues.apache.org/jira/browse/COUCHDB-1212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059255#comment-13059255
 ] 

Robert Newson commented on COUCHDB-1212:
----------------------------------------

Jan,

Don't worry about it. I'm not aware of the degree to which the CouchDB source 
code included in CouchBase Server varies from stock, though I believe it's very 
little (perhaps not at all?).

As to the proposed patch, Filipe is suggesting that your system is slowed 
significantly enough during compaction (it must read 4,5 GB of data, and not 
always sequentially, in addition to writing all live data to a new file) that 
you hit timeouts where we've never anticipated them. gen_server calls default 
to a 5 second timeout so it seems plausible that your GET to _users took 
longer. We have a history of extending the timeout on any gen_server:call to 
infinity if it has to consult the disk as we cannot predict how slow a disk 
will respond.

I'm confident that the bug you report is real and present in core CouchDB (and 
therefore in CouchBase Server, which has a small or zero delta from CouchDB). 
It may or may not be present in other products like BigCouch which have a 
larger delta, which is why it's important to file tickets appropriately. Anyone 
using CouchDB in their product will monitor this issue tracker in addition to 
monitoring the repository itself for fixes.

 

> Newly created user accounts cannot sign-in after _user database crashes 
> ------------------------------------------------------------------------
>
>                 Key: COUCHDB-1212
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-1212
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core, HTTP Interface
>    Affects Versions: 1.0.2
>         Environment: Ubuntu 10.10, Erlang R14B02 (erts-5.8.3)
>            Reporter: Jan van den Berg
>            Priority: Critical
>              Labels: _users, authentication
>         Attachments: couchdb-1212.patch
>
>
> We have one (4,5 GB) couch database and we use the (default) _users database 
> to store user accounts for a website. Once a week we need to restart couchdb 
> because newly sign-up user accounts cannot login any more. They get a HTTP 
> statuscode 401 from the _session HTTP interface. We update, and compact the 
> database three times a day.
> This is the a stacktrace I see in the couch database log prior to when these 
> issues occur.
> ----------- couch.log ---------------
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.117.0>] Starting compaction for db 
> "fbm"
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.5753.79>] 127.0.0.1 - - 'POST' 
> /fbm/_compact 202
> [Wed, 29 Jun 2011 22:02:46 GMT] [info] [<0.5770.79>] 127.0.0.1 - - 'POST' 
> /fbm/_view_cleanup 202
> [Wed, 29 Jun 2011 22:10:19 GMT] [info] [<0.5773.79>] 86.9.246.184 - - 'GET' 
> /_session 200
> [Wed, 29 Jun 2011 22:24:39 GMT] [info] [<0.6236.79>] 85.28.105.161 - - 'GET' 
> /_session 200
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.84.0>] ** Generic server 
> couch_server terminating 
> ** Last message in was {open,<<"fbm">>,
>                              [{user_ctx,{user_ctx,null,[],undefined}}]}
> ** When Server state == {server,"/opt/couchbase-server/var/lib/couchdb",
>                             {re_pattern,0,0,
>                                 <<69,82,67,80,116,0,0,0,16,0,0,0,1,0,0,0,0,0,
>                                   0,0,0,0,0,0,40,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>                                   0,93,0,72,25,77,0,0,0,0,0,0,0,0,0,0,0,0,254,
>                                   255,255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
>                                   77,0,0,0,0,16,171,255,3,0,0,0,128,254,255,
>                                   255,7,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,69,26,
>                                   84,0,72,0>>},
>                             100,2,"Sat, 18 Jun 2011 14:00:44 GMT"}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.84.0>] {error_report,<0.31.0>,
>     {<0.84.0>,crash_report,
>      [[{initial_call,{couch_server,init,['Argument__1']}},
>        {pid,<0.84.0>},
>        {registered_name,couch_server},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[couch_primary_services,couch_server_sup,<0.32.0>]},
>        {messages,[]},
>        {links,[<0.91.0>,<0.483.0>,<0.116.0>,<0.79.0>]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,6765},
>        {stack_size,24},
>        {reductions,206710598}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.79.0>] {error_report,<0.31.0>,
>     {<0.79.0>,supervisor_report,
>      [{supervisor,{local,couch_primary_services}},
>       {errorContext,child_terminated},
>       {reason,
>           {timeout,
>               {gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}},
>       {offender,
>           [{pid,<0.84.0>},
>            {name,couch_server},
>            {mfargs,{couch_server,sup_start_link,[]}},
>            {restart_type,permanent},
>            {shutdown,1000},
>            {child_type,worker}]}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.91.0>] ** Generic server <0.91.0> 
> terminating 
> ** Last message in was {'EXIT',<0.84.0>,
>                            {timeout,
>                                {gen_server,call,
>                                    [<0.116.0>,
>                                     {open_ref_count,<0.10417.79>}]}}}
> ** When Server state == {db,<0.91.0>,<0.92.0>,nil,<<"1308405644393791">>,
>                             <0.90.0>,<0.94.0>,
>                             {db_header,5,91,0,
>                                 {378285,{30,9}},
>                                 {380466,39},
>                                 nil,0,nil,nil,1000},
>                             91,
>                             {btree,<0.90.0>,
>                                 {378285,{30,9}},
>                                 #Fun<couch_db_updater.7.10053969>,
>                                 #Fun<couch_db_updater.8.35220795>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.9.107593676>},
>                             {btree,<0.90.0>,
>                                 {380466,39},
>                                 #Fun<couch_db_updater.10.30996817>,
>                                 #Fun<couch_db_updater.11.96515267>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.12.117826253>},
>                             {btree,<0.90.0>,nil,#Fun<couch_btree.0.83553141>,
>                                 #Fun<couch_btree.1.30790806>,
>                                 #Fun<couch_btree.2.124754102>,nil},
>                             91,<<"_users">>,
>                             
> "/opt/couchbase-server/var/lib/couchdb/_users.couch",
>                             [#Fun<couch_doc.7.50754398>],
>                             [],nil,
>                             {user_ctx,null,[],undefined},
>                             nil,1000,
>                             [before_header,after_header,on_file_open],
>                             true}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.91.0>] {error_report,<0.31.0>,
>     {<0.91.0>,crash_report,
>      [[{initial_call,{couch_db,init,['Argument__1']}},
>        {pid,<0.91.0>},
>        {registered_name,[]},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[<0.89.0>]},
>        {messages,[]},
>        {links,[]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,610},
>        {stack_size,24},
>        {reductions,8797798}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.300.0>] Shutting down view group 
> server, monitored db is closing.
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.10417.79>] Uncaught error in HTTP 
> request: {exit,
>                                  {{timeout,
>                                    {gen_server,call,
>                                     [<0.116.0>,
>                                      {open_ref_count,<0.10417.79>}]}},
>                                   {gen_server,call,
>                                    [couch_server,
>                                     {open,<<"fbm">>,
>                                      [{user_ctx,
>                                        {user_ctx,null,[],undefined}}]},
>                                     infinity]}}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.483.0>] ** Generic server 
> <0.483.0> terminating 
> ** Last message in was {'EXIT',<0.84.0>,
>                            {timeout,
>                                {gen_server,call,
>                                    [<0.116.0>,
>                                     {open_ref_count,<0.10417.79>}]}}}
> ** When Server state == {db,<0.483.0>,<0.484.0>,nil,<<"1308405937993370">>,
>                             <0.4643.19>,<0.4645.19>,
>                             {db_header,5,890453,0,
>                                 {3279126950,{752003,0}},
>                                 {3279118313,752003},
>                                 {3279132318,[]},
>                                 0,nil,3279127184,1000},
>                             890453,
>                             {btree,<0.4643.19>,
>                                 {3279126950,{752003,0}},
>                                 #Fun<couch_db_updater.7.10053969>,
>                                 #Fun<couch_db_updater.8.35220795>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.9.107593676>},
>                             {btree,<0.4643.19>,
>                                 {3279118313,752003},
>                                 #Fun<couch_db_updater.10.30996817>,
>                                 #Fun<couch_db_updater.11.96515267>,
>                                 #Fun<couch_btree.5.124754102>,
>                                 #Fun<couch_db_updater.12.117826253>},
>                             {btree,<0.4643.19>,
>                                 {3279132318,[]},
>                                 #Fun<couch_btree.0.83553141>,
>                                 #Fun<couch_btree.1.30790806>,
>                                 #Fun<couch_btree.2.124754102>,nil},
>                             890453,<<"fbm_full">>,
>                             
> "/opt/couchbase-server/var/lib/couchdb/fbm_full.couch",
>                             [#Fun<couch_doc.7.50754398>],
>                             [{<<"admins">>,
>                               {[{<<"names">>,[]},
>                                 {<<"roles">>,[<<"import">>]}]}},
>                              {<<"readers">>,
>                               {[{<<"names">>,[]},{<<"roles">>,[]}]}}],
>                             3279127184,
>                             {user_ctx,null,[],undefined},
>                             nil,1000,
>                             [before_header,after_header,on_file_open],
>                             false}
> ** Reason for termination == 
> ** {timeout,{gen_server,call,[<0.116.0>,{open_ref_count,<0.10417.79>}]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [error] [<0.483.0>] {error_report,<0.31.0>,
>     {<0.483.0>,crash_report,
>      [[{initial_call,{couch_db,init,['Argument__1']}},
>        {pid,<0.483.0>},
>        {registered_name,[]},
>        {error_info,
>            {exit,
>                {timeout,
>                    {gen_server,call,
>                        [<0.116.0>,{open_ref_count,<0.10417.79>}]}},
>                [{gen_server,terminate,6},{proc_lib,init_p_do_apply,3}]}},
>        {ancestors,[<0.480.0>]},
>        {messages,[]},
>        {links,[]},
>        {dictionary,[]},
>        {trap_exit,true},
>        {status,running},
>        {heap_size,6765},
>        {stack_size,24},
>        {reductions,1389}],
>       []]}}
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.2984.19>] Shutting down view group 
> server, monitored db is closing.
> [Wed, 29 Jun 2011 22:25:06 GMT] [info] [<0.10417.79>] Stacktrace: 
> [{gen_server,call,3},
>              {couch_server,open,2},
>              {couch_db,open,2},
>              {couch_httpd_db,do_db_req,2},
>              {couch_httpd,handle_request_int,5},
>              {mochiweb_http,headers,5},
>              {proc_lib,init_p_do_apply,3}]
> ------- end --------
> Here's the log file of me signing in as an admin, creating a new user, and 
> trying to sign-in as the newly created user. 
> ------ couch.log -------
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20439.91>] 93.92.103.118 - - 
> 'POST' /_session 200
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20457.91>] checkpointing view 
> update at seq 91 for _users _design/_auth
> [Fri, 01 Jul 2011 18:37:16 GMT] [info] [<0.20439.91>] 93.92.103.118 - - 'GET' 
> /_users/_design/_auth/_list/secure/users 200
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'PUT' 
> /_users/org.couchdb.user:[email protected] 201
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20457.91>] checkpointing view 
> update at seq 92 for _users _design/_auth
> [Fri, 01 Jul 2011 18:38:35 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'GET' 
> /_users/_design/_auth/_list/secure/users 200
> [Fri, 01 Jul 2011 18:38:47 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'GET' 
> /_users/_design/_auth/_list/secure/users?key=%22org.couchdb.user:[email protected]%22
>  200
> [Fri, 01 Jul 2011 18:38:47 GMT] [info] [<0.20456.91>] 93.92.103.118 - - 'PUT' 
> /_users/org.couchdb.user:[email protected] 201
> [Fri, 01 Jul 2011 18:39:00 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'GET' 
> /_session 200
> [Fri, 01 Jul 2011 18:39:01 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 'GET' 
> /fbm/_design/api/_list/secure/competitions 200
> [Fri, 01 Jul 2011 18:39:12 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 
> 'POST' /_session 401
> [Fri, 01 Jul 2011 18:39:22 GMT] [info] [<0.20547.91>] 93.92.103.118 - - 
> 'POST' /_session 401
> ------- end ---------
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to