Re: CouchDB 1.0.2 errors under load

Paul Davis Fri, 25 Feb 2011 09:06:06 -0800

On Fri, Feb 25, 2011 at 4:18 AM, Pasi Eronen <[email protected]> wrote:
> Hi,
>
> I had a big batch job (inserting 10M+ documents and generating views for them)
> that ran just fine for about 6 hours, but then I got this error:
>
> [Thu, 24 Feb 2011 19:42:57 GMT] [error] [<0.276.0>] ** Generic server
> <0.276.0> terminating
> ** Last message in was delayed_commit
> ** When Server state == {db,<0.275.0>,<0.276.0>,nil,<<"1298547642391489">>,
>                            <0.273.0>,<0.277.0>,
>                            {db_header,5,739828,0,
>                                {4778613011,{663866,0}},
>                                {4778614954,663866},
>                                nil,0,nil,nil,1000},
>                            739828,
>                            {btree,<0.273.0>,
>                                {4778772755,{663866,0}},
>                                #Fun<couch_db_updater.7.10053969>,
>                                #Fun<couch_db_updater.8.35220795>,
>                                #Fun<couch_btree.5.124754102>,
>                                #Fun<couch_db_updater.9.107593676>},
>                            {btree,<0.273.0>,
>                                {4778774698,663866},
>                                #Fun<couch_db_updater.10.30996817>,
>                                #Fun<couch_db_updater.11.96515267>,
>                                #Fun<couch_btree.5.124754102>,
>                                #Fun<couch_db_updater.12.117826253>},
>                            {btree,<0.273.0>,nil,
>                                #Fun<couch_btree.0.83553141>,
>                                #Fun<couch_btree.1.30790806>,
>                                #Fun<couch_btree.2.124754102>,nil},
>                            739831,<<"foo_replication_tmp">>,
>                            "/data/foo/couchdb-data/foo_replication_tmp.couch",
>                            [],[],nil,
>                            {user_ctx,null,[],undefined},
>                            #Ref<0.0.1793.256453>,1000,
>                            [before_header,after_header,on_file_open],
>                            false}
> ** Reason for termination ==
> ** {{badmatch,{error,emfile}},
>    [{couch_file,sync,1},
>     {couch_db_updater,commit_data,2},
>     {couch_db_updater,handle_info,2},
>     {gen_server,handle_msg,5},
>     {proc_lib,init_p_do_apply,3}]}
>
> (+lot of other messages with the same timestamp -- can send if they're useful)
>
> Exactly at this time, the client got HTTP 500 status code; the request
> was a bulk get (POST /foo_replication_tmp/_all_docs?include_docs=true).
>
> Just before this request, the client had made a PUT (updating an existing
> document) that got 200 status code, but apparently was not successfully
> committed to the disk (I'm using "delayed_commits=true" - for my app,
> this is just fine). The client had received the new _rev value, but when
> it tried updating the same document a minute later, there was a conflict
> (and it's not possible that somebody else updated this same document).
>
> About four hours later, there was a different error ("accept_failed"
> sounds like some temporary problem with sockets?):
>
> [Thu, 24 Feb 2011 23:55:42 GMT] [error] [<0.20693.4>] {error_report,<0.31.0>,
>              {<0.20693.4>,std_error,
>               [{application,mochiweb},
>                "Accept failed error","{error,emfile}"]}}
>
> [Thu, 24 Feb 2011 23:55:42 GMT] [error] [<0.20693.4>] {error_report,<0.31.0>,
>    {<0.20693.4>,crash_report,
>     [[{initial_call,{mochiweb_socket_server,acceptor_loop,['Argument__1']}},
>       {pid,<0.20693.4>},
>       {registered_name,[]},
>       {error_info,
>           {exit,
>               {error,accept_failed},
>               [{mochiweb_socket_server,acceptor_loop,1},
>                {proc_lib,init_p_do_apply,3}]}},
>       {ancestors,
>           [couch_httpd,couch_secondary_services,couch_server_sup,<0.32.0>]},
>       {messages,[]},
>       {links,[<0.106.0>]},
>       {dictionary,[]},
>       {trap_exit,false},
>       {status,running},
>       {heap_size,233},
>       {stack_size,24},
>       {reductions,200}],
>      []]}}
>
> (+lots of other messages within the next couple of minutes)
>
> The same error occured once more, about four hours later.
>
> I'm quite new to CouchDB, so I'd appreciate any help in interpreting
> what these error messages mean. (BTW, are these something I should
> report as bugs in JIRA? I can do that, but I'd like to at least understand
> which parts of the error messages are actually relevant here :-)
>
> I'm running CouchDB 1.0.2 with Erlang R14B on 64-bit RHEL 5.6.
>
> Best regards,
> Pasi
>


The error you're getting is because CouchDB is running out of
available file descriptors to use. Try increasing the limit for the
user running CouchDB.

Re: CouchDB 1.0.2 errors under load

Reply via email to