Dave,
Thank You for quick reply. The issues appear in a production environment to
which I don't have access to modify configuration or design documents. Log
level at the moment is "error"
Below is a lengthy log dump we got when the os_process_timeout was 5000, after
increasing timeout to 30000 there has been no log entries at all when indexing
stops.
-----
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15656.0>] OS Process Error
<0.15657.0> :: {os_process_error, "OS process timed out."}
[Thu, 31 May 2012 17:42:17 GMT] [error] [emulator] Error in process <0.15656.0>
with exit value: {{nocatch,{os_process_error,"OS process timed
out."}},[{couch_os_process,prompt,2},{couch_query_servers,map_doc_raw,2},{couch_view_updater,'-do_maps/3-fun-0-',3},{couch_view_updater,do_maps,3}]}
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15648.0>] ** Generic server
<0.15648.0> terminating
** Last message in was {'EXIT',<0.15653.0>,
{{nocatch,
{os_process_error,"OS process timed out."}},
[{couch_os_process,prompt,2},
{couch_query_servers,map_doc_raw,2},
{couch_view_updater,'-do_maps/3-fun-0-',3},
{couch_view_updater,do_maps,3}]}}
** When Server state == {group_state,undefined,<<"mutka_replicated">>,
{"/data/mutka/couchdb-index",<<"mutka_replicated">>,
{group,
<<223,185,95,248,235,18,77,64,18,164,253,96,95,237,
204,20>>,
nil,<<"_design/transactionA-1.2.0">>,
<<"javascript">>,[],
[{view,0,0,0,
[<<"transactionByPaymentInstrument">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.paymentInstrumentId) {
emit([doc.paymentInstrumentId,doc.startTimestamp], null); } }">>,
nil,[],[]},
{view,1,0,0,
[<<"transactionByTerminal">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.paymentTerminalId) {
emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
nil,[],[]},
{view,2,0,0,
[<<"transactionBySession">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.protocolSessionId) {
emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
nil,[],[]},
{view,3,0,0,
[<<"transactionByRayId">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.cId) { emit([-(-doc.cId),doc.startTimestamp],
null); } }">>,
nil,[],[]}],
{[]},
nil,0,0,nil,nil}},
{group,
<<223,185,95,248,235,18,77,64,18,164,253,96,95,237,
204,20>>,
<0.15650.0>,<<"_design/transactionA-1.2.0">>,
<<"javascript">>,[],
[{view,0,236439939,0,
[<<"transactionByPaymentInstrument">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.paymentInstrumentId) {
emit([doc.paymentInstrumentId,doc.startTimestamp], null); } }">>,
{btree,<0.15650.0>,
{47573274456,{8694059,[]},257926106},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]},
{view,1,236439939,0,
[<<"transactionByTerminal">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.paymentTerminalId) {
emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
{btree,<0.15650.0>,
{47574093427,{33638477,[]},942288018},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]},
{view,2,236439939,0,
[<<"transactionBySession">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.protocolSessionId) {
emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
{btree,<0.15650.0>,
{47574114746,{9241366,[]},131141244},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]},
{view,1,236439939,0,
[<<"transactionByTerminal">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.paymentTerminalId) {
emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
{btree,<0.15650.0>,
{47574093427,{33638477,[]},942288018},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]},
{view,2,236439939,0,
[<<"transactionBySession">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.protocolSessionId) {
emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
{btree,<0.15650.0>,
{47574114746,{9241366,[]},131141244},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]},
{view,3,236433956,0,
[<<"transactionByRayId">>],
<<"function(doc) { if (doc.objectType ==
\"ProtocolTransaction\" && doc.cId) { emit([-(-doc.cId),doc.startTimestamp],
null); } }">>,
{btree,<0.15650.0>,
{47559121340,{2250018,[]},76590679},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_view.less_json_ids.2>,
#Fun<couch_view_group.10.26766604>,snappy},
[],[]}],
{[]},
{btree,<0.15650.0>,
{47572622835,[],1061098089},
#Fun<couch_btree.3.71804109>,
#Fun<couch_btree.4.115144917>,
#Fun<couch_btree.5.93788370>,nil,snappy},
236439939,0,nil,nil},
<0.15653.0>,nil,false,
[{{<0.15441.0>,#Ref<0.0.0.182446>},409571621}],
<0.15652.0>,false}
** Reason for termination ==
** {os_process_error,"OS process timed out."}
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15648.0>] {error_report,<0.31.0>,
{<0.15648.0>,crash_report,
[[{initial_call,
{couch_view_group,init,['Argument__1']}},
{pid,<0.15648.0>},
{registered_name,[]},
{error_info,
{exit,
{os_process_error,"OS process timed out."},
[{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]}},
{ancestors,[<0.15647.0>]},
{messages,[]},
{links,[<0.15650.0>,<0.123.0>]},
{dictionary,[]},
{trap_exit,true},
{status,running},
{heap_size,2584},
{stack_size,24},
{reductions,18059924}],
[]]}}
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15441.0>] Uncaught server error:
{os_process_error, <<"OS process timed out.">>}
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15650.0>] ** Generic server
<0.15650.0> terminating
** Last message in was {'EXIT',<0.15648.0>, {os_process_error,"OS process timed
out."}}
** When Server state == {file,{file_descriptor,prim_file,{#Port<0.2119>,19}},
47574426987}
** Reason for termination ==
** {os_process_error,"OS process timed out."}
[Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15650.0>] {error_report,<0.31.0>,
{<0.15650.0>,crash_report,
[[{initial_call,{couch_file,init,['Argument__1']}},
{pid,<0.15650.0>},
{registered_name,[]},
{error_info,
{exit,
{os_process_error,"OS process timed out."},
[{gen_server,terminate,6},
{proc_lib,init_p_do_apply,3}]}},
{ancestors,[<0.15648.0>,<0.15647.0>]},
{messages,[{'EXIT',<0.15652.0>,shutdown}]},
{links,[]},
{dictionary,[]},
{trap_exit,true},
{status,running},
{heap_size,2584},
{stack_size,24},
{reductions,27732395236}],
[]]}}
-Sami
On Jun 5, 2012, at 12:23 PM, Dave Cottlehuber wrote:
> On 5 June 2012 11:13, Sami Sierla <[email protected]> wrote:
>> Hi,
>>
>> We have a rather large database (about 90 million documents /200GB) running
>> on CouchDB (1.0.3) and we're now updating it to version 1.2.0 due to view
>> compaction problems (large view group compactions never finished).
>>
>> At the moment we are rebuilding (JavaScript) views with 1.2.0 but during
>> this we have stumbled upon to new problem : indexer processes suddenly just
>> disappear. Initially we got "OS Process Timeout" -errors to log but after
>> adjusting os_process_timeout to 30secs indexing still prematurely stops but
>> without any log entry.
>>
>> Any ideas what might cause this behavior?
>>
>> CouchDB is running on RHEL 5.8 and is statically linked with SpiderMonkey
>> 1.8.5
>>
>>
>> Regards,
>> Sami Sierla / Poplatek Oy / Finland
>
> Sami,
>
> Have you anything useful in the couch.log file? Are you able to run
> the view generation in debug mode (might not be possible due to disk
> space constraints & performance impact).
>
> Also, if you query the view with ?limit=1&descending=true you'll get
> the last doc that couch successfully processed (I think). Is there
> anything special about that or the subsequent documents? If you
> process the view & those docs manually into node or js.exe directly
> [1] does that work?
>
> There's quite a few changes in 1.0.3 -> 1.2.0 including better
> detection of ill-formed docs amongst others, more info will help
> narrow this down.
>
> A+
> Dave
>
> [1]: http://wiki.apache.org/couchdb/Troubleshooting#Map.2BAC8-Reduce_debugging