Hi,

Actually we are not sure if we are hitting the timeout or not since there are 
no log entries anymore after increasing os_process_timeout. It might be that 
some view code is not well guarded though (you can actually see the code from 
server state part of log) - in any case I think there should be some error log 
if the JS code causes indexer failure. Compacting views/dbs seem to be fine 
with 1.2.0. We have no access to curl on production environment so debugging is 
a bit difficult.

In the testing environment we haven't seen this issue but the data is not 
identical and the number of documents is much smaller (about 1/100).

View indexing is periodically re-triggered by our java application so the views 
will get eventually finished after multiple resumes.

Regards,
Sami

On Jun 5, 2012, at 2:20 PM, Robert Newson wrote:

> You can increase that timeout with;
> 
> curl -XPUT localhost:5984/_config/couchdb/os_process_timeout -d '"60000"'
> 
> B.
> 
> On 5 June 2012 12:18, Robert Newson <[email protected]> wrote:
>> Sounds like https://issues.apache.org/jira/browse/COUCHDB-994
>> 
>> B.
>> 
>> On 5 June 2012 11:07, Sami Sierla <[email protected]> wrote:
>>> Dave,
>>> 
>>> Thank You for quick reply. The issues appear in a production environment to 
>>> which I don't have access to modify configuration or design documents. Log 
>>> level at the moment is "error"
>>> 
>>> Below is a lengthy log dump we got when the os_process_timeout was 5000, 
>>> after increasing timeout to 30000 there has been no log entries at all when 
>>> indexing stops.
>>> 
>>> -----
>>> 
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15656.0>] OS Process Error 
>>> <0.15657.0> :: {os_process_error, "OS process timed out."}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [emulator] Error in process 
>>> <0.15656.0> with exit value: {{nocatch,{os_process_error,"OS process timed 
>>> out."}},[{couch_os_process,prompt,2},{couch_query_servers,map_doc_raw,2},{couch_view_updater,'-do_maps/3-fun-0-',3},{couch_view_updater,do_maps,3}]}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15648.0>] ** Generic server 
>>> <0.15648.0> terminating
>>> ** Last message in was {'EXIT',<0.15653.0>,
>>>                           {{nocatch,
>>>                                {os_process_error,"OS process timed out."}},
>>>                            [{couch_os_process,prompt,2},
>>>                             {couch_query_servers,map_doc_raw,2},
>>>                             {couch_view_updater,'-do_maps/3-fun-0-',3},
>>>                             {couch_view_updater,do_maps,3}]}}
>>> ** When Server state == {group_state,undefined,<<"mutka_replicated">>,
>>>                         {"/data/mutka/couchdb-index",<<"mutka_replicated">>,
>>>                          {group,
>>>                           
>>> <<223,185,95,248,235,18,77,64,18,164,253,96,95,237,
>>>                             204,20>>,
>>>                           nil,<<"_design/transactionA-1.2.0">>,
>>>                           <<"javascript">>,[],
>>>                           [{view,0,0,0,
>>>                             [<<"transactionByPaymentInstrument">>],
>>>                             <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.paymentInstrumentId) { 
>>> emit([doc.paymentInstrumentId,doc.startTimestamp], null); } }">>,
>>>                             nil,[],[]},
>>>                            {view,1,0,0,
>>>                             [<<"transactionByTerminal">>],
>>>                             <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.paymentTerminalId) { 
>>> emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
>>>                             nil,[],[]},
>>>                            {view,2,0,0,
>>>                             [<<"transactionBySession">>],
>>>                             <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.protocolSessionId) { 
>>> emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
>>>                             nil,[],[]},
>>>                            {view,3,0,0,
>>>                             [<<"transactionByRayId">>],
>>>                             <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.cId) { 
>>> emit([-(-doc.cId),doc.startTimestamp], null); } }">>,
>>>                             nil,[],[]}],
>>>                           {[]},
>>>                           nil,0,0,nil,nil}},
>>>                         {group,
>>>                          <<223,185,95,248,235,18,77,64,18,164,253,96,95,237,
>>>                            204,20>>,
>>>                          <0.15650.0>,<<"_design/transactionA-1.2.0">>,
>>>                          <<"javascript">>,[],
>>>                          [{view,0,236439939,0,
>>>                            [<<"transactionByPaymentInstrument">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.paymentInstrumentId) { 
>>> emit([doc.paymentInstrumentId,doc.startTimestamp], null); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47573274456,{8694059,[]},257926106},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]},
>>>                           {view,1,236439939,0,
>>>                            [<<"transactionByTerminal">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.paymentTerminalId) { 
>>> emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47574093427,{33638477,[]},942288018},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]},
>>>                           {view,2,236439939,0,
>>>                            [<<"transactionBySession">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.protocolSessionId) { 
>>> emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47574114746,{9241366,[]},131141244},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]},
>>>                           {view,1,236439939,0,
>>>                            [<<"transactionByTerminal">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.paymentTerminalId) { 
>>> emit([doc.paymentTerminalId,doc.startTimestamp], null); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47574093427,{33638477,[]},942288018},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]},
>>>                           {view,2,236439939,0,
>>>                            [<<"transactionBySession">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.protocolSessionId) { 
>>> emit(doc.protocolSessionId,doc.protocolTransactionId); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47574114746,{9241366,[]},131141244},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]},
>>>                           {view,3,236433956,0,
>>>                            [<<"transactionByRayId">>],
>>>                            <<"function(doc) { if (doc.objectType == 
>>> \"ProtocolTransaction\" && doc.cId) { 
>>> emit([-(-doc.cId),doc.startTimestamp], null); } }">>,
>>>                            {btree,<0.15650.0>,
>>>                             {47559121340,{2250018,[]},76590679},
>>>                             #Fun<couch_btree.3.71804109>,
>>>                             #Fun<couch_btree.4.115144917>,
>>>                             #Fun<couch_view.less_json_ids.2>,
>>>                             #Fun<couch_view_group.10.26766604>,snappy},
>>>                            [],[]}],
>>>                          {[]},
>>>                          {btree,<0.15650.0>,
>>>                           {47572622835,[],1061098089},
>>>                           #Fun<couch_btree.3.71804109>,
>>>                           #Fun<couch_btree.4.115144917>,
>>>                           #Fun<couch_btree.5.93788370>,nil,snappy},
>>>                          236439939,0,nil,nil},
>>>                         <0.15653.0>,nil,false,
>>>                         [{{<0.15441.0>,#Ref<0.0.0.182446>},409571621}],
>>>                         <0.15652.0>,false}
>>> ** Reason for termination ==
>>> ** {os_process_error,"OS process timed out."}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15648.0>] 
>>> {error_report,<0.31.0>,
>>>                       {<0.15648.0>,crash_report,
>>>                        [[{initial_call,
>>>                           {couch_view_group,init,['Argument__1']}},
>>>                          {pid,<0.15648.0>},
>>>                          {registered_name,[]},
>>>                          {error_info,
>>>                           {exit,
>>>                            {os_process_error,"OS process timed out."},
>>>                            [{gen_server,terminate,6},
>>>                             {proc_lib,init_p_do_apply,3}]}},
>>>                          {ancestors,[<0.15647.0>]},
>>>                          {messages,[]},
>>>                          {links,[<0.15650.0>,<0.123.0>]},
>>>                          {dictionary,[]},
>>>                          {trap_exit,true},
>>>                          {status,running},
>>>                          {heap_size,2584},
>>>                          {stack_size,24},
>>>                          {reductions,18059924}],
>>>                         []]}}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15441.0>] Uncaught server 
>>> error: {os_process_error, <<"OS process timed out.">>}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15650.0>] ** Generic server 
>>> <0.15650.0> terminating
>>> ** Last message in was {'EXIT',<0.15648.0>, {os_process_error,"OS process 
>>> timed out."}}
>>> ** When Server state == 
>>> {file,{file_descriptor,prim_file,{#Port<0.2119>,19}}, 47574426987}
>>> ** Reason for termination ==
>>> ** {os_process_error,"OS process timed out."}
>>> [Thu, 31 May 2012 17:42:17 GMT] [error] [<0.15650.0>] 
>>> {error_report,<0.31.0>,
>>>                       {<0.15650.0>,crash_report,
>>>                        [[{initial_call,{couch_file,init,['Argument__1']}},
>>>                          {pid,<0.15650.0>},
>>>                          {registered_name,[]},
>>>                          {error_info,
>>>                           {exit,
>>>                            {os_process_error,"OS process timed out."},
>>>                            [{gen_server,terminate,6},
>>>                             {proc_lib,init_p_do_apply,3}]}},
>>>                          {ancestors,[<0.15648.0>,<0.15647.0>]},
>>>                          {messages,[{'EXIT',<0.15652.0>,shutdown}]},
>>>                          {links,[]},
>>>                          {dictionary,[]},
>>>                          {trap_exit,true},
>>>                          {status,running},
>>>                          {heap_size,2584},
>>>                          {stack_size,24},
>>>                          {reductions,27732395236}],
>>>                         []]}}
>>> 
>>> 
>>> -Sami
>>> On Jun 5, 2012, at 12:23 PM, Dave Cottlehuber wrote:
>>> 
>>>> On 5 June 2012 11:13, Sami Sierla <[email protected]> wrote:
>>>>> Hi,
>>>>> 
>>>>> We have a rather large database (about 90 million documents /200GB) 
>>>>> running on CouchDB (1.0.3) and we're now updating it to version 1.2.0 due 
>>>>> to view compaction problems (large view group compactions never finished).
>>>>> 
>>>>> At the moment we are rebuilding (JavaScript) views with 1.2.0 but during 
>>>>> this we have stumbled upon to new problem : indexer processes suddenly 
>>>>> just disappear. Initially we got "OS Process Timeout" -errors to log but 
>>>>> after adjusting os_process_timeout to 30secs indexing still prematurely 
>>>>> stops but without any log entry.
>>>>> 
>>>>> Any ideas what might cause this behavior?
>>>>> 
>>>>> CouchDB is running on RHEL 5.8 and is statically linked with SpiderMonkey 
>>>>> 1.8.5
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Sami Sierla / Poplatek Oy / Finland
>>>> 
>>>> Sami,
>>>> 
>>>> Have you anything useful in the couch.log file? Are you able to run
>>>> the view generation in debug mode (might not be possible due to disk
>>>> space constraints  & performance impact).
>>>> 
>>>> Also, if you query the view with ?limit=1&descending=true you'll get
>>>> the last doc that couch successfully processed (I think). Is there
>>>> anything special about that or the subsequent documents? If you
>>>> process the view & those docs manually into node or js.exe directly
>>>> [1] does that work?
>>>> 
>>>> There's quite a few changes in 1.0.3 -> 1.2.0 including better
>>>> detection of ill-formed docs amongst others, more info will help
>>>> narrow this down.
>>>> 
>>>> A+
>>>> Dave
>>>> 
>>>> [1]: 
>>>> http://wiki.apache.org/couchdb/Troubleshooting#Map.2BAC8-Reduce_debugging
>>> 

Reply via email to