Are you running with delayed_commits=true or false?
B.
On 7 Aug 2012, at 18:27, stephen bartell wrote:
>
>> Hi Stephen,
>>
>> Can you tell us anymore about the context, or did you start seeing these in
>> the logs?
>
> Sure, here's some context. This couch is part of a demo server. It travels
> a lot and is cycled a lot. There is one physical server, it consists of
> nginx (serving web apps and reverse proxying for couch), couchdb for
> persistence, and numerous programs which read and write to couch. Traffic on
> couch can get very heavy.
>
> I didn't first see this in the logs. Some of the web apps would grind to a
> halt, nginx would return 404, and then eventually couch would restart. This
> would happen every couple of minutes.
>
>> By chance do you have a scenario that reproduces this? Was this db compacted
>> or replicated from elsewhere?
>
> I wish I had a pliable scenario other than sending the server through taxi
> cabs, airlines, and pulling the power cord several times a day. We haven't
> seen this on any of our production servers.
> This server was not subject to any replication. Most databases on it are
> compacted often.
>
> Last night we were able to drill down to one particular program which was
> triggering the crash. One by one, we backed up, deleted, and rebuilt the
> databases that program touched. There was one database which seemed to be
> the culprit, lets call it History. History is a dumping ground for stale
> docs from another db. History is almost always written to, and rarely read
> from. We don't compact History since all docs in it are one revision deep.
> We never replicate to or from it. The only reason we deem History the
> culprit is because after rebuilding it, there hasn't been a crash for over 12
> hours.
>
> I have an additional question. Is it possible to turn couch logging off
> entirely, or would redirecting to dev/null suffice? When couch would crash,
> hundreds of MB of crap would get dumped to the log. (
> {{badmatch,{ok,<<32,50,48,48,10 … 'hundreds of MB of crap' … ,0,3,232>>}}).
> Right when this dump occurred, the cpu spiked and the server began its
> downward descent.
>
> Best
>
>>
>> Thanks,
>>
>> Bob
>> On Aug 7, 2012, at 2:06 AM, stephen bartell <[email protected]> wrote:
>>
>>> Hi all, could some one help shed some light on this crash I'm having. I'm
>>> on v1.2, ubuntu 11.04.
>>>
>>> [Mon, 06 Aug 2012 18:29:16 GMT] [error] [<0.492.0>] ** Generic server
>>> <0.492.0> terminating
>>> ** Last message in was {pread_iolist,88385709}
>>> ** When Server state ==
>>> {file,{file_descriptor,prim_file,{#Port<0.2899>,79}},
>>> 93302896}
>>> ** Reason for termination ==
>>> ** {{badmatch,{ok,<<32,50,48,48,10 … huge dump … ,0,3,232>>}},
>>> [{couch_file,read_raw_iolist_int,3},
>>> {couch_file,maybe_read_more_iolist,4},
>>> {couch_file,handle_call,3},
>>> {gen_server,handle_msg,5},
>>> {proc_lib,init_p_do_apply,3}]}
>>>
>>> I'm not too familiar with erlang, but what I gathered from the src was
>>> `pread_iolist` function is used when reading anything from the disk. So I
>>> think this might be a corrupt db problem.
>>>
>>> Thanks,
>>> Stephen Bartell
>>
>