On Nov 1, 2013, at 12:10 AM, Dave Cottlehuber <[email protected]> wrote:

>> On Oct 31, 2013, at 5:13 PM, Nathan Vander Wilt > 
>> wrote:
>> 
>> Aaaand my Couch commited suicide again today. Unless this is  
>> something different, I may have finally gotten lucky and had  
>> CouchDB leave a note [eerily unfinished!] in the logs this time:  
>> https://gist.github.com/natevw/fd509978516499ba128b  
>> 
>> ```
>> ** Reason == {badarg,
>> [{io,put_chars,
>> [<0.93.0>,unicode,
>> <<"[Thu, 31 Oct 2013 19:48:48 GMT] [info] [<0.31789.2>] 66.249.66.216  
>> - - GET 
>> /public/_design/glob/_list/posts/by_path?key=%5B%222012%22%2C%2203%22%2C%22metakaolin_geojson_editor%22%5D&include_docs=true&path1=2012&path2=03&path3=metakaolin_geojson_editor
>>   
>> 200\n">>],
>> []},
>> ```
>> 
>> So…now what? I have a rebuilt version of CouchDB I'm going to try  
>> [once I figure out why *it* isn't starting] but this is still really  
>> upsetting — I'm aware I could add my own cronjob or something to  
>> check and restart if needed every minute, but a) the shell script  
>> is SUPPOSED to be keeping CouchDB and b) it's NOT and c) this is  
>> embarrassing and aggravating.
>> 
>> thanks,
>> -natevw
> 
> So there’s 2 things here
> 
> - why the couch doesn’t get restarted?
> 
> Sounds very much like the afore mentioned pid race condition. Wendall do you 
> know any more about this? I thought you had some ideas about it IIRC.
> 


I think I figured out the answer to this one, at least in the latest crash. The 
Erlang process the shell script watches was still running, just not accepting 
connections. I didn't notice this the previous times, though…I only realized it 
this time because when I went to restart the shell script acted like it was 
already running. So maybe there's actually two crashes, one silent heartbeat 
one and this unicode?



> - why io:putchars/2 has trouble writing to a boring log file, which obviously 
> works most of the time.
> 
> <0.93.0>,unicode, <<"[Thu, 31 Oct 2013 19:48:48 GMT...”>>
> 
> io:put_chars(Fd, unicode, <<Binary>>) doesn’t look right — there’s no 
> io:put_chars/3. 
> 
> This unicode looks weird and from a quick look I can’t see where it should 
> come from.
> 
> Can you get more of the logfile (like hundreds of lines) and stick it 
> somewhere? email is fine.
> 
> I’d like to see what happens to <0.93.0> (the process wrapping the log fd), 
> and also if the unicode atom turns up anywhere else prior.


You want more of the log *up to* the crash? Because I have nothing *beyond* 
what is in that gist, that's the thing! The end of the log was cut off, I did 
not snip it. The log as it sits now has these exact lines in it:

```
                             {line,173}]},
                           {gen_event,ser
Apache CouchDB 1.4.0 (LogLevel=info) is starting.
```

(The subsequent "starting" is due to my intervention.)

-nvw



Reply via email to