Just a comment from the peanut gallery.... It may still be worth a look to make sure ya'll are not leaking fds somewhere. I've been bitten by that particular problem a number of times.
Thanks Joe On Sat, Feb 11, 2012 at 2:16 PM, Ali Lown <[email protected]> wrote: > ulimit -n returned 4096. > > So that was probably my problem. > > I have opened it up to 'unlimited' now, so it shouldn't be a problem > anymore... > > On 11 February 2012 15:10, Michael MacFadden > <[email protected]> wrote: > > One thing to mention is that most unix / linux systems their is usually > a maximum number of file descriptors as well as a per user / per shell > limit also. If you increased you system level number of file descriptors > to 100,000, you still may only have a smaller amount available per user. > Setting the system level number may not affect your user level. For > example if I issue the ulimit -n command on my machine I see 1024. Can u > verify this number for the user running wiab. 100,000 seems like plenty. > > > > > > On Feb 11, 2012, at 3:23 AM, Yuri Z wrote: > > > >> I think I had something similar, it is related to ulimit, just make sure > >> you do it correctly for your Linux user > >> On Feb 11, 2012 11:38 AM, "Ali Lown" <[email protected]> wrote: > >> > >>> I have now had this error occur a few times, and it results in a > >>> variety of problems from an inability to login (can't open user data > >>> file) to can't create new waves (can't create a new wavelet > >>> persistance delta) to can't open waves (can't open user wavelet > >>> deltas...) as well as the more-common failing to persist deltas > >>> (because can't write the file). > >>> > >>> I have upped the system limit to 100000 in an attempt to prevent it > >>> occurring again. > >>> > >>> Restarting the WIAB server only adds a few more hours before it > >>> repeats. The only way to clear it for a while is to restart the whole > >>> server it is running on (clears open inode tables?) > >>> > >>> Is the WIAB code failing to close its file handles somewhere? My > >>> suggestion would be that it doesn't when I SIGKILL the server to > >>> restart it. > >>> > >>> Has anyone else come across this problem yet? > >>> > >>> Ali > >>> > > >
