Send inn-workers mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.isc.org/mailman/listinfo/inn-workers
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of inn-workers digest..."
Today's Topics:
1. Re: INN 2.5.4 strange crash (Petr Novopashenniy)
2. Re: INN 2.5.4 strange crash (Julien ?LIE)
----------------------------------------------------------------------
Message: 1
Date: Wed, 4 Feb 2015 17:52:26 +0300 (MSK)
From: Petr Novopashenniy <[email protected]>
To: Julien ?LIE <[email protected]>
Cc: [email protected]
Subject: Re: INN 2.5.4 strange crash
Message-ID: <[email protected]>
Content-Type: TEXT/PLAIN; charset=US-ASCII
On Tue, 3 Feb 2015, Julien ?LIE wrote:
[dd]
J?? > I saw "blocked sleeping" before (for my two channels, innfeed-neva1 and
J?? > innfeed-neva2 at varios times), but not in the time of crash.
J??
J?? They do not seem related then.
J??
May be..
[dd]
J?? When you have such logs, please paste them so as to check the closing is
J?? fine.
J??
I looked more closely my logs, and this is partial information from them:
Feb 4 08:34:33 ns innd: innfeed-neva1!:26:proc:15113 cant write: Resource
temporarily unavailable
Feb 4 08:34:33 ns innd: innfeed-neva1!:26:proc:15113 blocked sleeping 5
[dd]
Feb 4 08:34:34 ns innd: innfeed-neva1!:26:proc:15113 cant write: Resource
temporarily unavailable
Feb 4 08:34:34 ns innd: innfeed-neva1!:26:proc:15113 blocked sleeping 5
[dd]
Feb 4 08:34:39 ns innd: innfeed-neva1!:26:proc:15113 wakeup
Feb 4 08:34:39 ns last message repeated 33 times
[dd]
Feb 4 08:34:39 ns innd: innfeed-neva1!:26:proc:15113 wakeup
Feb 4 08:34:39 ns last message repeated 85 times
$ cat news.notice |grep " wakeup"|wc -l
11103
$
And EAGAIN only twice.
No "sleeping without Waker" messages.
I not see messages with "wakeup" and EAGAIN (temporarily unavailable) in
errlog and news.err, it surprises me.
Continued logs:
We receive control message:
Feb 4 10:32:46 ns innd: innfeed-neva1!:26:proc:15113 wakeup
Feb 4 10:32:46 ns last message repeated 55 times
Feb 4 10:32:46 ns controlchan[15112]: control_newgroup,
alt.binaries.history [email protected] [email protected]
@03006E657773310000000003051400065DD3
@, , , UTF-8, neva-out.feeder.erje.net, doit, 1
Feb 4 10:32:46 ns innd: ctlinnd command
k:alt.binaries.history:y:[email protected]
Feb 4 10:32:47 ns innd: controlchan! closed
Feb 4 10:32:47 ns innd: innfeed-neva1! closed
Feb 4 10:32:47 ns innfeed[15113]: ME source lost . Exiting
[dd]
Feb 4 10:33:08 ns innd: controlchan! exit 0 elapsed 14017 pid 15112
Feb 4 10:33:08 ns innd: controlchan!:26:proc:26571 26 sleeping without
Waker
Feb 4 10:33:08 ns innd: controlchan!:26:proc:26571 closed
Feb 4 10:33:08 ns innd: controlchan! exit 0 elapsed 0 pid 26571
Feb 4 10:33:08 ns innd: controlchan!:99:file closed
Feb 4 10:33:09 ns innd: controlchan! spawned controlchan!:99:proc:26578
Feb 4 10:33:09 ns innd: controlchan! restarted
[dd]
Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 26 sleeping without Waker
Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 checkpoint seconds 0
accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size
0 rejected size 0
Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 closed seconds 0 accepted
0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 0
rejected size 0
Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 26 sleeping without Waker
Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 checkpoint seconds 0
accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size
0 rejected size 0
Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 closed seconds 0 accepted
0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 0
rejected size 0
[and same for several peers]
Feb 4 10:33:10 ns innd: free:-1 26 free but was in SMASK
And no errors after that.
Looks like problem with my system, I see same fd 26.
In my old logs, I see different fd numbers, but problem is same.
--pety
------------------------------
Message: 2
Date: Wed, 04 Feb 2015 21:07:06 +0100
From: Julien ?LIE <[email protected]>
To: [email protected]
Subject: Re: INN 2.5.4 strange crash
Message-ID: <[email protected]>
Content-Type: text/plain; charset=windows-1252; format=flowed
Hi Petr,
> Feb 4 08:34:33 ns innd: innfeed-neva1!:26:proc:15113 cant write: Resource
> temporarily unavailable
> Feb 4 08:34:33 ns innd: innfeed-neva1!:26:proc:15113 blocked sleeping 5
> [dd]
> Feb 4 08:34:34 ns innd: innfeed-neva1!:26:proc:15113 cant write: Resource
> temporarily unavailable
> Feb 4 08:34:34 ns innd: innfeed-neva1!:26:proc:15113 blocked sleeping 5
Strange that the log is written twice (the second one should have been
written at least 5 seconds later).
> Feb 4 08:34:39 ns innd: innfeed-neva1!:26:proc:15113 wakeup
> Feb 4 08:34:39 ns last message repeated 33 times
> [dd]
> Feb 4 08:34:39 ns innd: innfeed-neva1!:26:proc:15113 wakeup
> Feb 4 08:34:39 ns last message repeated 85 times
A lot of wake up at the same second...
> Feb 4 10:33:08 ns innd: controlchan!:26:proc:26571 26 sleeping without
> Waker
> Feb 4 10:33:08 ns innd: controlchan!:26:proc:26571 closed
> Feb 4 10:33:08 ns innd: controlchan! exit 0 elapsed 0 pid 26571
> Feb 4 10:33:08 ns innd: controlchan!:99:file closed
> Feb 4 10:33:09 ns innd: controlchan! spawned controlchan!:99:proc:26578
> Feb 4 10:33:09 ns innd: controlchan! restarted
That's the behaviour of the patch: it closes the channel if found in an
unexpected state.
Seems fine here.
> Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 26 sleeping without Waker
> Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 checkpoint seconds 0
> accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size
> 0 rejected size 0
> Feb 4 10:33:09 ns innd: goblin1.stu.neva.ru:26 closed seconds 0 accepted
> 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 0
> rejected size 0
> Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 26 sleeping without Waker
> Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 checkpoint seconds 0
> accepted 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size
> 0 rejected size 0
> Feb 4 10:33:09 ns innd: goblin3.stu.neva.ru:26 closed seconds 0 accepted
> 0 refused 0 rejected 0 duplicate 0 accepted size 0 duplicate size 0
> rejected size 0
>
> [and same for several peers]
>
> Feb 4 10:33:10 ns innd: free:-1 26 free but was in SMASK
>
> And no errors after that.
> Looks like problem with my system, I see same fd 26.
It should normally be fine as long as goblin1.stu.neva.ru:26 is closed
before goblin3.stu.neva.ru:26, and so on.
What I do not explain is why the sleeping state of these channels is not
removed by the call of SCHANremove(cp); it sounds as though
CHANsleeping(cp) were false when entering into SCHANremove.
Maybe you could change
void
SCHANremove(CHANNEL *cp)
{
if (!CHANsleeping(cp))
return;
to
void
SCHANremove(CHANNEL *cp)
{
if (!CHANsleeping(cp)) {
notice("%s was not sleeping", CHANname(cp));
return;
}
so as to verify that SCHANremove does the right thing when called after
a "sleeping without Waker" warning?
At least fd 26 is freed at 10:33:10, which solves the issue you are
facing. But I also do not explain why it had not been freed sooner...
--
Julien ?LIE
? Ce qui m'int?resse chez une femme, c'est ce que je n'ose pas lui
demander. ? (Wolinski)
------------------------------
_______________________________________________
inn-workers mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/inn-workers
End of inn-workers Digest, Vol 69, Issue 3
******************************************