On Tue, Feb 23, 2016 at 11:50:10PM +0100, Harald Dunkel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
> Hi Lukas,
> On 02/23/16 13:46, Lukas Slebodnik wrote:
> > On (23/02/16 13:01), Harald Dunkel wrote:
> >> On 02/23/2016 11:58 AM, Lukas Slebodnik wrote:
> >>> I would rather focus on different thing. Why is sssd_be process blocked
> >>> for long time?
> >> I have no idea. Was it really blocked?
> > It needn't be blocked itself. But it was busy with some non-blocking
> > operation which main process considered as bad state.
> Do you think this is OK? Did it try to terminate the unresponsive
> sssd_be, or did it just try to start a new one and ran into a
> conflict with the old?
We should have started a new one. Again, I'm speculating, but I /think/
that because the system might have been under load, the sssd_be took too
long to restart and the monitor (sssd itself) gave up on it. Of course,
it's something we should fix, but without a better idea how to
reproduce the error in the first place, I'm not sure how to start to be
> > Would you mind to share sssd log files with high debug level?
> Surely I can increase the log level for sssd. I wonder why
> sssd_be doesn't write its own log file?
Do you have debug_level=N in the [domain] section?
> >> Does it really have to be watched? Wouldn't it be the job of systemd to
> >> restart the service when it dies?
> > sssd works also on non-systemd distribution. We plan to reply on systemd.
> > If you want to speed-up process then patches are always welcomed.
> I highly appreciate your effort on providing compatibility with
> sysv init and others, but do you know that ipa-client-install (4.0.5)
> dies without systemd? I cannot tell for more recent ipa versions,
> since they are not available for Debian 8.
> > And moreover systemd would not solve the main issue. we should try to find
> > out why sssd_be did not respond for long time.
> Maybe it would help to improve the way how the monitor checks for un-
> responsive threads instead? We have no indication that sssd_be had
> any problem, except for sssd trying to start a new one. Since sssd
> couldn't I would assume that the old sssd_be was still up and running
> and that sssd was the buggy part.
In future we would like to make heavier use of systemd features, we need
to socket-activate the parts as a first step. Using systemd's watchdog
would also be nice, but we're not there yet.
Manage your subscription for the Freeipa-users mailing list:
Go to http://freeipa.org for more info on the project