** Description changed:

  [Impact]
  sssd can switch to an offline mode of operation when it cannot reach the 
authentication or id backend. It uses several methods to assess the situation, 
and one of them is monitoring the /etc/resolv.conf file for changes.
  
  In ubuntu that file is a symlink to /run/systemd/resolve/stub-
  resolv.conf, but the target doesn't exist at all times during boot. It's
  expected that symlink to be broken for a while during boot.
  
  Turns out that the monitoring that sssd was doing on /etc/resolv.conf
  didn't take into consideration that what could change was the *target*
  of the symlink. it completely ignored that fact, and didn't notice when
  the resolv.conf contents actually changed in this scenario, which
  resulted in sssd staying in the offline mode when it shouldn't.
  
  There are two fixes being pulled in for this SRU:
  a) fix the monitoring of the target of the /etc/resolv.conf symlink
  b) change the fallback polling code to keep trying, instead of giving up 
right away
  
  [Test Case]
  It's recommended to test this in a lxd container, or a vm.
  
  Preparation steps:
  $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq
  
  Become root:
  $ sudo su -
  
  Detect your ip:
  # export interface=$(ip route | grep default | sed -r 's,^default via .* dev 
([a-z0-9]+) .*,\1,')
  # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print 
$2}' | cut -d / -f 1)
  
  Confirm the $ip variable is correct for your case:
  # echo $ip
  
  Create /etc/dnsmasq.d/sssd-test.conf using your real ip:
  # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF
  host-record=ldap01.example.com,$ip
  listen-address=$ip
  EOF
  
  restart dnsmasq
  # systemctl restart dnsmasq
  
  a) inotify test
  Create /etc/sssd/sssd.conf:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
- #debug_level = 6
+ debug_level = 6
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  # chmod 0600 /etc/sssd/sssd.conf
  
  # rm /etc/resolv.conf
  # ln -s /etc/resolv.conf.target /etc/resolv.conf
  
  create good resolv.conf:
  # echo "nameserver $ip" > /etc/resolv.conf.good
  
  Confirm /etc/resolv.conf is a broken symlink:
  # ll /etc/resolv.conf*
  lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> 
/etc/resolv.conf.target
  -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good
+ 
+ Open another terminal/screen and tail the sssd logs with a grep:
+ # tail -f /var/log/sssd/sssd.log | grep resolv
  
  Start sssd
  # systemctl restart sssd
  
  Repeat the sssctl call until it shows the offline mode persistently:
  # sssctl domain-status LDAP
  Online status: Offline
  
  Active servers:
  LDAP: not connected
  
  Discovered LDAP servers:
  - ldap01.example.com
  
  "Unbreak" the symlink:
  # cp /etc/resolv.conf.good /etc/resolv.conf.target
  
  Run sssctl again, it should almost immediately switch to online:
  # sssctl domain-status LDAP
  Online status: Online
  
  Active servers:
  LDAP: ldap01.example.com
  
  Discovered LDAP servers:
  - ldap01.example.com
  
  b) polling test
  Repeat the previous test, but with "try_inotify = false" in sssd.conf, like 
this:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
  #debug_level = 6
  try_inotify = false
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  After unbreaking the symbolic link, in a few seconds (5s at most),
  sssctl should show the service as being online, if using the fixed
  packages.
  
  [Regression Potential]
  TBD
  
  [Other Info]
  Not at this time.
  
  [Original Description]
  
  SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were
  also affected) is offline on boot and seems to stay offline forever (I
  waited over 20 minutes).
  
  sssd_nss.log:
  (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  ...
  
  SSSD immediately returns to normal operation after restarting it or
  after sending SIGUSR2.
  
  A workaround for the problem is creating the file 
/etc/systemd/system/sssd.service.d/override.conf with contents
  [Unit]
  Requires=network-online.target
  After=network-online.target

** Description changed:

  [Impact]
  sssd can switch to an offline mode of operation when it cannot reach the 
authentication or id backend. It uses several methods to assess the situation, 
and one of them is monitoring the /etc/resolv.conf file for changes.
  
  In ubuntu that file is a symlink to /run/systemd/resolve/stub-
  resolv.conf, but the target doesn't exist at all times during boot. It's
  expected that symlink to be broken for a while during boot.
  
  Turns out that the monitoring that sssd was doing on /etc/resolv.conf
  didn't take into consideration that what could change was the *target*
  of the symlink. it completely ignored that fact, and didn't notice when
  the resolv.conf contents actually changed in this scenario, which
  resulted in sssd staying in the offline mode when it shouldn't.
  
  There are two fixes being pulled in for this SRU:
  a) fix the monitoring of the target of the /etc/resolv.conf symlink
  b) change the fallback polling code to keep trying, instead of giving up 
right away
  
  [Test Case]
  It's recommended to test this in a lxd container, or a vm.
  
  Preparation steps:
  $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq
  
  Become root:
  $ sudo su -
  
  Detect your ip:
  # export interface=$(ip route | grep default | sed -r 's,^default via .* dev 
([a-z0-9]+) .*,\1,')
  # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print 
$2}' | cut -d / -f 1)
  
  Confirm the $ip variable is correct for your case:
  # echo $ip
  
  Create /etc/dnsmasq.d/sssd-test.conf using your real ip:
  # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF
  host-record=ldap01.example.com,$ip
  listen-address=$ip
  EOF
  
  restart dnsmasq
  # systemctl restart dnsmasq
  
  a) inotify test
  Create /etc/sssd/sssd.conf:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
  debug_level = 6
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  # chmod 0600 /etc/sssd/sssd.conf
  
  # rm /etc/resolv.conf
  # ln -s /etc/resolv.conf.target /etc/resolv.conf
  
  create good resolv.conf:
  # echo "nameserver $ip" > /etc/resolv.conf.good
  
  Confirm /etc/resolv.conf is a broken symlink:
  # ll /etc/resolv.conf*
  lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> 
/etc/resolv.conf.target
  -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good
  
  Open another terminal/screen and tail the sssd logs with a grep:
  # tail -f /var/log/sssd/sssd.log | grep resolv
  
  Start sssd
  # systemctl restart sssd
  
  Repeat the sssctl call until it shows the offline mode persistently:
  # sssctl domain-status LDAP
  Online status: Offline
  
  Active servers:
  LDAP: not connected
  
  Discovered LDAP servers:
  - ldap01.example.com
  
  "Unbreak" the symlink:
  # cp /etc/resolv.conf.good /etc/resolv.conf.target
  
  Run sssctl again, it should almost immediately switch to online:
  # sssctl domain-status LDAP
  Online status: Online
  
  Active servers:
  LDAP: ldap01.example.com
  
  Discovered LDAP servers:
  - ldap01.example.com
  
+ For the above steps, the log file being tailed will show this for the startup 
of sssd:
+ (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch 
for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using 
function resolv_conf_inotify_cb after delay 1.0
+ 
+ And this for when the symlink is fixed:
+ (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received 
notification for watched file [resolv.conf.target] under /etc
+ 
+ 
  b) polling test
  Repeat the previous test, but with "try_inotify = false" in sssd.conf, like 
this:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
  #debug_level = 6
  try_inotify = false
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  After unbreaking the symbolic link, in a few seconds (5s at most),
  sssctl should show the service as being online, if using the fixed
  packages.
  
  [Regression Potential]
  TBD
  
  [Other Info]
  Not at this time.
  
  [Original Description]
  
  SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were
  also affected) is offline on boot and seems to stay offline forever (I
  waited over 20 minutes).
  
  sssd_nss.log:
  (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  ...
  
  SSSD immediately returns to normal operation after restarting it or
  after sending SIGUSR2.
  
  A workaround for the problem is creating the file 
/etc/systemd/system/sssd.service.d/override.conf with contents
  [Unit]
  Requires=network-online.target
  After=network-online.target

** Description changed:

  [Impact]
  sssd can switch to an offline mode of operation when it cannot reach the 
authentication or id backend. It uses several methods to assess the situation, 
and one of them is monitoring the /etc/resolv.conf file for changes.
  
  In ubuntu that file is a symlink to /run/systemd/resolve/stub-
  resolv.conf, but the target doesn't exist at all times during boot. It's
  expected that symlink to be broken for a while during boot.
  
  Turns out that the monitoring that sssd was doing on /etc/resolv.conf
  didn't take into consideration that what could change was the *target*
  of the symlink. it completely ignored that fact, and didn't notice when
  the resolv.conf contents actually changed in this scenario, which
  resulted in sssd staying in the offline mode when it shouldn't.
  
  There are two fixes being pulled in for this SRU:
  a) fix the monitoring of the target of the /etc/resolv.conf symlink
  b) change the fallback polling code to keep trying, instead of giving up 
right away
  
  [Test Case]
  It's recommended to test this in a lxd container, or a vm.
  
  Preparation steps:
+ $ sudo apt update && sudo apt dist-upgrade
  $ sudo apt install sssd-ldap sssd-tools sssd-dbus slapd dnsmasq
  
  Become root:
  $ sudo su -
  
  Detect your ip:
  # export interface=$(ip route | grep default | sed -r 's,^default via .* dev 
([a-z0-9]+) .*,\1,')
  # export ip=$(ip addr show dev $interface | grep "inet [0-9]" | awk '{print 
$2}' | cut -d / -f 1)
  
  Confirm the $ip variable is correct for your case:
  # echo $ip
  
  Create /etc/dnsmasq.d/sssd-test.conf using your real ip:
  # cat > /etc/dnsmasq.d/sssd-test.conf <<EOF
  host-record=ldap01.example.com,$ip
  listen-address=$ip
  EOF
  
  restart dnsmasq
  # systemctl restart dnsmasq
  
  a) inotify test
  Create /etc/sssd/sssd.conf:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
  debug_level = 6
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  # chmod 0600 /etc/sssd/sssd.conf
  
  # rm /etc/resolv.conf
  # ln -s /etc/resolv.conf.target /etc/resolv.conf
  
  create good resolv.conf:
  # echo "nameserver $ip" > /etc/resolv.conf.good
  
  Confirm /etc/resolv.conf is a broken symlink:
  # ll /etc/resolv.conf*
  lrwxrwxrwx 1 root root 23 May 13 20:48 /etc/resolv.conf -> 
/etc/resolv.conf.target
  -rw-r--r-- 1 root root 24 May 13 20:48 /etc/resolv.conf.good
  
  Open another terminal/screen and tail the sssd logs with a grep:
  # tail -f /var/log/sssd/sssd.log | grep resolv
  
  Start sssd
  # systemctl restart sssd
  
  Repeat the sssctl call until it shows the offline mode persistently:
  # sssctl domain-status LDAP
  Online status: Offline
  
  Active servers:
  LDAP: not connected
  
  Discovered LDAP servers:
  - ldap01.example.com
  
  "Unbreak" the symlink:
  # cp /etc/resolv.conf.good /etc/resolv.conf.target
  
  Run sssctl again, it should almost immediately switch to online:
  # sssctl domain-status LDAP
  Online status: Online
  
  Active servers:
  LDAP: ldap01.example.com
  
  Discovered LDAP servers:
  - ldap01.example.com
  
  For the above steps, the log file being tailed will show this for the startup 
of sssd:
  (Mon May 18 17:17:06 2020) [sssd] [_snotify_create] (0x0400): Added a watch 
for /etc/resolv.conf.target with inotify flags 0x8D88 internal flags 0x1 using 
function resolv_conf_inotify_cb after delay 1.0
  
  And this for when the symlink is fixed:
  (Mon May 18 17:18:06 2020) [sssd] [process_dir_event] (0x0400): received 
notification for watched file [resolv.conf.target] under /etc
  
- 
  b) polling test
  Repeat the previous test, but with "try_inotify = false" in sssd.conf, like 
this:
  # cat > /etc/sssd/sssd.conf <<EOF
  [sssd]
  config_file_version = 2
  services = nss, pam, ifp
  domains = LDAP
  #debug_level = 6
  try_inotify = false
  
  [domain/LDAP]
  id_provider = ldap
  ldap_uri = ldap://ldap01.example.com
  cache_credentials = True
  ldap_search_base = dc=example,dc=com
  EOF
  
  After unbreaking the symbolic link, in a few seconds (5s at most),
  sssctl should show the service as being online, if using the fixed
  packages.
  
  [Regression Potential]
  TBD
  
  [Other Info]
  Not at this time.
  
  [Original Description]
  
  SSSD 1.15.3-2ubuntu1 on 17.10/artful (previous versions on artful were
  also affected) is offline on boot and seems to stay offline forever (I
  waited over 20 minutes).
  
  sssd_nss.log:
  (Fri Oct 13 09:49:50 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  (Fri Oct 13 09:49:51 2017) [sssd[nss]] [sss_dp_get_reply] (0x0010): The Data 
Provider returned an error [org.freedesktop.sssd.Error.DataProvider.Offline]
  ...
  
  SSSD immediately returns to normal operation after restarting it or
  after sending SIGUSR2.
  
  A workaround for the problem is creating the file 
/etc/systemd/system/sssd.service.d/override.conf with contents
  [Unit]
  Requires=network-online.target
  After=network-online.target

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1723350

Title:
  sssd offline on boot, stays offline forever

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/sssd/+bug/1723350/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to