On Tue, Jan 06, 2015 at 05:06:39PM -0700, Orion Poplawski wrote: > We're having some trouble with sssd on centos 7 under load on a VPS. 389ds > ldap server for id/auth. Part may be an issue with the VPS, but I'm trying > to track down all possible issues. > > Also, we realized that we were running in a bit of a bad state - the primary > ldap server was not available, but the backup was. > > Some logs: > > General question, is this bad?: > (Tue Jan 6 23:17:43 2015) [sssd[be[default]]] [sdap_get_users_done] > (0x0040): Failed to retrieve users
This can be caused by many things, we neer more context..in general, the LDAP search has failed and SSSD would fall back to cached entries. > > see that fairly frequently. > > Trouble: > (Tue Jan 6 22:30:31 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:30:31 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:30:36 2015) [sssd[be[default]]] [resolv_gethostbyname_done] > (0x0040): querying hosts database failed [5]: Input/output error > (Tue Jan 6 22:30:36 2015) [sssd[be[default]]] [fo_resolve_service_done] > (0x0020): Failed to resolve server 'server.com': Timeout while contacting > DNS servers > (Tue Jan 6 22:30:36 2015) [sssd[be[default]]] [be_resolve_server_process] > (0x0080): Couldn't resolve server (server.com), resolver returned (5) This seems like the core issue, can you resolve server.com from outside SSSD (with dig, maybe) ? > (Tue Jan 6 22:30:45 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:30:45 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:30:45 2015) [sssd[be[default]]] [fo_resolve_service_send] > (0x0020): No available servers for service 'LDAP' > (Tue Jan 6 22:30:45 2015) [sssd[be[default]]] [sdap_id_op_connect_done] > (0x0020): Failed to connect, going offline (5 [Input/output error]) > (Tue Jan 6 22:30:45 2015) [sssd[be[default]]] [be_run_offline_cb] (0x0080): > Going offline. Running callbacks. Here SSSD goes offline and the front end would switch to using the cache. > (Tue Jan 6 22:31:52 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:31:52 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:31:52 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:31:52 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:32:00 2015) [sssd[be[default]]] [fo_resolve_service_send] > (0x0020): No available servers for service 'LDAP' > (Tue Jan 6 22:32:00 2015) [sssd[be[default]]] [sdap_id_op_connect_done] > (0x0020): Failed to connect, going offline (5 [Input/output error]) > (Tue Jan 6 22:32:00 2015) [sssd[be[default]]] [be_run_offline_cb] (0x0080): > Going offline. Running callbacks. > (Tue Jan 6 22:33:07 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:33:07 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:33:07 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:33:07 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:33:08 2015) [sssd[be[default]]] [get_single_value_as_string] > (0x0080): More than one value found. > (Tue Jan 6 22:33:08 2015) [sssd[be[default]]] > [sdap_set_config_options_with_rootdse] (0x0020): get_naming_context failed. > (Tue Jan 6 22:33:14 2015) [sssd[be[default]]] [get_single_value_as_string] > (0x0080): More than one value found. This is weird as well, here SSSD is complaining that defaultNamingContext attribute of rootDSE contains multiple values. But I don't see SSSD grabbing the rootDSE anywhere at all..what log level did you use? You can read the rootDSE manually using: ldapsearch -x -H ldap://server.com -s base -b "" defaultNamingContext > (Tue Jan 6 22:33:14 2015) [sssd[be[default]]] > [sdap_set_config_options_with_rootdse] (0x0020): get_naming_context failed. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_resolve_server_process] > (0x0040): The fail over cycled through all available servers > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_run_offline_cb] (0x0080): > Going offline. Running callbacks. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_resolve_server_process] > (0x0040): The fail over cycled through all available servers > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_run_offline_cb] (0x0080): > Going offline. Running callbacks. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_resolve_server_process] > (0x0040): The fail over cycled through all available servers > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [be_run_offline_cb] (0x0080): > Going offline. Running callbacks. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:34:06 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:34:16 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:34:16 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > (Tue Jan 6 22:34:16 2015) [sssd[be[default]]] > [sss_ldap_init_sys_connect_done] (0x0020): sdap_async_sys_connect request > failed. > (Tue Jan 6 22:34:16 2015) [sssd[be[default]]] [sdap_sys_connect_done] > (0x0020): sdap_async_connect_call request failed. > > don't know why it wasn't able to reconnect to the backup, or perhaps it did, > but just not logged. _______________________________________________ sssd-users mailing list [email protected] https://lists.fedorahosted.org/mailman/listinfo/sssd-users
