On Mon, Nov 25, 2013 at 09:23:22AM +1000, Matt Bryant wrote:
> All,
> 
> Was wondering if anyone can help out or point us the in right
> direction. Ever since we updated from IPA v2.1 to IPA v3.0 have been
> seeing some intermittent errors when trying to change passwords etc.
> Getting the error cannot change password since system offline.
> 
> Have increased the logging and found that quite frequently the
> sasl_bind using the host principle is timing out and failing ...
> (whether this is red herring or not dont know but cant be good
> anyhow)
> 
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_kinit_send] (0x0400): Attempting kinit (default,
> host/tardis.ipa.server-noc.com, SERVER-IPA, 86400)
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_kinit_next_kdc] (0x1000): Resolving next KDC for service IPA
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [fo_resolve_service_send] (0x0100): Trying to resolve service 'IPA'
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [get_server_status] (0x1000): Status of server
> 'tardis.ipa.server-noc.com' is 'working'
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [fo_resolve_service_activate_timeout] (0x2000): Resolve timeout set
> to 10 seconds
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [get_server_status] (0x1000): Status of server
> 'tardis.ipa.server-noc.com' is 'working'
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [be_resolve_server_process] (0x1000): Saving the first resolved
> server
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [be_resolve_server_process] (0x0200): Found address for server
> tardis.ipa.server-noc.com: [203.147.190.30] TTL 7200
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_kinit_kdc_resolved] (0x1000): KDC resolved, attempting to get
> TGT...
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [create_tgt_req_send_buffer] (0x1000): buffer size: 56
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [child_handler_setup] (0x2000): Setting up signal handler up for pid
> [3734]
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [child_handler_setup] (0x2000): Signal handler set up for pid [3734]
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [set_tgt_child_timeout] (0x0400): Setting 6 seconds timeout for tgt
> child
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_process_result] (0x2000): Trace: sh[0xa45960], connected[1],
> ops[(nil)], ldap[0xa12200]
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_process_result] (0x2000): Trace: ldap_result found nothing!
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [write_pipe_handler] (0x0400): All data has been sent!
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [child_sig_handler] (0x1000): Waiting for child [3734].
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [child_sig_handler] (0x0100): child [3734] finished successfully.
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sss_child_handler] (0x2000): waitpid failed [10]: No child
> processes
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [read_pipe_handler] (0x0400): EOF received, client finished
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_get_tgt_recv] (0x0400): Child responded: 0
> [FILE:/var/lib/sss/db/ccache_SERVER-IPA], expired on [1385420045]
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_cli_auth_step] (0x0100): expire timeout is 900
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_cli_auth_step] (0x1000): the connection will expire at
> 1385334545
> (Mon Nov 25 08:54:05 2013) [sssd[be[ipa.server-noc.com]]]
> [sasl_bind_send] (0x0100): Executing sasl bind mech: GSSAPI, user:
> host/tardis.ipa.server-noc.com
> (Mon Nov 25 08:54:11 2013) [sssd[be[ipa.server-noc.com]]]
> [sasl_bind_send] (0x0020): ldap_sasl_bind failed (-5)[Timed out]
> (Mon Nov 25 08:54:11 2013) [sssd[be[ipa.server-noc.com]]]
> [sasl_bind_send] (0x0080): Extended failure message: [unknown error]
> (Mon Nov 25 08:54:11 2013) [sssd[be[ipa.server-noc.com]]]
> [fo_set_port_status] (0x0100): Marking port 0 of server
> 'tardis.ipa.server-noc.com' as 'not working'
> (Mon Nov 25 08:54:11 2013) [sssd[be[ipa.server-noc.com]]]
> [sdap_handle_release] (0x2000): Trace: sh[0xa45960], connected[1],
> ops[(nil)], ldap[0xa12200], destructor_lock[0], release_memory[0]
> ..
> ..
> 
> it then goes on to connect to fail over server and succeed and
> shortly after the master server is tested and marked as working
> again ... works for a couple of times then time out again .. any
> pointers gratefully received.

According to the logs I would say that this timeout happens on the
LDAP server side. Do you have a chance to check the server logs to see
what happens at this time on the server?

You can increase the timeout value on the client by setting
ldap_opt_timeout to a larger value then 6 (the default). But please note
that the operation might take longer then letting SSSD fail over to the
next server.

HTH

bye,
Sumit

> 
> 
> packages used ...
> 
> ipa-server-3.0.0-25.el6.x86_64
> sssd-1.9.2-82.11.el6_4.x86_64
> 
> rgds
> 
> Matt Bryant
> 
> _______________________________________________
> Freeipa-users mailing list
> Freeipa-users@redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-users

_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Reply via email to