Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

Petr Spacek Wed, 24 Aug 2016 00:13:52 -0700

On 23.8.2016 18:44, Rakesh Rajasekharan wrote:
> I think thers something seriously wrong with my system
> 
> not able to run any  IPA commands
> 
> klist
> Ticket cache: KEYRING:persistent:0:0
> Default principal: [email protected]
> 
> Valid starting       Expires              Service principal
> 2016-08-23T16:26:36  2016-08-24T16:26:22  krbtgt/[email protected]
> 
> 
> [root@prod-ipa-master-1a :~] ipactl status
> Directory Service: RUNNING
> krb5kdc Service: RUNNING
> kadmin Service: RUNNING
> ipa_memcached Service: RUNNING
> httpd Service: RUNNING
> pki-tomcatd Service: RUNNING
> ipa-otpd Service: RUNNING
> ipa: INFO: The ipactl command was successful
> 
> 
> 
> [root@prod-ipa-master :~] ipa user-find p-testuser
> ipa: ERROR: Kerberos error: ('Unspecified GSS failure.  Minor code may
> provide more information', 851968)/("Cannot contact any KDC for realm '
> XYZ.COM'", -1765328228)
>


This is weird because the server seems to be up.

Please follow
http://www.freeipa.org/page/Troubleshooting#Authentication.2FKerberos

Petr^2 Spacek

> 
> 
> Thanks
> 
> Rakesh
> 
> On Tue, Aug 23, 2016 at 10:01 PM, Rakesh Rajasekharan <
> [email protected]> wrote:
> 
>> i changed the loggin level to 4 . Modifying nsslapd-accesslog-level
>>
>> But, the hang is still there. though I dont see the sigfault now
>>
>>
>>
>>
>> On Tue, Aug 23, 2016 at 9:02 PM, Rakesh Rajasekharan <
>> [email protected]> wrote:
>>
>>> My disk was getting filled too fast
>>>
>>> logs under /var/log/dirsrv was coming around 5 gb quickly filling up
>>>
>>> Is there a way to make the logging less verbose
>>>
>>>
>>>
>>> On Tue, Aug 23, 2016 at 6:41 PM, Petr Spacek <[email protected]> wrote:
>>>
>>>> On 23.8.2016 15:07, Rakesh Rajasekharan wrote:
>>>>> I was able to fix that may be temporarily... when i checked the
>>>> network..
>>>>> there was another process that was running and consuming a lot of
>>>> network (
>>>>> i have no idea who did that. I need to seriously start restricting
>>>> people
>>>>> access to this machine )
>>>>>
>>>>> after killing that perfomance improved drastically
>>>>>
>>>>> But now, suddenly I started experiencing the same hang.
>>>>>
>>>>> This time , I gert the following error when checked dmesg
>>>>>
>>>>> [  301.236976] ns-slapd[3124]: segfault at 0 ip 00007f1de416951c sp
>>>>> 00007f1dee1dba70 error 4 in libcos-plugin.so[7f1de4166000+b000]
>>>>> [ 1116.248431] TCP: request_sock_TCP: Possible SYN flooding on port 88.
>>>>> Sending cookies.  Check SNMP counters.
>>>>> [11831.397037] ns-slapd[22550]: segfault at 0 ip 00007f533d82251c sp
>>>>> 00007f5347894a70 error 4 in libcos-plugin.so[7f533d81f000+b000]
>>>>> [11832.727989] ns-slapd[22606]: segfault at 0 ip 00007f6231eb951c sp
>>>>> 00007f623bf2ba70 error 4 in libcos-plugin.so[7f6231eb6000+b00
>>>>
>>>> Okay, this one is serious. The LDAP server crashed.
>>>>
>>>> 1. Make sure all your packages are up-to-date.
>>>>
>>>> Please see
>>>> http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#d
>>>> ebugging-crashes
>>>> for further instructions how to debug this.
>>>>
>>>> Petr^2 Spacek
>>>>
>>>>>
>>>>> and in /var/log/dirsrv/example-com/errors
>>>>>
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291138 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291139 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291140 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291141 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291142 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291143 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291144 (rc: 32)
>>>>> [23/Aug/2016:12:49:36 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3291145 (rc: 32)
>>>>> [23/Aug/2016:12:49:50 +0000] - Retry count exceeded in delete
>>>>> [23/Aug/2016:12:49:50 +0000] DSRetroclPlugin - delete_changerecord:
>>>> could
>>>>> not delete change record 3292734 (rc: 51)
>>>>>
>>>>>
>>>>> Can  i do something about this error.. I treid to restart ipa a couple
>>>> of
>>>>> time but that did not help
>>>>>
>>>>> Thanks
>>>>> Rakesh
>>>>>
>>>>> On Mon, Aug 22, 2016 at 2:27 PM, Petr Spacek <[email protected]>
>>>> wrote:
>>>>>
>>>>>> On 19.8.2016 19:32, Rakesh Rajasekharan wrote:
>>>>>>> I am running my set up on AWS cloud, and entropy is low at around
>>>> 180 .
>>>>>>>
>>>>>>> I plan to increase it bu installing haveged . But, would low entropy
>>>> by
>>>>>> any
>>>>>>> chance cause this issue of intermittent hang .
>>>>>>> Also, the hang is mostly observed when registering around 20 clients
>>>>>>> together
>>>>>>
>>>>>> Possibly, I'm not sure. If you want to dig into this, I would do this:
>>>>>> 1. look what process hangs on client (using pstree command or so)
>>>>>> $ pstree
>>>>>>
>>>>>> 2. look to what server and port is the hanging client connected to
>>>>>> $ lsof -p <PID of the hanging process>
>>>>>>
>>>>>> 3. jump to server and see what process is bound to the target port
>>>>>> $ netstat -pn
>>>>>>
>>>>>> 4. see where the process if hanging
>>>>>> $ strace -p <PID of the hanging process>
>>>>>>
>>>>>> I hope it helps.
>>>>>>
>>>>>> Petr^2 Spacek
>>>>>>
>>>>>>> On Fri, Aug 19, 2016 at 7:24 PM, Rakesh Rajasekharan <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> yes there seems to be something thats worrying.. I have faced this
>>>> today
>>>>>>>> as well.
>>>>>>>> There are few hosts around 280 odd left and when i try adding them
>>>> to
>>>>>> IPA
>>>>>>>> , the slowness begins..
>>>>>>>>
>>>>>>>> all the ipa commands like ipa user-find.. etc becomes very slow in
>>>>>>>> responding.
>>>>>>>>
>>>>>>>> the SYNC_RECV are not many though just around 80-90 and today that
>>>> was
>>>>>>>> around 20 only
>>>>>>>>
>>>>>>>>
>>>>>>>> I have for now increased tcp_max_syn_backlog to 5000.
>>>>>>>> For now the slowness seems to have gone.. but I will do a try
>>>> adding the
>>>>>>>> clients again tomorrow and see how it goes
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Rakesh
>>>>>>>>
>>>>>>>> The issues
>>>>>>>>
>>>>>>>> On Fri, Aug 19, 2016 at 12:58 PM, Petr Spacek <[email protected]>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On 18.8.2016 17:23, Rakesh Rajasekharan wrote:
>>>>>>>>>> Hi
>>>>>>>>>>
>>>>>>>>>> I am migrating to freeipa from openldap and have around 4000
>>>> clients
>>>>>>>>>>
>>>>>>>>>> I had openned a another thread on that, but chose to start a new
>>>> one
>>>>>>>>> here
>>>>>>>>>> as its a separate issue
>>>>>>>>>>
>>>>>>>>>> I was able to change the nssslapd-maxdescriptors adding an ldif
>>>> file
>>>>>>>>>>
>>>>>>>>>> cat nsslapd-modify.ldif
>>>>>>>>>> dn: cn=config
>>>>>>>>>> changetype: modify
>>>>>>>>>> replace: nsslapd-maxdescriptors
>>>>>>>>>> nsslapd-maxdescriptors: 17000
>>>>>>>>>>
>>>>>>>>>> and running the ldapmodify command
>>>>>>>>>>
>>>>>>>>>> I have now started moving clients running an openldap to Freeipa
>>>> and
>>>>>>>>> have
>>>>>>>>>> today moved close to 2000 clients
>>>>>>>>>>
>>>>>>>>>> However, I have noticed that IPA hangs intermittently.
>>>>>>>>>>
>>>>>>>>>> running a kinit admin returns the below error
>>>>>>>>>> kinit: Generic error (see e-text) while getting initial
>>>> credentials
>>>>>>>>>>
>>>>>>>>>> from the /var/log/messages, I see this entry
>>>>>>>>>>
>>>>>>>>>>  prod-ipa-master-int kernel: [104090.315801] TCP:
>>>> request_sock_TCP:
>>>>>>>>>> Possible SYN flooding on port 88. Sending cookies.  Check SNMP
>>>>>> counters.
>>>>>>>>>
>>>>>>>>> I would be worried about this message. Maybe kernel/firewall is
>>>> doing
>>>>>>>>> something fishy behind your back and blocking some connections or
>>>> so.
>>>>>>>>>
>>>>>>>>> Petr^2 Spacek
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Started Session
>>>> 4885
>>>>>> of
>>>>>>>>>> user root.
>>>>>>>>>> Aug 18 13:00:01 prod-ipa-master-int systemd[1]: Starting Session
>>>> 4885
>>>>>> of
>>>>>>>>>> user root.
>>>>>>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Started Session
>>>> 4886
>>>>>> of
>>>>>>>>>> user root.
>>>>>>>>>> Aug 18 13:01:01 prod-ipa-master-int systemd[1]: Starting Session
>>>> 4886
>>>>>> of
>>>>>>>>>> user root.
>>>>>>>>>> Aug 18 13:02:40 prod-ipa-master-int python[28984]: ansible-command
>>>>>>>>> Invoked
>>>>>>>>>> with creates=None executable=None shell=True args= removes=None
>>>>>>>>> warn=True
>>>>>>>>>> chdir=None
>>>>>>>>>> Aug 18 13:04:37 prod-ipa-master-int sssd_be: GSSAPI Error:
>>>> Unspecified
>>>>>>>>> GSS
>>>>>>>>>> failure.  Minor code may provide more information (KDC returned
>>>> error
>>>>>>>>>> string: PROCESS_TGS)
>>>>>>>>>>
>>>>>>>>>> Could it be possible that its due to the initial load of adding
>>>> the
>>>>>>>>> clients
>>>>>>>>>> or is there something else that I need to take care of.

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

Reply via email to