On 04/09/2015 07:51 AM, Martin Kosek wrote:
On 04/09/2015 05:59 AM, Alexander Frolushkin wrote:
-----Original Message-----
From: thierry bordaz [mailto:tbor...@redhat.com]
Sent: Wednesday, April 08, 2015 6:36 PM
To: Alexander Frolushkin (SIB)
Cc: 'Ludwig Krispenz'; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] Accident upgrade 3.3 to 4.1

On 04/08/2015 02:19 PM, Alexander Frolushkin wrote:
On one of accidently upgraded server I have following error in dirsrv logs:

[08/Apr/2015:13:24:12 +0300] connection - conn=1095 fd=131 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1094 fd=124 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1096 fd=124 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
[08/Apr/2015:13:24:12 +0300] connection - conn=1097 fd=131 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
This message is logged if the received message was too large. But here max size 
was 200Mb.
I can not imagine a such large message.
Being log at the same second, it could be transient error. Have you seen others 
messages like these ?
Yes, it still here.

[08/Apr/2015:14:55:01 +0300] connection - conn=1125 fd=130 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
[08/Apr/2015:14:55:01 +0300] connection - conn=1124 fd=126 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
[08/Apr/2015:14:55:01 +0300] connection - conn=1126 fd=126 Incoming BER Element 
was too long, max allowable is 209715200 bytes. Change the nsslapd-maxbersize 
attribute in cn=config to increase.
Those logs mean the connection (e.g. conn=1125) got closed.
Would you grep conn=1125 in access log ?
[08/Apr/2015:14:55:00 +0300] conn=1125 fd=130 slot=130 connection from 
10.99.111.42 to 10.163.129.91
[08/Apr/2015:14:55:00 +0300] conn=1125 op=0 SRCH base="" scope=0 
filter="(objectClass=*)" attrs="subschemaSubentry dsservicename namingContexts 
defaultnamingcontext schemanamingcontext configuratio
nnamingcontext rootdomainnamingcontext supportedControl supportedLDAPVersion 
supportedldappolicies supportedSASLMechanisms dnshostname ldapservicename servername 
supportedcapabilities"
[08/Apr/2015:14:55:00 +0300] conn=1125 op=0 RESULT err=0 tag=101 nentries=1 
etime=0
No closure log ?
Possibly the next op=1, triggered the error and the closure of the connection. Do you know if it exists a kind of keep alive mechanism, that would ping the instance with op=0 and then could send some dummy data ?

Looking for periodicity on the 'Incoming BER Element' event could help to know who opened that connection

[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:14:55:26 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://cnt-rhidm01.unix.ad.com:389/o%3Dipaca) failed.


[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:11 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://sib-rhidm01.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:15 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://vlg-rhidm02.unix.ad.com:389/o%3Dipaca) failed.
[08/Apr/2015:13:25:15 +0300] attrlist_replace - attr_replace (nsslapd-referral, 
ldap://vlg-rhidm02.unix.ad.com:389/o%3Dipaca) failed.
Here it is likely trigger by RUV containing duplicated values (multiple replica 
install ?). You may have to use cleanruv after the upgrade.
ipa-replica-manage list-ruv  and ipa-replica-manager clean-ruv
Do You mean we need to upgrade all 3.3.3 IPA servers to 4.1 first? Or this can 
be cleaned right now on remaining servers?

BTW:
# ipa-replica-manage list-ruv
Directory Manager password:

sib-rhidm03.unix.ad.com:389: 5
dv-rhidm01.unix.ad.com:389: 17
sib-rhidm02.unix.ad.com:389: 3
sib-rhidm01.unix.ad.com:389: 4
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
....
nw-rhidm01.unix.ad.com:389: 19

This message is harmless. It means that some values of nsds50ruv in the RUV 
have identical referral.
This should not occur, but replication is smart enough to just log this warning 
and continue working.
I would not recommend cleanup right now. Just clarification of the status.
Would you send all the ruv values returned by 'list-ruv' (here there is no 
duplicate).
Here the full command output from the IPA 4.1 server:

# ipa-replica-manage list-ruv
Directory Manager password:

nw-rhidm01.unix.ad.com:389: 19
dv-rhidm02.unix.ad.com:389: 18
vlg-rhidm03.unix.ad.com:389: 12
sib-rhidm01.unix.ad.com:389: 4
dv-rhidm01.unix.ad.com:389: 17
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
cnt-rhidm01.unix.ad.com:389: 14
sib-rhidm03.unix.ad.com:389: 5
vlg-rhidm02.unix.ad.com:389: 13
msk-rhidm-03.unix.ad.com:389: 10
msk-rhidm-01.unix.ad.com:389: 9
vlg-rhidm01.unix.ad.com:389: 8
cnt-rhidm02.unix.ad.com:389: 15
sib-rhidm02.unix.ad.com:389: 3
msk-rhidm-02.unix.ad.com:389: 11

I'm planning to upgrade all the remaining IPA 3.3.3 to IPA 4.1.
Ok, that should help.

Am I undersanding correctly, that upper messages does not mean something is 
terribly wrong in IPA for now?
If you are asking about the attrlist_replace warnings, they should be benign,
caused by the uncleaned RUVs as Thierry indicated. Although the list above
looks OK, without duplicate RUVs.
I agree, those warnings means something needs to be cleaned but not that things are broken.
Replication should work fine.

Thierry, does this needs to be checked on every IPA server, or are RUVs also
replicated?

I am unsure if list-ruv command is hidding something. The following command will dump the RUV of the local instance:

   ldapsearch -D "cn=directory manager" -W -b "$SUFFIX"  
'(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'


The 'attrlist_replace' message means that the local instance received a RUV from a remote instance and that remote RUV contained duplicated referral. If you want to know which server need to be cleaned, you would do list-ruv (or the ldapsearch command) on each instance.
I would expect to see duplicates on some instances RUV, like for example:

nw-rhidm01.unix.ad.com:389: 19
dv-rhidm02.unix.ad.com:389: 18
vlg-rhidm03.unix.ad.com:389: 12
sib-rhidm01.unix.ad.com:389: 4
dv-rhidm01.unix.ad.com:389: 17
url-rhidm01.unix.ad.com:389: 6
url-rhidm02.unix.ad.com:389: 7
*cnt-rhidm01.unix.ad.com:389: 14**
**cnt-rhidm01.unix.ad.com:389: 24*
sib-rhidm03.unix.ad.com:389: 5
vlg-rhidm02.unix.ad.com:389: 13
msk-rhidm-03.unix.ad.com:389: 10
msk-rhidm-01.unix.ad.com:389: 9
vlg-rhidm01.unix.ad.com:389: 8
cnt-rhidm02.unix.ad.com:389: 15
sib-rhidm02.unix.ad.com:389: 3
msk-rhidm-02.unix.ad.com:389: 11



Martin

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Reply via email to