Re: [Freeipa-users] RUVs

2015-04-01 Thread Ludwig Krispenz

Hi,

a RUV (replica update vector) is a structure which on each sever 
maintains a state of updates it has seen from any other server, it is 
used in a replication session to determine which updates have to be sent.
Normally you don't need to deal with it, only if you remove a replica it 
is advisable to remove the references to the no longer existing server 
using clean ruv


Ludwig
On 04/01/2015 04:29 PM, Janelle wrote:

Hello again,

This is a more general question as I am new to "dirsrv" a bit. I have 
read through a lot of the docs, including 389-ds, but with regards to 
IPA, well, I am not 100% clear and perhaps this could help others in 
the future.


Are there guidelines or suggestions for RUV's and cleaning and how to 
know when you are actually seeing a problem that needs to be fixed? In 
a good system, for example, my 8 servers, if there are no issues, what 
would I expect to see from a "list-ruv"?  What errors would indicate 
the need to run a "clean-ruv id"?


I am thinking if there was a write up or FAQ for this, it would go a 
long way to helping new admins with FreeIPA in understanding all of 
this.   Just a suggestion.


Thank you
~J



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Accident upgrade 3.3 to 4.1

2015-04-08 Thread Ludwig Krispenz


On 04/08/2015 12:04 PM, Martin Kosek wrote:

On 04/08/2015 11:52 AM, Alexander Frolushkin wrote:

Hello!
We used have a geo-replicated IPA with RHEL 7.0, and on one site ipa servers 
was upgraded by mistake to RHEL 7.1 (ipa-server-4.1.0-18.el7_1.3.x86_64).
Now it is broken globally, in logs I see these:

[08/Apr/2015:13:06:47 +0600] NSACLPlugin - ACL PARSE ERR(rv=-5): 
(targetattr="ipaProtectedOperation;write_keys
[08/Apr/2015:13:06:47 +0600] NSACLPlugin - __aclp__init_targetattr: targetattr 
"ipaProtectedOperation;write_keys" does not exist in schema. Please add attributeTypes 
"ipaProtectedOperation;write_keys" to schema if necessary.

What can I do to fix this catastrophe, or it is fatal?
As it seems from the client servers, hbac is not working at all, maybe all 
other things as well :(

With best regards,
Alexander Frolushkin

AFAIK, this particular error message should not be fatal to the function and
new ACI should just be ignored.
yes, but I don't know if any IPA component would rely on access granted 
by this aci.

Maybe the new schema did not replicate

is this message logged on all servers ?

properly. Do you see other DS errors? (CCing DS guys)

Non-working HBAC is also strange, SSSD developers will want logs to analyze,
see https://fedorahosted.org/sssd/wiki/Troubleshooting

In any case, upgrade from 3.3 to 4.1 should just work, you just need to have a
recent enough RHEL-6 servers - at least RHEL-6.6+z-streams.


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] understanding RUVs?

2015-04-21 Thread Ludwig Krispenz


On 04/21/2015 01:26 AM, Janelle wrote:

Hello,

When I was working with OpenLDAP, and AD - and did not deal with 
"RUV"s the way I am with 389-ds and IPA.


I am trying to understand what is "normal" for values. If I am looking 
at this (and seem to have no replication problems):


ipa-replica-manage list-ruv

ipa001.example.com:389: 13
ipa002.example.com:389: 12
ipa003.example.com:389: 11
ipa004.example.com:389: 10
ipa005.example.com:389: 7
ipa006.example.com:389: 6
ipa007.example.com:389: 5
ipa008.example.com:389: 3
ipa009.example.com:389: 16
ipa00a.example.com:389: 17
ipa00b.example.com:389: 15
ipa00c.example.com:389: 14
ipa00d.example.com:389: 9
ipa00e.example.com:389: 8
ipa00f.example.com:389: 4

I guess I was wondering, should I be seeing all the same values or 
should they all be unique based on being "replicated" and the order 
they were added?  Or is it telling me something else? Sorry, I guess I 
am still trying to wrap my head around replication metadata.
the output of list-ruv lists the replicaids and the corresponding 
servers the replica knows about. It should be unique and exactly match 
the servers (with their replicaid) deployed in your topology.
If there are more ruvs, you probably have removed a server and should 
clean the ruv, if you have less than replication from the missing 
replica in the list did not get propagated to this server.


But the output of list-ruv only shows part of the RUV, the "real" ruv 
looks like this:


 ldapsearch -LLL -o ldif-wrap=no -h localhost  -p 30522 -x -D 
"cn=directory manager" -w .  -b "cn=config" 
"objectclass=nsds5replica" nsds50ruv

dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 51dc3bac0064
nsds50ruv: {replica 100 ldap://localhost:30522} 5506ce510064 
55254d910064
nsds50ruv: {replica 200 ldap://localhost:4945} 5506cf8e00c8 
5506cf8e00c8


The most important part is the last field, eg 55254d910064 it is 
the csn of the last change this server has seen for replicaid 100 
(0x64). In a replication session the ruvs of the supplier and consumer 
are compared to detect if the supplier has changes the consumer has not 
yet seen.

So the ruvs have to be managed per server.

Ludwig





Thank you
~J



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] ipa-replica-manage re-initialize and database size

2015-04-24 Thread Ludwig Krispenz


On 04/24/2015 09:26 AM, Dominik Korittki wrote:

Hello all,

I am running two ipa3.3.3 instances in a replication on Centos 7 servers.
Last day the rootpartition went full (where the dirsrv databases are 
stored), because of a big changelog-db.
dirsrv managed to do a graceful shutdown. Luckily, the second master 
was still working properly, so i could recover the first one from it.


I resized the partition, booted up again and ran
'ipa-replica-manage re-initialize --from ipa02.internal'

Everything seemed to ran fine except for one warnig regarding an issue 
with the changelog db, heres the log portion of the log 
/var/log/dirsrv/slapd-INTERNAL/errors on recieving (first) IPA master:

[...]
[23/Apr/2015:10:41:46 +0200] NSMMReplicationPlugin - 
multimaster_be_state_change: replica dc=internal is going offline; 
disabling replication
[23/Apr/2015:10:41:47 +0200] - WARNING: Import is running with 
nsslapd-db-private-import-mem on; No other process is allowed to 
access the database
[23/Apr/2015:10:41:55 +0200] - import userRoot: Workers finished; 
cleaning up...

[23/Apr/2015:10:41:55 +0200] - import userRoot: Workers cleaned up.
[23/Apr/2015:10:41:55 +0200] - import userRoot: Indexing complete.  
Post-processing...
[23/Apr/2015:10:41:55 +0200] - import userRoot: Generating 
numSubordinates complete.

[23/Apr/2015:10:41:55 +0200] - import userRoot: Flushing caches...
[23/Apr/2015:10:41:55 +0200] - import userRoot: Closing files...
[23/Apr/2015:10:41:55 +0200] - import userRoot: Import complete.  
Processed 9983 entries in 8 seconds. (1247.88 entries/sec)
[23/Apr/2015:10:41:55 +0200] NSMMReplicationPlugin - 
multimaster_be_state_change: replica dc=internal is coming online; 
enabling replication
[23/Apr/2015:10:41:55 +0200] NSMMReplicationPlugin - 
replica_reload_ruv: Warning: new data for replica dc=internal does not 
match the data in the changelog.
 Recreating the changelog file. This could affect replication with 
replica's  consumers in which case the consumers should be reinitialized.

[...]
this shouold be normal. at the moment of initialization, a server has a 
database and a changelog. The datavase is recreated by initialization 
and when replication plugin starts it detects that changelog and db no 
longer match and recreates the changelog.





I am no expert in LDAP or Directory Server, but i noticed a 
significant size difference of files in 
/var/lib/dirsrv/slapd-INTERNAL/cldb/:

root@ipa01:~ > du -sch /var/lib/dirsrv/slapd-INTERNAL/cldb/*
0 
/var/lib/dirsrv/slapd-INTERNAL/cldb/61e65983-718611e4-8059dc1c-48160578.sema
24M 
/var/lib/dirsrv/slapd-INTERNAL/cldb/61e65983-718611e4-8059dc1c-48160578_546f45150004.db
0 
/var/lib/dirsrv/slapd-INTERNAL/cldb/9b310907-74a711e4-8059dc1c-48160578.sema
6,8M 
/var/lib/dirsrv/slapd-INTERNAL/cldb/9b310907-74a711e4-8059dc1c-48160578_547485400060.db

4,0K/var/lib/dirsrv/slapd-INTERNAL/cldb/DBVERSION
30Mtotal

root@ipa02:/var/log > du -sch /var/lib/dirsrv/slapd-INTERNAL/cldb/*
0 
/var/lib/dirsrv/slapd-INTERNAL/cldb/98ceaf89-74a711e4-910b9512-1512b1dc.sema
4,7G 
/var/lib/dirsrv/slapd-INTERNAL/cldb/98ceaf89-74a711e4-910b9512-1512b1dc_546f45150004.db
0 
/var/lib/dirsrv/slapd-INTERNAL/cldb/9cfacd4b-74a711e4-910b9512-1512b1dc.sema
3,7M 
/var/lib/dirsrv/slapd-INTERNAL/cldb/9cfacd4b-74a711e4-910b9512-1512b1dc_547485400060.db

4,0K/var/lib/dirsrv/slapd-INTERNAL/cldb/DBVERSION
4,7Gtotal


Also, i noticed a difference in the actual database size on both servers:
root@ipa01:~ > du -sch /var/lib/dirsrv/slapd-INTERNAL/db/*
4,0K/var/lib/dirsrv/slapd-INTERNAL/db/DBVERSION
1,3M/var/lib/dirsrv/slapd-INTERNAL/db/__db.001
544K/var/lib/dirsrv/slapd-INTERNAL/db/__db.002
9,6M/var/lib/dirsrv/slapd-INTERNAL/db/__db.003
1,4M/var/lib/dirsrv/slapd-INTERNAL/db/ipaca
2,2M/var/lib/dirsrv/slapd-INTERNAL/db/log.124384
101M/var/lib/dirsrv/slapd-INTERNAL/db/userRoot
115Mtotal

root@ipa02:/var/log > du -sch /var/lib/dirsrv/slapd-INTERNAL/db/*
4,0K/var/lib/dirsrv/slapd-INTERNAL/db/DBVERSION
1,7M/var/lib/dirsrv/slapd-INTERNAL/db/__db.001
544K/var/lib/dirsrv/slapd-INTERNAL/db/__db.002
9,6M/var/lib/dirsrv/slapd-INTERNAL/db/__db.003
1,3M/var/lib/dirsrv/slapd-INTERNAL/db/ipaca
4,3M/var/lib/dirsrv/slapd-INTERNAL/db/log.074356
175M/var/lib/dirsrv/slapd-INTERNAL/db/userRoot
193Mtotal

Besides from that, everything seems to be working fine again, 
including the replication. No errors or warnings regarding this issue 
are stated in dirsrv-logs. So I'm a bit confused right know wether to 
believe everything worked fine or not.

Is this behaviour of IPA/Directory Server normal? Many thanks in advance!


Greetings and a nice day,
Dominik Korittki




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] thousands DSRetroclPlugin mesages

2015-04-27 Thread Ludwig Krispenz


On 04/26/2015 10:49 AM, Martin (Lists) wrote:

Hallo

after a reboot I get almost thousand of the following messages:

DSRetroclPlugin - delete_changerecord: could not delete change record
128755 (rc: 32)
this message comes from changeglog trimming and means that an entry, 
which should be purged does not exist (any more).
the retrocl maintains a first/lastchange and trinming starts at 
firstchange. if for some reason (race ?) there is an attempt to try to 
delete the same entry a second time this message should be logged.
since the changenumbers in the error message increases, I think 
changelog trimming moves forward. you could do searches on 
"cn=changelog" to verify that trimming works.


The record number changes from 127600 up to 148400. What does this mean?
I have searched the web but did not find any hint on this.

I use Fedora 21 Server with current IPA packages (Version 4.1.4).

Kindly
Martin



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 03:14 PM, thierry bordaz wrote:

On 04/29/2015 02:43 PM, Andy Thompson wrote:

-Original Message-
From: Martin Kosek [mailto:mko...@redhat.com]
Sent: Wednesday, April 29, 2015 8:31 AM
To: Andy Thompson;freeipa-users@redhat.com; Ludwig Krispenz; Thierry
Bordaz
Subject: Re: [Freeipa-users] deleting ipa user

On 04/29/2015 01:26 PM, Andy Thompson wrote:

I'm trying to delete an IPA account and I get a generic "operations error"

when trying to remove it.  It looks like something is messed up with the
group object.  The user doesn't show up in the ipausers group and there also
isn't a group object for the user in question.  Here is the error from the
attempt.

[29/Apr/2015:07:21:32 -0400] referint-plugin - _update_all_per_mod:
entry cn=ipausers,cn=groups,cn=accounts,dc=domain,dc=com: deleting
"member: uid=,cn=users,cn=accounts,dc=domain,dc=com"

failed

(16)
[29/Apr/2015:07:21:32 -0400] referint-plugin - _update_all_per_mod:
entry
ipaUniqueID=3897c894-e764-11e4-b05b-

005056a92af3,cn=hbac,dc=domain,dc=

com: deleting "memberUser:
uid=,cn=users,cn=accounts,dc=domain,dc=com" failed (16)
[29/Apr/2015:07:21:32 -0400] ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e: 0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-entries-plugin - mep_del_post_op:
failed to delete managed entry
(cn=,cn=groups,cn=accounts,dc=domain,dc=com) - error (1)
[29/Apr/2015:07:21:32 -0400] ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e: 0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-entries-plugin - mep_del_post_op:
failed to delete managed entry
(cn=,cn=groups,cn=accounts,dc=domain,dc=com) - error (1)

This is the first time I see this error. CCing Ludwig or Thierry to advise.

Andy, please also include FreeIPA and 389-ds-base packages versions so that
Thierry and Ludwig know what to look at.


Here you go

ipa-server-4.1.0-18.el7_1.3.x86_64
389-ds-base-1.3.3.1-15.el7_1.x86_64

Thanks much

-andy



Hello,

I wonder it is not a similar issue I hit 
https://fedorahosted.org/389/ticket/48165. What differs is 
'_update_all_per_mod' logs but could be a consequence of the same bug.
I think what differs taht in the ticket there is an attempt to delete an 
existng entry, but in the log snippet provided it attempts to delete a 
tombstone entry (an entry which was already deleted).
So the errors logged by DS seem to be ok, but why does IPA want to 
delete an already deleted user ? but mybe only the mep plugin finds a 
tombstone and tries to delete it.


What was the command executed, is the result the same if repeated ?

? I have a non systematic test case for 48165.
Is it happening systematically in your case ?

thanks
thierry


-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] thousands DSRetroclPlugin mesages

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 03:17 PM, Martin (Lists) wrote:

Am 27.04.2015 um 09:45 schrieb Ludwig Krispenz:

On 04/26/2015 10:49 AM, Martin (Lists) wrote:

Hallo

after a reboot I get almost thousand of the following messages:

DSRetroclPlugin - delete_changerecord: could not delete change record
128755 (rc: 32)

this message comes from changeglog trimming and means that an entry,
which should be purged does not exist (any more).
the retrocl maintains a first/lastchange and trinming starts at
firstchange. if for some reason (race ?) there is an attempt to try to
delete the same entry a second time this message should be logged.
since the changenumbers in the error message increases, I think
changelog trimming moves forward. you could do searches on
"cn=changelog" to verify that trimming works.

changelog is part of the ldbm database plugin and contains several
informations I don't understand (or understand partially). What kind of
information should I look for?
the changelog keeps track of the changes applied to the database, a 
typical entry looks like:

dn: changenumber=4,cn=changelog
objectClass: top
objectClass: changelogentry
changeNumber: 4
targetDn: cn=tuser,ou=people,dc=example,dc=com
changeTime: 20140411093444Z
changeType: delete

each entry gets a DN made up from he changenumber, so your entries will 
be named:


dn: changenumber=61,cn=changelog
dn: changenumber=62,cn=changelog
dn: changenumber=63,cn=changelog
dn: changenumber=64,cn=changelog

changenumbers start and are always incremented, changelog trimming 
removes old entries (depending on config).


so if you do a search like:
ldapsearch .. -b "cn=changelog"
the changenumber of the first entry rerurne should always increase, 
indicating that trimming works.


you said "thousands" of messages, how frequent are they really ?


I only have one server running by the way.

Regards
Martin


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 03:40 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 9:22 AM
To: thierry bordaz
Cc: Andy Thompson; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 03:14 PM, thierry bordaz wrote:


On 04/29/2015 02:43 PM, Andy Thompson wrote:


-Original Message-
From: Martin Kosek [mailto:mko...@redhat.com]
Sent: Wednesday, April 29, 2015 8:31 AM
To: Andy Thompson; freeipa-users@redhat.com
<mailto:freeipa-users@redhat.com> ; Ludwig Krispenz; Thierry
Bordaz
Subject: Re: [Freeipa-users] deleting ipa user

On 04/29/2015 01:26 PM, Andy Thompson wrote:

I'm trying to delete an IPA account and I get a
generic "operations error"

when trying to remove it.  It looks like something is
messed up with the
group object.  The user doesn't show up in the
ipausers group and there also
isn't a group object for the user in question.  Here is
the error from the
attempt.

[29/Apr/2015:07:21:32 -0400] referint-plugin -
_update_all_per_mod:
entry
cn=ipausers,cn=groups,cn=accounts,dc=domain,dc=com: deleting
"member:
uid=,cn=users,cn=accounts,dc=domain,dc=com"

failed

(16)
[29/Apr/2015:07:21:32 -0400] referint-plugin -
_update_all_per_mod:
entry
ipaUniqueID=3897c894-e764-11e4-b05b-

005056a92af3,cn=hbac,dc=domain,dc=

com: deleting "memberUser:

uid=,cn=users,cn=accounts,dc=domain,dc=com" failed
(16)
[29/Apr/2015:07:21:32 -0400]
ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e:
0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-
entries-plugin - mep_del_post_op:
failed to delete managed entry

(cn=,cn=groups,cn=accounts,dc=domain,dc=com) -
error (1)
[29/Apr/2015:07:21:32 -0400]
ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e:
0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-
entries-plugin - mep_del_post_op:
failed to delete managed entry

(cn=,cn=groups,cn=accounts,dc=domain,dc=com) -
error (1)

This is the first time I see this error. CCing Ludwig or
Thierry to advise.

Andy, please also include FreeIPA and 389-ds-base
packages versions so that
Thierry and Ludwig know what to look at.


Here you go

ipa-server-4.1.0-18.el7_1.3.x86_64
389-ds-base-1.3.3.1-15.el7_1.x86_64

Thanks much

-andy



Hello,

I wonder it is not a similar issue I hit
https://fedorahosted.org/389/ticket/48165. What differs is
'_update_all_per_mod' logs but could be a consequence of the same bug.


I think what differs taht in the ticket there is an attempt to delete an existng
entry, but in the log snippet provided it attempts to delete a tombstone
entry (an entry which was already deleted).
So the errors logged by DS seem to be ok, but why does IPA want to delete
an already deleted user ? but mybe only the mep plugin finds a tombstone
and tries to delete it.

What was the command executed, is the result the same if repeated ?



I attempted using the web interface initially
  and then tried using ipa user-del  to see if it gave any more 
detail.
were both attempts at 2015:07:21:32 ? or do you have more errors in the 
error log ?


More info though, this is a replicated environment and  I just tried deleting 
it on the replica server and it completed successfully so it appears I might 
have a replication issue going on?  Hopefully I didn't mess somethin

Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz

can you do the followin search on both servers ?

 ldapsearch -LLL -o ldif-wrap=no -h xxx p xxx  -x -D "cn=directory 
manager" -w xxx  -b "dc=xxx " 
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8))" 
nscpentrywsi | grep -i objectClass



-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:07 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 03:40 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 9:22 AM
To: thierry bordaz
Cc: Andy Thompson; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 03:14 PM, thierry bordaz wrote:


On 04/29/2015 02:43 PM, Andy Thompson wrote:


-Original Message-
From: Martin Kosek [mailto:mko...@redhat.com]
Sent: Wednesday, April 29, 2015 8:31 AM
To: Andy Thompson; freeipa-users@redhat.com
<mailto:freeipa-users@redhat.com> ; Ludwig Krispenz; Thierry
Bordaz
Subject: Re: [Freeipa-users] deleting ipa user

On 04/29/2015 01:26 PM, Andy Thompson wrote:

I'm trying to delete an IPA account and I get a

generic

"operations error"

when trying to remove it.  It looks like something is

messed up

with the
group object.  The user doesn't show up in the

ipausers group and

there also
isn't a group object for the user in question.  Here is

the error

from the
attempt.

[29/Apr/2015:07:21:32 -0400] referint-plugin -
_update_all_per_mod:
entry
cn=ipausers,cn=groups,cn=accounts,dc=domain,dc=com: deleting
"member:
uid=,cn=users,cn=accounts,dc=domain,dc=com"

failed

(16)
[29/Apr/2015:07:21:32 -0400] referint-plugin -
_update_all_per_mod:
entry
ipaUniqueID=3897c894-e764-11e4-b05b-

005056a92af3,cn=hbac,dc=domain,dc=

com: deleting "memberUser:

uid=,cn=users,cn=accounts,dc=domain,dc=com" failed
(16)
[29/Apr/2015:07:21:32 -0400]
ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e:
0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-

entries-plugin -

mep_del_post_op:
failed to delete managed entry

(cn=,cn=groups,cn=accounts,dc=domain,dc=com) -

error (1)

[29/Apr/2015:07:21:32 -0400]
ldbm_back_delete - conn=0 op=0 Turning a
tombstone into a tombstone!
"nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0abc1a8,cn=,cn=group

s,cn=accounts,dc=domain,dc=com"; e:
0x7fcc84226070, cache_state: 0x0,
refcnt: 1
[29/Apr/2015:07:21:32 -0400] managed-

entries-plugin -

mep_del_post_op:
failed to delete managed entry

(cn=,cn=groups,cn=accounts,dc=domain,dc=com) -

error (1)

This is the first time I see this error. CCing Ludwig or

Thierry

to advise.

Andy, please also include FreeIPA and 389-ds-base

packages

versions so that
Thierry and Ludwig know what to look at.


Here you go

ipa-server-4.1.0-18.el7_1.3.x86_64
389-ds-base-1.3.3.1-15.el7_1.x86_64

Thanks much

-andy



Hello,

I wonder it is not a similar issue I hit
https://fedorahosted.org/389/ticket/48165. What differs is
'_update_all_per_mod' logs but could be a consequence of the same bug.


I think what differs taht in the ticket there is an attempt to delete
an existng entry, but in the log snippet provided it attempts to
delete a tombstone entry (an entry which was already deleted).
So the errors logged by DS seem to be ok, but why does IPA want to
delete an already deleted user ? but mybe only

Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz

did you run the searches as directory manager ?

On 04/29/2015 04:34 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:28 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

can you do the followin search on both servers ?

   ldapsearch -LLL -o ldif-wrap=no -h xxx p xxx  -x -D "cn=directory manager" -
w xxx  -b "dc=xxx "
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-
f0abc1a8))"
nscpentrywsi | grep -i objectClass

The server that I initially attempted the deletion on returns nothing.  The 
second server (the one currently throwing the consumer failed replay error)  
returns this if I remove the nscpentrywsi attribute filter.  If I leave the 
attribute filter I don't get anything

objectClass: posixgroup
objectClass: ipaobject
objectClass: mepManagedEntry
objectClass: top
objectClass: nsTombstone

-andy


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 04:49 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:51 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

did you run the searches as directory manager ?


Yep sure did
that's weird, as directory manager you should be able to see the 
nscpentrywsi attribute, could you paste your full search request ?



  

On 04/29/2015 04:34 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:28 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

can you do the followin search on both servers ?

ldapsearch -LLL -o ldif-wrap=no -h xxx p xxx  -x -D "cn=directory
manager" - w xxx  -b "dc=xxx "
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-

99f1b343-

f0abc1a8))"
nscpentrywsi | grep -i objectClass

The server that I initially attempted the deletion on returns nothing.
The second server (the one currently throwing the consumer failed
replay error)  returns this if I remove the nscpentrywsi attribute
filter.  If I leave the attribute filter I don't get anything

objectClass: posixgroup
objectClass: ipaobject
objectClass: mepManagedEntry
objectClass: top
objectClass: nsTombstone

-andy


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 05:08 PM, Andy Thompson wrote:



-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:59 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 04:49 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:51 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

did you run the searches as directory manager ?


Yep sure did

that's weird, as directory manager you should be able to see the
nscpentrywsi attribute, could you paste your full search request ?

This returns the object

ldapsearch -LLL -o ldif-wrap=no -H ldap://mdhixnpipa02 -x -D "cn=directory manager" -W  -b 
"dc=..." "(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8))"  
| grep -i objectClass

This returns nothing

ldapsearch -LLL -o ldif-wrap=no -H ldap://mdhixnpipa02 -x -D "cn=directory manager" -W  -b 
"dc=..." "(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8))"  
nscpentrywsi | grep -i objectClass

and if you omit the grep ? still puzzled.
what is logged in the access log for these two searches?






On 04/29/2015 04:34 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:28 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

can you do the followin search on both servers ?

 ldapsearch -LLL -o ldif-wrap=no -h xxx p xxx  -x -D
"cn=directory manager" - w xxx  -b "dc=xxx"
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-

99f1b343-

f0abc1a8))"
nscpentrywsi | grep -i objectClass

The server that I initially attempted the deletion on returns nothing.
The second server (the one currently throwing the consumer failed
replay error)  returns this if I remove the nscpentrywsi attribute
filter.  If I leave the attribute filter I don't get anything

objectClass: posixgroup
objectClass: ipaobject
objectClass: mepManagedEntry
objectClass: top
objectClass: nsTombstone

-andy


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] deleting ipa user

2015-04-29 Thread Ludwig Krispenz


On 04/29/2015 05:35 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 11:28 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 05:08 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:59 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user


On 04/29/2015 04:49 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:51 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

did you run the searches as directory manager ?


Yep sure did

that's weird, as directory manager you should be able to see the
nscpentrywsi attribute, could you paste your full search request ?

This returns the object

ldapsearch -LLL -o ldif-wrap=no -H ldap://mdhixnpipa02 -x -D
"cn=directory manager" -W  -b "dc=..."
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0a

bc1a8))"  | grep -i objectClass

This returns nothing

ldapsearch -LLL -o ldif-wrap=no -H ldap://mdhixnpipa02 -x -D
"cn=directory manager" -W  -b "dc=..."
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-99f1b343-

f0a

bc1a8))"  nscpentrywsi | grep -i objectClass

and if you omit the grep ? still puzzled.

Ah if I omit the grep on the second server I get

dn: 
nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8,cn=username,cn=groups,cn=accounts,dc=mhbenp,dc=lin
nscpentrywsi: dn: 
nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8,cn=username,cn=groups,cn=accounts,dc=mhbenp,dc=lin
nscpentrywsi: objectClass;vucsn-55364a4200050004: posixgroup
nscpentrywsi: objectClass;vucsn-55364a4200050004: ipaobject
nscpentrywsi: objectClass;vucsn-55364a4200050004: mepManagedEntry
nscpentrywsi: objectClass;vucsn-55364a4200050004: top
nscpentrywsi: objectClass;vucsn-5540deb800030003: nsTombstone
nscpentrywsi: cn;vucsn-55364a4200050004;mdcsn-55364a4200050004: gfeigh
nscpentrywsi: gidNumber;vucsn-55364a4200050004: 124903
nscpentrywsi: description;vucsn-55364a4200050004: User private group for 
username
nscpentrywsi: mepManagedBy;vucsn-55364a4200050004: uid= 
username,cn=users,cn=accounts,dc=mhbenp,dc=lin
nscpentrywsi: creatorsName;vucsn-55364a4200050004: cn=Managed 
Entries,cn=plugins,cn=config
nscpentrywsi: modifiersName;vucsn-55364a4200050004: cn=Managed 
Entries,cn=plugins,cn=config
nscpentrywsi: createTimestamp;vucsn-55364a4200050004: 20150421130152Z
nscpentrywsi: modifyTimestamp;vucsn-55364a4200050004: 20150421130152Z
nscpentrywsi: nsUniqueId: 7e1a1f87-e82611e4-99f1b343-f0abc1a8
nscpentrywsi: ipaUniqueID;vucsn-55364a4200050004: 
94dc1638-e826-11e4-878a-005056a92af3
nscpentrywsi: parentid: 4
nscpentrywsi: entryid: 385
nscpentrywsi: nsParentUniqueId: 3763f193-e76411e4-99f1b343-f0abc1a8
nscpentrywsi: nstombstonecsn: 5540deb800030003
nscpentrywsi: nscpEntryDN: cn=username,cn=groups,cn=accounts,dc=mhbenp,dc=lin
nscpentrywsi: entryusn: 52327

thought I tried that before, apparently not.
ok, so we have the entry on one server, the csn of the objectclass: 
tombstone is :


objectClass;vucsn-5540deb800030003: nsTombstone

, which matches the csn in the error log:

Consumer failed to replay change (uniqueid 7e1a1f87-e82611e4-99f1b343-f0abc1a8, 
CSN 5540deb800030003): Operations error (1)
so the state of the entry is as expected.

Now we nend to find it on the other server. If the search for the & filter with 
nstombstone does return nothing, could you try
-  a plain search (nsuniqueid=7e1a1f87-e82611e4-99f1b343-f0abc1a8) (also with 
nscpentrywsi)
or if this doesn't return anything:
- (objectclass=nstombstone) and grep for your 





what is logged in the access log for these two searches?



On 04/29/2015 04:34 PM, Andy Thompson wrote:

-Original Message-
From: Ludwig Krispenz [mailto:lkris...@redhat.com]
Sent: Wednesday, April 29, 2015 10:28 AM
To: Andy Thompson
Cc: thierry bordaz; Martin Kosek; freeipa-users@redhat.com
Subject: Re: [Freeipa-users] deleting ipa user

can you do the followin search on both servers ?

  ldapsearch -LLL -o ldif-wrap=no -h xxx p xxx  -x -D
"cn=directory manager" - w xxx  -b "dc=xxx"
"(&(objectclass=nstombstone)(nsuniqueid=7e1a1f87-e82611e4-

99f1b343-

f0abc1a8))"
nscpentrywsi | grep -i objectClass

The server that I initially attempted the deletion on returns nothing.
The second server (the one currently throwing the consumer failed
replay error)  returns this if I remove the nscpentrywsi attribute
filter.  If I leave the attribute filter I don'

Re: [Freeipa-users] thousands DSRetroclPlugin mesages

2015-04-30 Thread Ludwig Krispenz


On 04/29/2015 05:51 PM, Martin (Lists) wrote:

Am 29.04.2015 um 15:43 schrieb Ludwig Krispenz:

On 04/29/2015 03:17 PM, Martin (Lists) wrote:

Am 27.04.2015 um 09:45 schrieb Ludwig Krispenz:

On 04/26/2015 10:49 AM, Martin (Lists) wrote:

Hallo

after a reboot I get almost thousand of the following messages:

DSRetroclPlugin - delete_changerecord: could not delete change record
128755 (rc: 32)

this message comes from changeglog trimming and means that an entry,
which should be purged does not exist (any more).
the retrocl maintains a first/lastchange and trinming starts at
firstchange. if for some reason (race ?) there is an attempt to try to
delete the same entry a second time this message should be logged.
since the changenumbers in the error message increases, I think
changelog trimming moves forward. you could do searches on
"cn=changelog" to verify that trimming works.

changelog is part of the ldbm database plugin and contains several
informations I don't understand (or understand partially). What kind of
information should I look for?

the changelog keeps track of the changes applied to the database, a
typical entry looks like:
dn: changenumber=4,cn=changelog
objectClass: top
objectClass: changelogentry
changeNumber: 4
targetDn: cn=tuser,ou=people,dc=example,dc=com
changeTime: 20140411093444Z
changeType: delete

OK, I looked in the wrong directory. Now I have found many changelog
entries, starting with number 152926 and ending with 155512 (ldapsearch
states 2588 numEntries). Should that be that much?

The oldest is about two days and an half old and it does not change
within the last few minutes.


each entry gets a DN made up from he changenumber, so your entries will
be named:

dn: changenumber=61,cn=changelog
dn: changenumber=62,cn=changelog
dn: changenumber=63,cn=changelog
dn: changenumber=64,cn=changelog

changenumbers start and are always incremented, changelog trimming
removes old entries (depending on config).

so if you do a search like:
ldapsearch .. -b "cn=changelog"
the changenumber of the first entry rerurne should always increase,
indicating that trimming works.

As it seems my trimming is broken, at least partially. Is there
something I can adjust?
no, it seems to be ok, IPA configures the "changelog maxage" as 2d, so 
if changelog trimming runs, it removes changes older than two days, then 
it "sleeps" for this time and then runs again, so the changes could pile 
up to four days, then get trimmed and so on ...



you said "thousands" of messages, how frequent are they really ?

On every reboot I got these messages. I do not get them during normal
opperation.

how frequently do you reboot ? maybe you only see the trimming after startup


Something odd I observed after the last two reboots: ns-slapd runs my
hard disk for several minutes (about 15 minutes) after the reboot. This
is the time it takes to log all these change record messages.

Kindly
Martin



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] IPA RUV unable to decode

2015-05-05 Thread Ludwig Krispenz


On 05/05/2015 01:27 PM, Martin Kosek wrote:

On 05/05/2015 12:38 PM, Vaclav Adamec wrote:

Hi,
  I tried migrate to newest version IPA, but result is quite unstable and
removing old replicas ends with RUV which cannot be decoded (it stucked in
queue forever):

ipa-replica-manage del ipa-master-dmz002.test.com -fc
Cleaning a master is irreversible.
This should not normally be require, so use cautiously.
Continue to clean master? [no]: yes

ipa-replica-manage list-ruv
unable to decode: {replica 8} 5509123900040008 5509123900040008
unable to decode: {replica 7} 552f84cd00030007 552f84cd00030007
unable to decode: {replica 11} 551a42f7000b 551aa3140001000b
unable to decode: {replica 15} 551e82e10001000f 551e82e10001000f
unable to decode: {replica 14} 551e82ec0001000e 551e82ec0001000e
unable to decode: {replica 20} 552f4b7200060014 552f4b7200060014
unable to decode: {replica 10} 551a25af0001000a 551a25af0001000a
unable to decode: {replica 3} 551e864c00030003 551e864c00030003
unable to decode: {replica 5} 55083ad200030005 55083ad200030005
unable to decode: {replica 9} 550913e70009 550913e70009
unable to decode: {replica 19} 5521019300030013 5521019300030013
unable to decode: {replica 12} 551a4829000c 551a48c5000c
ipa-master-dmz001.test.com:389: 25
ipa-master-dmz002.test.com:389: 21

it is possible to clear this queue and leave only valid servers ?

Thanks in advance

ipa-client-4.1.0-18.el7_1.3.x86_64
ipa-server-4.1.0-18.el7_1.3.x86_64

Ludwig or Thierry, do you know? The questions about RUV cleaning seems to be
recurring, I suspect there will be a pattern (bug) and not just configuration
issue.
we have seen this in a recent thread, and it is clear that the RUV is 
corrupted and cannot be decoded, but we don't have a scenario how this 
is state is reached.


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


[Freeipa-users] Fwd: Re: IPA RUV unable to decode

2015-05-05 Thread Ludwig Krispenz

let's keep the info on the list, more peple more ideas

 Original Message 
Subject:Re: [Freeipa-users] IPA RUV unable to decode
Date:   Tue, 5 May 2015 18:32:15 +0200
From:   Vaclav Adamec 
To: Ludwig Krispenz 



master:
ipa-replica-manage del  -fc
ipa-replica-prepare ... + copy gpg

replicas:
ipa-server-install --uninstall
ipa-replica-install ...

one by one on all replicas (1 CA master and 5 replicas which are also 
set in DNS as SRV records). Plus whole the time there was script/job 
which register clients (ipa-client-install --uninstall + 
ipa-client-install ... --force-join   different active replica 
servers - via DNS). Seems that I have some issue with replica itself, it 
seems to work, but when it's loaded with such "heavy" operation replica 
servers goes down (dir process died) even master itself went down with 
minor codes errs in log.



On Tue, May 5, 2015 at 3:25 PM, Ludwig Krispenz <mailto:lkris...@redhat.com>> wrote:



   On 05/05/2015 02:24 PM, Vaclav Adamec wrote:

For me it was quite easy to reached this state, I just setup
replica on 6 servers (two datacenters, with ~15ms latency, same
stable ntp source), run test scripts which add/verify random
account, add/remove hosts (unregister seems to be "heaviest"
operation to replicas). During these tests I removed some remove
servers from replica and try to add them back,

   what exactly did you do:
   ipa-server-install --uninstall
   ipa-server-install ?
   any other commands ?


result is like this. Seems that with higher load it happens more
often (register/unregister multiple servers, with actual setup it
is about 10 new registration in same time to different replica
servers), also sometimes I got replication conflicts or dir
service complete failure (dirsrv down, started with gssapi minor
code error). But even If I stop tests queue is not freed, RUV
still there (old replicas).

    On Tue, May 5, 2015 at 1:49 PM, Ludwig Krispenz
mailto:lkris...@redhat.com>> wrote:


On 05/05/2015 01:27 PM, Martin Kosek wrote:

On 05/05/2015 12:38 PM, Vaclav Adamec wrote:

Hi,
  I tried migrate to newest version IPA, but result is
quite unstable and
removing old replicas ends with RUV which cannot be
decoded (it stucked in
queue forever):

ipa-replica-manage del ipa-master-dmz002.test.com
<http://ipa-master-dmz002.test.com> -fc
Cleaning a master is irreversible.
This should not normally be require, so use cautiously.
Continue to clean master? [no]: yes

ipa-replica-manage list-ruv
unable to decode: {replica 8} 5509123900040008
5509123900040008
unable to decode: {replica 7} 552f84cd00030007
552f84cd00030007
unable to decode: {replica 11} 551a42f7000b
551aa3140001000b
unable to decode: {replica 15} 551e82e10001000f
551e82e10001000f
unable to decode: {replica 14} 551e82ec0001000e
551e82ec0001000e
unable to decode: {replica 20} 552f4b7200060014
552f4b7200060014
unable to decode: {replica 10} 551a25af0001000a
551a25af0001000a
unable to decode: {replica 3} 551e864c00030003
551e864c00030003
unable to decode: {replica 5} 55083ad200030005
55083ad200030005
unable to decode: {replica 9} 550913e70009
550913e70009
unable to decode: {replica 19} 5521019300030013
5521019300030013
unable to decode: {replica 12} 551a4829000c
551a48c5000c
ipa-master-dmz001.test.com:389
<http://ipa-master-dmz001.test.com:389>: 25
ipa-master-dmz002.test.com:389
<http://ipa-master-dmz002.test.com:389>: 21

it is possible to clear this queue and leave only
valid servers ?

Thanks in advance

ipa-client-4.1.0-18.el7_1.3.x86_64
ipa-server-4.1.0-18.el7_1.3.x86_64

Ludwig or Thierry, do you know? The questions about RUV
cleaning seems to be
recurring, I suspect there will be a pattern (bug) and not
just configuration
issue.

we have seen this in a recent thread, and it is clear that the
RUV is corrupted and cannot be decoded, but we don't have a
scenario how 

Re: [Freeipa-users] Problem with replication

2015-05-06 Thread Ludwig Krispenz

Hi,

there seem to be different issues,
- I don't know what the ipactl status is looking for when it generates 
the error message about no matching master,

but I don't think it is related to the retro changelog.

- the retro changelog errors for adding and deleting
-- the add failures are about aborted transactions because a page cannot 
be accessed, this maybe caused by concurrent mods on different backends, 
which want to update teh shared retro cl database.
the changenumber reprted seems to be increasing, one error is about 
changenumber 44975, the next about 45577, so it looks like changes into 
the changelog are written and teh changenumber increases
-- i'm not sure about the delete errors, but normally trimming would go 
on after such an error message, the changenumber attempted to delete are 
increasing.
Could you verify which changes are in the changelog, and if these are 
changing:

ldapsearch -b "cn=changelog" dn

On 05/06/2015 09:52 AM, Łukasz Jaworski wrote:

Hi,

One of our replica hanged up morning. Error log after dirsrv restart:
[06/May/2015:09:28:15 +0200] - Retry count exceeded in delete
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 38376 (rc: 51)
[06/May/2015:09:28:15 +0200] - Operation error fetching Null DN 
(6368aeb7-f3c111e4-ae70ce39-9b469c1f), error -30993.
[06/May/2015:09:28:15 +0200] - dn2entry_ext: Failed to get id for 
changenumber=44975,cn=changelog from entryrdn index (-30993)
[06/May/2015:09:28:15 +0200] - Operation error fetching 
changenumber=44975,cn=changelog (null), error -30993.
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - replog: an error occured while 
adding change number 44975, dn = changenumber=44975,cn=changelog: Operations 
error.
[06/May/2015:09:28:15 +0200] retrocl-plugin - retrocl_postob: operation failure 
[1]
[06/May/2015:09:28:15 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:03 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:06 +0200] - Retry count exceeded in delete
[06/May/2015:09:30:06 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39297 (rc: 51)

I did "re-initialize" from other replica.

Now ipactl doesn't work. Shows: Configured hostname 'replica09.local' does not 
match any master server in LDAP. On lists replica09 is exists (twice)

# ipactl status
Failed to get list of services to probe status!
Configured hostname 'replica09.local' does not match any master server in LDAP:
replica01.local
replica02.local
replica03.local
replica04.local
replica05.local
replica06.local
replica07.local
replica08.local
replica09.local
replica10.local
replica09.local

After dirsrv stop/start:

In error logs there are many:
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39290 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39291 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39292 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39293 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39294 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39295 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39296 (rc: 32)
etc.

[06/May/2015:09:51:08 +0200] - Operation error fetching Null DN 
(9f51430a-f3c411e4-927ece39-9b469c1f), error -30993.
[06/May/2015:09:51:08 +0200] - dn2entry_ext: Failed to get id for 
changenumber=45577,cn=changelog from entryrdn index (-30993)
[06/May/2015:09:51:08 +0200] - Operation error fetching 
changenumber=45577,cn=changelog (null), error -30993.
[06/May/2015:09:51:08 +0200] DSRetroclPlugin - replog: an error occured while 
adding change number 45577, dn = changenumber=45577,cn=changelog: Operations 
error.
[06/May/2015:09:51:08 +0200] retrocl-plugin - retrocl_postob: operation failure 
[1]
[06/May/2015:09:51:08 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0

Packages:
freeipa-server-4.1.3-2.fc21.x86_64
389-ds-base-1.3.3.8-1.fc21.x86_64
389-ds-base-libs-1.3.3.8-1.fc21.x86_64

Best regards,
Ender



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Problem with replication

2015-05-06 Thread Ludwig Krispenz

please reply to the mailing list
On 05/06/2015 11:00 AM, Łukasz Jaworski wrote:

Hi,

ipactl stops working after dirsrv-stop/start.

There are many changes in the changelog:
from 39399 to 44397

(…)
# 44393, changelog
dn: changenumber=44393,cn=changelog

# 44394, changelog
dn: changenumber=44394,cn=changelog

# 44395, changelog
dn: changenumber=44395,cn=changelog

# 44396, changelog
dn: changenumber=44396,cn=changelog

# 44397, changelog
dn: changenumber=44397,cn=changelog

# search result
search: 2
result: 11 Administrative limit exceeded

# numResponses: 5001
# numEntries: 5000

Best regards,
Lukasz Jaworski 'Ender'

Wiadomość napisana przez Ludwig Krispenz  w dniu 6 maj 
2015, o godz. 10:52:


Hi,

there seem to be different issues,
- I don't know what the ipactl status is looking for when it generates the 
error message about no matching master,
but I don't think it is related to the retro changelog.

- the retro changelog errors for adding and deleting
-- the add failures are about aborted transactions because a page cannot be 
accessed, this maybe caused by concurrent mods on different backends, which 
want to update teh shared retro cl database.
the changenumber reprted seems to be increasing, one error is about 
changenumber 44975, the next about 45577, so it looks like changes into the 
changelog are written and teh changenumber increases
-- i'm not sure about the delete errors, but normally trimming would go on 
after such an error message, the changenumber attempted to delete are 
increasing.
Could you verify which changes are in the changelog, and if these are changing:
ldapsearch -b "cn=changelog" dn

On 05/06/2015 09:52 AM, Łukasz Jaworski wrote:

Hi,

One of our replica hanged up morning. Error log after dirsrv restart:
[06/May/2015:09:28:15 +0200] - Retry count exceeded in delete
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 38376 (rc: 51)
[06/May/2015:09:28:15 +0200] - Operation error fetching Null DN 
(6368aeb7-f3c111e4-ae70ce39-9b469c1f), error -30993.
[06/May/2015:09:28:15 +0200] - dn2entry_ext: Failed to get id for 
changenumber=44975,cn=changelog from entryrdn index (-30993)
[06/May/2015:09:28:15 +0200] - Operation error fetching 
changenumber=44975,cn=changelog (null), error -30993.
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - replog: an error occured while 
adding change number 44975, dn = changenumber=44975,cn=changelog: Operations 
error.
[06/May/2015:09:28:15 +0200] retrocl-plugin - retrocl_postob: operation failure 
[1]
[06/May/2015:09:28:15 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:03 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:06 +0200] - Retry count exceeded in delete
[06/May/2015:09:30:06 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39297 (rc: 51)

I did "re-initialize" from other replica.

Now ipactl doesn't work. Shows: Configured hostname 'replica09.local' does not 
match any master server in LDAP. On lists replica09 is exists (twice)

# ipactl status
Failed to get list of services to probe status!
Configured hostname 'replica09.local' does not match any master server in LDAP:
replica01.local
replica02.local
replica03.local
replica04.local
replica05.local
replica06.local
replica07.local
replica08.local
replica09.local
replica10.local
replica09.local

After dirsrv stop/start:

In error logs there are many:
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39290 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39291 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39292 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39293 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39294 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39295 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39296 (rc: 32)
etc.

[06/May/2015:09:51:08 +0200] - Operation error fetching Null DN 
(9f51430a-f3c411e4-927ece39-9b469c1f), error -30993.
[06/May/2015:09:51:08 +0200] - dn2entry_ext: Failed to get id for 
changenumber=45577,cn=changelog from entryrdn index (-30993)
[06/May/2015:09:51:08 +0200] - Operation error fetching 
changenumber=45577,cn=changelog (null), error -30993.
[06/May/2015:09:51:08 +0200] DSRetroclPlugin - replog: an error occured while 
adding change number 45577, dn = changenumber=45577,cn=changelog: Operations 
error.
[06/May/2015:09:51:08 +0200] retrocl-plugin - retrocl_postob: operation failure 
[1

Re: [Freeipa-users] Problem with replication

2015-05-06 Thread Ludwig Krispenz


On 05/06/2015 11:10 AM, Łukasz Jaworski wrote:

Hi,

ipactl stops working after dirsrv-stop/start.

There are many changes in the changelog:
from 39399 to 44397

(…)
# 44393, changelog
dn: changenumber=44393,cn=changelog

# 44394, changelog
dn: changenumber=44394,cn=changelog

# 44395, changelog
dn: changenumber=44395,cn=changelog

# 44396, changelog
dn: changenumber=44396,cn=changelog

# 44397, changelog
dn: changenumber=44397,cn=changelog

# search result
search: 2
result: 11 Administrative limit exceeded

# numResponses: 5001
# numEntries: 5000


After some seconds dirsrv stops responding.

In error log:
[06/May/2015:11:00:04 +0200] 
agmt="cn=cloneAgreement1-replica09.local-pki-tomcat" (replica08:389) - Can't 
locate CSN 55100d8c069f in the changelog (DB rc=-30988). If replication stops, 
the consumer may need to be reinitialized.
[06/May/2015:11:00:04 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0

ldapsearch hangs. Dirsrv is not responding now.

if the  server is hanging, can you get a pstack


This replica is on virtual machine (ganeti). We had problems with replication 
to vm, but after force-sync all was fine. On physical servers all works fine.
the messages indicate there could be  many concurrent operations, 
because individual ops are not fast enough, could your VM have 
less/slower resources than the physical machines ?


Lukasz Jaworski 'Ender'

Wiadomość napisana przez Ludwig Krispenz  w dniu 6 maj 
2015, o godz. 10:52:


Hi,

there seem to be different issues,
- I don't know what the ipactl status is looking for when it generates the 
error message about no matching master,
but I don't think it is related to the retro changelog.

- the retro changelog errors for adding and deleting
-- the add failures are about aborted transactions because a page cannot be 
accessed, this maybe caused by concurrent mods on different backends, which 
want to update teh shared retro cl database.
the changenumber reprted seems to be increasing, one error is about 
changenumber 44975, the next about 45577, so it looks like changes into the 
changelog are written and teh changenumber increases
-- i'm not sure about the delete errors, but normally trimming would go on 
after such an error message, the changenumber attempted to delete are 
increasing.
Could you verify which changes are in the changelog, and if these are changing:
ldapsearch -b "cn=changelog" dn

On 05/06/2015 09:52 AM, Łukasz Jaworski wrote:

Hi,

One of our replica hanged up morning. Error log after dirsrv restart:
[06/May/2015:09:28:15 +0200] - Retry count exceeded in delete
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 38376 (rc: 51)
[06/May/2015:09:28:15 +0200] - Operation error fetching Null DN 
(6368aeb7-f3c111e4-ae70ce39-9b469c1f), error -30993.
[06/May/2015:09:28:15 +0200] - dn2entry_ext: Failed to get id for 
changenumber=44975,cn=changelog from entryrdn index (-30993)
[06/May/2015:09:28:15 +0200] - Operation error fetching 
changenumber=44975,cn=changelog (null), error -30993.
[06/May/2015:09:28:15 +0200] DSRetroclPlugin - replog: an error occured while 
adding change number 44975, dn = changenumber=44975,cn=changelog: Operations 
error.
[06/May/2015:09:28:15 +0200] retrocl-plugin - retrocl_postob: operation failure 
[1]
[06/May/2015:09:28:15 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:03 +0200] - ldbm_back_seq deadlock retry BAD 1601, err=0 
BDB0062 Successful return: 0
[06/May/2015:09:30:06 +0200] - Retry count exceeded in delete
[06/May/2015:09:30:06 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39297 (rc: 51)

I did "re-initialize" from other replica.

Now ipactl doesn't work. Shows: Configured hostname 'replica09.local' does not 
match any master server in LDAP. On lists replica09 is exists (twice)

# ipactl status
Failed to get list of services to probe status!
Configured hostname 'replica09.local' does not match any master server in LDAP:
replica01.local
replica02.local
replica03.local
replica04.local
replica05.local
replica06.local
replica07.local
replica08.local
replica09.local
replica10.local
replica09.local

After dirsrv stop/start:

In error logs there are many:
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39290 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39291 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39292 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39293 (rc: 32)
[06/May/2015:09:50:30 +0200] DSRetroclPlugin - delete_changerecord: could not 
delete change record 39294 (rc: 32)
[06/May/2015:09:50:30 +0200]

Re: [Freeipa-users] Antwort: Re: more replication fun

2015-05-07 Thread Ludwig Krispenz


On 05/07/2015 10:46 AM, Christoph Kaminski wrote:

> I am curious however. I have been running OpenLDAP configs with 20 or
> more servers in replication for over 5 years. In all that time, I think
> I have had replication issues 5 times.  In the 6 months of working with
> FreeIPA, replication issues are constant. From reading the threads, 
I am

> not the only one in this predicament. Is there any history on why
> replication is so problematic in FreeIPA?
>
same here... OpenLDAP no problems, since we use IPA we have ever 
replikation issues


I think the replikation design is the problem. All IPA's are master. I 
think it would be more stable if there would be 1 Master and all 
replicas are read only.


Greetz
i don't think that the multimaster design is a problem in itself (it is 
complex and can make things complicated, yes).-As Thierry said, we have 
customers with large 389-ds deployments, with many million entries in 
the db, lots of mods and stable replication.
The current issues with the RUVs seem to come from the dymanics of 
adding and removing replicas, which reveals new problems.






-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Antwort: RE: Known issues with IPA on VM?

2015-05-08 Thread Ludwig Krispenz


On 05/07/2015 08:38 AM, Christoph Kaminski wrote:

> Just a guess, what is your deployment size?
> We have a two ipa domains, one have 3 servers (2 hw and 1 vm, no
> issues with dirsrv yet), another currently includes 16 vm servers,
> ant dirsrv hangs and crashes periodically...
>

we have 8 IPA servers, 4 bare metal and 4 vm's. We see the crashes 
only on the vm's.





yes, there have been several reports about problems on VMs, but as 
Martin and Rich said for investigation we need some data about the crashes
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] more replication fun

2015-05-08 Thread Ludwig Krispenz


On 05/08/2015 05:30 PM, Rob Crittenden wrote:

Janelle wrote:

On 5/7/15 12:59 AM, thierry bordaz wrote:

On 05/07/2015 05:39 AM, Janelle wrote:

On 5/6/15 8:12 PM, Vaclav Adamec wrote:

Hi,
  Mike Reynolds recommend cleanallruv script (IPA RUV unable to 
decode

thread), if you are sure that's not any live replica server behind
this id than just try "cleanallruv.pl -w X -b "dc=" -r 9"

Vasek


On Thu, May 7, 2015 at 2:25 AM, Janelle 
wrote:

Hi again..

Seems to be an ongoing theme (replication). How does one remove 
these?


unable to decode: {replica 9} 553ef80e00010009
55402c390009

I am hoping this is a stupid question with a really simple answer
that I am simply missing?

~J

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project





Thank you Vasek,

I am curious however. I have been running OpenLDAP configs with 20 or
more servers in replication for over 5 years. In all that time, I
think I have had replication issues 5 times.  In the 6 months of
working with FreeIPA, replication issues are constant. From reading
the threads, I am not the only one in this predicament. Is there any
history on why replication is so problematic in FreeIPA?

regards
~J


Hi Janelle,

This is a large question and I have no precise answer. My
understanding of OpenLDAP replication vs RHDS replication is that
it is not based on the same approach syncrepl vs
replica_agreement. Both are working. Replication is complex  and
when I compare RHDS with others DS implementation using the same
approach (replica_agreement) I can say that RHDS is doing a good
job in terms of performance, stability and robustness.

Replication is sensitive to administrative tasks, backup-restore,
reinit, upgrade, schema update. This is possibly your case we have
seen 'unable to decode' during upgrade/cleanruv and still
investigating that bug.

thanks
thierry


All of this makes good sense - in fact, even the OpenLDAP vs 389-ds
issues and replication. Yes, in the environment I had, there were a
couple of masters, and the reset were read-only, which meant keeping in
sync - easy.

Now, I was looking through the archives and can't seem to find the
recommended way to delete one of these:

unable to decode  {replica 22} 553eec6400040016 553eec6400040016

I think someone mentioned a script, but I can't find it.   I have
several replicas in this state and would like to try and clean them up.
The interesting thing is - the replicas in this state 
seehttps://www.redhat.com/archives/freeipa-users/2015-May/msg00062.htmlm 
to have a

higher CPU load as based on "uptime". Interesting.

Thanks
~J




See https://www.redhat.com/archives/freeipa-users/2015-May/msg00062.html

hopefully it does, if not maybe Mark can help to get rid of it


It would be nice to know if this style of RUV could be acted on by 
ipa-replica-manage. I added this bit as a catch-all so no RUV would be 
invisibly skipped if it didn't match the regex I wrote. If this kind 
of RUV could indeed still be cleaned it would be nice to know and we 
could make that possible.
I think this kind of RUV should never exist, strange enough we have a 
hard time to reproduce it in the lab, but out in the real world they 
seem to proliferate.


Any help to reproduce is greatly appreciated.

Ludwig


rob



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] more replication issues

2015-05-15 Thread Ludwig Krispenz


On 05/13/2015 06:34 PM, Janelle wrote:

On 5/13/15 9:13 AM, Rich Megginson wrote:

On 05/13/2015 10:04 AM, Janelle wrote:

On 5/13/15 8:49 AM, Rich Megginson wrote:

On 05/13/2015 09:40 AM, Janelle wrote:

Recently I started seeing these crop up across my servers:

slapi_ldap_bind - Error: could not bind id [cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config] authentication 
mechanism [SIMPLE]: error 32 (No such object) errno 0 (Success)


Does that entry exist?

ldapsearch -xLLL -h consumer.host -D "cn=directory manager" -W -s 
base -b "cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config"


Does the parent exist?

ldapsearch -xLLL -h consumer.host -D "cn=directory manager" -W -s 
base -b "ou=csusers,cn=config"


I am finding that there does seem to be a relation to the above 
error and a possible CSN issue:


Can't locate CSN 555131e500020019 in the changelog (DB 
rc=-30988). If replication stops, the consumer may need to be 
reinitialized.


I guess what concerns me is what could be causing this. We don't do 
a lot of changes all the time.


And in answer to the question above - we seem to have last the 
agreement somehow:


No such object (32)



Is there a DEL operation in the access log for "cn=Replication 
Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config"?


maybe something like

# grep DEL /var/log/dirsrv/slapd-INST/access|grep -i "Replication 
Manager"



nope -- none of the servers have it.

your original message is very clear:

could not bind id [cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config] 
authentication mechanism [SIMPLE]: error 32 (No such object) errno 0 
(Success)


this means that you have replication agreement wth SIMPLE auth which uses a
nsDS5ReplicaBindDN: cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config


which does not exist on the target server of the agreement. Now you say 
it was never deleted, so it was probably never added, but used in the 
replication agreements. How do you manage and setup replication agreements ?


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] more replication issues

2015-05-15 Thread Ludwig Krispenz


On 05/15/2015 02:45 PM, Janelle wrote:

On 5/15/15 3:30 AM, Ludwig Krispenz wrote:


On 05/13/2015 06:34 PM, Janelle wrote:

On 5/13/15 9:13 AM, Rich Megginson wrote:

On 05/13/2015 10:04 AM, Janelle wrote:

On 5/13/15 8:49 AM, Rich Megginson wrote:

On 05/13/2015 09:40 AM, Janelle wrote:

Recently I started seeing these crop up across my servers:

slapi_ldap_bind - Error: could not bind id [cn=Replication 
Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config] 
authentication mechanism [SIMPLE]: error 32 (No such object) 
errno 0 (Success)


Does that entry exist?

ldapsearch -xLLL -h consumer.host -D "cn=directory manager" -W -s 
base -b "cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config"


Does the parent exist?

ldapsearch -xLLL -h consumer.host -D "cn=directory manager" -W -s 
base -b "ou=csusers,cn=config"


I am finding that there does seem to be a relation to the above 
error and a possible CSN issue:


Can't locate CSN 555131e500020019 in the changelog (DB 
rc=-30988). If replication stops, the consumer may need to be 
reinitialized.


I guess what concerns me is what could be causing this. We don't 
do a lot of changes all the time.


And in answer to the question above - we seem to have last the 
agreement somehow:


No such object (32)



Is there a DEL operation in the access log for "cn=Replication 
Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config"?


maybe something like

# grep DEL /var/log/dirsrv/slapd-INST/access|grep -i "Replication 
Manager"



nope -- none of the servers have it.

your original message is very clear:

could not bind id [cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config] 
authentication mechanism [SIMPLE]: error 32 (No such object) errno 0 
(Success)


this means that you have replication agreement wth SIMPLE auth which 
uses a
nsDS5ReplicaBindDN: cn=Replication Manager 
masterAgreement1-ipa01.example.com-pki-tomcat,ou=csusers,cn=config


which does not exist on the target server of the agreement. Now you 
say it was never deleted, so it was probably never added, but used in 
the replication agreements. How do you manage and setup replication 
agreements ?



All replicas are configred simply:

ipa-replica-prepare hostname...
scp ..
ipa-replica-install --no-ntp --setup-ca Replica-file

That is it. NTP is not set because internal NTP servers are used. All 
replicas are CA replicas for safety (no certs are managed)
ok, I was a bit puzzled because ipa uses ldapprincipals and gssapi for 
the main suffix replication.
But I just verified that after ipa-replica-install --setup-ca CA 
replication is setup with users in ou=csusers,cn=config and uses it as 
replica binddn, I have no idea why it would disappear.


when Rich asked to search for a DEL, did you check this on the server 
that logged the message or on the endpoint of the replication agreement 
(it should be there), and you may have to check in the rotated access 
logs access. as well


After a few days to a week the message starts popping up in logs.

~J



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replication again :-(

2015-05-19 Thread Ludwig Krispenz


On 05/19/2015 08:58 AM, thierry bordaz wrote:

On 05/19/2015 07:47 AM, Martin Kosek wrote:

On 05/19/2015 03:23 AM, Janelle wrote:
Once again, replication/sync has been lost. I really wish the 
product was more

stable, it is so much potential and yet.

Servers running for 6 days no issues. No new accounts or changes 
(maybe a few
users changing passwords) and again, 5 out of 16 servers are no 
longer in sync.


I can test it easily by adding an account and then waiting a few 
minutes, then
run "ipa  user-show --all username" on all the servers, and only a 
few of them

have the account.  I have now waited 15 minutes, still no luck.

Oh well.. I guess I will go look at alternatives. I had such high 
hopes for
this tool. Thanks so much everyone for all your help in trying to 
get things
stable, but for whatever reason, there is a random loss of sync 
among the

servers and obviously this is not acceptable.


Hello Janelle,

I am very sorry to hear about your troubles. Would you be still OK 
with helping us (mostly Ludwig and Thierry) investigate what is the 
root cause of the instability of the replication agreements? This is 
obviously something that should not be happening at this rate as in 
your deployment, so I would really like to be able to identity and 
fix this issue in the 389 DS.

Hello Janelle,

I can only join my voice to Martin to say how I am sorry to read this.
Would you turn on replication logging level (8192) on the 
master/consumer and provide us the logs(access/error) and config 
(dse.ldif).
The master is the instance where you can see the update and the that 
is linked (replica agreement) to a replica(aka consumer) where the 
update is not received.
what puzzles me most, is that replication is working for quite some time 
and then breaks, so we need to find out about the dynamics which lead to 
that state. You reported errors about invalid credentials or about a 
bind dn entry not found, these credentials don't get changed by ds or 
entries are not deleted by ds, so what triggers these changes.
also for the suggestion by Thierry to debug, we need to determine where 
replication breaks, if you add an account and it is propageted to some 
servers and not to others, where does it stop ? This depends on your 
replication topology, you said in anotehr post that you have a ring 
topology, does it mean all 16 servers are conencted in a ring only, and 
if two links break the topology is disconnected ?


thanks
thierry


-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-20 Thread Ludwig Krispenz


On 05/20/2015 02:57 AM, Janelle wrote:

On 5/19/15 12:04 AM, thierry bordaz wrote:

On 05/19/2015 03:42 AM, Janelle wrote:

On 5/18/15 6:23 PM, Janelle wrote:
Once again, replication/sync has been lost. I really wish the 
product was more stable, it is so much potential and yet.


Servers running for 6 days no issues. No new accounts or changes 
(maybe a few users changing passwords) and again, 5 out of 16 
servers are no longer in sync.


I can test it easily by adding an account and then waiting a few 
minutes, then run "ipa  user-show --all username" on all the 
servers, and only a few of them have the account.  I have now 
waited 15 minutes, still no luck.


Oh well.. I guess I will go look at alternatives. I had such high 
hopes for this tool. Thanks so much everyone for all your help in 
trying to get things stable, but for whatever reason, there is a 
random loss of sync among the servers and obviously this is not 
acceptable.


regards
~J




All the replicas are happy again. I found these again:

unable to decode  {replica 16} 5535647200030010 5535647200030010
unable to decode  {replica 23} 5553e3a30017 555432430017
unable to decode  {replica 24} 554d53d30018 554d54a400020018

What I also found to be interesting is that I have not deleted any 
masters at all, so this was quite perplexing where the orphaned 
entries came from.  However I did find 3 of the replicas did not show 
complete RUV lists... While most of the replicas had a list of all 16 
servers, a couple of them listed only 4 or 5. (using 
ipa-replica-manage list-ruv)
so this happens "out of the blue" ? Did it happen at the same time, do 
you know when it started ? The maxcsns in the ruv are quite old: r16: 
apr,21, r23: may,14 r24: may,9 could it be that there was no change 
applied to these masters for that time ?


Once I re-initialized --from servers that showed the correct RUVS 
everyone is happy again. I have tested replication by creating and 
deleting accounts, changing group members and a few other things. 
Everything is working fine.  I have enabled additional logging.


Now we wait and when it happens again, hopefully we have something.

thanks
~Janelle





-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-20 Thread Ludwig Krispenz


On 05/20/2015 03:25 PM, Janelle wrote:

On 5/20/15 12:54 AM, Ludwig Krispenz wrote:


On 05/20/2015 02:57 AM, Janelle wrote:

On 5/19/15 12:04 AM, thierry bordaz wrote:

On 05/19/2015 03:42 AM, Janelle wrote:

On 5/18/15 6:23 PM, Janelle wrote:
Once again, replication/sync has been lost. I really wish the 
product was more stable, it is so much potential and yet.


Servers running for 6 days no issues. No new accounts or changes 
(maybe a few users changing passwords) and again, 5 out of 16 
servers are no longer in sync.


I can test it easily by adding an account and then waiting a few 
minutes, then run "ipa  user-show --all username" on all the 
servers, and only a few of them have the account.  I have now 
waited 15 minutes, still no luck.


Oh well.. I guess I will go look at alternatives. I had such high 
hopes for this tool. Thanks so much everyone for all your help in 
trying to get things stable, but for whatever reason, there is a 
random loss of sync among the servers and obviously this is not 
acceptable.


regards
~J




All the replicas are happy again. I found these again:

unable to decode  {replica 16} 5535647200030010 5535647200030010
unable to decode  {replica 23} 5553e3a30017 555432430017
unable to decode  {replica 24} 554d53d30018 554d54a400020018

What I also found to be interesting is that I have not deleted any 
masters at all, so this was quite perplexing where the orphaned 
entries came from.  However I did find 3 of the replicas did not 
show complete RUV lists... While most of the replicas had a list of 
all 16 servers, a couple of them listed only 4 or 5. (using 
ipa-replica-manage list-ruv)
so this happens "out of the blue" ? Did it happen at the same time, 
do you know when it started ? The maxcsns in the ruv are quite old: 
r16: apr,21, r23: may,14 r24: may,9 could it be that there was no 
change applied to these masters for that time ?


Indeed yes, that is a correct statement. It seems to be incredibly 
random.
Ok, I give up - how are you finding the date in the strings? And 
really, is May 14th that old?

5535647200030010 is a CSN (ChangeSequenceNumber), it is built of

hextimestamp: 55356472
sequence number: 0003  (numbering of csns generated within the sceond of 
the time stamp

replica id: 0010 (==16) replica, where the change was received
subsequence number:  used internally if a mod consists of several 
sub-mods


May. 14 is not old, but would mean that there was no change on that 
replica for a couple of days




What is odd about the Apr 21st one, is that if you see my previous 
emails, I had cleaned up all of this before, so for that to 
"re-appear" is indeed a mystery.


As of this morning, things remain clean. What will be funny, now that 
I had extended logging enabled, they know we are on to them, so the 
servers won't fail again. :-)


~J







-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] confused by ldapsearch results

2015-05-20 Thread Ludwig Krispenz


On 05/21/2015 07:50 AM, Martin Kosek wrote:

On 05/20/2015 04:01 PM, Boyce, George Robert. (GSFC-762.0)[NICS] wrote:

<<
This worked for me:

$ ldapsearch -LLL -Y GSSAPI -b cn=users,cn=accounts,dc=example,dc=cm
"(|(uid=admin)(name=admin))" dn
SASL/GSSAPI authentication started
SASL username: ad...@example.com
SASL SSF: 56
SASL data security layer installed.
dn: uid=admin,cn=users,cn=accounts,dc=example,dc=com

Note that cn is Common Name which is set to the user's full name, in this case likely 
"George Boyce". So that will never match gboyce.

Rob
Rob,

Thanks for your example, it had me test my ldap bind which narrows the problem 
and gives me a workaround.

I used cn=gboyce to pull my group record, so I expected my test to return two 
records for my account and my group. And it does when I authenticate as admin 
as in your test. So the problem is isolated to when I use a dedicated search 
account. I missed this note on setting up system accounts:

<<
Note: IPA 4.0 is going to change the default stance on data from nearly 
everything is readable to nothing is readable, by default. You will eventually 
need to add some Access Control Instructions (ACI's) to grant read access to 
the parts of the LDAP tree you will need.
Looks like I need to do just that. :-)

Still the behavior of returning nothing by adding an extra false term,

IIRC, this is done on purpose, there was an CVE and as a fix, if you are
querying with an attribute you do not have permission to query with, you get no
answers.

correct. It was https://bugzilla.redhat.com/show_bug.cgi?id=979508
and behaviour matches the spec in 13.3.3.3: 
https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/9.0/html/Administration_Guide/Managing_Access_Control-Creating_ACIs_Manually.html#Creating_ACIs_Manually-Defining_Permissions


For the other problem, there  is not enough information to judge. If two 
entries are in different subtrees also different acis could apply, we 
need the full set of acis, the full search and eventuallay access 
control logging (nsslapd-errorlog-level: 128)



or returning one entry when each of the terms each returns a unique entry,

seems wrong.

It does return two entries when both are in the same subtree.

This one sounds strange, CCing Ludwig for reference.


###
### everything ok when using admin... two records, one from users, one from 
groups
###
# ldapsearch -Y GSSAPI -b "dc=..." "(|(uid=admin)(cn=gboyce))" dn
SASL/GSSAPI authentication started
SASL username: admin@...
SASL SSF: 56
SASL data security layer installed.
# extended LDIF
#
# LDAPv3
# base  with scope subtree
# filter: (|(uid=admin)(cn=gboyce))
# requesting: dn
#

# admin, users, accounts, ...
dn: uid=admin,cn=users,cn=accounts,dc=...

# gboyce, groups, accounts, ...
dn: cn=gboyce,cn=groups,cn=accounts,dc=...

# search result
search: 4
result: 0 Success

# numResponses: 3
# numEntries: 2

##

###
### system account (without ACLs) returns simple queries, but not correct 
results for compound queries in different subtrees
###

###
### different subtrees fails...
###
# ldapsearch -x  -D "uid=LDAPsearch,cn=sysaccounts,cn=etc,dc=..." -w "..." -b "dc=..." 
"(|(uid=admin)(cn=gboyce))" dn
# extended LDIF
#
# LDAPv3
# base  with scope subtree
# filter: (|(uid=admin)(cn=gboyce))
# requesting: dn
#

# admin, users, accounts, ...
dn: uid=admin,cn=users,cn=accounts,dc=...

# search result
search: 2
result: 0 Success

# numResponses: 2
# numEntries: 1

###
### same subtree works...
###
# l "(|(cn=admins)(cn=gboyce))" dn
# extended LDIF
#
# LDAPv3
# base  with scope subtree
# filter: (|(cn=admins)(cn=gboyce))
# requesting: dn
#

# admins, groups, accounts, ...
dn: cn=admins,cn=groups,cn=accounts,dc=...

# gboyce, groups, accounts, ...
dn: cn=gboyce,cn=groups,cn=accounts,dc=...

# search result
search: 2
result: 0 Success

# numResponses: 3
# numEntries: 2

###
### valid filter from above with extra false term...
###
# l "(|(cn=admins)(cn=gboyce)(name=foobar))" dn
# extended LDIF
#
# LDAPv3
# base  with scope subtree
# filter: (|(cn=admins)(cn=gboyce)(name=foobar))
# requesting: dn
#

# search result
search: 2
result: 0 Success

# numResponses: 1




--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] ruv problem

2015-05-21 Thread Ludwig Krispenz
could you try this: 
https://www.redhat.com/archives/freeipa-users/2015-May/msg00062.html

it was successfully applied before

On 05/21/2015 06:58 AM, Alexander Frolushkin wrote:


Hello again.

Is it now clear how to deal with problem ipa-replica-manage list-ruv 
showing


unable to decode: {replica 16} 548a81260010 548a81260010

?

I have this on all of my 17 servers, including a new replica created 
recently, and


ipa-replica-manage clean-ruv 16 says

unable to decode: {replica 16} 548a81260010 
548a81260010 Replica ID 16 not found"


WBR,

Alexander Frolushkin




?? ?  ? ? ? ??? ?? 
???, ??? ??? ??. ? ? ? ??? 
 ??, ??? ?? ?   ??? 
 ???-, ? ?.  ?? ?? ??? ? 
?, ?? ?, ?, ??? ??? 
??? ?? ? ??? ??? ? ? ? 
?.  ??  ??? ? , ??, 
???  ??? ??  ? ??? ??  
??  ? ? ? ? ??? ? ? ??.


The information contained in this communication is intended solely for 
the use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally 
privileged information. The contents may not be disclosed or used by 
anyone other than the addressee. If you are not the intended 
recipient(s), any use, disclosure, copying, distribution or any action 
taken or omitted to be taken in reliance on it is prohibited and may 
be unlawful. If you have received this communication in error please 
notify us immediately by responding to this email and then delete the 
e-mail and all attachments and any copies thereof.


(c)20mf50




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] ruv problem

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 09:50 AM, Alexander Frolushkin wrote:


Thank you. Do I need to run this on each of my 17 IPA servers in unix 
domain?


no, the cleanallruv task should be propagated to all server a repl 
agreement exists


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*freeipa-users-boun...@redhat.com 
[mailto:freeipa-users-boun...@redhat.com] *On Behalf Of *Ludwig Krispenz

*Sent:* Thursday, May 21, 2015 1:37 PM
*To:* freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] ruv problem

could you try this: 
https://www.redhat.com/archives/freeipa-users/2015-May/msg00062.html

it was successfully applied before

On 05/21/2015 06:58 AM, Alexander Frolushkin wrote:

Hello again.

Is it now clear how to deal with problem ipa-replica-manage
list-ruv showing

unable to decode: {replica 16} 548a81260010
548a81260010

?

I have this on all of my 17 servers, including a new replica
created recently, and

ipa-replica-manage clean-ruv 16 says

unable to decode: {replica 16} 548a81260010
548a81260010 Replica ID 16 not found"

WBR,

Alexander Frolushkin




Информация в этом сообщении предназначена исключительно для
конкретных лиц, которым она адресована. В сообщении может
содержаться конфиденциальная информация, которая не может быть
раскрыта или использована кем-либо, кроме адресатов. Если вы не
адресат этого сообщения, то использование, переадресация,
копирование или распространение содержания сообщения или его части
незаконно и запрещено. Если Вы получили это сообщение ошибочно,
пожалуйста, незамедлительно сообщите отправителю об этом и удалите
со всем содержимым само сообщение и любые возможные его копии и
приложения.

The information contained in this communication is intended solely
for the use of the individual or entity to whom it is addressed
and others authorized to receive it. It may contain confidential
or legally privileged information. The contents may not be
disclosed or used by anyone other than the addressee. If you are
not the intended recipient(s), any use, disclosure, copying,
distribution or any action taken or omitted to be taken in
reliance on it is prohibited and may be unlawful. If you have
received this communication in error please notify us immediately
by responding to this email and then delete the e-mail and all
attachments and any copies thereof.

(c)20mf50





Информация в этом сообщении предназначена исключительно для конкретных 
лиц, которым она адресована. В сообщении может содержаться 
конфиденциальная информация, которая не может быть раскрыта или 
использована кем-либо, кроме адресатов. Если вы не адресат этого 
сообщения, то использование, переадресация, копирование или 
распространение содержания сообщения или его части незаконно и 
запрещено. Если Вы получили это сообщение ошибочно, пожалуйста, 
незамедлительно сообщите отправителю об этом и удалите со всем 
содержимым само сообщение и любые возможные его копии и приложения.


The information contained in this communication is intended solely for 
the use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally 
privileged information. The contents may not be disclosed or used by 
anyone other than the addressee. If you are not the intended 
recipient(s), any use, disclosure, copying, distribution or any action 
taken or omitted to be taken in reliance on it is prohibited and may 
be unlawful. If you have received this communication in error please 
notify us immediately by responding to this email and then delete the 
e-mail and all attachments and any copies thereof.


(c)20mf50


-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 01:36 PM, Janelle wrote:

And just like that - for no reason, they all reappeared:

unable to decode  {replica 16} 5535647200030010 5535647200030010
unable to decode  {replica 23} 5545d61f00020017 5552f71800030017
unable to decode  {replica 24} 554d53d30018 554d54a400020018

:-(
~J



so it is the same set of rids again. Just to be sure, do servers with 
rid 16, 23 and 24 still exist ? I think last time you cleaned them by 
ldapmodify, so they should no longer exist.


you said you would enable logging, did you find something in the logs ? 
or can we have a look at them ?


Ludwig
-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 02:20 PM, thierry bordaz wrote:

On 05/21/2015 01:36 PM, Janelle wrote:


And just like that - for no reason, they all reappeared:

unable to decode  {replica 16} 5535647200030010 5535647200030010
unable to decode  {replica 23} 5545d61f00020017 5552f71800030017
unable to decode  {replica 24} 554d53d30018 554d54a400020018

:-(
~J


Hello Janelle,

Those 3 RIDs were already present in Node dc2-ipa1, correct ? They 
reappeared on others nodes as well ?
May be ds2-ipa1 established a replication session with its peers and 
send those RIDs.
Could you track in all the access logs, when the op 
csn=5552f71800030017 was applied.


Note that the two hexa values of replica 23 changed 
(5545d61f00020017 5552f71800030017 vs 5553e3a30017 
555432430017). Have you recreated a replica 23 ?.


Do you have replication logging enabled ?

thanks
thierry





Hi Thierry, Mark,

I have an idea how this can happen, and now I have an environment where 
these show up.


The changelog contains max and purge ruv, and in my changelog I have:


dbid: 006f
entry count: 304

dbid: 00de
purge ruv:
{replicageneration} 51dc3bac0064
{replica 100} a7590064 a7590064
{replica 200} b3c200c8 b3c200c8
{replica 300} b3c20005012c b3c20005012c

dbid: 014d
max ruv:
{replicageneration} 51dc3bac0064
{replica 100} a7590064 d7730064
{replica 200} b3c200c8 b3c200c8
{replica 300} b3c20005012c b3c20005012c


after restarting I got:
 ldapsearch -LLL -o ldif-wrap=no -h localhost  -p 30522 -x -D 
"cn=directory manager" -w xx -b "cn=config" 
"objectclass=nsds5replica" nsds50ruv

dn: cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 51dc3bac0064
nsds50ruv: {replica 100 ldap://localhost:30522} a7590064 
d7730064
nsds50ruv: {replica 200 ldap://localhost:4945} b3c200c8 
b3c200c8

nsds50ruv: {replica 300} b3c20005012c b3c20005012c

replica 300 is corrupted.

In this env I had played by cleaning ruv for rid 300, without disabling 
repl agreements from 300 (which I shoudl have done) and by adding 
changes later on replica 300 (which I shouldn't). Everything looked fine,
just after stopping to dump the changelog and restarting I was in the 
bad state


Need to try to repeat and verify








-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 03:04 PM, Janelle wrote:

On 5/21/15 5:49 AM, Rich Megginson wrote:

On 05/21/2015 06:25 AM, Janelle wrote:

On 5/21/15 5:20 AM, thierry bordaz wrote:

Hello Janelle,

Those 3 RIDs were already present in Node dc2-ipa1, correct ? They 
reappeared on others nodes as well ?
May be ds2-ipa1 established a replication session with its peers 
and send those RIDs.
Could you track in all the access logs, when the op 
csn=5552f71800030017 was applied.


Note that the two hexa values of replica 23 changed 
(5545d61f00020017 5552f71800030017 vs 5553e3a30017 
555432430017). Have you recreated a replica 23 ?.


Do you have replication logging enabled ?

thanks
thierry
Just to help me -- what is the best way to enable the logging level 
you need?


http://www.port389.org/docs/389ds/FAQ/faq.html#troubleshooting
The Replication log level.

I thought I did it correctly adding to ldif.dse, but I don't think 
it took.


You cannot edit dse.ldif while the server is running.  Anyway, 
ldapmodify is the best way to set this value.


I am used to OpenLDAP, so perhaps there is a different way to do it 
with 389-ds. Can you suggest settings of logging you want me to use?



The Replication log level.


~Janelle



How do I  kill one of the ldapmodify "cleans" I had started but seems 
to be stuck:

abort should be done by ldapmodify similar to starting it:
ldapmo

|ldapmodify 
dn: cn=abort 222, cn=abort cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
cn: abort 222
replica-base-dn: dc=example,dc=com
replica-id: 222
replica-certify-all: no

--> if set to "no" the task does not wait for all the replica servers to have been sent the abort task, or be 
online, before completing.  If set to "yes", the task will run forever until all the configured replicas have 
been aborted.  Note - the orginal default was "yes", but this was changed to "no" on 4/21/15.  It is 
best to set this attribute anyway, and not rely on what the default is.|

if it doesn't work we have to ask Mark :-)


CLEANALLRUV tasks
RID 24  None
No abort CLEANALLRUV tasks running

It has been 45 minutes and still nothing, so I want to kill it and try 
again.


~J



-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication again :-(

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 03:28 PM, Janelle wrote:

I think I found the problem.

There was a lone replica running in another DC. It was installed as a 
replica some time ago with all the others.  Think of this -- the 
original config had 5 servers, one of them was this server. Then the 
other 4 servers were RE-BUILT from scratch, so all the replication 
agreements were changed AND - this is the important part - the 5th 
server was never added back in. BUT - the 5th server was left running 
and never told it that it was not a member anymore. It still thought 
it had a replication agreement with original "server 1", but server 1 
knew otherwise.


Now, although the first 4 servers were rebuilt, the same domain, 
realm, AND passwords were used.


I am guessing that somehow, this 5th server keeps trying to interject 
its info into the ring of 4 servers, kind of forcing its way in. 
Somehow, because the original credentials still work (but certs are 
all different) is leaving the first 4 servers with a "can't decode" 
issue.


There should be some security checks so this can't happen. It should 
also be easy to replicate.


Now I have to go re-initialize all the servers from a good server, so 
everyone is happy again. The "problem" server has been shutdown 
completely. (and yes, there were actually 3 of them in my scenario - I 
just used 1 to simplify my example - but that explains the 3 CSNs that 
just kept "appearing")


What concerns me most about this - were the servers outside of the 
"good ring" somehow able to inject data into replication which might 
have been causing bad data??? This is bad if it is true.

it depends a bit on what you mean by rebuilt from scratch.
A replication session needs to meet three conditions to be able to send 
data:
- the supplier side needs to be able to authenticate and the 
authenticated users has to be in the list of binddns of the replica
-  the data generation of supplier and consumer side need to be the same 
(they all have to have the same common origin)
- the supplier needs to have the changes (CSNs) to be able to position 
in its changelog to send updates


now if you have 5 servers, forget about one of them and do not change 
the credentials in the others and do not reinitialize the database by an 
ldif import to generate a new database generation, the fifth server will 
still be able to connect and eventually send updates - how should the 
other servers know that this one is no longer a "good" one


~Janelle



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replication again :-(

2015-05-21 Thread Ludwig Krispenz


On 05/21/2015 03:59 PM, Janelle wrote:

On 5/21/15 6:46 AM, Ludwig Krispenz wrote:


On 05/21/2015 03:28 PM, Janelle wrote:

I think I found the problem.

There was a lone replica running in another DC. It was installed as 
a replica some time ago with all the others. Think of this -- the 
original config had 5 servers, one of them was this server. Then the 
other 4 servers were RE-BUILT from scratch, so all the replication 
agreements were changed AND - this is the important part - the 5th 
server was never added back in. BUT - the 5th server was left 
running and never told it that it was not a member anymore. It still 
thought it had a replication agreement with original "server 1", but 
server 1 knew otherwise.


Now, although the first 4 servers were rebuilt, the same domain, 
realm, AND passwords were used.


I am guessing that somehow, this 5th server keeps trying to 
interject its info into the ring of 4 servers, kind of forcing its 
way in. Somehow, because the original credentials still work (but 
certs are all different) is leaving the first 4 servers with a 
"can't decode" issue.


There should be some security checks so this can't happen. It should 
also be easy to replicate.


Now I have to go re-initialize all the servers from a good server, 
so everyone is happy again. The "problem" server has been shutdown 
completely. (and yes, there were actually 3 of them in my scenario - 
I just used 1 to simplify my example - but that explains the 3 CSNs 
that just kept "appearing")


What concerns me most about this - were the servers outside of the 
"good ring" somehow able to inject data into replication which might 
have been causing bad data??? This is bad if it is true.

it depends a bit on what you mean by rebuilt from scratch.
A replication session needs to meet three conditions to be able to 
send data:
- the supplier side needs to be able to authenticate and the 
authenticated users has to be in the list of binddns of the replica
-  the data generation of supplier and consumer side need to be the 
same (they all have to have the same common origin)
- the supplier needs to have the changes (CSNs) to be able to 
position in its changelog to send updates


now if you have 5 servers, forget about one of them and do not change 
the credentials in the others and do not reinitialize the database by 
an ldif import to generate a new database generation, the fifth 
server will still be able to connect and eventually send updates - 
how should the other servers know that this one is no longer a "good" 
one


~Janelle



The only problem left now - is no matter what, this last entry will 
NOT go away and now I have 2 "stuck" cleanruvs that will not "abort" 
either.


unable to decode  {replica 24} 554d53d30018 554d54a400020018

CLEANALLRUV tasks
RID 24  None
No abort CLEANALLRUV tasks running
=

ldapmodify -D "cn=directory manager" -W -a

dn: cn=abort 24, cn=abort cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn: dc=example,dc=com
cn: abort 24
replica-id: 24
replica-certify-all: no
adding new entry " cn=abort 24, cn=abort cleanallruv, cn=tasks, 
cn=config"

ldap_add: No such object (32)

in your dse.ldif do you see something like:

nsds5ReplicaCleanRUV: 300::no
in the replica object ?
This is where the task lives as long as it couldn't reach all servers 
for which a replication agreement exists.


If abort task doesn't work, you could try to stop the server, remove 
these lines from the dse.ldif, start the server again.





--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Migration error?

2015-06-16 Thread Ludwig Krispenz


On 06/16/2015 05:07 AM, Janelle wrote:

On 6/15/15 1:12 PM, Rob Crittenden wrote:

Janelle wrote:

On 6/15/15 6:36 AM, Rob Crittenden wrote:


Usually means there is a replication conflict entry. You may be able
to get more details on what failed by looking at the LDAP access log
of both LDAP servers, though I guess I'd expect this happened locally
on the IPA box.



Hi again,

I have been trying to follow this procedure for replication conflicts 
regarding "nsds5ReplConflict", where I had the two account duplicates, 
but no matter what, I still get:


modifying rdn of entry 
"nsuniqueid=ffc68a41-86e71c6-71714816-fcf248a0+uid=janelle,cn=users,cn=accounts,dc=example,dc=com"

ldap_rename: Constraint violation
additional info: Another entry with the same attribute value 
already exists (attribute: "uid")


When I am trying to run the modrdn (ldapmodify) command?  Which simply 
refuses to work. I have been at it for over a week now with no luck.  
I think this is the last of my issues causing my replication problems. 
What caused this is that I do have multiple helpdesk personnel that 
had been updating user accounts. This process has been resolved, but 
we can't seem to remove the last few duplicates.


Any suggestions? Is there a missing step in conflict resolution perhaps?
these entries are already a result of conflict resolution, If you add 
the same entry simultaneously on two servers (meaning add it on A and 
add it on B (before B has received the replicated add from A), there 
exist two entries with the same dn, which is not possible. So conflict 
resolution does not arbitrarily throw one away, but renames it and 
leaves it to the admin, which on to keep. So you should have one entry

uid=janelle,... and one nsuniqueid=+uid=janelle,
you can delete the nsuniqeid= entry to get rid of it.

There is a request to hide these nsuniqueid+uid entries from regular 
searches, it will be in a next release of 389


Ludwig


~J






--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replication conflicts

2015-06-16 Thread Ludwig Krispenz


On 06/16/2015 11:42 AM, Alexander Frolushkin wrote:


Hello.

Just to remind if somebody still not familiar with our IPA installation J

We currently have 18 IPA servers in domain, on 8 sites in different 
regions across the Russia.


And now, our new problem.

Regularly we getting a nsds5ReplConflict records on some of our 
servers, very often on servers from specific site. Usually it is 
simply a doubles and we can remove the renamed change to get 
everything back. But why do we have them at all?


May be someone could explain, how we can detect the cause of this 
replication conflicts?



if you are talking about having two "duplicate" entries,
one: uid=x,
one: nsuniqueid=+uid=x,

these entries appear if the entry uid=x was added, simultaneously, 
on two servers. I think this can happen if a client tries to add an 
entry and if it doesn't get a response in some time retries on another 
server.
to find out which client this is you need to check on which servers the 
entries were originally added and then see which client was doing it


Sometime it is moderately harmful, because, for example HBAC stops 
working on specific server while doubles still present.


Thanks in forward...

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764




?? ?  ? ? ? ??? ?? 
???, ??? ??? ??. ? ? ? ??? 
 ??, ??? ?? ?   ??? 
 ???-, ? ?.  ?? ?? ??? ? 
?, ?? ?, ?, ??? ??? 
??? ?? ? ??? ??? ? ? ? 
?.  ??  ??? ? , ??, 
???  ??? ??  ? ??? ??  
??  ? ? ? ? ??? ? ? ??.


The information contained in this communication is intended solely for 
the use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally 
privileged information. The contents may not be disclosed or used by 
anyone other than the addressee. If you are not the intended 
recipient(s), any use, disclosure, copying, distribution or any action 
taken or omitted to be taken in reliance on it is prohibited and may 
be unlawful. If you have received this communication in error please 
notify us immediately by responding to this email and then delete the 
e-mail and all attachments and any copies thereof.


(c)20mf50




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replication conflicts

2015-06-16 Thread Ludwig Krispenz


On 06/16/2015 12:44 PM, Alexander Frolushkin wrote:


It looks like our duplicates have some "internal" source, it source is 
not a client system, but one of our IPA servers.



to get these kind of conflict two servers have to be involved
if you say internal source, what kind of entries are affected ? do you 
mean these entries are created internally on server by a plugin ?


Is it possible to get such duplicate records in combination of 
replication "multipath" and some clock skew (it is not ideally 
synchronized because of very big distances between sites)?


the clock skew should have no effect, the replication protocol 
additinally manages it own time used in genratio of CSNs and tries to 
synchronize time, it could affect the oreder changes are applied during 
replication, but for these conflicts there have to be two independent ADDs


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*freeipa-users-boun...@redhat.com 
[mailto:freeipa-users-boun...@redhat.com] *On Behalf Of *Ludwig Krispenz

*Sent:* Tuesday, June 16, 2015 3:52 PM
*To:* freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/16/2015 11:42 AM, Alexander Frolushkin wrote:

Hello.

Just to remind if somebody still not familiar with our IPA
installation J

We currently have 18 IPA servers in domain, on 8 sites in
different regions across the Russia.

And now, our new problem.

Regularly we getting a nsds5ReplConflict records on some of our
servers, very often on servers from specific site. Usually it is
simply a doubles and we can remove the renamed change to get
everything back. But why do we have them at all?

May be someone could explain, how we can detect the cause of this
replication conflicts?

if you are talking about having two "duplicate" entries,
one: uid=x,
one: nsuniqueid=+uid=x,

these entries appear if the entry uid=x was added, simultaneously, 
on two servers. I think this can happen if a client tries to add an 
entry and if it doesn't get a response in some time retries on another 
server.
to find out which client this is you need to check on which servers 
the entries were originally added and then see which client was doing it


Sometime it is moderately harmful, because, for example HBAC stops 
working on specific server while doubles still present.


Thanks in forward...

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764




Информация в этом сообщении предназначена исключительно для конкретных 
лиц, которым она адресована. В сообщении может содержаться 
конфиденциальная информация, которая не может быть раскрыта или 
использована кем-либо, кроме адресатов. Если вы не адресат этого 
сообщения, то использование, переадресация, копирование или 
распространение содержания сообщения или его части незаконно и 
запрещено. Если Вы получили это сообщение ошибочно, пожалуйста, 
незамедлительно сообщите отправителю об этом и удалите со всем 
содержимым само сообщение и любые возможные его копии и приложения.


The information contained in this communication is intended solely for 
the use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally 
privileged information. The contents may not be disclosed or used by 
anyone other than the addressee. If you are not the intended 
recipient(s), any use, disclosure, copying, distribution or any action 
taken or omitted to be taken in reliance on it is prohibited and may 
be unlawful. If you have received this communication in error please 
notify us immediately by responding to this email and then delete the 
e-mail and all attachments and any copies thereof.


(c)20mf50





Информация в этом сообщении предназначена исключительно для конкретных 
лиц, которым она адресована. В сообщении может содержаться 
конфиденциальная информация, которая не может быть раскрыта или 
использована кем-либо, кроме адресатов. Если вы не адресат этого 
сообщения, то использование, переадресация, копирование или 
распространение содержания сообщения или его части незаконно и 
запрещено. Если Вы получили это сообщение ошибочно, пожалуйста, 
незамедлительно сообщите отправителю об этом и удалите со всем 
содержимым само сообщение и любые возможные его копии и приложения.


The information contained in this communication is intended solely for 
the use of the individual or entity to whom it is addressed and others 
authorized to receive it. It may contain confidential or legally 
privileged information. The contents may not be disclosed or used by 
anyone other than the addressee. If you are not the intended 
recipient(s), any use, disclosure, copying, distribution or any action 
taken or omitted to be taken in reliance on it

Re: [Freeipa-users] Migration error?

2015-06-16 Thread Ludwig Krispenz


On 06/16/2015 02:08 PM, Janelle wrote:

On Jun 16, 2015, at 01:56, thierry bordaz  wrote:

On 06/16/2015 09:02 AM, Ludwig Krispenz wrote:


On 06/16/2015 05:07 AM, Janelle wrote:

On 6/15/15 1:12 PM, Rob Crittenden wrote:
Janelle wrote:

On 6/15/15 6:36 AM, Rob Crittenden wrote:

Usually means there is a replication conflict entry. You may be able
to get more details on what failed by looking at the LDAP access log
of both LDAP servers, though I guess I'd expect this happened locally
on the IPA box.

Hi again,

I have been trying to follow this procedure for replication conflicts regarding 
"nsds5ReplConflict", where I had the two account duplicates, but no matter 
what, I still get:

modifying rdn of entry 
"nsuniqueid=ffc68a41-86e71c6-71714816-fcf248a0+uid=janelle,cn=users,cn=accounts,dc=example,dc=com"
ldap_rename: Constraint violation
additional info: Another entry with the same attribute value already exists 
(attribute: "uid")

When I am trying to run the modrdn (ldapmodify) command?  Which simply refuses 
to work. I have been at it for over a week now with no luck.  I think this is 
the last of my issues causing my replication problems. What caused this is that 
I do have multiple helpdesk personnel that had been updating user accounts. 
This process has been resolved, but we can't seem to remove the last few 
duplicates.

Any suggestions? Is there a missing step in conflict resolution perhaps?

these entries are already a result of conflict resolution, If you add the same 
entry simultaneously on two servers (meaning add it on A and add it on B 
(before B has received the replicated add from A), there exist two entries with 
the same dn, which is not possible. So conflict resolution does not arbitrarily 
throw one away, but renames it and leaves it to the admin, which on to keep. So 
you should have one entry
uid=janelle,... and one nsuniqueid=+uid=janelle,

The error you get is coming from 'uid uniqueness'. Like ludwig mention,  it 
exists duplicated entries  with both of them 'uid=janelle'.
'uid uniqueness' plugin prevents you to do a direct MODRDN on one of them 
because, it finds duplicated 'uid=janelle'.

you can delete the nsuniqeid= entry to get rid of it.

+1

thierry

There is a request to hide these nsuniqueid+uid entries from regular searches, 
it will be in a next release of 389

Ludwig

~J

--

But everything I try to delete fails.  Is there a procedure in 389-DS I can 
read for this? Maybe I am missing an option in ldapmodify? I am happy to 
delete, if only it would let me.

hm, it should be straightforwrd:
ldpapmodify -D  ..
dn: 
nsuniqueid=ffc68a41-86e71c6-71714816-fcf248a0+uid=janelle,cn=users,cn=accounts,dc=example,dc=com

changetype: delete

if it fails, what is the error you get ?


~J


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Migration error?

2015-06-16 Thread Ludwig Krispenz


On 06/16/2015 03:54 PM, Janelle wrote:

Good morning,

Just a quick note. I hope that all my questions do not make any one 
the DEV Team think that I do not support FreeIPA wholly and 
completely. I am a huge fan of this package and have in fact discussed 
with several of my clients (I'm a consultant of course) who have 
purchased RH support contracts just because of this. The product is 
wonderful and has potential of being even better as you continue to 
add new features.  Thank you so much for all the support you have 
provided. I hope RH understands too that many new customers come from 
recommendations from us consultant-types :-)


Ok, so I just wanted to throw that in this thread -- a big THANK YOU 
to the IPA Team and all the work accomplished so far. You are the best!
thanks, and don't worry. we need people like you, consistently, 
patiently pushing us to resolve things. And believe me, the corrupted 
ruvs haunt me as much as you


Ludwig


~Janelle



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz
Hi, this is really strange, if these conflict entries get created they 
should be the same on all servers.


could you repeat the two searches requesting the attribute 
"nscpentrywsi" (you have to do it as directory manager, and add -o 
ldif-wrap=no), it could give info when and where these entries were created.


Ludwig

On 06/17/2015 08:13 AM, Alexander Frolushkin wrote:


Hello.

Anotherexample. Today appeared on servers of different site.

Original LDIF:

# extended LDIF

#

# LDAPv3

# base Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru> with scope subtree


# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab, permissions, pbac, unix.megafon.ru

dn: cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc


=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host 
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru


ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

Duplicate:

# extended LDIF

#

# LDAPv3

# base Keytab+nsuniqueid=708bba65-14a611e5-8a48fd19-df27ff01,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru> 
with scope subtree


# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab + 708bba65-14a611e5-8a48fd19-df27ff01, 
permissio


ns, pbac, unix.megafon.ru

dn: cn=System: Manage Host 
Keytab+nsuniqueid=708bba65-14a611e5-8a48fd19-df27ff


01,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host 
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru


ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

No other servers in IPA domain have such duplicates.

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*freeipa-users-boun...@redhat.com 
[mailto:freeipa-users-boun...@redhat.com] *On Behalf Of *Ludwig Krispenz

*Sent:* Tuesday, June 16, 2015 3:52 PM
*To:* freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/16/2015 11:42 AM, Alexander Frolushkin wrote:

Hello.

Just to remind if somebody still not familiar with our IPA
installation J

We currently have 18 IPA servers in domain, on 8 sites in
different regions across the Russia.

And now, our new problem.

Regularly we getting a nsds5ReplConflict records on some of our
servers, very often on servers from specific site. Usually it is
simply a doubles and we can remove the renamed change to get
everything back. But why do we have them at all?

May be someone could explain, how we can detect the cause of this
replication conflicts?

if you are talking about having two "duplicate" entries,
one: uid=x,
one: nsuniqueid=+uid=x,

these entries appear if the entry uid=x was added, simultaneously, 
on two servers. I think this can happen if a client tries to add an 
entry and if it doesn't get a response in some time retries on another 
server.
to find out which client this is you need to check on which servers 
the entries were originally added and then see which client was doing it


Sometime it is moderately harmful, because, for example HBAC stops 
working on specific server while doubles still present.


Thanks in forward...

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764




Информация в этом сообщении предназначена исключительно для конкретных 
лиц, которым она адресована. В сообщении может содержаться 
конфиденциальная информация, которая не может быть раскрыта или 
использована кем-либо, кроме адресатов. Если вы не адресат этого 
сообщения, то использование, переадресация, копирование или 
распространение содержания сообщения или его части незаконно и 
запрещено. Если Вы получили это сообщение ошибочно, пожалуйста, 
незамедлительно сообщите отправителю об этом и удалите со всем 
содержимым само сообщение и любые возможные его копии 

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz

Hi,

you did send the data directly to me, maybe not wanting to share them to 
everyone. I'll continue discussion here, trying to be careful.


The "good" entry was created in April on replica 12 "0x0c"
createTimestamp;vucsn-5524d42b0067000c: 20150408070720Z

the "nsuniqueid" entry was created today on replica 26 "0x1a"
createTimestamp;vucsn-5580f321001a: 20150617040801Z

if the original entry would have existed on replica26 the new add should 
have been rejected, if it was not there the question is why.


Do you have any additional info on replica 26, when was it created, was 
it disconnected for some time ??


Ludwig

On 06/17/2015 08:13 AM, Alexander Frolushkin wrote:


Hello.

Anotherexample. Today appeared on servers of different site.

Original LDIF:

# extended LDIF

#

# LDAPv3

# base Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru> with scope subtree


# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab, permissions, pbac, unix.megafon.ru

dn: cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc


=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host 
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru


ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

Duplicate:

# extended LDIF

#

# LDAPv3

# base Keytab+nsuniqueid=708bba65-14a611e5-8a48fd19-df27ff01,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru> 
with scope subtree


# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab + 708bba65-14a611e5-8a48fd19-df27ff01, 
permissio


ns, pbac, unix.megafon.ru

dn: cn=System: Manage Host 
Keytab+nsuniqueid=708bba65-14a611e5-8a48fd19-df27ff


01,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host 
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru


ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

No other servers in IPA domain have such duplicates.

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*freeipa-users-boun...@redhat.com 
[mailto:freeipa-users-boun...@redhat.com] *On Behalf Of *Ludwig Krispenz

*Sent:* Tuesday, June 16, 2015 3:52 PM
*To:* freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/16/2015 11:42 AM, Alexander Frolushkin wrote:

Hello.

Just to remind if somebody still not familiar with our IPA
installation J

We currently have 18 IPA servers in domain, on 8 sites in
different regions across the Russia.

And now, our new problem.

Regularly we getting a nsds5ReplConflict records on some of our
servers, very often on servers from specific site. Usually it is
simply a doubles and we can remove the renamed change to get
everything back. But why do we have them at all?

May be someone could explain, how we can detect the cause of this
replication conflicts?

if you are talking about having two "duplicate" entries,
one: uid=x,
one: nsuniqueid=+uid=x,

these entries appear if the entry uid=x was added, simultaneously, 
on two servers. I think this can happen if a client tries to add an 
entry and if it doesn't get a response in some time retries on another 
server.
to find out which client this is you need to check on which servers 
the entries were originally added and then see which client was doing it


Sometime it is moderately harmful, because, for example HBAC stops 
working on specific server while doubles still present.


Thanks in forward...

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764




Информация в этом сообщении предназначена исключительно для конкретных 
лиц, которым она адресована. В сообщении может содержаться 
конфиденциальная информация, которая не может быть раскрыта или 
использована кем-либо, кром

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz


On 06/17/2015 11:03 AM, Alexander Frolushkin wrote:


This is correct, thank you for understanding and for helping!

Replica with id 26 was created today, this is our new server which was 
included in domain just a few hours ago. Looks like this dup came 
right after this new replica creation.



so on which servers does the "nsuniqueid" entry exist ?

can you check for 5580f321001a in the access log of replica 26, 
then check the errro log around this time and eventually the replica 
install log


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 2:58 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

Hi,

you did send the data directly to me, maybe not wanting to share them 
to everyone. I'll continue discussion here, trying to be careful.


The "good" entry was created in April on replica 12 "0x0c"
createTimestamp;vucsn-5524d42b0067000c: 20150408070720Z

the "nsuniqueid" entry was created today on replica 26 "0x1a"
createTimestamp;vucsn-5580f321001a: 20150617040801Z

if the original entry would have existed on replica26 the new add 
should have been rejected, if it was not there the question is why.


Do you have any additional info on replica 26, when was it created, 
was it disconnected for some time ??


Ludwig

On 06/17/2015 08:13 AM, Alexander Frolushkin wrote:

Hello.

Another example. Today appeared on servers of different site.

Original LDIF:

# extended LDIF

#

# LDAPv3

# base  with scope
subtree

# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab, permissions, pbac, unix.megafon.ru

dn: cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc

=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host
Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

Duplicate:

# extended LDIF

#

# LDAPv3

# base 
with scope subtree

# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab +
708bba65-14a611e5-8a48fd19-df27ff01, permissio

ns, pbac, unix.megafon.ru

dn: cn=System: Manage Host
Keytab+nsuniqueid=708bba65-14a611e5-8a48fd19-df27ff

01,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: top

objectClass: groupofnames

objectClass: ipapermissionv2

member: cn=Host
Enrollment,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

member: cn=Host
Administrators,cn=privileges,cn=pbac,dc=unix,dc=megafon,dc=ru

ipaPermDefaultAttr: krbprincipalkey

ipaPermDefaultAttr: krblastpwdchange

ipaPermLocation: cn=computers,cn=accounts,dc=unix,dc=megafon,dc=ru

# search result

search: 2

result: 0 Success

# numResponses: 2

# numEntries: 1

No other servers in IPA domain have such duplicates.

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*freeipa-users-boun...@redhat.com
<mailto:freeipa-users-boun...@redhat.com>
[mailto:freeipa-users-boun...@redhat.com] *On Behalf Of *Ludwig
Krispenz
*Sent:* Tuesday, June 16, 2015 3:52 PM
*To:* freeipa-users@redhat.com <mailto:freeipa-users@redhat.com>
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/16/2015 11:42 AM, Alexander Frolushkin wrote:

Hello.

Just to remind if somebody still not familiar with our IPA
installation J

We currently have 18 IPA servers in domain, on 8 sites in
different regions across the Russia.

And now, our new problem.

Regularly we getting a nsds5ReplConflict records on some of
our servers, very often on servers from specific site. Usually
it is simply a doubles and we can remove the renamed change to
get everything back. But why do we have them a

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz


On 06/17/2015 11:45 AM, thierry bordaz wrote:


On 06/17/2015 11:22 AM, Alexander Frolushkin wrote:


This was a usual "ipa-replica-install --setup-ca --setup-dns" and 
after that ipa-adtrust-install.


No DEL found:

# grep "cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru" ./access


[17/Jun/2015:10:08:01 +0600] conn=2 op=89 SRCH base="cn=System: 
Manage Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru" 
scope=0 filter="(objectClass=*)" attrs="ipaPermRight 
ipaPermTargetFilter ipaPermBindRuleType ipaPermissionType cn 
objectClass memberOf member ipaPermTarget ipaPermDefaultAttr 
ipaPermLocation ipaPermIncludedAttr ipaPermExcludedAttr"


[17/Jun/2015:10:08:01 +0600] conn=2 op=91 ADD dn="cn=System: Manage 
Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"




There is something I miss. conn=2 op=91 was a direct update on 
replica26 (not replicated) because it received its own 
CSN=5580f321001a. But it created a conflict entry, so at that 
time it existed the same entry (the one created 20150408070720Z) . So 
the direct update should have been rejected.
I think the search in op=89 did not return an entry, so it was added in 
op 91, that seems to be ok, but then 4 hrs later there is conn=237 
adding it again.


Alexander,

could you get the complete 'conn=237 op=93' and also the start of conn 
293, to show where teh connection comes from


Would you check if the replicaID=26 is unique in the topology 
(list-ruv for example) ?


[17/Jun/2015:14:39:46 +0600] conn=237 op=93 ADD dn="cn=System: Manage 
Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"


It is also possible this entry on affected servers was previously 
duplicated and not correctly managed to delete (more recent dup was 
deleted).


Is there any natural way to fix such issues? Maybe ipa-replica-manage 
force-sync, or ipa-replica-manage re-initialize on affected site 
servers from normal servers could help?


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*thierry bordaz [mailto:tbor...@redhat.com]
*Sent:* Wednesday, June 17, 2015 3:15 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'Ludwig Krispenz'; freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

Hello Alexander,

How did you initialize that new replica 26.
Either 'cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru' was not part 
of the total init data, or a DEL of that entry happened on replica 26 
(before a new ADD) but the DEL was not replicated to replica12.

Would you check in replica26 access logs if that entry was deleted ?

thanks
theirry

On 06/17/2015 11:03 AM, Alexander Frolushkin wrote:

This is correct, thank you for understanding and for helping!

Replica with id 26 was created today, this is our new server
which was included in domain just a few hours ago. Looks like
this dup came right after this new replica creation.

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 2:58 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* freeipa-users@redhat.com <mailto:freeipa-users@redhat.com>
*Subject:* Re: [Freeipa-users] replication conflicts

Hi,

you did send the data directly to me, maybe not wanting to share
them to everyone. I'll continue discussion here, trying to be
careful.

The "good" entry was created in April on replica 12 "0x0c"
createTimestamp;vucsn-5524d42b0067000c: 20150408070720Z

the "nsuniqueid" entry was created today on replica 26 "0x1a"
createTimestamp;vucsn-5580f321001a: 20150617040801Z

if the original entry would have existed on replica26 the new add
should have been rejected, if it was not there the question is why.

Do you have any additional info on replica 26, when was it
created, was it disconnected for some time ??

Ludwig

On 06/17/2015 08:13 AM, Alexander Frolushkin wrote:

Hello.

Another example. Today appeared on servers of different site.

Original LDIF:

# extended LDIF

#

# LDAPv3

# base  with
scope subtree

# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab, permissions, pbac, unix.megafon.ru

dn: cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc

=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Manage Host Keytab

objectClass: ipapermission

objectClass: to

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz


On 06/17/2015 11:52 AM, Ludwig Krispenz wrote:


On 06/17/2015 11:45 AM, thierry bordaz wrote:


On 06/17/2015 11:22 AM, Alexander Frolushkin wrote:


This was a usual "ipa-replica-install --setup-ca --setup-dns" and 
after that ipa-adtrust-install.


No DEL found:

# grep "cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru" ./access


[17/Jun/2015:10:08:01 +0600] conn=2 op=89 SRCH base="cn=System: 
Manage Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru" 
scope=0 filter="(objectClass=*)" attrs="ipaPermRight 
ipaPermTargetFilter ipaPermBindRuleType ipaPermissionType cn 
objectClass memberOf member ipaPermTarget ipaPermDefaultAttr 
ipaPermLocation ipaPermIncludedAttr ipaPermExcludedAttr"


[17/Jun/2015:10:08:01 +0600] conn=2 op=91 ADD dn="cn=System: Manage 
Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"




There is something I miss. conn=2 op=91 was a direct update on 
replica26 (not replicated) because it received its own 
CSN=5580f321001a. But it created a conflict entry, so at that 
time it existed the same entry (the one created 20150408070720Z) . So 
the direct update should have been rejected.
I think the search in op=89 did not return an entry, so it was added 
in op 91, that seems to be ok, but then 4 hrs later there is conn=237 
adding it again.


Alexander,

could you get the complete 'conn=237 op=93' and also the start of conn 
293, to show where teh connection comes from

of course conn=237


Would you check if the replicaID=26 is unique in the topology 
(list-ruv for example) ?


[17/Jun/2015:14:39:46 +0600] conn=237 op=93 ADD dn="cn=System: 
Manage Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"


It is also possible this entry on affected servers was previously 
duplicated and not correctly managed to delete (more recent dup was 
deleted).


Is there any natural way to fix such issues? Maybe 
ipa-replica-manage force-sync, or ipa-replica-manage re-initialize 
on affected site servers from normal servers could help?


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*thierry bordaz [mailto:tbor...@redhat.com]
*Sent:* Wednesday, June 17, 2015 3:15 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'Ludwig Krispenz'; freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

Hello Alexander,

How did you initialize that new replica 26.
Either 'cn=System: Manage Host 
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru' was not part 
of the total init data, or a DEL of that entry happened on replica 
26 (before a new ADD) but the DEL was not replicated to replica12.

Would you check in replica26 access logs if that entry was deleted ?

thanks
theirry

On 06/17/2015 11:03 AM, Alexander Frolushkin wrote:

This is correct, thank you for understanding and for helping!

Replica with id 26 was created today, this is our new server
which was included in domain just a few hours ago. Looks like
this dup came right after this new replica creation.

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 2:58 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* freeipa-users@redhat.com <mailto:freeipa-users@redhat.com>
*Subject:* Re: [Freeipa-users] replication conflicts

Hi,

you did send the data directly to me, maybe not wanting to share
them to everyone. I'll continue discussion here, trying to be
careful.

The "good" entry was created in April on replica 12 "0x0c"
createTimestamp;vucsn-5524d42b0067000c: 20150408070720Z

the "nsuniqueid" entry was created today on replica 26 "0x1a"
createTimestamp;vucsn-5580f321001a: 20150617040801Z

if the original entry would have existed on replica26 the new
add should have been rejected, if it was not there the question
is why.

Do you have any additional info on replica 26, when was it
created, was it disconnected for some time ??

Ludwig

On 06/17/2015 08:13 AM, Alexander Frolushkin wrote:

Hello.

Another example. Today appeared on servers of different site.

Original LDIF:

# extended LDIF

#

# LDAPv3

# base  with
scope subtree

# filter: (objectclass=*)

# requesting: ALL

#

# System: Manage Host Keytab, permissions, pbac, unix.megafon.ru

dn: cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc

=ru

ipaPermTargetFilter: (objectclass=ipahost)

ipaPermRight: write

ipaPermBindRuleType: permission

ipaPermissionType: V2

ipaPermissionType: MANAGED

ipaPermissionType: SYSTEM

cn: System: Ma

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz

conn=237 is from 10.99.75.82 which replica is this ?

On 06/17/2015 12:13 PM, Alexander Frolushkin wrote:


This is not a good news, because replica id 20 is not exist for a some 
days already. It was recreated and now have id 23


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*thierry bordaz [mailto:tbor...@redhat.com]
*Sent:* Wednesday, June 17, 2015 4:10 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'Ludwig Krispenz'; freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/17/2015 11:56 AM, Alexander Frolushkin wrote:

Will this be enough?

# grep "conn=237 op=93" ./access

[17/Jun/2015:14:39:46 +0600] conn=237 op=93 ADD dn="cn=System:
Manage Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"

[17/Jun/2015:14:39:46 +0600] conn=237 op=93 RESULT err=0 tag=105
nentries=0 etime=0 csn=555ac9360014

This operation is a replicated one and the CSN is from May 19th. So 
why a replica (26) created today was initialized without that entry ?
This updates was originated from replica20. Was it stopped and 
restarted recently ?



# grep "conn=293" ./access

[17/Jun/2015:15:33:04 +0600] conn=293 fd=75 slot=75 connection from 
10.99.75.82 to 10.61.8.2


[17/Jun/2015:15:33:04 +0600] conn=293 op=0 BIND dn="" method=sasl 
version=3 mech=GSSAPI


[17/Jun/2015:15:33:04 +0600] conn=293 op=0 RESULT err=14 tag=97 
nentries=0 etime=0, SASL bind in progress


[17/Jun/2015:15:33:04 +0600] conn=293 op=1 BIND dn="" method=sasl 
version=3 mech=GSSAPI


[17/Jun/2015:15:33:04 +0600] conn=293 op=1 RESULT err=14 tag=97 
nentries=0 etime=0, SASL bind in progress


[17/Jun/2015:15:33:04 +0600] conn=293 op=2 BIND dn="" method=sasl 
version=3 mech=GSSAPI


[17/Jun/2015:15:33:04 +0600] conn=293 op=2 RESULT err=0 tag=97 
nentries=0 etime=0 
dn="krbprincipalname=ldap/msk-rhidm-03.unix.megafon...@unix.megafon.ru,cn=services,cn=accounts,dc=unix,dc=megafon,dc=ru" 
<mailto:krbprincipalname=ldap/msk-rhidm-03.unix.megafon...@unix.megafon.ru,cn=services,cn=accounts,dc=unix,dc=megafon,dc=ru>


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 3:53 PM
*To:* thierry bordaz
*Cc:* Alexander Frolushkin (SIB); freeipa-users@redhat.com 
<mailto:freeipa-users@redhat.com>

*Subject:* Re: [Freeipa-users] replication conflicts

On 06/17/2015 11:45 AM, thierry bordaz wrote:


On 06/17/2015 11:22 AM, Alexander Frolushkin wrote:

This was a usual "ipa-replica-install --setup-ca --setup-dns"
and after that ipa-adtrust-install.

No DEL found:

# grep "cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru" ./access

[17/Jun/2015:10:08:01 +0600] conn=2 op=89 SRCH
base="cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"
scope=0 filter="(objectClass=*)" attrs="ipaPermRight
ipaPermTargetFilter ipaPermBindRuleType ipaPermissionType cn
objectClass memberOf member ipaPermTarget ipaPermDefaultAttr
ipaPermLocation ipaPermIncludedAttr ipaPermExcludedAttr"

[17/Jun/2015:10:08:01 +0600] conn=2 op=91 ADD dn="cn=System:
Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"


There is something I miss. conn=2 op=91 was a direct update on
replica26 (not replicated) because it received its own
CSN=5580f321001a. But it created a conflict entry, so at
that time it existed the same entry (the one created
20150408070720Z) . So the direct update should have been rejected.

I think the search in op=89 did not return an entry, so it was added 
in op 91, that seems to be ok, but then 4 hrs later there is conn=237 
adding it again.


Alexander,

could you get the complete 'conn=237 op=93' and also the start of conn 
293, to show where teh connection comes from




Would you check if the replicaID=26 is unique in the topology 
(list-ruv for example) ?




[17/Jun/2015:14:39:46 +0600] conn=237 op=93 ADD dn="cn=System: Manage 
Host Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"


It is also possible this entry on affected servers was previously 
duplicated and not correctly managed to delete (more recent dup was 
deleted).


Is there any natural way to fix such issues? Maybe ipa-replica-manage 
force-sync, or ipa-replica-manage re-initialize on affected site 
servers from normal servers could help?


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*thierry bordaz [mailto:tbor...@redhat.com]
*Sent:* Wednesday, June 17, 2015 3:15 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'Ludwig Krispenz'; freeipa-users@redhat.com 
<mailto:freeipa-users@redhat.com>

*Subject:* Re: [Freeipa

Re: [Freeipa-users] replication conflicts

2015-06-17 Thread Ludwig Krispenz


On 06/17/2015 12:57 PM, Alexander Frolushkin wrote:


Unfortunately, number of duplicates grows dramatically on most sites. 
Some servers already have over 40 duplicates.


Could you please say, may I use re-initialize on falling replica from 
the good one to fix this?


If you have a good one, this should work, "dups" are only created when a 
replicated ADD is received for an existing entry.
But what really puzzles me is that you do not have them on all servers, 
something weird seems to happen, this entry seems to exist whit several 
replicaids, and why would replica 10 replicate this 4 hrs after the 
replica installation.


WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 4:35 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'thierry bordaz'; freeipa-users@redhat.com
*Subject:* Re: [Freeipa-users] replication conflicts

conn=237 is from 10.99.75.82 which replica is this ?

On 06/17/2015 12:13 PM, Alexander Frolushkin wrote:

This is not a good news, because replica id 20 is not exist for a
some days already. It was recreated and now have id 23

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*thierry bordaz [mailto:tbor...@redhat.com]
*Sent:* Wednesday, June 17, 2015 4:10 PM
*To:* Alexander Frolushkin (SIB)
*Cc:* 'Ludwig Krispenz'; freeipa-users@redhat.com
<mailto:freeipa-users@redhat.com>
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/17/2015 11:56 AM, Alexander Frolushkin wrote:

Will this be enough?

# grep "conn=237 op=93" ./access

[17/Jun/2015:14:39:46 +0600] conn=237 op=93 ADD dn="cn=System:
Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"

[17/Jun/2015:14:39:46 +0600] conn=237 op=93 RESULT err=0
tag=105 nentries=0 etime=0 csn=555ac9360014

This operation is a replicated one and the CSN is from May 19th.
So why a replica (26) created today was initialized without that
entry ?
This updates was originated from replica20. Was it stopped and
restarted recently ?



# grep "conn=293" ./access

[17/Jun/2015:15:33:04 +0600] conn=293 fd=75 slot=75 connection
from 10.99.75.82 to 10.61.8.2

[17/Jun/2015:15:33:04 +0600] conn=293 op=0 BIND dn="" method=sasl
version=3 mech=GSSAPI

[17/Jun/2015:15:33:04 +0600] conn=293 op=0 RESULT err=14 tag=97
nentries=0 etime=0, SASL bind in progress

[17/Jun/2015:15:33:04 +0600] conn=293 op=1 BIND dn="" method=sasl
version=3 mech=GSSAPI

[17/Jun/2015:15:33:04 +0600] conn=293 op=1 RESULT err=14 tag=97
nentries=0 etime=0, SASL bind in progress

[17/Jun/2015:15:33:04 +0600] conn=293 op=2 BIND dn="" method=sasl
version=3 mech=GSSAPI

[17/Jun/2015:15:33:04 +0600] conn=293 op=2 RESULT err=0 tag=97
nentries=0 etime=0

dn="krbprincipalname=ldap/msk-rhidm-03.unix.megafon...@unix.megafon.ru,cn=services,cn=accounts,dc=unix,dc=megafon,dc=ru"

<mailto:krbprincipalname=ldap/msk-rhidm-03.unix.megafon...@unix.megafon.ru,cn=services,cn=accounts,dc=unix,dc=megafon,dc=ru>

WBR,

Alexander Frolushkin

Cell +79232508764

Work +79232507764

*From:*Ludwig Krispenz [mailto:lkris...@redhat.com]
*Sent:* Wednesday, June 17, 2015 3:53 PM
*To:* thierry bordaz
*Cc:* Alexander Frolushkin (SIB); freeipa-users@redhat.com
<mailto:freeipa-users@redhat.com>
*Subject:* Re: [Freeipa-users] replication conflicts

On 06/17/2015 11:45 AM, thierry bordaz wrote:


On 06/17/2015 11:22 AM, Alexander Frolushkin wrote:

This was a usual "ipa-replica-install --setup-ca
--setup-dns" and after that ipa-adtrust-install.

No DEL found:

# grep "cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"
./access

[17/Jun/2015:10:08:01 +0600] conn=2 op=89 SRCH
base="cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"
scope=0 filter="(objectClass=*)" attrs="ipaPermRight
ipaPermTargetFilter ipaPermBindRuleType ipaPermissionType
cn objectClass memberOf member ipaPermTarget
ipaPermDefaultAttr ipaPermLocation ipaPermIncludedAttr
ipaPermExcludedAttr"

[17/Jun/2015:10:08:01 +0600] conn=2 op=91 ADD
dn="cn=System: Manage Host
Keytab,cn=permissions,cn=pbac,dc=unix,dc=megafon,dc=ru"


There is something I miss. conn=2 op=91 was a direct update on
replica26 (not replicated) because it received its own
CSN=5580f321001a. But it created a conflict entry, so
at that time it exis

Re: [Freeipa-users] WG: Re: Haunted servers?

2015-06-19 Thread Ludwig Krispenz

Hi Christoph,

bad news. So to summarize, you have a procedure to cleanup your env, but 
once you restart the master the ghosts are back.


I really want to find out where they are coming from, so If you have to 
restart your server, could you please lookup these data, after the 
server is stopped:


 dbscan -f /var/lib/dirsrv/slapd-s/db/userRoot/nsuniqueid.db 
-k =--- -r

=---
3
this gives you the RUVID and you can look it up in the database
[root@elkris scripts]# dbscan -f 
/var/lib/dirsrv/slapd-/db/userRoot/id2entry.db -K 

id 3
rdn: nsuniqueid=---
nsUniqueId: ---
objectClass: top
objectClass: nsTombstone
objectClass: extensibleobject
nsds50ruv: {replicageneration} 51dc3bac0064
nsds50ruv: {replica 100 ldap://localhost:30522} 
557fd5410064 557fd9d30

 064
nsds50ruv: {replica 200 ldap://localhost:4945} 557fd6e600c8 
557fda0e00

..

then check the contents of the changelog:
[root@elkris scripts]# dbscan -f 
/var/lib/dirsrv/slapd-/changelogdb/ec450682-7c0a11e2-aa0e8005-8430f734_51dc3bac0064.db 
| more


the first entries contain th ruv data:
dbid: 006f
entry count: 307

dbid: 00de
purge ruv:
{replicageneration} 51dc3bac0064
{replica 100 ldap://localhost:30522}
{replica 200 ldap://localhost:30522}

dbid: 014d
max ruv:
{replicageneration} 51dc3bac0064
{replica 100} 557fd5410064 557fd9d30064
{replica 200} 557fd6e600c8 557fda0e00c8



On 06/12/2015 07:38 AM, Christoph Kaminski wrote:
I've been too early pleased :/ After ipactl restart of our first 
master (where we re-initialize from) are the 'ghost' rids again there...


I think there is something like a fs backup for dirsrv (changelog?) 
but where?


>
> we had the same problem (and some more) and yesterday we have
> successfully cleaned the gohst rid's
>
> our fix:
>
> 1. stop all cleanallruv Tasks, if it works with ipa-replica-manage
> abort-clean-ruv. It hasnt worked here. We have done it manually on
> ALL replicas with:
>  a) replica stop
>  b) delete all nsds5ReplicaClean from /etc/dirsrv/slapd-HSO/dse.ldif
>  c) replica start
>
> 2. prepare on EACH ipa a cleanruv ldif file with ALL ghost rids
> inside (really ALL from all ipa replicas, we has had some rids only
> on some replicas...)
> Example:
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV11
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV22
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV37
> ...
>
> 3. do a "ldapmodify -h 127.0.0.1 -D "cn=Directory Manager" -W -x -f
> $your-cleanruv-file.ldif" on all replicas AT THE SAME TIME :) we
> used terminator  for it (https://launchpad.net/terminator). You can
> open multiple shell windows inside one window and send to all at the
> same time the same commands...
>
> 4. we have done a re-initialize of each IPA from our first master
>
> 5. restart of all replicas
>
> we are not sure about the point 3 and 4. Maybe they are not
> necessary, but we have done it.
>
> If something fails look at defect LDAP entries in whole ldap, we
> have had some entries with 'nsunique-$HASH' after the 'normal' name.
> We have deleted them.
>
> MfG
> Christoph Kaminski
>
>







-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Antwort: Re: WG: Re: Haunted servers?

2015-06-19 Thread Ludwig Krispenz


On 06/19/2015 11:48 AM, Christoph Kaminski wrote:

freeipa-users-boun...@redhat.com schrieb am 19.06.2015 11:34:21:

> Von: Ludwig Krispenz 
> An: freeipa-users@redhat.com
> Datum: 19.06.2015 11:35
> Betreff: Re: [Freeipa-users] WG: Re:  Haunted servers?
> Gesendet von: freeipa-users-boun...@redhat.com
>
> Hi Christoph,
>
> bad news. So to summarize, you have a procedure to cleanup your env,
> but once you restart the master the ghosts are back.
>
> I really want to find out where they are coming from, so If you have
> to restart your server, could you please lookup these data, after
> the server is stopped:
>
>  dbscan -f /var/lib/dirsrv/slapd-s/db/userRoot/
> nsuniqueid.db -k =--- -r
> =---
> 3
> this gives you the RUVID and you can look it up in the database
> [root@elkris scripts]# dbscan -f /var/lib/dirsrv/slapd-/
> db/userRoot/id2entry.db -K 
> id 3
> rdn: nsuniqueid=---
> nsUniqueId: ---
> objectClass: top
> objectClass: nsTombstone
> objectClass: extensibleobject
> nsds50ruv: {replicageneration} 51dc3bac0064
> nsds50ruv: {replica 100 ldap://localhost:30522}
> 557fd5410064 557fd9d30
>  064
> nsds50ruv: {replica 200 ldap://localhost:4945}
> 557fd6e600c8 557fda0e00
> ..
>
> then check the contents of the changelog:
> [root@elkris scripts]# dbscan -f /var/lib/dirsrv/slapd-/
> 
changelogdb/ec450682-7c0a11e2-aa0e8005-8430f734_51dc3bac0064.db | 
more

>
> the first entries contain th ruv data:
> dbid: 006f
> entry count: 307
>
> dbid: 00de
> purge ruv:
> {replicageneration} 51dc3bac0064
> {replica 100 ldap://localhost:30522}
> {replica 200 ldap://localhost:30522}
>
> dbid: 014d
> max ruv:
> {replicageneration} 51dc3bac0064
> {replica 100} 557fd5410064 557fd9d30064
> {replica 200} 557fd6e600c8 557fda0e00c8
>
>

meanwhile we have found an other place where can be a reason for this 
problem... se the ldapsearch result at the end of this post (2 
ldapsearch outputs, in both there are dead entries)
in the second search I don't see nsds50ruv attributes for dead entries, 
so the database ruv seems to be ok.


the first search is for the replication agreements and they keep info 
about the consumer ruv, used in replication session. you cannot modify 
these, but they are maintained in the dse.ldif, you could edit the 
dse.ldif when the server is stopped.


Info:

we have only this IPA Hosts:

ipa-2.mgmt.hss.int:389: 44
ipa-1.mgmt.testsystem-homemonitoring.int:389: 45
ipa-1.mgmt.biotronik-homemonitoring.int:389: 35
ipa-1.mgmt.hss.int:389: 38
ipa-1.mgmt.datacenter-homemonitoring.int:389: 40

Please pay attention at the rids. We have used the same names for new 
install of ipa. There are a lot of ghost/dead entries with the same 
name but an other rid (smaller)!


The problem is, how can we delete them? A simple delete with an ldap 
browser doesnt work (server is unwilling to perform)


1. ldapsearch output:

ldapsearch -LLL -o ldif-wrap=no -h localhost -p 389 -x -D 
"cn=directory manager" -W -b "cn=config" 
"objectclass=nsds5replicationagreement"


dn: 
cn=meToipa-1.mgmt.datacenter-homemonitoring.int,cn=replica,cn=dc\3Dhso,cn=mappingtree,cn=config 


cn: meToipa-1.mgmt.datacenter-homemonitoring.int
objectClass: nsds5replicationagreement
objectClass: top
nsDS5ReplicaTransportInfo: LDAP
description: me to ipa-1.mgmt.datacenter-homemonitoring.int
nsDS5ReplicaRoot: dc=hso
nsDS5ReplicaHost: ipa-1.mgmt.datacenter-homemonitoring.int
nsds5replicaTimeout: 120
nsDS5ReplicaPort: 389
nsDS5ReplicaBindMethod: SASL/GSSAPI
nsDS5ReplicatedAttributeList: (objectclass=*) $ EXCLUDE memberof 
idnssoaserial entryusn krblastsuccessfulauth krblastfailedauth 
krbloginfailedcount
nsds5ReplicaStripAttrs: modifiersName modifyTimestamp 
internalModifiersName internalModifyTimestamp
nsDS5ReplicatedAttributeListTotal: (objectclass=*) $ EXCLUDE entryusn 
krblastsuccessfulauth krblastfailedauth krbloginfailedcount

nsds5ReplicaEnabled: on
nsds50ruv: {replicageneration} 548eae680004
nsds50ruv: {replica 40 
ldap://ipa-1.mgmt.datacenter-homemonitoring.int:389} 
5528dda50028 558283390028
nsds50ruv: {replica 35 
ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389} 
5526258800040023 55827e0500020023
nsds50ruv: {replica 41 
ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389} 
554092ff00050029 558275a700100029
nsds50ruv: {replica 33 
ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389} 
552568220021 5582727d00080021
nsds50ruv: 

Re: [Freeipa-users] Antwort: Re: Antwort: Re: WG: Re: Haunted servers?

2015-06-19 Thread Ludwig Krispenz

Hi,
On 06/19/2015 12:32 PM, Christoph Kaminski wrote:

> in the second search I don't see nsds50ruv attributes for dead
> entries, so the database ruv seems to be ok.

these are dead:

nscpentrywsi: nsDS5ReplicaBindDN: 
krbprincipalname=ldap/ipa-2.mgmt.biotronik-h

omemonitoring.int@HSO,cn=services,cn=accounts,dc=hso
nscpentrywsi: nsDS5ReplicaBindDN: 
krbprincipalname=ldap/ipa-2.mgmt.testsystem-

homemonitoring.int@HSO,cn=services,cn=accounts,dc=hso
nscpentrywsi: nsDS5ReplicaBindDN: 
krbprincipalname=ldap/ipa-2.mgmt.datacenter-

homemonitoring.int@HSO,cn=services,cn=accounts,dc=hso
but these are bind dns, ipa adds them when creating a new replica to be 
able to establish a gssapi replication, I don't know if and when they 
are removed, they are definitely not in the task of cleanallruv


> the first search is for the replication agreements and they keep
> info about the consumer ruv, used in replication session. you cannot
> modify these, but they are maintained in the dse.ldif, you could
> edit the dse.ldif when the server is stopped.

big thx, we try it and I let you know if it works!




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Antwort: Re: Antwort: Re: Antwort: Re: WG: Re: Haunted servers?

2015-06-19 Thread Ludwig Krispenz
from an earlier post it looks like they are from the o=ipaca backend, 
did you clean the ruvs there ?


to know which are the correct current rids for this backend you could do 
on each active server a search for
... -b "cn=config" 
"(&(objectclass=nsds5replica)(nsDS5ReplicaRoot=o=ipaca))" nsDS5ReplicaId


then you could search

ldapsearch -h  -D "cn=Directory Manager" -W -b "o=ipaca" 
"(&(objectclass=nstombstone)(nsUniqueId=--- 
))"

to see what you have in the ruv and eventually clean them

On 06/19/2015 01:48 PM, Christoph Kaminski wrote:

Ludwig Krispenz  schrieb am 19.06.2015 13:23:43:

>
> > the first search is for the replication agreements and they keep
> > info about the consumer ruv, used in replication session. you cannot
> > modify these, but they are maintained in the dse.ldif, you could
> > edit the dse.ldif when the server is stopped.
>
> big thx, we try it and I let you know if it works!
>

one thing what I still dont understand:

nsds50ruv: {replica 1595 
_ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389/>} 
55256878063b 5526284d063b
nsds50ruv: {replica 1690 
_ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389/>} 
552625f2069a 55683c26069a
nsds50ruv: {replica 1695 
_ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389/>} 
55256ba4069f 555b58ce069f
nsds50ruv: {replica 1490 _ldap://ipa-2.mgmt.hss.int:389_ 
<ldap://ipa-2.mgmt.hss.int:389/>} 5525600805d2 
5525686e000205d2
nsds50ruv: {replica 1395 
_ldap://ipa-2.mgmt.datacenter-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.datacenter-homemonitoring.int:389/>} 
54eca6ed0573 55255ff900010573
nsds50ruv: {replica 1495 _ldap://ipa-2.mgmt.hss.int:389_ 
<ldap://ipa-2.mgmt.hss.int:389/>} 5523c7f105d7 
55250ae205d7
nsds50ruv: {replica 1295 
_ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389/>} 
54d350ab050f 552512a80001050f
nsds50ruv: {replica 1195 _ldap://ipa-2.mgmt.hss.int:389_ 
<ldap://ipa-2.mgmt.hss.int:389/>} 54bd18cd04ab 
551cd0b3000404ab
nsds50ruv: {replica 97 
_ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389/>} 
548eaeb80061 551a885400050061
nsds50ruv: {replica 96 
_ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389/>} 
548eaeb900010060 5523b8fd0060
nsds50ruv: {replica 91 
_ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389/>} 
5492f60f005b 5509812d0006005b
nsds50ruv: {replica 1095 _ldap://ipa-1.mgmt.hss.int:389_ 
<ldap://ipa-1.mgmt.hss.int:389/>} 5493f9fe0447 
551cd51900040447
nsds50ruv: {replica 1390 
_ldap://ipa-1.mgmt.datacenter-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.datacenter-homemonitoring.int:389/>} 
54ecb4d3056e 552510e6000d056e
nsds50ruv: {replica 1385 
_ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.testsystem-homemonitoring.int:389/>} 
552512b40569 5525488600040569
nsds50ruv: {replica 1795 
_ldap://ipa-2.mgmt.datacenter-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.datacenter-homemonitoring.int:389/>} 
55262cc30703 552663f30703
nsds50ruv: {replica 1790 
_ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389/>} 
552669f206fe 552bb124000306fe
nsds50ruv: {replica 1785 _ldap://ipa-1.mgmt.hss.int:389_ 
<ldap://ipa-1.mgmt.hss.int:389/>} 552677b806f9 
555e169406f9
nsds50ruv: {replica 1780 _ldap://ipa-2.mgmt.hss.int:389_ 
<ldap://ipa-2.mgmt.hss.int:389/>} 5527a56f06f4 
5528f6a106f4
nsds50ruv: {replica 1775 
_ldap://ipa-1.mgmt.datacenter-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.datacenter-homemonitoring.int:389/>} 
5528ddd806ef 5528ddd9000106ef
nsds50ruv: {replica 1770 
_ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.testsystem-homemonitoring.int:389/>} 
554093bb000806ea 555c9bcf000406ea
nsds50ruv: {replica 1765 _ldap://ipa-2.mgmt.hss.int:389_ 
<ldap://ipa-2.mgmt.hss.int:389/>} 5540aba6000506e5 
5540aba8000506e5
nsruvReplicaLastModified: {replica 1595 
_ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-2.mgmt.biotronik-homemonitoring.int:389/>} 
nsruvReplicaLastModified: {replica 1690 
_ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389_ 
<ldap://ipa-1.mgmt.biotronik-homemonitoring.int:389/>} 
n

Re: [Freeipa-users] WG: Re: Haunted servers?

2015-06-22 Thread Ludwig Krispenz

Hi,
On 06/22/2015 09:48 AM, Christoph Kaminski wrote:

>
> from an earlier post it looks like they are from the o=ipaca
> backend, did you clean the ruvs there ?

we have only done a 'normal' cleanruv... How can I clean them there?

either you try the cleanallruv:

# ldapmodify -D "cn=directory manager" -W -a
dn: cn=clean 8, cn=cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn:*o=ipaca*
replica-id: 8
cn: clean 8

you have to set the replicabase dn.

or, if you want to go to the method of running cleanruv individually on all 
servers you have to use the dn of the ipaca replica:

> dn: cn=replica,cn=*o**\3Dipaca*,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV11


>
> to know which are the correct current rids for this backend you
> could do on each active server a search for
> ... -b "cn=config" "(&(objectclass=nsds5replica)(
> nsDS5ReplicaRoot=o=ipaca))"  nsDS5ReplicaId
>
> then you could search
>
> ldapsearch -h  -D "cn=Directory Manager" -W -b "o=ipaca"
> "(&(objectclass=nstombstone)(nsUniqueId=---
> ffff))"
> to see what you have in the ruv and eventually clean them

> On 06/19/2015 01:48 PM, Christoph Kaminski wrote:
> Ludwig Krispenz  schrieb am 19.06.2015 13:23:43:
>
> >
> > > the first search is for the replication agreements and they keep
> > > info about the consumer ruv, used in replication session. you cannot
> > > modify these, but they are maintained in the dse.ldif, you could
> > > edit the dse.ldif when the server is stopped.
> >
> > big thx, we try it and I let you know if it works!
> >
>

Greetz



-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] WG: Re: Haunted servers?

2015-06-22 Thread Ludwig Krispenz

Hi,

I have one scenario where I can show the comeback of the "ghost" rids. 
but it requires a server where the rids have successfully cleaned and it 
is killed or crashes. In that case, if the "ghost" rids have not yet 
been trimmed from the changelog they can be recreated from information 
in the changelog. they can then also propagate to other servers


Could something similar have happened in your environment ?

Ludwig

On 06/12/2015 07:38 AM, Christoph Kaminski wrote:
I've been too early pleased :/ After ipactl restart of our first 
master (where we re-initialize from) are the 'ghost' rids again there...


I think there is something like a fs backup for dirsrv (changelog?) 
but where?


>
> we had the same problem (and some more) and yesterday we have
> successfully cleaned the gohst rid's
>
> our fix:
>
> 1. stop all cleanallruv Tasks, if it works with ipa-replica-manage
> abort-clean-ruv. It hasnt worked here. We have done it manually on
> ALL replicas with:
>  a) replica stop
>  b) delete all nsds5ReplicaClean from /etc/dirsrv/slapd-HSO/dse.ldif
>  c) replica start
>
> 2. prepare on EACH ipa a cleanruv ldif file with ALL ghost rids
> inside (really ALL from all ipa replicas, we has had some rids only
> on some replicas...)
> Example:
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV11
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV22
>
> dn: cn=replica,cn=dc\3Dexample,cn=mapping tree,cn=config
> changetype: modify
> replace: nsds5task
> nsds5task:CLEANRUV37
> ...
>
> 3. do a "ldapmodify -h 127.0.0.1 -D "cn=Directory Manager" -W -x -f
> $your-cleanruv-file.ldif" on all replicas AT THE SAME TIME :) we
> used terminator  for it (https://launchpad.net/terminator). You can
> open multiple shell windows inside one window and send to all at the
> same time the same commands...
>
> 4. we have done a re-initialize of each IPA from our first master
>
> 5. restart of all replicas
>
> we are not sure about the point 3 and 4. Maybe they are not
> necessary, but we have done it.
>
> If something fails look at defect LDAP entries in whole ldap, we
> have had some entries with 'nsunique-$HASH' after the 'normal' name.
> We have deleted them.
>
> MfG
> Christoph Kaminski
>
>







-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

[Freeipa-users] changing the default for changelog trimmimg

2015-06-30 Thread Ludwig Krispenz

Hi,

389-ds allows to configure the max size of the replication changelog 
either by setting a maximum record number or a maximum age of changes.
freeIPA does not use this setting. In the context of ticket 
https://fedorahosted.org/freeipa/ticket/5086 we are discussing to change 
the default to

enable changelog trimming.

Does anyone already use changlog trimming or is there a  scenario where 
you rely on all changes being available ?


Thanks for your feedback,
Ludwig

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] changing the default for changelog trimmimg

2015-07-03 Thread Ludwig Krispenz


On 07/03/2015 02:03 PM, Petr Spacek wrote:

On 3.7.2015 11:45, thierry bordaz wrote:

On 06/30/2015 03:54 PM, Ludwig Krispenz wrote:

Hi,

389-ds allows to configure the max size of the replication changelog either
by setting a maximum record number or a maximum age of changes.
freeIPA does not use this setting. In the context of ticket
https://fedorahosted.org/freeipa/ticket/5086 we are discussing to change the
default to
enable changelog trimming.

Does anyone already use changlog trimming or is there a  scenario where you
rely on all changes being available ?

Thanks for your feedback,
Ludwig


Hello,

I think it is reasonable to set nsds5ReplicaPurgeDelay and
nsslapd-changelogmaxage to similar value.

When a replica (master or consumer) is down for some time and is
restarted, both attribute express the ability to get the replica in
sync with the rest of the topology.
It can work (and likely will) if
nsds5ReplicaPurgeDelay
I wonder if these values could/should be controlled by topology plugin. Does
it make sense to have different values on different replicas?
no, and it would be possible, but would be an extension of the scope of 
the topo plugin, so far we only manage agreements between

replicas and not replicas themselves.




--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] changing the default for changelog trimmimg

2015-07-03 Thread Ludwig Krispenz


On 07/03/2015 02:28 PM, Petr Spacek wrote:

On 3.7.2015 14:21, thierry bordaz wrote:

On 07/03/2015 02:03 PM, Petr Spacek wrote:

On 3.7.2015 11:45, thierry bordaz wrote:

On 06/30/2015 03:54 PM, Ludwig Krispenz wrote:

Hi,

389-ds allows to configure the max size of the replication changelog either
by setting a maximum record number or a maximum age of changes.
freeIPA does not use this setting. In the context of ticket
https://fedorahosted.org/freeipa/ticket/5086 we are discussing to change the
default to
enable changelog trimming.

Does anyone already use changlog trimming or is there a  scenario where you
rely on all changes being available ?

Thanks for your feedback,
Ludwig


Hello,

 I think it is reasonable to set nsds5ReplicaPurgeDelay and
 nsslapd-changelogmaxage to similar value.

 When a replica (master or consumer) is down for some time and is
 restarted, both attribute express the ability to get the replica in
 sync with the rest of the topology.
 It can work (and likely will) if
 nsds5ReplicaPurgeDelay
I wonder if these values could/should be controlled by topology plugin. Does
it make sense to have different values on different replicas?


Purgedelay can be different on each replica but it makes sense that the value
is the same on all replicas. It is used to remove too old csn and so how far
in the past the replication can decide which value is more recent than an
other one. With different values of purge delay, a replica can decide to keep
one value and an other replica can decide the opposite.
Currently purgedelay is identical on all replicas (default value).

I understand that technically it is possible so the question is more like
'does it even make sense'?
no, it doesn't make sense. At least I can't imagine a scenario, where it 
does

Do we want to support it?
what exactly do you mean by this, you always can as a last method edit 
the dse.ldif, even if you can catch all online mods by a plugin like the 
topo plugin.

do we offer an easy way to configure and modify it, I think: no
does one loose support if changing the default, no




--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-13 Thread Ludwig Krispenz
can you get a pstack of the slapd process along with a top -H to find th 
ethread with high cpu usage


Ludwig

On 07/13/2015 04:46 PM, Andrew E. Bruno wrote:

We have 3 freeipa-replicas. Centos 7.1.1503, ipa-server 4.1.0-18, and
389-ds 1.3.3.1-16.

Recently, the ns-slapd process on one of our replicas started showing higher
than normal CPU usage. ns-slapd is pegged at high CPU usage more or less
constantly.

Seems very similar to this thread:
https://www.redhat.com/archives/freeipa-users/2015-February/msg00281.html

There are a few errors showing in /var/log/dirsrv/slapd-[domain]/errors (not
sure if these are related):


[13/Jul/2015:02:56:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:04:11:50 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2277387,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:04:11:50 -0400] - Operation error fetching 
changenumber=2277387,cn=changelog (null), error -30993.
[13/Jul/2015:04:11:50 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2277387, dn = changenumber=2277387,cn=changelog: 
Operations error.
[13/Jul/2015:04:11:50 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:07:41:49 -0400] - Operation error fetching Null DN 
(01de36ac-295411e5-b94db2ab-07afbca6), error -30993.
[13/Jul/2015:07:41:49 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2281464,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:07:41:49 -0400] - Operation error fetching 
changenumber=2281464,cn=changelog (null), error -30993.
[13/Jul/2015:07:41:49 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2281464, dn = changenumber=2281464,cn=changelog: 
Operations error.
[13/Jul/2015:07:41:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]


access logs seem to be showing normal activity. Here's the number of open
connections:

# ls -al /proc/`cat /var/run/dirsrv/slapd-[domain].pid`/fd|grep socket|wc -l
62

Note: the other two replicas have much higher open connections (>250) and low
cpu load avgs.

Here's some output of logconv.pl from our most recent access log on the replica
with high cpu load:

Start of Logs:13/Jul/2015:04:49:18
End of Logs:  13/Jul/2015:10:06:11

Processed Log Time:  5 Hours, 16 Minutes, 53 Seconds

Restarts: 0
Total Connections:2343
  - LDAP Connections:  2120
  - LDAPI Connections: 223
  - LDAPS Connections: 0
  - StartTLS Extended Ops: 45
  Secure Protocol Versions:
   - TLS1.2 128-bit AES - 45

Peak Concurrent Connections:  22
Total Operations: 111865
Total Results:111034
Overall Performance:  99.3%

Searches: 95585 (5.03/sec)  (301.64/min)
Modifications:3369  (0.18/sec)  (10.63/min)
Adds: 0 (0.00/sec)  (0.00/min)
Deletes:  0 (0.00/sec)  (0.00/min)
Mod RDNs: 0 (0.00/sec)  (0.00/min)
Compares: 0 (0.00/sec)  (0.00/min)
Binds:7082  (0.37/sec)  (22.35/min)

Proxied Auth Operations:  0
Persistent Searches:  0
Internal Operations:  0
Entry Operations: 0
Extended Operations:  5317
Abandoned Requests:   416
Smart Referrals Received: 0

VLV Operations:   96
VLV Unindexed Searches:   0
VLV Unindexed Components: 32
SORT Operations:  64

Entire Search Base Queries:   0
Paged Searches:   3882
Unindexed Searches:   0
Unindexed Components: 5

FDs Taken:2566
FDs Returned: 2643
Highest FD Taken: 249

Broken Pipes: 0
Connections Reset By Peer:0
Resource Unavailable: 0
Max BER Size Exceeded:0

Binds:7082
Unbinds:  2443
  - LDAP v2 Binds: 0
  - LDAP v3 Binds: 6859
  - AUTOBINDs: 223
  - SSL Client Binds:  0
  - Failed SSL Client Binds:   0
  - SASL Binds:6814
 GSSAPI - 6591
 EXTERNAL - 223
  - Directory Manager Binds:   0
  - Anonymous Binds:   6591
  - Other Binds:   491




strace timing on the ns-slapd process:


% time seconds  usecs/call callserrors syscall
-- --- --- - - 
  94.400.346659597758   poll
   4.100.015057   15057 1   restart_syscall
   0.910.003353  575959 getpeername
   0.490.001796 15012   futex
   0.100.000364  73 5   read
-- --- --- - - 
100.000.367229   13559 total


top output (with threads 'H'):

   PID USER 

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-13 Thread Ludwig Krispenz


On 07/13/2015 05:05 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 04:58:46PM +0200, Ludwig Krispenz wrote:

can you get a pstack of the slapd process along with a top -H to find th
ethread with high cpu usage

Attached is the full stacktrace of the running ns-slapd proccess. top -H
shows this thread (2879) with high cpu usage:

2879 dirsrv20   0 3819252 1.962g  11680 R 99.9  3.1   8822:10 ns-slapd
this thread is a replication thread sending updates, what is strange is 
that the current csn_str is quite old (july, 7th), I can't tell which 
agreeement this thread is handling, but looks like it is heavily reading 
the changeglog and sending updates. anything changed recently in 
replication setup ?







On 07/13/2015 04:46 PM, Andrew E. Bruno wrote:

We have 3 freeipa-replicas. Centos 7.1.1503, ipa-server 4.1.0-18, and
389-ds 1.3.3.1-16.

Recently, the ns-slapd process on one of our replicas started showing higher
than normal CPU usage. ns-slapd is pegged at high CPU usage more or less
constantly.

Seems very similar to this thread:
https://www.redhat.com/archives/freeipa-users/2015-February/msg00281.html

There are a few errors showing in /var/log/dirsrv/slapd-[domain]/errors (not
sure if these are related):


[13/Jul/2015:02:56:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:04:11:50 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2277387,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:04:11:50 -0400] - Operation error fetching 
changenumber=2277387,cn=changelog (null), error -30993.
[13/Jul/2015:04:11:50 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2277387, dn = changenumber=2277387,cn=changelog: 
Operations error.
[13/Jul/2015:04:11:50 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:07:41:49 -0400] - Operation error fetching Null DN 
(01de36ac-295411e5-b94db2ab-07afbca6), error -30993.
[13/Jul/2015:07:41:49 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2281464,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:07:41:49 -0400] - Operation error fetching 
changenumber=2281464,cn=changelog (null), error -30993.
[13/Jul/2015:07:41:49 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2281464, dn = changenumber=2281464,cn=changelog: 
Operations error.
[13/Jul/2015:07:41:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]


access logs seem to be showing normal activity. Here's the number of open
connections:

# ls -al /proc/`cat /var/run/dirsrv/slapd-[domain].pid`/fd|grep socket|wc -l
62

Note: the other two replicas have much higher open connections (>250) and low
cpu load avgs.

Here's some output of logconv.pl from our most recent access log on the replica
with high cpu load:

Start of Logs:13/Jul/2015:04:49:18
End of Logs:  13/Jul/2015:10:06:11

Processed Log Time:  5 Hours, 16 Minutes, 53 Seconds

Restarts: 0
Total Connections:2343
  - LDAP Connections:  2120
  - LDAPI Connections: 223
  - LDAPS Connections: 0
  - StartTLS Extended Ops: 45
  Secure Protocol Versions:
   - TLS1.2 128-bit AES - 45

Peak Concurrent Connections:  22
Total Operations: 111865
Total Results:111034
Overall Performance:  99.3%

Searches: 95585 (5.03/sec)  (301.64/min)
Modifications:3369  (0.18/sec)  (10.63/min)
Adds: 0 (0.00/sec)  (0.00/min)
Deletes:  0 (0.00/sec)  (0.00/min)
Mod RDNs: 0 (0.00/sec)  (0.00/min)
Compares: 0 (0.00/sec)  (0.00/min)
Binds:7082  (0.37/sec)  (22.35/min)

Proxied Auth Operations:  0
Persistent Searches:  0
Internal Operations:  0
Entry Operations: 0
Extended Operations:  5317
Abandoned Requests:   416
Smart Referrals Received: 0

VLV Operations:   96
VLV Unindexed Searches:   0
VLV Unindexed Components: 32
SORT Operations:  64

Entire Search Base Queries:   0
Paged Searches:   3882
Unindexed Searches:   0
Unindexed Components: 5

FDs Taken:2566
FDs Returned: 2643
Highest FD Taken: 249

Broken Pipes: 0
Connections Reset By Peer:0
Resource Unavailable: 0
Max BER Size Exceeded:0

Binds:7082
Unbinds:  2443
  - LDAP v2 Binds: 0
  - LDAP v3 Binds: 6859
  - AUTOBINDs: 223
  - SSL Client Binds:  0
  - Failed SSL Client Binds:   0
  - SASL Binds:6814
 GSSAPI - 6591
 EXTERNAL - 223
  - Directory Manager Binds:   0
  - Anonymous Binds:   6591
  - Other Binds:   491




strace timi

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-14 Thread Ludwig Krispenz


On 07/13/2015 06:36 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 05:29:13PM +0200, Ludwig Krispenz wrote:

On 07/13/2015 05:05 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 04:58:46PM +0200, Ludwig Krispenz wrote:

can you get a pstack of the slapd process along with a top -H to find th
ethread with high cpu usage

Attached is the full stacktrace of the running ns-slapd proccess. top -H
shows this thread (2879) with high cpu usage:

2879 dirsrv20   0 3819252 1.962g  11680 R 99.9  3.1   8822:10 ns-slapd

this thread is a replication thread sending updates, what is strange is that
the current csn_str is quite old (july, 7th), I can't tell which agreeement
this thread is handling, but looks like it is heavily reading the changeglog
and sending updates. anything changed recently in replication setup ?


Yes, we had one replica fail on (6/19) which we removed (not this one
showing high CPU load). Had to perform some manual cleanup of the ipa-ca
RUVs. Then we added the replica back in on 7/1. Since then, replication
appears to have been running normally between the 3 replicas. We've been
monitoring utilization since 7/1 and only recently seen this spike (past
24 hours or so).

is it still in this state ? or was it a spike.

if it still is high cpu consuming, could you
- get a few pstack like the one before with some time in between, I 
would like to see if it is progressing with the csns or looping on the 
same one
- check the consumer side. is there anything in the error log ? does the 
access log show replication activity from this server

- eventually enable replication logging: nsslapd-errorlog-level: 8192
  


On a side note, we get hit with this bug often:

https://www.redhat.com/archives/freeipa-users/2015-July/msg00018.html

(rouge sssd_be processing hammering a replica).

This causes high ns-slapd utilization on the replica and restarting sssd
on the client host immediately fixes the issue. However, in this
case, we're not seeing this behavior.










On 07/13/2015 04:46 PM, Andrew E. Bruno wrote:

We have 3 freeipa-replicas. Centos 7.1.1503, ipa-server 4.1.0-18, and
389-ds 1.3.3.1-16.

Recently, the ns-slapd process on one of our replicas started showing higher
than normal CPU usage. ns-slapd is pegged at high CPU usage more or less
constantly.

Seems very similar to this thread:
https://www.redhat.com/archives/freeipa-users/2015-February/msg00281.html

There are a few errors showing in /var/log/dirsrv/slapd-[domain]/errors (not
sure if these are related):


[13/Jul/2015:02:56:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:04:11:50 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2277387,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:04:11:50 -0400] - Operation error fetching 
changenumber=2277387,cn=changelog (null), error -30993.
[13/Jul/2015:04:11:50 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2277387, dn = changenumber=2277387,cn=changelog: 
Operations error.
[13/Jul/2015:04:11:50 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]
[13/Jul/2015:07:41:49 -0400] - Operation error fetching Null DN 
(01de36ac-295411e5-b94db2ab-07afbca6), error -30993.
[13/Jul/2015:07:41:49 -0400] - dn2entry_ext: Failed to get id for 
changenumber=2281464,cn=changelog from entryrdn index (-30993)
[13/Jul/2015:07:41:49 -0400] - Operation error fetching 
changenumber=2281464,cn=changelog (null), error -30993.
[13/Jul/2015:07:41:49 -0400] DSRetroclPlugin - replog: an error occured while 
adding change number 2281464, dn = changenumber=2281464,cn=changelog: 
Operations error.
[13/Jul/2015:07:41:49 -0400] retrocl-plugin - retrocl_postob: operation failure 
[1]


access logs seem to be showing normal activity. Here's the number of open
connections:

# ls -al /proc/`cat /var/run/dirsrv/slapd-[domain].pid`/fd|grep socket|wc -l
62

Note: the other two replicas have much higher open connections (>250) and low
cpu load avgs.

Here's some output of logconv.pl from our most recent access log on the replica
with high cpu load:

Start of Logs:13/Jul/2015:04:49:18
End of Logs:  13/Jul/2015:10:06:11

Processed Log Time:  5 Hours, 16 Minutes, 53 Seconds

Restarts: 0
Total Connections:2343
  - LDAP Connections:  2120
  - LDAPI Connections: 223
  - LDAPS Connections: 0
  - StartTLS Extended Ops: 45
  Secure Protocol Versions:
   - TLS1.2 128-bit AES - 45

Peak Concurrent Connections:  22
Total Operations: 111865
Total Results:111034
Overall Performance:  99.3%

Searches: 95585 (5.03/sec)  (301.64/min)
Modifications:3369  (0.18/sec)  (10.63/min)
Adds: 0 (0.00/sec)  (0.00/min)
Deletes:  0 (0.00/sec)  (0.00/min)
Mod RDNs: 0 (0.

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-14 Thread Ludwig Krispenz
hm, the stack traces show csn_str, which correspond to Jul,8th, Jul,4th, 
and Jul,7th - so it looks like it is iterating the changelog over and 
over again.
Th consumer side Is "cn=meTosrv-m14-24.ccr.buffalo.edu" - is this the 
master ?


can you provide the result of the following search from 
m14-24.ccr.buffalo.edu adn the server with the high cpu:


ldapsearch -o ldif-wrap=no -x -D ... -w  -b "cn=config" 
"objectclass=nsds5replica" nsds50ruv


On 07/14/2015 02:35 PM, Andrew E. Bruno wrote:

On Tue, Jul 14, 2015 at 01:41:57PM +0200, Ludwig Krispenz wrote:

On 07/13/2015 06:36 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 05:29:13PM +0200, Ludwig Krispenz wrote:

On 07/13/2015 05:05 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 04:58:46PM +0200, Ludwig Krispenz wrote:

can you get a pstack of the slapd process along with a top -H to find th
ethread with high cpu usage

Attached is the full stacktrace of the running ns-slapd proccess. top -H
shows this thread (2879) with high cpu usage:

2879 dirsrv20   0 3819252 1.962g  11680 R 99.9  3.1   8822:10 ns-slapd

this thread is a replication thread sending updates, what is strange is that
the current csn_str is quite old (july, 7th), I can't tell which agreeement
this thread is handling, but looks like it is heavily reading the changeglog
and sending updates. anything changed recently in replication setup ?

Yes, we had one replica fail on (6/19) which we removed (not this one
showing high CPU load). Had to perform some manual cleanup of the ipa-ca
RUVs. Then we added the replica back in on 7/1. Since then, replication
appears to have been running normally between the 3 replicas. We've been
monitoring utilization since 7/1 and only recently seen this spike (past
24 hours or so).

is it still in this state ? or was it a spike.

Yes same state.


if it still is high cpu consuming, could you
- get a few pstack like the one before with some time in between, I would
like to see if it is progressing with the csns or looping on the same one

Attached are a few stacktraces. The thread pegging the cpu is:

PID  USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+  COMMAND
2879 dirsrv20   0 3819252 1.978g  11684 R  99.9  3.2  10148:26  ns-slapd


- check the consumer side. is there anything in the error log ? does the
access log show replication activity from this server


Here's some errors showing up on the first master server rep1 (rep2 is the
server with pegged cpu):

[13/Jul/2015:20:41:51 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-rep2-pki-tomcat" (rep2:389): Consumer failed to 
replay change (uniqueid cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a45ad60060): 
Operations error (1). Will retry later.
[13/Jul/2015:22:41:51 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-rep2-pki-tomcat" (rep2:389): Consumer failed to 
replay change (uniqueid cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a476f60060): 
Operations error (1). Will retry later.
[14/Jul/2015:06:56:51 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-rep2-tomcat" (rep2:389): Consumer failed to replay 
change (uniqueid cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a4eafa0060): 
Operations error (1). Will retry later.


Here's some snips from the access log of the rep2:


[14/Jul/2015:08:22:31 -0400] conn=87 op=9794 EXT oid="2.16.840.1.113730.3.5.5" 
name="Netscape Replication End Session"
[14/Jul/2015:08:22:31 -0400] conn=87 op=9794 RESULT err=0 tag=120 
nentries=0 etime=0
[14/Jul/2015:08:22:31 -0400] conn=87 op=9795 EXT oid="2.16.840.1.113730.3.5.12" 
name="replication-multimaster-extop"
[14/Jul/2015:08:22:31 -0400] conn=87 op=9795 RESULT err=0 tag=120 
nentries=0 etime=0
[14/Jul/2015:08:22:33 -0400] conn=87 op=9796 EXT oid="2.16.840.1.113730.3.5.5" 
name="Netscape Replication End Session"
..
[14/Jul/2015:08:23:38 -0400] conn=782341 op=129 RESULT err=0 tag=103 nentries=0 
etime=0 csn=55a4ff6c0005
..
[14/Jul/2015:08:24:02 -0400] conn=781901 op=1745 RESULT err=0 tag=101 
nentries=1 etime=0
[14/Jul/2015:08:24:03 -0400] conn=87 op=9810 EXT oid="2.16.840.1.113730.3.5.5" 
name="Netscape Replication End Session"
[14/Jul/2015:08:24:03 -0400] conn=87 op=9810 RESULT err=0 tag=120 
nentries=0 etime=0
[14/Jul/2015:08:24:03 -0400] conn=87 op=9811 EXT oid="2.16.840.1.113730.3.5.12" 
name="replication-multimaster-extop"
[14/Jul/2015:08:24:03 -0400] conn=87 op=9811 RESULT err=0 tag=120 
nentries=0 etime=0
[14/Jul/2015:08:24:05 -0400] conn=87 op=9812 EXT oid="2.16.840.1.113730.3.5.5" 
name="Netscape Replication End Session"
[14/Jul/2015:08:24:05 -0400] conn=87 op=9812 RESULT err=0 tag=120 
nentries=0 etime=0
[14/Jul/2015:08:24:08 -0400] conn=87 op=9813 EXT oid="2.16.840.1.113730.3.5.12"

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-15 Thread Ludwig Krispenz


On 07/14/2015 08:59 PM, Andrew E. Bruno wrote:

On Tue, Jul 14, 2015 at 04:52:10PM +0200, Ludwig Krispenz wrote:

hm, the stack traces show csn_str, which correspond to Jul,8th, Jul,4th, and
Jul,7th - so it looks like it is iterating the changelog over and over
again.
Th consumer side Is "cn=meTosrv-m14-24.ccr.buffalo.edu" - is this the master
?

can you provide the result of the following search from
m14-24.ccr.buffalo.edu adn the server with the high cpu:

ldapsearch -o ldif-wrap=no -x -D ... -w  -b "cn=config"
"objectclass=nsds5replica" nsds50ruv


master is srv-m14-24.. here's the results of the ldapsearch:

[srv-m14-24 ~]$ ldapsearch -o ldif-wrap=no -x -D "cn=directory manager" -W  -b 
"cn=config" "objectclass=nsds5replica" nsds50ruv

# replica, dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu, mapping tree, config
dn: cn=replica,cn=dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f7110004
nsds50ruv: {replica 4 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7710004 55a55aed0014
nsds50ruv: {replica 5 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7730005 5591a3d200070005
nsds50ruv: {replica 6 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943dda0006 5594537800020006
so this is really strange, the master m14-24 has the latest change from 
replica 5(m14-26) as: 5591a3d200070005

which corresponds to Mon, 29 Jun 2015 20:00:18 GMT
so no update from 14-24 since that did arrive, or could not update the 
ruv. So m14-26 tries to replicate all the changes back from that time, 
but looks like iit has no success.
is there anything in the logs of m14-24 ? can you see successful mods 
with csn=xxx0005 ?


# replica, o\3Dipaca, mapping tree, config
dn: cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f74b0060
nsds50ruv: {replica 96 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7540060 55a557f60060
nsds50ruv: {replica 86 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943e6e0056 55943e6f00010056
nsds50ruv: {replica 91 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7ba005b 5582c7e40004005b


server with high cpu load is srv-m14-26. here's the results of the ldapsearch
from this server:

[srv-m14-26 ~]$ ldapsearch -o ldif-wrap=no -x -D "cn=directory manager" -W  -b 
"cn=config" "objectclass=nsds5replica" nsds50ruv

# replica, dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu, mapping tree, config
dn: cn=replica,cn=dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f7110004
nsds50ruv: {replica 5 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7730005 55a55b4700030005
nsds50ruv: {replica 4 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7710004 55a53eba0004
nsds50ruv: {replica 6 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943dda0006 5594537800020006

# replica, o\3Dipaca, mapping tree, config
dn: cn=replica,cn=o\3Dipaca,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f74b0060
nsds50ruv: {replica 91 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7ba005b 5582c7e40004005b
nsds50ruv: {replica 96 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7540060 55a557f60060
nsds50ruv: {replica 86 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943e6e0056 55943e6f00010056


srv-m14-25-02 is our 3rd replicate which we recently added back in after it
failed (was added back in 7/1).

Let me know if you need anything else. Thanks for the help.

--Andrew


On 07/14/2015 02:35 PM, Andrew E. Bruno wrote:

On Tue, Jul 14, 2015 at 01:41:57PM +0200, Ludwig Krispenz wrote:

On 07/13/2015 06:36 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 05:29:13PM +0200, Ludwig Krispenz wrote:

On 07/13/2015 05:05 PM, Andrew E. Bruno wrote:

On Mon, Jul 13, 2015 at 04:58:46PM +0200, Ludwig Krispenz wrote:

can you get a pstack of the slapd process along with a top -H to find th
ethread with high cpu usage

Attached is the full stacktrace of the running ns-slapd proccess. top -H
shows this thread (2879) with high cpu usage:

2879 dirsrv20   0 3819252 1.962g  11680 R 99.9  3.1   8822:10 ns-slapd

this thread is a replication thread sending updates, what is strange is that
the current csn_str is quite old (july, 7th), I can't tell which agreeement
this thread is handling, but looks like it is heavily reading the changeglog
and sending updates. anything changed recently in replication setup ?

Yes, we had one replica fail on (6/19) which we removed (not this one
showing high CPU load). Had to perform some manual cleanup of the ipa-ca
RUVs. Then we added the replica back in on 7/1. Since then, replication
appears to have been running normally between the 3 replicas. We've been
monitoring utilization since 7/1 and only recently seen this spike (past
24 hour

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-15 Thread Ludwig Krispenz


On 07/15/2015 04:10 PM, Andrew E. Bruno wrote:

On Wed, Jul 15, 2015 at 03:22:51PM +0200, Ludwig Krispenz wrote:

On 07/14/2015 08:59 PM, Andrew E. Bruno wrote:

On Tue, Jul 14, 2015 at 04:52:10PM +0200, Ludwig Krispenz wrote:

hm, the stack traces show csn_str, which correspond to Jul,8th, Jul,4th, and
Jul,7th - so it looks like it is iterating the changelog over and over
again.
Th consumer side Is "cn=meTosrv-m14-24.ccr.buffalo.edu" - is this the master
?

can you provide the result of the following search from
m14-24.ccr.buffalo.edu adn the server with the high cpu:

ldapsearch -o ldif-wrap=no -x -D ... -w  -b "cn=config"
"objectclass=nsds5replica" nsds50ruv

master is srv-m14-24.. here's the results of the ldapsearch:

[srv-m14-24 ~]$ ldapsearch -o ldif-wrap=no -x -D "cn=directory manager" -W  -b 
"cn=config" "objectclass=nsds5replica" nsds50ruv

# replica, dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu, mapping tree, config
dn: cn=replica,cn=dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f7110004
nsds50ruv: {replica 4 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7710004 55a55aed0014
nsds50ruv: {replica 5 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7730005 5591a3d200070005
nsds50ruv: {replica 6 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943dda0006 5594537800020006

so this is really strange, the master m14-24 has the latest change from
replica 5(m14-26) as: 5591a3d200070005
which corresponds to Mon, 29 Jun 2015 20:00:18 GMT
so no update from 14-24 since that did arrive, or could not update the ruv.
So m14-26 tries to replicate all the changes back from that time, but looks
like iit has no success.
is there anything in the logs of m14-24 ? can you see successful mods with
csn=xxx0005 ?

Here's what I could find from the logs on srv-m14-24:


[srv-m14-24 ~]# grep -r 0005 /var/log/dirsrv/slapd-[domain]/*
access.20150714-014346:[14/Jul/2015:03:10:05 -0400] conn=748529 op=14732 RESULT 
err=0 tag=103 nentries=0 etime=1 csn=55a4b5f00054
ok, so no update originating at replica 5 has been replicated (probably 
since June,29) did you experience data inconsistency between the servers ?



And here's the last few lines the error log on srv-m14-24:
one set of messages refers to the o=ipaca backend and seem to be 
transient, replication continues later.
the other set of msg "No original tombstone .." is annoying (and it is 
fixed in ticket https://fedorahosted.org/389/ticket/47912)


the next thing we can do to try to understand what is going on is to 
enable replication logging on m14-26, it will then not only consume all 
cpu, but write tons of messages to the error log.

But it can be turned on and off:

ldapmodify ...
dn: cn=config
replace: nsslapd-errorlog-level
nsslapd-errorlog-level: 8192

and let it run for a while, then set it back to: 0



[12/Jul/2015:10:11:14 -0400] ldbm_back_delete - conn=0 op=0 [retry: 1] No 
original_tombstone for changenumber=2456070,cn=changelog!!
[12/Jul/2015:10:11:48 -0400] ldbm_back_delete - conn=0 op=0 [retry: 1] No 
original_tombstone for changenumber=2498441,cn=changelog!!
[13/Jul/2015:07:41:49 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a3a4060060): Operations error (1). 
Will retry later.
[13/Jul/2015:11:56:50 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a3dfca0060): Operations error (1). 
Will retry later.
[13/Jul/2015:14:26:50 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a402f20060): Operations error (1). 
Will retry later.
[13/Jul/2015:15:26:49 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a411020060): Operations error (1). 
Will retry later.
[13/Jul/2015:18:26:51 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a43b320060): Operations error (1). 
Will retry later.
[13/Jul/2015:18:56:51 -0400] NSMMReplicationPlugin - 
agmt="cn=masterAgreement1-srv-m14-26.ccr.buffalo.edu-pki-tomcat" 
(srv-m14-26:389): Consumer failed to replay change (uniqueid 
cb7acfc1-df9211e4-a351aa45-2e06257b, CSN 55a4423a0

Re: [Freeipa-users] ns-slapd high cpu usage

2015-07-16 Thread Ludwig Krispenz

Thank you for the data, I think I understand now what is going on.

In the error logs we see only message like (from my test env):

[16/Jul/2015:10:12:40 +0200] NSMMReplicationPlugin - agmt="cn=100-300" 
(localhost:9759): replay_update: modifys operation 
(dn="dc=example,dc=com" csn=55a82a2900010064) not sent - empty
[16/Jul/2015:10:12:40 +0200] NSMMReplicationPlugin - agmt="cn=100-300" 
(localhost:9759): replay_update: Consumer successfully sent operation 
with csn 55a82a2900010064
[16/Jul/2015:10:12:40 +0200] NSMMReplicationPlugin - agmt="cn=100-300" 
(localhost:9759): Skipping update operation with no message_id (uniqueid 
7507cb26-e8ac11e2-b2898005-8430f734, CSN 55a82a2900010064):


This happens if fractional replication is configured as IPA does and the 
changes affect only attributes which will NOT be replicated. So teh 
local RUV will be updated, but since no change is really sent, the 
consumer RUV is not updated and replciation will always set off from an 
very old starting csn. It is a rare scenario where a server receives 
only mods which are not replicated.


I have opened a ticket for this: https://fedorahosted.org/389/ticket/48225

As  a workaround can you try to apply a mod on m14-26 which will not be 
stripped, either create a dummy user or add a description attribute to 
an existing object. Repliciation will once again iterate thru all the 
changes (which can take a while), but then should replay this latest 
change and define a new offset


Regards,
Ludwig


On 07/15/2015 07:05 PM, Andrew E. Bruno wrote:

On Wed, Jul 15, 2015 at 04:58:23PM +0200, Ludwig Krispenz wrote:

On 07/15/2015 04:10 PM, Andrew E. Bruno wrote:

On Wed, Jul 15, 2015 at 03:22:51PM +0200, Ludwig Krispenz wrote:

On 07/14/2015 08:59 PM, Andrew E. Bruno wrote:

On Tue, Jul 14, 2015 at 04:52:10PM +0200, Ludwig Krispenz wrote:

hm, the stack traces show csn_str, which correspond to Jul,8th, Jul,4th, and
Jul,7th - so it looks like it is iterating the changelog over and over
again.
Th consumer side Is "cn=meTosrv-m14-24.ccr.buffalo.edu" - is this the master
?

can you provide the result of the following search from
m14-24.ccr.buffalo.edu adn the server with the high cpu:

ldapsearch -o ldif-wrap=no -x -D ... -w  -b "cn=config"
"objectclass=nsds5replica" nsds50ruv

master is srv-m14-24.. here's the results of the ldapsearch:

[srv-m14-24 ~]$ ldapsearch -o ldif-wrap=no -x -D "cn=directory manager" -W  -b 
"cn=config" "objectclass=nsds5replica" nsds50ruv

# replica, dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu, mapping tree, config
dn: cn=replica,cn=dc\3Dccr\2Cdc\3Dbuffalo\2Cdc\3Dedu,cn=mapping tree,cn=config
nsds50ruv: {replicageneration} 5527f7110004
nsds50ruv: {replica 4 ldap://srv-m14-24.ccr.buffalo.edu:389} 
5527f7710004 55a55aed0014
nsds50ruv: {replica 5 ldap://srv-m14-26.ccr.buffalo.edu:389} 
5537c7730005 5591a3d200070005
nsds50ruv: {replica 6 ldap://srv-m14-25-02.ccr.buffalo.edu:389} 
55943dda0006 5594537800020006

so this is really strange, the master m14-24 has the latest change from
replica 5(m14-26) as: 5591a3d200070005
which corresponds to Mon, 29 Jun 2015 20:00:18 GMT
so no update from 14-24 since that did arrive, or could not update the ruv.
So m14-26 tries to replicate all the changes back from that time, but looks
like iit has no success.
is there anything in the logs of m14-24 ? can you see successful mods with
csn=xxx0005 ?

Here's what I could find from the logs on srv-m14-24:


[srv-m14-24 ~]# grep -r 0005 /var/log/dirsrv/slapd-[domain]/*
access.20150714-014346:[14/Jul/2015:03:10:05 -0400] conn=748529 op=14732 RESULT 
err=0 tag=103 nentries=0 etime=1 csn=55a4b5f00054

ok, so no update originating at replica 5 has been replicated (probably
since June,29) did you experience data inconsistency between the servers ?


And here's the last few lines the error log on srv-m14-24:

one set of messages refers to the o=ipaca backend and seem to be transient,
replication continues later.
the other set of msg "No original tombstone .." is annoying (and it is fixed
in ticket https://fedorahosted.org/389/ticket/47912)

the next thing we can do to try to understand what is going on is to enable
replication logging on m14-26, it will then not only consume all cpu, but
write tons of messages to the error log.
But it can be turned on and off:

ldapmodify ...
dn: cn=config
replace: nsslapd-errorlog-level
nsslapd-errorlog-level: 8192

and let it run for a while, then set it back to: 0

I enabled replication logging and it's running now. I noticed the
default value for nsslapd-errorlog-level was set to 16384 (not 0).

OK to send you the logs off list? Looks like they contain quite a bit of
sensitive data.

Thanks again for all the help looking into this.

Best,

--Andrew




[12/Jul/2015:10:11:14 -0400] ldbm_back_delete - c

Re: [Freeipa-users] Failed to start pki-tomcatd Service

2015-07-22 Thread Ludwig Krispenz


On 07/22/2015 06:40 PM, Alexander Bokovoy wrote:

On Wed, 22 Jul 2015, Alexandre Ellert wrote:


Le 22 juil. 2015 à 18:08, Alexander Bokovoy  a 
écrit :


On Wed, 22 Jul 2015, Alexandre Ellert wrote:

# fgrep -r 0.9.2342.19200300.100.1.25 /etc/dirsrv
from both servers?


Server 1:
# fgrep -r 0.9.2342.19200300.100.1.25 /etc/dirsrv
/etc/dirsrv/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )
/etc/dirsrv/slapd-NUMEEZY-FR/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )


Server 2 :
# fgrep -r 0.9.2342.19200300.100.1.25 /etc/dirsrv
/etc/dirsrv/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )
/etc/dirsrv/slapd-NUMEEZY-FR/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )




With correct setup IPA 4.x should show:
/etc/dirsrv/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )
/etc/dirsrv/slapd-EXAMPLE-COM/schema/00core.ldif:attributeTypes: ( 
0.9.2342.19200300.100.1.25 NAME ( 'dc' 'domaincomponent' )


I.e. there are two lines -- in the default schema and in the IPA
instance schema. —


Seems to be good ?

Yes. Can you get a new set of logs on 'ipactl start'?

--
/ Alexander Bokovoy


Sorry, the log is very long…I can format differently if you need.

Thanks, no need for more logs right now.

What I see from these logs:
- Directory server starts just fine but serves only port 389
- krb5kdc starts just fine and works fine with LDAP server
- Dogtag tries to use LDAP server via port 636 and fails

We need to see why port 636 is disabled.

why do you think so ? There is:

[22/Jul/2015:18:14:54 +0200] - slapd started.  Listening on All Interfaces port 
389 for LDAP requests
[22/Jul/2015:18:14:54 +0200] - Listening on All Interfaces port 636 for LDAPS 
requests
[22/Jul/2015:18:14:54 +0200] - Listening on /var/run/slapd-NUMEEZY-FR.socket 
for LDAPI requests

but what is failing is:
agmt="cn=cloneAgreement1-inf-ipa-2.numeezy.fr-pki-tomcat" (inf-ipa:7389): 
Replication bind with SIMPLE auth failed: LDAP error -1 (Can't contact LDAP server) ()

Is dogtag on a different instance ? why do we use port 7389 ?



Can you grep /etc/dirsrv/slapd-NUMEEZY-FR/dse.ldif for following
attributes:
nsslapd-security
nsslapd-port

They should be 'on' and '389' correspondingly.



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Kerberos hanging approx. once a day

2015-07-23 Thread Ludwig Krispenz


On 07/23/2015 09:56 AM, Sumit Bose wrote:

On Thu, Jul 23, 2015 at 09:18:43AM +0200, Torsten Harenberg wrote:

Hi Sumit,



The principal looks strange, I would at least expect the fully-qualified
name of the ipa server here. What does the 'hostname' command return? It

[root@ipa slapd-PLEIADES-UNI-WUPPERTAL-DE]# hostname
ipa.pleiades.uni-wuppertal.de


is expected that it will return the fully-qualified name. Additionally if
you added the ipa server to /etc/hosts please only use the
fully-qualified name to be on the safe side (iirc it is ok to have the
short name as a second name, but the fully-qualified one should be
always first).

I removed the entries vom /etc/hosts again.


The keytab file /etc/krb5.keytab looks strange here. Later on the right
one /etc/dirsrv/ds.keytab is used. Did you try to run the
/usr/sbin/ns-slapd binary manually at some time?


Yes.. once .. after it did not came up.

After another reboot, the system came up now.

But what I found is

https://fedorahosted.org/freeipa/ticket/2739

and indeed:

[root@ipa slapd-PLEIADES-UNI-WUPPERTAL-DE]# grep WARNING *
errors:[21/Jul/2015:17:15:21 +0200] - WARNING: cache too small,
increasing to 500K bytes
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[21/Jul/2015:17:15:21 +0200] - WARNING: userRoot: entry cache
size 512000B is less than db size 4177920B; We recommend to increase the
entry cache size nsslapd-cachememsize.
errors:[21/Jul/2015:17:15:21 +0200] - WARNING: changelog: entry cache
size 512000B is less than db size 18096128B; We recommend to increase
the entry cache size nsslapd-cachememsize.
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[22/Jul/2015:11:03:31 +0200] - WARNING: userRoot: entry cache
size 512000B is less than db size 4218880B; We recommend to increase the
entry cache size nsslapd-cachememsize.
errors:[22/Jul/2015:11:03:31 +0200] - WARNING: changelog: entry cache
size 512000B is less than db size 27992064B; We recommend to increase
the entry cache size nsslapd-cachememsize.
errors:[23/Jul/2015:07:33:09 +0200] - WARNING: cache too small,
increasing to 500K bytes
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up
errors:[23/Jul/2015:07:33:09 +0200] - WARNING -- Minimum cache size is
512000 -- rounding up

I'm not a 389ds expert but in my setup nsslapd-cachememsize is set to
10M and since I didn't do any tuning I would expect that this is some
default.
yes, 10M should be the default. and OOM would be triggered by a memleak, 
not by the cache size.

Also the server seems to stop and start cleanly, and is not killed by oom




And what I see is that nodes occasionaly loose their users. I haven't
seen that the two month while testing (of course there were no real
users during that time, so I'm not 100% sure that it did not happen).

Could that be the cause of the trouble??

The users and groups are delivered to the system via SSSD. If SSSD loses
the connection to the IPA servers, e.g. because the server does not
respond, SSSD cannot lookup new users. Nevertheless SSSD has a cache and
users and groups are delivered from the cache in this case. But system
users which important for the services to run like the users dirsrv,
apache, pkiuser etc are defined in /etc/passwd. So I don't expect this
to bethe casue of the trouble.

bye,
Sumit


Kind regards,

   Torsten



--
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
<>  <>
<> Dr. Torsten Harenberg harenb...@physik.uni-wuppertal.de  <>
<> Bergische Universitaet   

Re: [Freeipa-users] Kerberos hanging approx. once a day

2015-07-23 Thread Ludwig Krispenz

you can change the cachememsize online:
ldapmodify 
dn: cn=,cn=ldbm database,cn=plugins,cn=config
changetype: modify
replace: nsslapd-cachememsize
nsslapd-cachememsize: 

But I would also increase the dbcache size, which would require a 
restart to be effective.


So you could also stop DS, edit /etc/dirsrv/slapd-/dse.ldif
search all *cache* attributes and replace the valu.

Ludwig

On 07/23/2015 10:21 AM, Marisa Sandhoff wrote:

Hi Sumit,


I'm not a 389ds expert but in my setup nsslapd-cachememsize is set to
10M and since I didn't do any tuning I would expect that this is some
default.


Perhaps we should start with increasing the nsslapd-cachememsize to 10M
and than see what happens with our server. Actually, how can we increase
this cachmemsize?

Thanks for your help,
Torsten and Marisa



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Keeping a Tuesday fun - replication? without replication?

2015-08-04 Thread Ludwig Krispenz


On 08/04/2015 05:40 PM, Rob Crittenden wrote:

Janelle wrote:

Hello again,

Just to keep your Tuesday fun, is this possible:

16 servers.
ipa-replica-manage list  < shows all 16

1 of the servers broke a couple of weeks ago and was removed with
"clean-ruv" but STILL shows up in the replica list, but not a single
master has a replica agreement with it, so there is no way to delete it
since trying to do "ipa-replica-manage del" with any options, including
force, from ANY servers says there is no replica agreement.  How is this
possible and how do I get rid of the phantom replica? and I did try
--cleanup and it took it, but did nothing. And there is NOTHING in the
logs??

To further clarify, it is not a CA either, and never was.

Very confusing indeed. I just like to keep the developers on their toes.
:-)

don't know if I want to know the answer, but is it contained in the ruvs ?


list shows the those entries in cn=masters,cn=ipa,cn=etc,$SUFFIX. It 
doesn't show agreements or topology.


What output do you see when --cleanup is used?

You should check the 389-ds access log after this is run as well to 
see what searches and mods were attempted.


rob



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] Keeping a Tuesday fun - replication? without replication?

2015-08-04 Thread Ludwig Krispenz

Hi
On 08/04/2015 06:14 PM, Janelle wrote:



On 8/4/15 9:06 AM, Ludwig Krispenz wrote:


On 08/04/2015 05:40 PM, Rob Crittenden wrote:

Janelle wrote:

Hello again,

Just to keep your Tuesday fun, is this possible:

16 servers.
ipa-replica-manage list  < shows all 16

1 of the servers broke a couple of weeks ago and was removed with
"clean-ruv" but STILL shows up in the replica list, but not a single
master has a replica agreement with it, so there is no way to 
delete it
since trying to do "ipa-replica-manage del" with any options, 
including
force, from ANY servers says there is no replica agreement. How is 
this

possible and how do I get rid of the phantom replica? and I did try
--cleanup and it took it, but did nothing. And there is NOTHING in the
logs??

To further clarify, it is not a CA either, and never was.

Very confusing indeed. I just like to keep the developers on their 
toes.

:-)
don't know if I want to know the answer, but is it contained in the 
ruvs ?
No. That is why I am baffled. I want to re-add the server to help with 
loading, but obviously if it still shows up - so weird. Looks like 
ldapmodify is going to be required.  I don't even have any strange 
CSN/replicas that can't be decoded in list-ruv
you probably did run into this issue: 
https://fedorahosted.org/freeipa/ticket/5019


ioa-replica-manage del failed to delete the master because it did not 
remove all services before. If you want to do it by ldapmodify, check 
what services are there below the master entry and remove these befor 
removing the master


~J


list shows the those entries in cn=masters,cn=ipa,cn=etc,$SUFFIX. It 
doesn't show agreements or topology.


What output do you see when --cleanup is used?

You should check the 389-ds access log after this is run as well to 
see what searches and mods were attempted.


rob







--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] stubborn old replicas

2015-08-27 Thread Ludwig Krispenz


On 08/27/2015 09:08 AM, Martin Kosek wrote:

On 08/26/2015 05:31 PM, Simo Sorce wrote:

On Wed, 2015-08-26 at 06:36 -0700, Janelle wrote:

Hello all,

My biggest problem is losing replicas and then trying to delete the
entries and rebuild them. Here is a perfect example, I simply can't get
rid of these  (see below). I have tried (of course after the ORIGINAL
"ipa-replica-manage del hostname --force --clean":

ipa-replica-manage clean-ruv 25

ldapmodify... with:
dn: cn=clean 25, cn=cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn: dc=example,dc=com
replica-id: 25
cn: clean 25

And yet nothing works. Any suggestions? This is perhaps the most
frustrating part about maintaining IPA.

~J

unable to decode: {replica 12} 5588dc2e000c 559f3de60004000c
unable to decode: {replica 14} 5587aa8d000e 5587aa8d0003000e
unable to decode: {replica 16} 5588f58f0010 55bb7b0800050010
unable to decode: {replica 25} 55a4887b0019 55a4924200040019
unable to decode: {replica 29} 55d199a50001001d 55d199a50001001d
unable to decode: {replica 3} 5587c5c30003 55b8a04900010003
unable to decode: {replica 5} 55cc82ab041d0005 55cc82ab041d0005

Have you tried restarting DS before trying to clean the ruv ?

I run in a similar problem in a test install recently, and I got better
results that way. The bug is known to the DS people and they are working
to get out patches that fix the root issue.

Simo.

CCing DS folks. Wasn't there a recent DS fix that was supposed to improve the
RUV situation?

Looking at 389 DS Trac, I see some interesting RUV fixes in 1.3.4.x releases:

https://fedorahosted.org/389/query?summary=~RUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone

I see that 389-ds-base-1.3.4.3 is already in Fedora 22+, does the RUV issue
happen there?

it should not, and I think Thierry verified the fix.
The problem we resolved and which we think is the core of the corrupted 
RUV was that the cleanallruv task did only purge the RUV, but dit not 
purge the changelog. If cleanallruv was run and the server had a 
disorderly shutdown (crash or abort when shutdown was hanging) then at 
restart the changelog RUV was rebuilt from the data in the changelog and 
if it contained a csn from cleaned RIDs this was added to the RUV (but 
the reference to the server was lost and so the url part is missing from 
this RUV.
The fix now does remove all references to the cleaned RID from the 
changelog and the problem should not reoccur with RIDs cleaned with the 
fix, of course th echangelog can still can contain references to RIDs 
cleaned before the fix - and if no changelog trimming is configured this 
is what will happen. So, even after the fix old RUVs could pop up and 
have to be (finally) cleaned.


The other source is that these corrupted rivs can be "imported" from 
another server by exchanging ruvs in the repl protocol. Cleanallruv 
tries to address this and to propagate the cleanallruv tasks to all 
servers it thinks are connected. If there are replication agreements to 
servers which no longer exist or to servers which cannot be connetcted 
this will delay the ruv cleaning


--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replicas unresponsive with increasing file descriptors

2015-09-01 Thread Ludwig Krispenz


On 09/01/2015 04:39 PM, Andrew E. Bruno wrote:

A few months ago we had a replica failure where the system ran out of file
descriptors and the slapd database was corrupted:

https://www.redhat.com/archives/freeipa-users/2015-June/msg00389.html

We now monitor file descriptor counts on our replicas and last night we
had 2 of our 3 replicas fail and become completely unresponsive. Trying
to kinit on the replica resulted in:

[user@ipa-master]$ kinit
kinit: Generic error (see e-text) while getting initial credentials


Snippet from the /var/log/dirsrv/slapd-[domain]/errors:

[31/Aug/2015:17:14:39 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Warning: 
Attempting to release replica, but unable to receive endReplication extended operation 
response from the replica. Error -5 (Timed out)
[31/Aug/2015:17:16:39 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:18:42 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:20:42 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:22:47 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:24:47 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:24:47 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Incremental 
protocol: event backoff_timer_expired should not occur in state start_backoff
[31/Aug/2015:17:26:50 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.
[31/Aug/2015:17:28:50 -0400] NSMMReplicationPlugin - 
agmt="cn=meTosrv-m14-30.cbls.ccr.buffalo.edu" (srv-m14-30:389): Unable to 
receive the response for a startReplication extended operation to consumer (Timed out). 
Will retry later.

The access logs were filling up with:

[31/Aug/2015:17:13:17 -0400] conn=1385990 fd=449 slot=449 connection from 
10.106.14.29 to 10.113.14.30
[31/Aug/2015:17:13:18 -0400] conn=1385991 fd=450 slot=450 connection from 
10.104.9.137 to 10.113.14.30
[31/Aug/2015:17:13:18 -0400] conn=1385992 fd=451 slot=451 connection from 
10.104.16.19 to 10.113.14.30
[31/Aug/2015:17:13:21 -0400] conn=1385993 fd=452 slot=452 connection from 
10.111.11.30 to 10.113.14.30
[31/Aug/2015:17:13:24 -0400] conn=1385994 fd=453 slot=453 connection from 
10.113.27.115 to 10.113.14.30
[31/Aug/2015:17:13:27 -0400] conn=1385995 fd=454 slot=454 connection from 
10.111.8.116 to 10.113.14.30
[31/Aug/2015:17:13:27 -0400] conn=1385996 fd=514 slot=514 connection from 
10.113.25.40 to 10.113.14.30
[31/Aug/2015:17:13:29 -0400] conn=1385997 fd=515 slot=515 connection from 
10.106.14.27 to 10.113.14.30
[31/Aug/2015:17:13:29 -0400] conn=1385998 fd=516 slot=516 connection from 
10.111.10.141 to 10.113.14.30
[31/Aug/2015:17:13:30 -0400] conn=1385999 fd=528 slot=528 connection from 
10.104.14.27 to 10.113.14.30
[31/Aug/2015:17:13:31 -0400] conn=1386000 fd=529 slot=529 connection from 
10.106.13.132 to 10.113.14.30
[31/Aug/2015:17:13:31 -0400] conn=1386001 fd=530 slot=530 connection from 
10.113.25.11 to 10.113.14.30
[31/Aug/2015:17:13:31 -0400] conn=1386002 fd=531 slot=531 connection from 
10.104.15.11 to 10.113.14.30
[31/Aug/2015:17:13:32 -0400] conn=1386003 fd=533 slot=533 connection from 
10.104.7.136 to 10.113.14.30
[31/Aug/2015:17:13:33 -0400] conn=1386004 fd=534 slot=534 connection from 
10.113.24.23 to 10.113.14.30
[31/Aug/2015:17:13:33 -0400] conn=1386005 fd=535 slot=535 connection from 
10.106.12.105 to 10.113.14.30
[31/Aug/2015:17:13:33 -0400] conn=1386006 fd=536 slot=536 connection from 
10.104.16.41 to 10.113.14.30
[31/Aug/2015:17:13:34 -0400] conn=1386007 fd=537 slot=537 connection from 
10.104.16.4 to 10.113.14.30
[31/Aug/2015:17:13:35 -0400] conn=1386008 fd=538 slot=538 connection from 
10.111.8.12 to 10.113.14.30
[31/Aug/2015:17:13:36 -0400] conn=1386009 fd=539 slot=539 connection from 
10.111.8.17 to 10.113.14.30



Seems like clients were connecting to the replicas but file descriptors were
not getting released. Our monitoring showed increasing file descriptor counts
on both re

Re: [Freeipa-users] stubborn old replicas

2015-09-02 Thread Ludwig Krispenz

Hi Janelle,
On 09/01/2015 06:17 PM, Janelle wrote:

On 8/28/15 8:17 AM, Vaclav Adamec wrote:
You could try this (RH recommended way). It works for me better than 
cleanallruv.pl <http://cleanallruv.pl/> as this sometimes leads to 
ldap freeze)


unable to decode: {replica 30} 5548fa20001e 
5548fa20001e unable to decode: {replica 26} 
5548a9a8001a 5548a9a8001a


for all of them, on-by-one:

ldapmodify -x -D "cn=directory manager" -w XXX dn: 
cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config 
changetype: modify replace: nsds5task nsds5task: CLEANRUV30  + 



On Fri, Aug 28, 2015 at 4:55 PM, Guillermo Fuentes 
 wrote:


Hi Janelle,

Using the cleanallruv.pl <http://cleanallruv.pl> tool was the
only way I was able to get ride of the "unable to decode:
{replica x}" entries.

This is how I used it, cleaning a replica ID at a time:
# For replica id: 40
cleanallruv.pl <http://cleanallruv.pl> -v -D "cn=directory
manager" -w - -b 'dc=example,dc=com' -r 40

Note that the "-w -" will make the tool prompt you for the
directory manager password.

Hope this helps,
Guillermo


On Thu, Aug 27, 2015 at 10:27 AM, Janelle
 wrote:

On 8/27/15 1:05 AM, thierry bordaz wrote:

On 08/27/2015 09:41 AM, Ludwig Krispenz wrote:


On 08/27/2015 09:08 AM, Martin Kosek wrote:

On 08/26/2015 05:31 PM, Simo Sorce wrote:

On Wed, 2015-08-26 at 06:36 -0700, Janelle wrote:

Hello all,

My biggest problem is losing replicas and then trying to
delete the
entries and rebuild them. Here is a perfect example, I
simply can't get
rid of these  (see below). I have tried (of course after
the ORIGINAL
"ipa-replica-manage del hostname --force --clean":

ipa-replica-manage clean-ruv 25

ldapmodify... with:
dn: cn=clean 25, cn=cleanallruv, cn=tasks, cn=config
objectclass: extensibleObject
replica-base-dn: dc=example,dc=com
replica-id: 25
cn: clean 25

And yet nothing works. Any suggestions? This is perhaps
the most
frustrating part about maintaining IPA.

~J

unable to decode: {replica 12} 5588dc2e000c
559f3de60004000c
unable to decode: {replica 14} 5587aa8d000e
5587aa8d0003000e
unable to decode: {replica 16} 5588f58f0010
55bb7b0800050010
unable to decode: {replica 25} 55a4887b0019
55a4924200040019
unable to decode: {replica 29} 55d199a50001001d
55d199a50001001d
unable to decode: {replica 3} 5587c5c30003
55b8a04900010003
unable to decode: {replica 5} 55cc82ab041d0005
55cc82ab041d0005

Have you tried restarting DS before trying to clean the
ruv ?

I run in a similar problem in a test install recently,
and I got better
results that way. The bug is known to the DS people and
they are working
to get out patches that fix the root issue.

Simo.

CCing DS folks. Wasn't there a recent DS fix that was
supposed to improve the
RUV situation?

Looking at 389 DS Trac, I see some interesting RUV fixes
in 1.3.4.x releases:


https://fedorahosted.org/389/query?summary=~RUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone

<https://fedorahosted.org/389/query?summary=%7ERUV&status=closed&order=milestone&col=id&col=summary&col=status&col=owner&col=type&col=priority&col=milestone>


I see that 389-ds-base-1.3.4.3 is already in Fedora 22+,
does the RUV issue
happen there?

it should not, and I think Thierry verified the fix.
The problem we resolved and which we think is the core of
the corrupted RUV was that the cleanallruv task did only
purge the RUV, but dit not purge the changelog. If
cleanallruv was run and the server had a disorderly
shutdown (crash or abort when shutdown was hanging) then at
restart the changelog RUV was rebuilt from the data in the
changelog and if it contained a csn from cleaned RIDs this
was added to the RUV (but the reference to the server was
lost and so the url part is missing from this RUV.
The fix now does remove all references to the cleaned RID
from the changelog and the problem should not reoccur with
RIDs cleaned with the fix, of course th echangelog can
still can contain references to RIDs cleaned before the fix
- and if no changelog trimming is configured this

Re: [Freeipa-users] Problem with replication?

2015-09-04 Thread Ludwig Krispenz


On 09/04/2015 04:37 PM, Christoph Kaminski wrote:

Hi

we have a lot of this messages in the error log of dirsrv... What can 
be the problem and how can we fix it?


our (first) master (ipa-1.mgmt.biotronik-homemonitoring.int):
[04/Sep/2015:16:06:41 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config":32 

[04/Sep/2015:16:08:00 +0200] ipalockout_preop - [file ipa_lockout.c, 
line 749]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32
[04/Sep/2015:16:08:00 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32
[04/Sep/2015:16:11:41 +0200] ipalockout_preop - [file ipa_lockout.c, 
line 749]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config":32 

[04/Sep/2015:16:11:41 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config":32 

[04/Sep/2015:16:13:00 +0200] ipalockout_preop - [file ipa_lockout.c, 
line 749]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32
[04/Sep/2015:16:13:00 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32
[04/Sep/2015:16:16:40 +0200] ipalockout_preop - [file ipa_lockout.c, 
line 749]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config":32 

[04/Sep/2015:16:16:40 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config":32 

[04/Sep/2015:16:18:00 +0200] ipalockout_preop - [file ipa_lockout.c, 
line 749]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32
[04/Sep/2015:16:18:00 +0200] ipalockout_postop - [file ipa_lockout.c, 
line 503]: Failed to retrieve entry "cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.hss.int-pki-tomcat,ou=csusers,cn=config": 32


one of our other ipa's (ipa-1.mgmt.datacenter-homemonitoring.int):
[04/Sep/2015:16:21:41 +0200] slapi_ldap_bind - Error: could not bind 
id [cn=Replication Manager 
masterAgreement1-ipa-1.mgmt.datacenter-homemonitoring.int-pki-tomcat,ou=csusers,cn=config] 
authentication mechanism [SIMPLE]: error 32 (No such object) errno 0 
(Success)
this means you somehow lost the user for authentication in replication. 
you could try to add it back, as a template use one existing user in 
ou=csusers,cn=config


Greetz
Christoph Kaminski






-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Faulty LDAP record

2015-09-04 Thread Ludwig Krispenz


On 09/04/2015 04:49 PM, Christoph Kaminski wrote:

Hi All,

how can I delete a faulty user in IPA 4.1? The record in LDAP look 
like this:

nsuniqueid=a69f868e-4b4411e5-99ef9ac3-776749aa+uid=zimt,cn=users,cn=accounts,dc=hso
this is a replication conflict entry, the user uid=zimt was added in 
parallel on two servers. you should be able to delete it with ldapmodify


ldapmodify .
dn: 
nsuniqueid=a69f868e-4b4411e5-99ef9ac3-776749aa+uid=zimt,cn=users,cn=accounts,dc=hso

changetype: delete



It is not possible to delete it over the WebUI and with LDAP Browser I 
get this error:


Deleting is not possible, the following error appears:
Error while deleting entry LDAP: error code 32 - No Such Object

Greetz
Christoph Kaminski







-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] user delete command hangs kdc and ldap stop responding

2015-09-18 Thread Ludwig Krispenz


On 09/18/2015 12:24 AM, HECTOR LOPEZ wrote:

This is rhel 7.1 with ipa version 4.1.0

user-show shows the user. However, if the user contains 
ipaNTSecurityIdentifier: attribute, user-del hangs with no response.


Meanwhile, the KDC and 389ds stop working. The only way to recover 
functionality is to reboot the machine.  ipactl restart does nothing.

If it hangs again, could you get a pstack of the slapd process ?
If you then kill slapd, does ipactl restart work ?


In the ldap access log I see this when trying to delete user sclown:

[14/Sep/2015:09:28:27 -0700] conn=326 op=18 RESULT err=0 tag=101 
nentries=0 etime=0
[14/Sep/2015:09:28:27 -0700] conn=326 op=19 DEL 
dn="uid=sclown,cn=users,cn=accounts,dc=some,dc=domain,dc=org"
[14/Sep/2015:09:30:03 -0700] conn=12 op=442 MOD 
dn="cn=MasterCRL,ou=crlIssuingPoints,ou=ca,o=ipaca"
[14/Sep/2015:09:30:03 -0700] conn=12 op=442 RESULT err=1 tag=103 
nentries=0 etime=0
[14/Sep/2015:09:30:06 -0700] conn=20 op=288 SRCH 
base="ou=sessions,ou=Security Domain,o=ipaca" scope=2 
filter="(objectClass=securityDomainSessionEntry)" attrs="cn"
[14/Sep/2015:09:30:06 -0700] conn=20 op=288 RESULT err=32 tag=101 
nentries=0 etime=0
[14/Sep/2015:09:30:08 -0700] conn=12 op=444 SRCH 
base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 
filter="(certStatus=INVALID)" attrs="objectClass serialno notBefore 
notAfter duration extension subjectName userCertificate version 
algorithmId signingAlgorithmId publicKeyData"

[14/Sep/2015:09:30:08 -0700] conn=12 op=444 SORT notBefore
[14/Sep/2015:09:30:08 -0700] conn=12 op=444 VLV 200:0:20150914093009Z 
1:0 (0)
[14/Sep/2015:09:30:08 -0700] conn=12 op=444 RESULT err=0 tag=101 
nentries=0 etime=0
[14/Sep/2015:09:30:08 -0700] conn=12 op=445 SRCH 
base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 
filter="(certStatus=VALID)" attrs="objectClass serialno notBefore 
notAfter duration extension subjectName userCertificate version 
algorithmId signingAlgorithmId publicKeyData"

[14/Sep/2015:09:30:08 -0700] conn=12 op=445 SORT notAfter
[14/Sep/2015:09:30:08 -0700] conn=12 op=445 VLV 200:0:20150914093009Z 
1:10 (0)
[14/Sep/2015:09:30:08 -0700] conn=12 op=445 RESULT err=0 tag=101 
nentries=1 etime=0
[14/Sep/2015:09:30:08 -0700] conn=12 op=446 SRCH 
base="ou=certificateRepository,ou=ca,o=ipaca" scope=1 
filter="(certStatus=REVOKED)" attrs="objectClass revokedOn serialno 
revInfo notAfter notBefore duration extension subjectName 
userCertificate version algorithmId signingAlgorithmId publicKeyData"
[14/Sep/2015:09:30:08 -0700] conn=12 op=446 VLV 200:0:20150914093009Z 
0:0 (0)
[14/Sep/2015:09:30:08 -0700] conn=12 op=446 RESULT err=0 tag=101 
nentries=0 etime=0 notes=U
[14/Sep/2015:09:30:08 -0700] conn=12 op=447 SRCH 
base="ou=certificateRepository,ou=ca,o=ipaca" scope=0 
filter="(|(objectClass=*)(objectClass=ldapsubentry))" attrs="description"
[14/Sep/2015:09:30:08 -0700] conn=12 op=447 RESULT err=0 tag=101 
nentries=1 etime=0

[14/Sep/2015:09:30:19 -0700] conn=322 op=6 UNBIND

Then in the ldap error log I see this, which makes me think there is a 
problem with the changelog:


[14/Sep/2015:09:30:03 -0700] - dn2entry_ext: Failed to get id for 
changenumber=91314,cn=changelog from entryrdn index (-30993)
[14/Sep/2015:09:30:03 -0700] - Operation error fetching 
changenumber=91314,cn=changelog (null), error -30993.
[14/Sep/2015:09:30:03 -0700] DSRetroclPlugin - replog: an error 
occured while adding change number 91314, dn = 
changenumber=91314,cn=changelog: Operations error.
[14/Sep/2015:09:30:03 -0700] retrocl-plugin - retrocl_postob: 
operation failure [1]


After this both kdc and ldap stop responding. In the krb5kdc.log I see 
server errors after the user-del command is run. The only way to 
resume normal operations is to restart the whole machine. ipactl 
restart doesn't work.


Any help would be highly appreciated!




-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] How to turn off RC4 in 389ds???

2015-09-23 Thread Ludwig Krispenz


On 09/23/2015 05:05 PM, Michael Lasevich wrote:
Yes, I am talking about 389ds as is integrated in FreeIPA (would be 
silly to post completely non-IPA questions to this list...).
I am running FreeIPA 4.1.4 on CentOS 7.1 and RC4 is enabled on port 
636 no matter what I do.


I am running "CentOS Linux release 7.1.1503 (Core)"

Relevant Packages:

freeipa-server-4.1.4-1.el7.centos.x86_64
389-ds-base-1.3.3.8-1.el7.centos.x86_64
nss-3.19.1-5.el7_1.x86_64
openssl-1.0.1e-42.el7.9.x86_64

LDAP setting (confirmed that in error.log there is no menition of RC4 
in list of ciphers):


nsSSL3Ciphers: 
-rc4,-rc4export,-rc2,-rc2export,-des,-desede3,-rsa_rc4_128_md5,-rsa_rc4_128_sha,+rsa_3des_sha,-rsa_des_sha,+rsa_fips_3des_sha,+fips_3des_sha,-rsa_fips_des_sha,-fips_des_sha,-rsa_rc4_40_md5,-rsa_rc2_40_md5,-rsa_null_md5,-rsa_null_sha,-tls_rsa_export1024_with_rc4_56_sha,-rsa_rc4_56_sha,-tls_rsa_export1024_with_des_cbc_sha,-rsa_des_56_sha,-fortezza,-fortezza_rc4_128_sha,-fortezza_null,-dhe_dss_des_sha,+dhe_dss_3des_sha,-dhe_rsa_des_sha,+dhe_rsa_3des_sha,+tls_rsa_aes_128_sha,+rsa_aes_128_sha,+tls_dhe_dss_aes_128_sha,+tls_dhe_rsa_aes_128_sha,+tls_rsa_aes_256_sha,+rsa_aes_256_sha,+tls_dhe_dss_aes_256_sha,+tls_dhe_rsa_aes_256_sha,-tls_dhe_dss_1024_rc4_sha,-tls_dhe_dss_rc4_128_sha



with ipa the config entry should contain:

dn: cn=encryption,cn=config
allowWeakCipher: off
nsSSL3Ciphers: +all

could you try this setting


Slapd "error" log showing no ciphersuites supporting RC4:

[23/Sep/2015:08:51:04 -0600] SSL Initialization - Configured SSL 
version range: min: TLS1.0, max: TLS1.2
[23/Sep/2015:08:51:04 -0600] - SSL alert: Cipher suite fortezza is not 
available in NSS 3.16.  Ignoring fortezza
[23/Sep/2015:08:51:04 -0600] - SSL alert: Cipher suite 
fortezza_rc4_128_sha is not available in NSS 3.16. Ignoring 
fortezza_rc4_128_sha
[23/Sep/2015:08:51:04 -0600] - SSL alert: Cipher suite fortezza_null 
is not available in NSS 3.16.  Ignoring fortezza_null

[23/Sep/2015:08:51:04 -0600] - SSL alert: Configured NSS Ciphers
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_DHE_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_DHE_DSS_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_DHE_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_DHE_DSS_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - SSL alert: 
TLS_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2015:08:51:04 -0600] - 389-Directory/1.3.3.8  
B2015.040.128 starting up



But sslscan returns:

$ sslscan --no-failed localhost:636
...

Supported Server Cipher(s):

Accepted  TLSv1  256 bits AES256-SHA
Accepted  TLSv1  128 bits  AES128-SHA
Accepted  TLSv1  128 bits  DES-CBC3-SHA
Accepted  TLSv1  128 bits  RC4-SHA
Accepted  TLSv1  128 bits  RC4-MD5
Accepted  TLS11  256 bits  AES256-SHA
Accepted  TLS11  128 bits  AES128-SHA
Accepted  TLS11  128 bits  DES-CBC3-SHA
Accepted  TLS11  128 bits  RC4-SHA
Accepted  TLS11  128 bits  RC4-MD5
Accepted  TLS12  256 bits  AES256-SHA256
Accepted  TLS12  256 bits  AES256-SHA
Accepted  TLS12  128 bits  AES128-GCM-SHA256
Accepted  TLS12  128 bits  AES128-SHA256
Accepted  TLS12  128 bits  AES128-SHA
Accepted  TLS12  128 bits  DES-CBC3-SHA
Accepted  TLS12  128 bits  RC4-SHA
Accepted  TLS12  128 bits  RC4-MD5

...


I would assume the sslscan is broken, but nmap and other scanners all 
confirm that RC4 is still on.


-M


On Wed, Sep 23, 2015 at 3:35 AM, Martin Kosek > wrote:


On 09/23/2015 11:00 AM, Michael Lasevich wrote:
> OK, this is most bizarre issue,
>
> I am trying to disable RC4 based TLS Cipher Suites in LDAPs(port
636) and
> for the life of me cannot get it to work
>
> I have followed many nearly identical instructions to create
ldif file and
> change "nsSSL3Ciphers" in "cn=encryption,cn=config". Seems
simple enough -
> and I get it to take, and during the startup I can see the right
SSL Cipher
> Suites listed in errors.log - but when it starts and I probe it, RC4
> ciphers are still there. I am completely confused.
>
> I tried setting "nsSSL3Ciphers" to "default" (which does not
have "RC4")
> and to old style cyphers lists(lowercase), and new style cypher
> lists(uppercase), and nothing seems to make any difference.
>
> Any ideas?
>
> -M

Are you asking about standalone 389-DS or the one integrated in
FreeIPA? As
with currently supported versions of FreeIPA, RC4 ciphers should
be already
gone, AFAIK.

In RHEL/CentOS world, it should be fixed in 6.7/7.1 or later:

https://bugzilla.redhat.com/show_bug.cgi?id=1154687
https://fedorahosted.org/freeipa/ticket/4653






-- 
Manage your subscription for the Freeipa-users mailing list:
https

Re: [Freeipa-users] Problem with replica

2015-09-24 Thread Ludwig Krispenz

Hi,

can you try to get a core dump:

http://directory.fedoraproject.org/docs/389ds/FAQ/faq.html#debug_crashes

and open a ticket for 389 DS: https://fedorahosted.org/389/newticket

Ludwig

On 09/24/2015 09:08 AM, Nicola Canepa wrote:
Hello, I'm trying to setup a partial replica of the LDAP tree stored 
in 389-ds by FreeIPA 4.1 (under CentOS 7), so that legacy systems have 
a local copy of the data needed to authenticate.
Those systems have already OpenLDAP installed, so I 'm trying to 
enable syncrepl from DS to OL.
I followed this ticket: https://fedorahosted.org/freeipa/ticket/3967 
and I enabled the 2 plugins as indicated.
When the slave starts and tries to sync, the ns-slapd process on 
FreeIPA server dies, with this in syslog:
kernel: ns-slapd[4801]: segfault at 0 ip 7f0f041f2db6 sp 
7f0ecc7f0f38 error 4 in libc-2.17.so[7f0f0416e000+1b6000]

immediately (same second) followed by:
named[1974]: LDAP error: Can't contact LDAP server: ldap_sync_poll() 
failed

named[1974]: ldap_syncrepl will reconnect in 60 seconds
systemd: dirsrv@XXX.service: main process exited, code=killed, 
status=11/SEGV


There is nothing in access or error log (found in 
/var/log/dirsrv/INSTANCE) at that second (last log is 30 seconds 
before the problem).


Even if replica doesn't work, I think it shoundn't kill the daemon.


The ldif used on the slave:

dn: olcDatabase={1}bdb,cn=config
changetype: modify
replace:olcSyncrepl
olcSyncrepl: rid=0001
  provider=ldap://AAA.TLD
  type=refreshOnly
  interval=00:1:00:00
  retry="5 5 300 +"
  searchbase="YYY"
  attrs="*,+"
  bindmethod=simple
  binddn="uid=XXX,cn=users,cn=accounts,dc=YYY"
  credentials=ZZZ



Nicola



--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] multiple ds instances (maybe off-topic)

2016-06-28 Thread Ludwig Krispenz


On 06/28/2016 09:50 AM, Natxo Asenjo wrote:



On Tue, Jun 28, 2016 at 9:07 AM, Alexander Bokovoy 
mailto:aboko...@redhat.com>> wrote:


On Tue, 28 Jun 2016, Natxo Asenjo wrote:

hi,

according to the RHDS documentation (

https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/8.1/html-single/Using_the_Admin_Server/index.html)
one can have multiple directory server instances on the same hosts

Would it be interesting to offer this functionality in
freeipa.org ? The
business case would be to allow different kinds of
authentication per
instance/port. So one could block standard ldap connections on
port 389 to
the internet, for instance, but allow them on another port
only if using
external/GSSAPI auth, so no passswords would be involved.

This is not how instances work in 389-ds. Each instance is fully
independent of another one, including database content and structure.
You cannot have instance that shares the same content with another one
unless you enable database chaining (and then there are some
limitations).


ok, thanks for the info.

We used to have CA instance separate from the main IPA instance, for
example, but then merged them together in the same instance using two
different backends.

Standard IPA 389-ds instance already allows its access on the unix
domain
socket with EXTERNAL/GSSAPI authentication. It is visible only within
the scope of the IPA master host, of course.

I'm still not sure what exactly you would like to achieve. All ports
that 389-ds listens to do support the same authentication methods
except
LDAPI protocol (unix domain sockets) which supports automapping
between
POSIX ID and a user object that it maps to.


I'd like to have internally all sort of ldap access, but externally 
onlly certificate based, for example.


If there is a way to do that know that I am not aware of I'd be very 
interested to know it as well ;-). Right now we solve this problems 
using vpn connections with third parties, but ideally one could just 
open the port to the internet if only that kind of access was allowed.
maybe you can achieve this with access control, there are all kind of 
rules to allow access based on client's ip address, domain, security 
strength, authentication method - and combinations of them.



Thanks for your time.

--
regards,
Natxo





--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] multiple ds instances (maybe off-topic)

2016-06-28 Thread Ludwig Krispenz


On 06/28/2016 10:33 AM, Natxo Asenjo wrote:


hi Ludwig,

On Tue, Jun 28, 2016 at 10:03 AM, Ludwig Krispenz <mailto:lkris...@redhat.com>> wrote:



On 06/28/2016 09:50 AM, Natxo Asenjo wrote:


I'd like to have internally all sort of ldap access, but
externally onlly certificate based, for example.

If there is a way to do that know that I am not aware of I'd be
very interested to know it as well ;-). Right now we solve this
problems using vpn connections with third parties, but ideally
one could just open the port to the internet if only that kind of
access was allowed.

maybe you can achieve this with access control, there are all kind
of rules to allow access based on client's ip address, domain,
security strength, authentication method - and combinations of them.


Do you mean something like explained here: 
http://directory.fedoraproject.org/docs/389ds/design/rootdn-access-control.html 
?

I was thinking of something like this (and the other bind rules):

https://access.redhat.com/documentation/en-US/Red_Hat_Directory_Server/10/html/Administration_Guide/Managing_Access_Control-Bind_Rules.html#Bind_Rules-Defining_Access_Based_on_Authentication_Method

the link you sent is about restraing access of directory manager, which 
is not subject to normal acis


Thanks!
--
Groeten,
natxo




--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-06-30 Thread Ludwig Krispenz

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's installed on 
CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful


Before each crash, I have these messages in 
/var/log/dirsrv/slapd-X/errors :


[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file 
encoding.c, line 171]: generating kerberos keys failed [Invalid argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file encoding.c, 
line 225]: key encryption/encoding failed



Any help?
Best regards



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-06-30 Thread Ludwig Krispenz


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the debuginfo for 
ipa-server.

and then someone familiar with this plugin should look into it


Regards


On 2016-06-30 12:13, Ludwig Krispenz wrote:

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's installed 
on CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful


Before each crash, I have these messages in 
/var/log/dirsrv/slapd-X/errors :


[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file 
encoding.c, line 171]: generating kerberos keys failed [Invalid 
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c, line 225]: key encryption/encoding failed



Any help?
Best regards



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-06-30 Thread Ludwig Krispenz


On 06/30/2016 02:45 PM, Ludwig Krispenz wrote:


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the debuginfo for 
ipa-server.

but tje stack matches the error messages you have seen
[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file 
encoding.c, line 171]: generating kerberos keys failed [Invalid argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file encoding.c, 
line 225]: key encryption/encoding failed

they are from the function sin the call stack.

Looks like the user has a password with a \351 char:
cred = {bv_len = 15, bv_val = 0x7fc7880013a0 "d\351sertification"}

does the crash always happen with a bind from this user ?


and then someone familiar with this plugin should look into it


Regards


On 2016-06-30 12:13, Ludwig Krispenz wrote:

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's installed 
on CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful


Before each crash, I have these messages in 
/var/log/dirsrv/slapd-X/errors :


[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file 
encoding.c, line 171]: generating kerberos keys failed [Invalid 
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c, line 225]: key encryption/encoding failed



Any help?
Best regards



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander




--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-07-01 Thread Ludwig Krispenz

please keep the discussion on the mailing list
On 07/01/2016 01:17 PM, Omar AKHAM wrote:

Which package to install ? ipa-debuginfo?

yes


2 other crashes last night, with a different user bind this time :

rawdn = 0x7f620003a200 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"

dn = 0x7f62000238b0 "uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"
saslmech = 0x0
cred = {bv_len = 9, bv_val = 0x7f6200034af0 
"nw_PA\250\063\065\067"}

be = 0x7f6254941c20
ber_rc = 
rc = 0
sdn = 0x7f62000313f0
bind_sdn_in_pb = 1
referral = 0x0
errorbuf = '\000' ...
supported = 
pmech = 
authtypebuf = 
"\000\000\000\000\000\000\000\000\370\030\002\000b\177\000\000\360\030\002\000b\177\000\000\320\030\002\000b\177\000\000\001\000
\000\000\000\000\000\000\250\311\377+b\177\000\000\320\352\377+b\177\000\000\200\376\002\000b\177\000\000\262\202\211Rb\177\000\000\260\311\377+b\177\ 

000\000\000\000\000\000\000\000\000\000&\272\200Rb\177\000\000\000\000\000\000\000\000\000\000<\224\204Rb\177\000\000\260\311\377+b\177\000\000\000\00 

0\000\000\000\000\000\000\210\311\377+b\177\000\000\250\311\377+b\177", '\000' 
, "\002\000\000\000 \305\363Tb\177\000\000\377\377\37
7\377\377\377\377\377\320\030\002\000b\177\000\000\000\000\000\000\000\000\000\000~a\003\000b\177", 
'\000' 

bind_target_entry = 0x0



On 2016-06-30 18:16, Ludwig Krispenz wrote:

On 06/30/2016 05:54 PM, d...@mdfive.dz wrote:
The crash is random, sometimes the user binds without probleme, 
sometimes it bind and there is the error message of ipa plugin 
without dirsrv crash. But when it crashes, this user's bind is found 
in the new  generated core file!

ok, so the user might try or use different passwords. it could be
helpful if you can install the debuginfo for the ipa-server package
and get a new stack. Please post it to teh list, you can X the
credentials in the core, although I think they will not be proper
credentials.

Ludwig


On 2016-06-30 14:50, Ludwig Krispenz wrote:

On 06/30/2016 02:45 PM, Ludwig Krispenz wrote:


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the debuginfo 
for ipa-server.

but tje stack matches the error messages you have seen
[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file
encoding.c, line 171]: generating kerberos keys failed [Invalid
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file encoding.c,
line 225]: key encryption/encoding failed
they are from the function sin the call stack.

Looks like the user has a password with a \351 char:
cred = {bv_len = 15, bv_val = 0x7fc7880013a0 "d\351sertification"}

does the crash always happen with a bind from this user ?


and then someone familiar with this plugin should look into it


Regards


On 2016-06-30 12:13, Ludwig Krispenz wrote:

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's 
installed on CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful


Before each crash, I have these messages in 
/var/log/dirsrv/slapd-X/errors :


[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - 
[file encoding.c, line 171]: generating kerberos keys failed 
[Invalid argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c, line 225]: key encryption/encoding failed



Any help?
Best regards



-- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: 
Grasbrunn,

Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander




-- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: 
Grasbrunn,

Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-07-04 Thread Ludwig Krispenz


On 07/03/2016 03:04 PM, Omar AKHAM wrote:

Where can i find core file of ipa-server?
you still need to look for the core file of slapd, but IPA deploys 
plugins for slapd and that  is why you need the debuginfo for ipa-server 
for a better analysis of the slapd core.


On 2016-07-01 13:29, Ludwig Krispenz wrote:

please keep the discussion on the mailing list
On 07/01/2016 01:17 PM, Omar AKHAM wrote:

Which package to install ? ipa-debuginfo?

yes


2 other crashes last night, with a different user bind this time :

rawdn = 0x7f620003a200 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"

dn = 0x7f62000238b0 "uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"
saslmech = 0x0
cred = {bv_len = 9, bv_val = 0x7f6200034af0 
"nw_PA\250\063\065\067"}

be = 0x7f6254941c20
ber_rc = 
rc = 0
sdn = 0x7f62000313f0
bind_sdn_in_pb = 1
referral = 0x0
errorbuf = '\000' ...
supported = 
pmech = 
authtypebuf = 
"\000\000\000\000\000\000\000\000\370\030\002\000b\177\000\000\360\030\002\000b\177\000\000\320\030\002\000b\177\000\000\001\000
\000\000\000\000\000\000\250\311\377+b\177\000\000\320\352\377+b\177\000\000\200\376\002\000b\177\000\000\262\202\211Rb\177\000\000\260\311\377+b\177\ 
000\000\000\000\000\000\000\000\000\000&\272\200Rb\177\000\000\000\000\000\000\000\000\000\000<\224\204Rb\177\000\000\260\311\377+b\177\000\000\000\00 
0\000\000\000\000\000\000\210\311\377+b\177\000\000\250\311\377+b\177", 
'\000' , "\002\000\000\000 
\305\363Tb\177\000\000\377\377\37
7\377\377\377\377\377\320\030\002\000b\177\000\000\000\000\000\000\000\000\000\000~a\003\000b\177", 
'\000' 

bind_target_entry = 0x0



On 2016-06-30 18:16, Ludwig Krispenz wrote:

On 06/30/2016 05:54 PM, d...@mdfive.dz wrote:
The crash is random, sometimes the user binds without probleme, 
sometimes it bind and there is the error message of ipa plugin 
without dirsrv crash. But when it crashes, this user's bind is 
found in the new generated core file!

ok, so the user might try or use different passwords. it could be
helpful if you can install the debuginfo for the ipa-server package
and get a new stack. Please post it to teh list, you can X the
credentials in the core, although I think they will not be proper
credentials.

Ludwig


On 2016-06-30 14:50, Ludwig Krispenz wrote:

On 06/30/2016 02:45 PM, Ludwig Krispenz wrote:


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the 
debuginfo for ipa-server.

but tje stack matches the error messages you have seen
[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file
encoding.c, line 171]: generating kerberos keys failed [Invalid
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c,

line 225]: key encryption/encoding failed
they are from the function sin the call stack.

Looks like the user has a password with a \351 char:
cred = {bv_len = 15, bv_val = 0x7fc7880013a0 "d\351sertification"}

does the crash always happen with a bind from this user ?


and then someone familiar with this plugin should look into it


Regards


On 2016-06-30 12:13, Ludwig Krispenz wrote:

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's 
installed on CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Service: RUNNING
kadmin Service: RUNNING
ipa_memcached Service: RUNNING
httpd Service: RUNNING
pki-tomcatd Service: RUNNING
ipa-otpd Service: RUNNING
ipa: INFO: The ipactl command was successful


Before each crash, I have these messages in 
/var/log/dirsrv/slapd-X/errors :


[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - 
[file encoding.c, line 171]: generating kerberos keys failed 
[Invalid argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c, line 225]: key encryption/encoding failed



Any help?
Best regards



-- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: 
Grasbrunn,

Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander




-- Red Hat GmbH, http://www.de.redhat.com/, Registered seat: 
Grasbrunn,

Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael
O'Neill, Eric Shander


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Mich

Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-07-05 Thread Ludwig Krispenz

well, this does not have more information:
#0  0x7efe7167c4c0 in ipapwd_keyset_free () from 
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so

No symbol table info available.
#1  0x7efe7167c742 in ipapwd_encrypt_encode_key () from 
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so

No symbol table info available.
#2  0x7efe7167c9c8 in ipapwd_gen_hashes () from 
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so

No symbol table info available.
#3  0x7efe7167c0a7 in ipapwd_SetPassword () from 
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so

No symbol table info available.
#4  0x7efe7167e458 in ipapwd_pre_bind () from 
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so

No symbol table info available.

and it looks like a bug in the ipapwd plugin, we would have to reproduce 
and work on a fix. I don't see any immediate relief unless you cannot 
prevent clients from using password containing arbitrar octets.
Please open a ticket to get this worked on: 
https://fedorahosted.org/freeipa/newticket


Ludwig

On 07/05/2016 12:07 AM, Omar AKHAM wrote:

Ok, here is a new core file : http://pastebin.com/2cJQymHd

Best regards

On 2016-07-04 09:39, Ludwig Krispenz wrote:

On 07/03/2016 03:04 PM, Omar AKHAM wrote:

Where can i find core file of ipa-server?

you still need to look for the core file of slapd, but IPA deploys
plugins for slapd and that  is why you need the debuginfo for
ipa-server for a better analysis of the slapd core.


On 2016-07-01 13:29, Ludwig Krispenz wrote:

please keep the discussion on the mailing list
On 07/01/2016 01:17 PM, Omar AKHAM wrote:

Which package to install ? ipa-debuginfo?

yes


2 other crashes last night, with a different user bind this time :

rawdn = 0x7f620003a200 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"
dn = 0x7f62000238b0 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"

saslmech = 0x0
cred = {bv_len = 9, bv_val = 0x7f6200034af0 
"nw_PA\250\063\065\067"}

be = 0x7f6254941c20
ber_rc = 
rc = 0
sdn = 0x7f62000313f0
bind_sdn_in_pb = 1
referral = 0x0
errorbuf = '\000' ...
supported = 
pmech = 
authtypebuf = 
"\000\000\000\000\000\000\000\000\370\030\002\000b\177\000\000\360\030\002\000b\177\000\000\320\030\002\000b\177\000\000\001\000
\000\000\000\000\000\000\250\311\377+b\177\000\000\320\352\377+b\177\000\000\200\376\002\000b\177\000\000\262\202\211Rb\177\000\000\260\311\377+b\177\ 
000\000\000\000\000\000\000\000\000\000&\272\200Rb\177\000\000\000\000\000\000\000\000\000\000<\224\204Rb\177\000\000\260\311\377+b\177\000\000\000\00 
0\000\000\000\000\000\000\210\311\377+b\177\000\000\250\311\377+b\177", 
'\000' , "\002\000\000\000 
\305\363Tb\177\000\000\377\377\37
7\377\377\377\377\377\320\030\002\000b\177\000\000\000\000\000\000\000\000\000\000~a\003\000b\177", 
'\000' 

bind_target_entry = 0x0



On 2016-06-30 18:16, Ludwig Krispenz wrote:

On 06/30/2016 05:54 PM, d...@mdfive.dz wrote:
The crash is random, sometimes the user binds without probleme, 
sometimes it bind and there is the error message of ipa plugin 
without dirsrv crash. But when it crashes, this user's bind is 
found in the new generated core file!

ok, so the user might try or use different passwords. it could be
helpful if you can install the debuginfo for the ipa-server package
and get a new stack. Please post it to teh list, you can X the
credentials in the core, although I think they will not be proper
credentials.

Ludwig


On 2016-06-30 14:50, Ludwig Krispenz wrote:

On 06/30/2016 02:45 PM, Ludwig Krispenz wrote:


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the 
debuginfo for ipa-server.

but tje stack matches the error messages you have seen
[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file
encoding.c, line 171]: generating kerberos keys failed [Invalid
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c,

line 225]: key encryption/encoding failed
they are from the function sin the call stack.

Looks like the user has a password with a \351 char:
cred = {bv_len = 15, bv_val = 0x7fc7880013a0 "d\351sertification"}

does the crash always happen with a bind from this user ?


and then someone familiar with this plugin should look into it


Regards


On 2016-06-30 12:13, Ludwig Krispenz wrote:

can you get a core file ?
http://www.port389.org/docs/389ds/FAQ/faq.html#debug_crashes


On 06/30/2016 11:28 AM, d...@mdfive.dz wrote:

Hi,

The Directory Services crashes several times a day. It's 
installed on CentOS 7 VM :


Installed Packages
Name: ipa-server
Arch: x86_64
Version : 4.2.0

# ipactl status
Directory Service: STOPPED
krb5kdc Se

Re: [Freeipa-users] FreeIPA (directory service) Crash several times a day

2016-07-05 Thread Ludwig Krispenz


On 07/05/2016 12:08 PM, Omar AKHAM wrote:

OK thanks. Ticket URL : https://fedorahosted.org/freeipa/ticket/6030
thanks, I tried to reproduce and failed so far, could you add some 
information to the ticket on

- how the entry was created
- a full entry which was seen to crash the server, you don't need to 
reveal any real data, jsur which objectclasses and attributes the entry has


On 2016-07-05 10:51, Ludwig Krispenz wrote:

well, this does not have more information:
#0  0x7efe7167c4c0 in ipapwd_keyset_free () from
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so
No symbol table info available.
#1  0x7efe7167c742 in ipapwd_encrypt_encode_key () from
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so
No symbol table info available.
#2  0x7efe7167c9c8 in ipapwd_gen_hashes () from
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so
No symbol table info available.
#3  0x7efe7167c0a7 in ipapwd_SetPassword () from
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so
No symbol table info available.
#4  0x7efe7167e458 in ipapwd_pre_bind () from
/usr/lib64/dirsrv/plugins/libipa_pwd_extop.so
No symbol table info available.

and it looks like a bug in the ipapwd plugin, we would have to
reproduce and work on a fix. I don't see any immediate relief unless
you cannot prevent clients from using password containing arbitrar
octets.
Please open a ticket to get this worked on:
https://fedorahosted.org/freeipa/newticket

Ludwig

On 07/05/2016 12:07 AM, Omar AKHAM wrote:

Ok, here is a new core file : http://pastebin.com/2cJQymHd

Best regards

On 2016-07-04 09:39, Ludwig Krispenz wrote:

On 07/03/2016 03:04 PM, Omar AKHAM wrote:

Where can i find core file of ipa-server?

you still need to look for the core file of slapd, but IPA deploys
plugins for slapd and that  is why you need the debuginfo for
ipa-server for a better analysis of the slapd core.


On 2016-07-01 13:29, Ludwig Krispenz wrote:

please keep the discussion on the mailing list
On 07/01/2016 01:17 PM, Omar AKHAM wrote:

Which package to install ? ipa-debuginfo?

yes


2 other crashes last night, with a different user bind this time :

rawdn = 0x7f620003a200 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"
dn = 0x7f62000238b0 
"uid=XXX,cn=users,cn=accounts,dc=XXX,dc=XX"

saslmech = 0x0
cred = {bv_len = 9, bv_val = 0x7f6200034af0 
"nw_PA\250\063\065\067"}

be = 0x7f6254941c20
ber_rc = 
rc = 0
sdn = 0x7f62000313f0
bind_sdn_in_pb = 1
referral = 0x0
errorbuf = '\000' ...
supported = 
pmech = 
authtypebuf = 
"\000\000\000\000\000\000\000\000\370\030\002\000b\177\000\000\360\030\002\000b\177\000\000\320\030\002\000b\177\000\000\001\000
\000\000\000\000\000\000\250\311\377+b\177\000\000\320\352\377+b\177\000\000\200\376\002\000b\177\000\000\262\202\211Rb\177\000\000\260\311\377+b\177\ 
000\000\000\000\000\000\000\000\000\000&\272\200Rb\177\000\000\000\000\000\000\000\000\000\000<\224\204Rb\177\000\000\260\311\377+b\177\000\000\000\00 
0\000\000\000\000\000\000\210\311\377+b\177\000\000\250\311\377+b\177", 
'\000' , "\002\000\000\000 
\305\363Tb\177\000\000\377\377\37
7\377\377\377\377\377\320\030\002\000b\177\000\000\000\000\000\000\000\000\000\000~a\003\000b\177", 
'\000' 

bind_target_entry = 0x0



On 2016-06-30 18:16, Ludwig Krispenz wrote:

On 06/30/2016 05:54 PM, d...@mdfive.dz wrote:
The crash is random, sometimes the user binds without 
probleme, sometimes it bind and there is the error message of 
ipa plugin without dirsrv crash. But when it crashes, this 
user's bind is found in the new generated core file!

ok, so the user might try or use different passwords. it could be
helpful if you can install the debuginfo for the ipa-server 
package

and get a new stack. Please post it to teh list, you can X the
credentials in the core, although I think they will not be proper
credentials.

Ludwig


On 2016-06-30 14:50, Ludwig Krispenz wrote:

On 06/30/2016 02:45 PM, Ludwig Krispenz wrote:


On 06/30/2016 02:27 PM, d...@mdfive.dz wrote:

Hi,

Please find strace on a core file : 
http://pastebin.com/v9cUzau4

the crash is in an IPA plugin, ipa_pwd_extop,
to get a better stack you would have to install also the 
debuginfo for ipa-server.

but tje stack matches the error messages you have seen
[30/Jun/2016:09:35:19 +0100] ipapwd_encrypt_encode_key - [file
encoding.c, line 171]: generating kerberos keys failed [Invalid
argument]
[30/Jun/2016:09:35:19 +0100] ipapwd_gen_hashes - [file 
encoding.c,

line 225]: key encryption/encoding failed
they are from the function sin the call stack.

Looks like the user has a password with a \351 char:
cred = {bv_len = 15, bv_val = 0x7fc7880013a0 
"d\351sertification"}


does the crash always happen with a bind from this user ?


and then someone familiar with this plugin should look into it


Regard

Re: [Freeipa-users] Could not delete change record

2016-07-12 Thread Ludwig Krispenz


On 07/12/2016 11:25 AM, Christophe TREFOIS wrote:

Hi,

I have 3 replicas running 4.1 and 3 replicas running 4.2.

One of the 4.2 replicas is the new master (CRL) and is at the moment 
replicating against the old 4.1 cluster (we are in the process of 
migrating).


Upon restart of the 4.2 master, I receive many messages in slapd error 
log about delete_changerecord as seen below.


Is this something to worry about, or will it go away by itself?
it should go away, it is a problem of incorrect starting point for retro 
changelog trimming and it tries to remove already deleted records.


Thank you for your help,

[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15892 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15893 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15894 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15895 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15896 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15897 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15898 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15899 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15900 (rc: 32)
[12/Jul/2016:11:16:43 +0200] DSRetroclPlugin - delete_changerecord: 
could not delete change record 15901 (rc: 32)


*Christophe*





--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] unable to delete a replica server

2016-08-12 Thread Ludwig Krispenz

Hi Torsten,

I haven't seen which version you are using. There was a bug in ipa where 
it attempted to delete a master before all services were deleted: 
https://fedorahosted.org/freeipa/ticket/5019


You can delete the services below the master by using ldapmodify, but I 
am not sure if this will be sufficient.


Ludwig

On 08/12/2016 08:06 AM, Torsten Harenberg wrote:

Am 11.08.16 um 17:58 schrieb Rob Crittenden:

Torsten Harenberg wrote:

Hi,

we have three ipa servers

- ipa
- ipa2
- ipacentos7

We wanted to re-install ipa2 from scratch as this server gave us strange
issues in the past (for example, you have to do a "ipactl stop && ipactl
start" after boot to have everything running - a step which is not
needed on the other two).

However, the ipa-replica-manage del ipa2.pleiades.uni-wuppertal.de gave
an error at the end (it scrolled out of the terminal, but ended with
"unexpected error: Not allowed on non-leaf entry").

It seems to be impossible to get rid of this replica now:

[root@ipa ~]#  ipa-replica-manage -v -f -c  del
ipa2.pleiades.uni-wuppertal.de
Directory Manager password:

Cleaning a master is irreversible.
This should not normally be require, so use cautiously.
Continue to clean master? [no]: yes
unexpected error: Not allowed on non-leaf entry
[root@ipa ~]# ipa-replica-manage list
Directory Manager password:

ipacentos7.pleiades.uni-wuppertal.de: master
ipa.pleiades.uni-wuppertal.de: master
ipa2.pleiades.uni-wuppertal.de: master
[root@ipa ~]#

[root@ipa ~]#  ipa-csreplica-manage -v del ipa2.pleiades.uni-wuppertal.de
Directory Manager password:

Deleted replication agreement from 'ipa.pleiades.uni-wuppertal.de' to
'ipa2.pleiades.uni-wuppertal.de'
[root@ipa ~]# ipa-replica-manage list
Directory Manager password:

ipacentos7.pleiades.uni-wuppertal.de: master
ipa.pleiades.uni-wuppertal.de: master
ipa2.pleiades.uni-wuppertal.de: master
[root@ipa ~]#

Any ideas how to proceed from here?

Seems like an error that LDAP is throwing. There might be details in
/var/log/dirsrv/slapd-REALM/{access|errors}

It sounds like when IPA tried to delete some entry it failed because
that entry has children. The logs should help pinpoint which entry it is
failing on.

rob


Hmm.. unfortunately, there is nothing which tells us here something. The
last entries in error containing "ipa2" are

[11/Aug/2016:12:54:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:12:54:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:12:54:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:12:59:59 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:12:59:59 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:12:59:59 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:09:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:09:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:09:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:24:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:24:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:24:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:39:46 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:39:46 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[11/Aug/2016:13:39:46 +0200] attrlist_replace - attr_replace
(nsslapd-referral, ldap://ipa2.pleiades.uni-wuppertal.de:389/o%3Dipaca)
failed.
[root@ipa slapd-PLEIADES-UNI-WUPPERTAL-DE]#

And those stopped after issuing the ipa-replica-manage del command for
the first time.

Surprisingly, these messages are in the log even for the freshly
installed "ipacentos7" replica:

[root@ipa slapd-PLEIADES-UNI-WUPPERTAL-DE]# tail -3 errors
[12/Aug/2016:07:24:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral,
ldap://ipacentos7.pleiades.uni-wuppertal.de:389/o%3Dipaca) failed.
[12/Aug/2016:07:24:43 +0200] attrlist_replace - attr_replace
(nsslapd-referral,
ldap://ipacentos7.pleiades.uni-wuppertal.

Re: [Freeipa-users] Problem with replication

2016-08-12 Thread Ludwig Krispenz


On 08/12/2016 04:10 PM, Louis Francoeur wrote:


Since the rpm update to 
ipa-server-dns-4.2.0-15.0.1.el7.centos.18.x86_64 (running on Centos 7),



most of my replication started to failed with:

what do you mean by "most of", if some servers still work and others 
don't is there something different ?



last update status: -1 Incremental update has failed and requires 
administrator actionLDAP error: Can't contact LDAP server


what is in the error log of directory server ? Identify one broken 
replication connection and check both supplier and consumer side



Then setup contains about 10 ipa servers in 5 different locations.


But i went and ran an ipa-replica-conncheck i get this:


# ipa-replica-conncheck --replica server.domain.local
Check connection from master to remote replica 'server.domain.local':
   Directory Service: Unsecure port (389): OK
   Directory Service: Secure port (636): OK
   Kerberos KDC: TCP (88): OK
   Kerberos KDC: UDP (88): WARNING
   Kerberos Kpasswd: TCP (464): OK
   Kerberos Kpasswd: UDP (464): WARNING
   HTTP Server: Unsecure port (80): OK
   HTTP Server: Secure port (443): OK
The following UDP ports could not be verified as open: 88, 464
This can happen if they are already bound to an application
and ipa-replica-conncheck cannot attach own UDP responder.

Connection from master to replica is OK.



I even ran the following without issue:

# kinit -kt /etc/dirsrv/ds.keytab ldap/`hostname`
# klist
# ldapsearch -Y GSSAPI -h `hostname` -b "" -s base
# ldapsearch -Y GSSAPI -h the.other.master.fqdn -b "" -s base

Not really sure what to check for next?

Any hint?


Thanks

Louis Francoeur





--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] replica_generate_next_csn messages in dirsrv error logs

2016-08-18 Thread Ludwig Krispenz


On 08/17/2016 08:54 PM, John Desantis wrote:

Hello all,

We've been re-using old host names and IP addresses for a new
deployment of nodes, and recently I've been seeing the messages pasted
below in the slapd-DC.DC.DC "error" log on our nodes.

[17/Aug/2016:10:30:30 -0400] - replica_generate_next_csn:
opcsn=57b475cd00120004 <= basecsn=57b475cf0016, adjusted
opcsn=57b475cf00010004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f020004 <= basecsn=57b47f020016, adjusted
opcsn=57b47f030004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f040004 <= basecsn=57b47f040016, adjusted
opcsn=57b47f050004
[17/Aug/2016:11:10:33 -0400] - replica_generate_next_csn:
opcsn=57b47f2f0014 <= basecsn=57b47f320016, adjusted
opcsn=57b47f330004
[17/Aug/2016:13:50:45 -0400] - replica_generate_next_csn:
opcsn=57b4a4bb00090004 <= basecsn=57b4a4bc0016, adjusted
opcsn=57b4a4bc00010004
[17/Aug/2016:13:52:54 -0400] - replica_generate_next_csn:
opcsn=57b4a53e000a0004 <= basecsn=57b4a53f0016, adjusted
opcsn=57b4a53f00010004
[17/Aug/2016:13:53:15 -0400] - replica_generate_next_csn:
opcsn=57b4a55200070004 <= basecsn=57b4a5530016, adjusted
opcsn=57b4a55300010004
[17/Aug/2016:13:53:32 -0400] - replica_generate_next_csn:
opcsn=57b4a56200090004 <= basecsn=57b4a5640016, adjusted
opcsn=57b4a56400010004
Each modification (add/del/mod) gets a csn assignged used in replication 
update resolution. And each assigned csn has to newer than an existing 
one. The messages you see is from code that double checks that the entry 
doesn't have already a lareg csn - and adjusts it.
The logs indicate that entries are more or less concurrently updated on 
replica 4 and 16, and the updates from16 are received while processing 
the updates on 4.
This is a normal scenario, but you could check if the simultaneous 
updates on 4 and 16 are intentional.


They seem to only occur when updating DNS entries, whether on the
console or via the GUI (tail -f'ing the log).

A search in this mailing-list returns nothing, but a message is found
on the 389-ds list [1];  it seems to suggest that the messages aren't
fatal and are purely informational, yet if they are occurring
constantly that there could be a problem with the replication
algorithm and/or deployment.

We're using ipa-server 3.0.0-47 and 389-ds 1.2.11.15-60.  Nothing has
changed on the deployment side of things, and I don't recall seeing
this message before.

I'm wondering if it's safe to disregard these messages due to the
re-use of the entries, or if something else should be looked into.

Thank you,
John DeSantis

[1] https://fedorahosted.org/389/ticket/47959



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] replica_generate_next_csn messages in dirsrv error logs

2016-08-18 Thread Ludwig Krispenz


On 08/18/2016 03:15 PM, John Desantis wrote:

Ludwig,

Thank you for your response!


This is a normal scenario, but you could check if the simultaneous updates
on 4 and 16 are intentional.

In regards to the simultaneous updates, the only items I have noted so far are:

*  The time sync between the master (4) and replica (16) was off by
about 1-2 seconds, with the latter being ahead;
yes, this happens, but the replication protocol tries to handle this, in 
a replication session the supplier and consumer
exchange their ruvs and if the time differs the csn state generator is 
updated with a local or remote offset so that the
generated time is always based on the most advanced clock - on all 
servers. And even if you adjust the system time, the csn

time will never go back.

*  There are continual log entries referencing
"replication-multimaster-extop" and "Netscape Replication End Session"
in the dirsrv "access" logs, and during one of the manifestations of
"replica_generate_next_csn", I found this:

PROD:08:46:08-root@REPLICA:/var/log/dirsrv/slapd-DOM-DOM-DOM
# grep -E '17/Aug/2016:13:50:4.*conn=602.*ADD' access.2016081*
access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4143
ADD dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4148
ADD dn="idnsname=server-6-4-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4151
ADD dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"

PROD:08:47:44-root@MASTER:/var/log/dirsrv/slapd-DOM-DOM-DOM
# grep -E '17/Aug/2016:13:50:4.*conn=1395.*ADD' access.2016081*
access.20160817-111940:[17/Aug/2016:13:50:43 -0400] conn=1395 op=4151
ADD dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4158
ADD dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"

It looks like the entries for server-6-3-sp and 6-5-sp were referenced
twice.  Do you think that the time being off by 1-2 seconds between
the master and replica could be the issue?  The connection 602 is the
replication between the replica and master, and the connection 1395 is
the replication between the master and replica.
unfortunately this is not enough to determine what is going on. The 
intersting generated/used csn is only logged in the
corresponding RESULT message and these are only the replication 
connections, it would be necessary to see the

original ADD operation, was it added once or twice by a client ?
you could pick one entry eg server-6-3-sp and grep for all references in 
the access logs of both servers  (maybe there are mods as well) and then

get also get the RESULT line for the ops found


Since I know these operations were performed using the console via a
for loop 'ipa dnsrecord-add dom.dom.dom server-6-$i-sp
--a-rec=10.250.12.$i' on one of our login nodes, do you think that
specifying an _srv_ record in the DOMAIN configuration with the
address of the master server, e.g.: ipa_server = _srv_,
MASTER.dom.dom.dom could be the issue (coupled with the time syncing)?

I know that these questions are probably leaning more towards the
389ds team, so feel free to pass me over to them if need be.
I think I can address the ds related questions, but I don't know about 
console and dns to assess if the behaviour is normal


Again, thank you very much for responding!

John DeSantis

2016-08-18 4:14 GMT-04:00 Ludwig Krispenz :

On 08/17/2016 08:54 PM, John Desantis wrote:

Hello all,

We've been re-using old host names and IP addresses for a new
deployment of nodes, and recently I've been seeing the messages pasted
below in the slapd-DC.DC.DC "error" log on our nodes.

[17/Aug/2016:10:30:30 -0400] - replica_generate_next_csn:
opcsn=57b475cd00120004 <= basecsn=57b475cf0016, adjusted
opcsn=57b475cf00010004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f020004 <= basecsn=57b47f020016, adjusted
opcsn=57b47f030004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f040004 <= basecsn=57b47f040016, adjusted
opcsn=57b47f050004
[17/Aug/2016:11:10:33 -0400] - replica_generate_next_csn:
opcsn=57b47f2f0014 <= basecsn=57b47f320016, adjusted
opcsn=57b47f330004
[17/Aug/2016:13:50:45 -0400] - replica_generate_next_csn:
opcsn=57b4a4bb00090004 <= basecsn=57b4a4bc0016, adjusted
opcsn=57b4a4bc00010004
[17/Aug/2016:13:52:54 -0400] - replica_generate_next_csn:
opcsn=57b4a53e000a0004 <= basecsn=57b4a53f0016, adjusted
opcsn=57b4a53f00010004
[17/Aug/2016:13:53:15 -0400] - replica_generate_next_csn:
opcsn=57b4a552000

Re: [Freeipa-users] replica_generate_next_csn messages in dirsrv error logs

2016-08-19 Thread Ludwig Krispenz
40:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4159
RESULT err=0 tag=103 nentries=0 etime=0
access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4160
RESULT err=0 tag=103 nentries=0 etime=0 csn=57b4a4c30016

I'm positive that I was the only one performing DNS updates during
this time, and I was only using 1 console.

Thanks,
John DeSantis


2016-08-18 10:09 GMT-04:00 Ludwig Krispenz :

On 08/18/2016 03:15 PM, John Desantis wrote:

Ludwig,

Thank you for your response!


This is a normal scenario, but you could check if the simultaneous
updates
on 4 and 16 are intentional.

In regards to the simultaneous updates, the only items I have noted so far
are:

*  The time sync between the master (4) and replica (16) was off by
about 1-2 seconds, with the latter being ahead;

yes, this happens, but the replication protocol tries to handle this, in a
replication session the supplier and consumer
exchange their ruvs and if the time differs the csn state generator is
updated with a local or remote offset so that the
generated time is always based on the most advanced clock - on all servers.
And even if you adjust the system time, the csn
time will never go back.

*  There are continual log entries referencing
"replication-multimaster-extop" and "Netscape Replication End Session"
in the dirsrv "access" logs, and during one of the manifestations of
"replica_generate_next_csn", I found this:

PROD:08:46:08-root@REPLICA:/var/log/dirsrv/slapd-DOM-DOM-DOM
# grep -E '17/Aug/2016:13:50:4.*conn=602.*ADD' access.2016081*
access.20160817-124811:[17/Aug/2016:13:50:42 -0400] conn=602 op=4143
ADD
dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-124811:[17/Aug/2016:13:50:47 -0400] conn=602 op=4148
ADD
dn="idnsname=server-6-4-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-124811:[17/Aug/2016:13:50:49 -0400] conn=602 op=4151
ADD
dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"

PROD:08:47:44-root@MASTER:/var/log/dirsrv/slapd-DOM-DOM-DOM
# grep -E '17/Aug/2016:13:50:4.*conn=1395.*ADD' access.2016081*
access.20160817-111940:[17/Aug/2016:13:50:43 -0400] conn=1395 op=4151
ADD
dn="idnsname=server-6-3-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"
access.20160817-111940:[17/Aug/2016:13:50:49 -0400] conn=1395 op=4158
ADD
dn="idnsname=server-6-5-sp,idnsname=dom.dom.dom,cn=dns,dc=dom,dc=dom,dc=dom"

It looks like the entries for server-6-3-sp and 6-5-sp were referenced
twice.  Do you think that the time being off by 1-2 seconds between
the master and replica could be the issue?  The connection 602 is the
replication between the replica and master, and the connection 1395 is
the replication between the master and replica.

unfortunately this is not enough to determine what is going on. The
intersting generated/used csn is only logged in the
corresponding RESULT message and these are only the replication connections,
it would be necessary to see the
original ADD operation, was it added once or twice by a client ?
you could pick one entry eg server-6-3-sp and grep for all references in the
access logs of both servers  (maybe there are mods as well) and then
get also get the RESULT line for the ops found


Since I know these operations were performed using the console via a
for loop 'ipa dnsrecord-add dom.dom.dom server-6-$i-sp
--a-rec=10.250.12.$i' on one of our login nodes, do you think that
specifying an _srv_ record in the DOMAIN configuration with the
address of the master server, e.g.: ipa_server = _srv_,
MASTER.dom.dom.dom could be the issue (coupled with the time syncing)?

I know that these questions are probably leaning more towards the
389ds team, so feel free to pass me over to them if need be.

I think I can address the ds related questions, but I don't know about
console and dns to assess if the behaviour is normal


Again, thank you very much for responding!

John DeSantis

2016-08-18 4:14 GMT-04:00 Ludwig Krispenz :

On 08/17/2016 08:54 PM, John Desantis wrote:

Hello all,

We've been re-using old host names and IP addresses for a new
deployment of nodes, and recently I've been seeing the messages pasted
below in the slapd-DC.DC.DC "error" log on our nodes.

[17/Aug/2016:10:30:30 -0400] - replica_generate_next_csn:
opcsn=57b475cd00120004 <= basecsn=57b475cf0016, adjusted
opcsn=57b475cf00010004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f020004 <= basecsn=57b47f020016, adjusted
opcsn=57b47f030004
[17/Aug/2016:11:09:44 -0400] - replica_generate_next_csn:
opcsn=57b47f040004 <= basecsn=57b47f040016, adjusted
opcsn=57b47f050004
[17/Aug/2016:11:10:33 -0400] - replica_generate_next_csn:
opcsn=57b47f2f0014 <= basecsn=57b47f320016, adjusted
opcsn=57b47f

Re: [Freeipa-users] replica_generate_next_csn messages in dirsrv error logs

2016-08-22 Thread Ludwig Krispenz

Thanks,

I looked into the logs, I think the messages are harmless, just an 
effect of csn adjustment due to time difference on the two machines. I 
had said that the replication protocol will try to adjust the csn 
generator, but looks like you have long lasting replication connections 
and the adjustment is done only at the beginning. Maybe we can look into 
this and improve it.

Just the tracking of one of these error messages:

the entry is modified on adm3
adm3 :[19/Aug/2016:15:47:05 -0400] conn=13 op=126637 MOD 
dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
adm3 :[19/Aug/2016:15:47:05 -0400] conn=13 op=126637 RESULT err=0 
tag=103 nentries=0 etime=0 csn=57b763030016

this mod is replicated to adm0
adm0 :[19/Aug/2016:15:47:06 -0400] conn=1395 op=86121 MOD 
dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
adm0 :[19/Aug/2016:15:47:06 -0400] conn=1395 op=86121 RESULT err=0 
tag=103 nentries=0 etime=0 csn=57b763030016

the entry is modified again on adm0
adm0 :[19/Aug/2016:15:47:07 -0400] conn=27 op=1108697 MOD 
dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
but it looks like the csn generated is smaller than the one already in 
the entry, and it is adjusted
adm0 :[19/Aug/2016:15:47:07 -0400] - replica_generate_next_csn: 
opcsn=57b76301000a0004 <= basecsn=57b763030016, adjusted 
opcsn=57b7630300010004

then the result is logged with the adjusted csn
adm0 :[19/Aug/2016:15:47:07 -0400] conn=27 op=1108697 RESULT err=0 
tag=103 nentries=0 etime=0 csn=57b7630300010004


so the mechanism works, but the messages may be confusing and 
improvement of the protocol could be investigated.


One question I have, but someone more familiar with dns should answer:
we  have regular updates of the same entry on both replicas, about every 
2 seconds, what is the reason for this ?



/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:03 -0400] conn=13 
op=126630 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:05 -0400] conn=13 
op=126637 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:07 -0400] conn=13 
op=126646 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:09 -0400] conn=13 
op=126653 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:13 -0400] conn=13 
op=12 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:16 -0400] conn=13 
op=126673 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:18 -0400] conn=13 
op=126689 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:20 -0400] conn=13 
op=126696 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:21 -0400] conn=13 
op=126702 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:23 -0400] conn=13 
op=126737 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:26 -0400] conn=13 
op=126758 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"
/tmp/adm3-logs-del39-errors.txt:[19/Aug/2016:15:47:29 -0400] conn=13 
op=126801 MOD dn="idnsname=rc.usf.edu,cn=dns,dc=rc,dc=usf,dc=edu"




On 08/19/2016 10:00 PM, John Desantis wrote:

Ludwig,


you still only grep the replication connection, but before being replicated
the entry has to be added by some client connection, can you get all
references to the entry ?
the log snippet you provide shows also csns with tag=103, which indicate a
MOD, are these MODs for the added entries ? or other mods ?

.

I can't believe I did that!

Ok, so the logs have been rotated (I didn't think to adjust
logrotate..), so there aren't any logs to peruse for the case I've
presented so far.  However, I was able to reproduce the errors by
"bulk" deleting 39 DNS entries, and only the MASTER reported
"replica_generate_next_csn" entries.

Given the size of the logs, I think it would be pointless to do any
kind of sanitization.  I'll go ahead and gzip them for you and email
you off-list.

I've labeled them as MASTER and REPLICA.

John DeSantis


2016-08-19 9:18 GMT-04:00 Ludwig Krispenz :

On 08/18/2016 05:28 PM, John Desantis wrote:

Ludwig,


unfortunately this is not enough to determine what is going on. The
intersting generated/used csn is only logged in the
corresponding RESULT message and these are only the replication
connections,
it would be necessary to see the
original ADD operation, was it added once or twice by a client ?
you could pick 

Re: [Freeipa-users] clean-ruv

2016-08-23 Thread Ludwig Krispenz
looks like you are searching the nstombstone below "o=ipaca", but you 
are cleaning ruvs in "dc=bpt,dc=rocks",


your attrlist_replace error refers to the bpt,rocks backend, so you 
should search the tombstone entry ther, then determine which replicaIDs 
to remove.


Ludwig

On 08/23/2016 09:20 AM, Ian Harding wrote:

I've followed the procedure in this thread:

https://www.redhat.com/archives/freeipa-users/2016-May/msg00043.html

and found my list of RUV that don't have an existing replica id.

I've tried to remove them like so:

[root@seattlenfs ianh]# ldapmodify -D "cn=directory manager" -W -a
Enter LDAP Password:
dn: cn=clean 97, cn=cleanallruv, cn=tasks, cn=config
objectclass: top
objectclass: extensibleObject
replica-base-dn: dc=bpt,dc=rocks
replica-id: 97
replica-force-cleaning: yes
cn: clean 97

adding new entry "cn=clean 97, cn=cleanallruv, cn=tasks, cn=config"

[root@seattlenfs ianh]# ipa-replica-manage list-clean-ruv
CLEANALLRUV tasks
RID 9: Waiting to process all the updates from the deleted replica...
RID 96: Successfully cleaned rid(96).
RID 97: Successfully cleaned rid(97).

No abort CLEANALLRUV tasks running


and yet, they are still there...

[root@seattlenfs ianh]# ldapsearch -ZZ -h seattlenfs.bpt.rocks -D
"cn=Directory Manager" -W -b "o=ipaca"
"(&(objectclass=nstombstone)(nsUniqueId=---))"
| grep "nsds50ruv\|nsDS5ReplicaId"
Enter LDAP Password:
nsDS5ReplicaId: 81
nsds50ruv: {replicageneration} 55c8f3ae0060
nsds50ruv: {replica 81 ldap://seattlenfs.bpt.rocks:389}
568ac4310051 5
nsds50ruv: {replica 1065 ldap://freeipa-sea.bpt.rocks:389}
57b103d40429000
nsds50ruv: {replica 1070 ldap://bellevuenfs.bpt.rocks:389}
57a4f270042e000
nsds50ruv: {replica 1075 ldap://bpt-nyc1-nfs.bpt.rocks:389}
57a47865043300
nsds50ruv: {replica 1080 ldap://bellevuenfs.bpt.rocks:389}
57a417670438000
nsds50ruv: {replica 1085 ldap://fremontnis.bpt.rocks:389}
57a403e6043d
nsds50ruv: {replica 1090 ldap://freeipa-dal.bpt.rocks:389}
57a2dd350442000
nsds50ruv: {replica 1095 ldap://freeipa-sea.bpt.rocks:389}
579a963c0447000
nsds50ruv: {replica 96 ldap://freeipa-sea.bpt.rocks:389}
55c8f3bd0060
nsds50ruv: {replica 86 ldap://fremontnis.bpt.rocks:389}
5685b24e0056 5
nsds50ruv: {replica 91 ldap://seattlenis.bpt.rocks:389}
567ad6180001005b 5
nsds50ruv: {replica 97 ldap://freeipa-dal.bpt.rocks:389}
55c8f3ce0061
nsds50ruv: {replica 76 ldap://bellevuenis.bpt.rocks:389}
56f385eb0007004c
nsds50ruv: {replica 71 ldap://bellevuenfs.bpt.rocks:389}
570485690047
nsds50ruv: {replica 66 ldap://bpt-nyc1-nfs.bpt.rocks:389}
5733e594000a0042
nsds50ruv: {replica 61 ldap://edinburghnfs.bpt.rocks:389}
57442125003d
nsds50ruv: {replica 1195 ldap://edinburghnfs.bpt.rocks:389}
57a4239004ab00

What have I done wrong?

The problem I am trying to solve is that seattlenfs.bpt.rocks sends
updates to all its children, but their changes don't come back because
of these errors:

[23/Aug/2016:00:02:16 -0700] attrlist_replace - attr_replace
(nsslapd-referral,
ldap://seattlenfs.bpt.rocks:389/dc%3Dbpt%2Cdc%3Drocks) failed.

in effect, the replication agreements are one-way.

Any ideas?

- Ian



--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

--
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project


Re: [Freeipa-users] clean-ruv

2016-08-23 Thread Ludwig Krispenz


On 08/23/2016 11:52 AM, Ian Harding wrote:

Ah.  I see.  I mixed those up but I see that those would have to be
consistent.

However, I have been trying to beat some invalid RUV to death for a long
time and I can't seem to kill them.

For example, bellevuenfs has 9 and 16 which are invalid:

[ianh@seattlenfs ~]$ ldapsearch -ZZ -h seattlenfs.bpt.rocks -D
"cn=Directory Manager" -W -b "dc=bpt,dc=rocks"
"(&(objectclass=nstombstone)(nsUniqueId=---))"
| grep "nsds50ruv\|nsDS5ReplicaId"
Enter LDAP Password:
nsDS5ReplicaId: 7
nsds50ruv: {replicageneration} 55c8f3640004
nsds50ruv: {replica 7 ldap://seattlenfs.bpt.rocks:389}
568ac3cc0007 57
nsds50ruv: {replica 20 ldap://freeipa-sea.bpt.rocks:389}
57b1037700020014
nsds50ruv: {replica 18 ldap://bpt-nyc1-nfs.bpt.rocks:389}
57a4780100010012
nsds50ruv: {replica 15 ldap://fremontnis.bpt.rocks:389}
57a40386000f 5
nsds50ruv: {replica 14 ldap://freeipa-dal.bpt.rocks:389}
57a2dccd000e
nsds50ruv: {replica 17 ldap://edinburghnfs.bpt.rocks:389}
57a422f90011
nsds50ruv: {replica 19 ldap://bellevuenfs.bpt.rocks:389}
57a4f20d00060013
nsds50ruv: {replica 16 ldap://bellevuenfs.bpt.rocks:389}
57a417060010
nsds50ruv: {replica 9 ldap://bellevuenfs.bpt.rocks:389}
570484ee0009 5


So I try to kill them like so:
[ianh@seattlenfs ~]$ ipa-replica-manage clean-ruv 9 --force --cleanup
ipa: WARNING: session memcached servers not running
Clean the Replication Update Vector for bellevuenfs.bpt.rocks:389

Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C
^C[ianh@seattlenfs ~]$ ipa-replica-manage clean-ruv 16 --force --cleanup
ipa: WARNING: session memcached servers not running
Clean the Replication Update Vector for bellevuenfs.bpt.rocks:389

Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C
^C[ianh@seattlenfs ~]$ ipa-replica-manage list-clean-ruv
ipa: WARNING: session memcached servers not running
CLEANALLRUV tasks
RID 16: Waiting to process all the updates from the deleted replica...
RID 9: Waiting to process all the updates from the deleted replica...

No abort CLEANALLRUV tasks running
[ianh@seattlenfs ~]$ ipa-replica-manage list-clean-ruv
ipa: WARNING: session memcached servers not running
CLEANALLRUV tasks
RID 16: Waiting to process all the updates from the deleted replica...
RID 9: Waiting to process all the updates from the deleted replica...

and it never finishes.

seattlenfs is the first master, that's the only place I should have to
run this command, right?
right, you need to run it only on one master, but this ease of use can 
become the problem.
The cleanallruv task is propagated to all servers in the topology and it 
does this based on the replication agreements it finds.
A frequent cause of failure is that replication agreements still exist 
pointing to no longer existing servers. It is a bit tedious, but could 
you run the following search on ALL

of your current replicas (as directory manager):

ldapsearch .. -b "cn=config" "objectclass=nsds5replicationagreement" 
nsds5replicahost


if you find any agreement where nsds5replicahost is a host no longer 
existing or working, delete these agreements.


I'm about to burn everything down and ipa-server-install --uninstall but
I've done that before a couple times and that seems to be what got me
into this mess...

Thank you for your help.




On 08/23/2016 01:37 AM, Ludwig Krispenz wrote:

looks like you are searching the nstombstone below "o=ipaca", but you
are cleaning ruvs in "dc=bpt,dc=rocks",

your attrlist_replace error refers to the bpt,rocks backend, so you
should search the tombstone entry ther, then determine which replicaIDs
to remove.

Ludwig

On 08/23/2016 09:20 AM, Ian Harding wrote:

I've followed the procedure in this thread:

https://www.redhat.com/archives/freeipa-users/2016-May/msg00043.html

and found my list of RUV that don't have an existing replica id.

I've tried to remove them like so:

[root@seattlenfs ianh]# ldapmodify -D "cn=directory manager" -W -a
Enter LDAP Password:
dn: cn=clean 97, cn=cleanallruv, cn=tasks, cn=config
objectclass: top
objectclass: extensibleObject
replica-base-dn: dc=bpt,dc=rocks
replica-id: 97
replica-force-cleaning: yes
cn: clean 97

adding new entry "cn=clean 97, cn=cleanallruv, cn=tasks, cn=config"

[ro

Re: [Freeipa-users] clean-ruv

2016-08-24 Thread Ludwig Krispenz


On 08/24/2016 01:08 AM, Ian Harding wrote:


On 08/23/2016 03:14 AM, Ludwig Krispenz wrote:

On 08/23/2016 11:52 AM, Ian Harding wrote:

Ah.  I see.  I mixed those up but I see that those would have to be
consistent.

However, I have been trying to beat some invalid RUV to death for a long
time and I can't seem to kill them.

For example, bellevuenfs has 9 and 16 which are invalid:

[ianh@seattlenfs ~]$ ldapsearch -ZZ -h seattlenfs.bpt.rocks -D
"cn=Directory Manager" -W -b "dc=bpt,dc=rocks"
"(&(objectclass=nstombstone)(nsUniqueId=---))"

| grep "nsds50ruv\|nsDS5ReplicaId"
Enter LDAP Password:
nsDS5ReplicaId: 7
nsds50ruv: {replicageneration} 55c8f3640004
nsds50ruv: {replica 7 ldap://seattlenfs.bpt.rocks:389}
568ac3cc0007 57
nsds50ruv: {replica 20 ldap://freeipa-sea.bpt.rocks:389}
57b1037700020014
nsds50ruv: {replica 18 ldap://bpt-nyc1-nfs.bpt.rocks:389}
57a4780100010012
nsds50ruv: {replica 15 ldap://fremontnis.bpt.rocks:389}
57a40386000f 5
nsds50ruv: {replica 14 ldap://freeipa-dal.bpt.rocks:389}
57a2dccd000e
nsds50ruv: {replica 17 ldap://edinburghnfs.bpt.rocks:389}
57a422f90011
nsds50ruv: {replica 19 ldap://bellevuenfs.bpt.rocks:389}
57a4f20d00060013
nsds50ruv: {replica 16 ldap://bellevuenfs.bpt.rocks:389}
57a417060010
nsds50ruv: {replica 9 ldap://bellevuenfs.bpt.rocks:389}
570484ee0009 5


So I try to kill them like so:
[ianh@seattlenfs ~]$ ipa-replica-manage clean-ruv 9 --force --cleanup
ipa: WARNING: session memcached servers not running
Clean the Replication Update Vector for bellevuenfs.bpt.rocks:389

Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C
^C[ianh@seattlenfs ~]$ ipa-replica-manage clean-ruv 16 --force --cleanup
ipa: WARNING: session memcached servers not running
Clean the Replication Update Vector for bellevuenfs.bpt.rocks:389

Cleaning the wrong replica ID will cause that server to no
longer replicate so it may miss updates while the process
is running. It would need to be re-initialized to maintain
consistency. Be very careful.
Background task created to clean replication data. This may take a while.
This may be safely interrupted with Ctrl+C
^C[ianh@seattlenfs ~]$ ipa-replica-manage list-clean-ruv
ipa: WARNING: session memcached servers not running
CLEANALLRUV tasks
RID 16: Waiting to process all the updates from the deleted replica...
RID 9: Waiting to process all the updates from the deleted replica...

No abort CLEANALLRUV tasks running
[ianh@seattlenfs ~]$ ipa-replica-manage list-clean-ruv
ipa: WARNING: session memcached servers not running
CLEANALLRUV tasks
RID 16: Waiting to process all the updates from the deleted replica...
RID 9: Waiting to process all the updates from the deleted replica...

and it never finishes.

seattlenfs is the first master, that's the only place I should have to
run this command, right?

right, you need to run it only on one master, but this ease of use can
become the problem.
The cleanallruv task is propagated to all servers in the topology and it
does this based on the replication agreements it finds.
A frequent cause of failure is that replication agreements still exist
pointing to no longer existing servers. It is a bit tedious, but could
you run the following search on ALL
of your current replicas (as directory manager):

ldapsearch .. -b "cn=config" "objectclass=nsds5replicationagreement"
nsds5replicahost

if you find any agreement where nsds5replicahost is a host no longer
existing or working, delete these agreements.

I have 7 FreeIPA servers, all of which have been in existence in some
form or another since I started.  It used to work great.  I've broken it
now but the hostnames and ip addresses all still exist.  I've
uninstalled and reinstalled them a few times which I think is the source
of my troubles so I tried to straighten out the RUVs and probably messed
that up pretty good

Anyway, now what I THINK I have is

seattlenfs
|-freeipa-sea
   |- freeipa-dal
   |- bellevuenfs
   |- fremontnis
   |- bpt-nyc1-nfs
   |- edinburghnfs

Until I get this squared away I've turned off ipa services on all but
seattlenfs, freeipa-sea and freeipa-dal and am hoping that any password
changes etc. happen on seattlenfs.  I need the other two because they
are my DNS.  The rest I can kind of live without since they are just
local instances living on nfs servers.

Here's the output from that ldap query on all the hosts:

yes, looks like the replication agreements are fine, but the RUVs are not.

In the o=ipaca suffix, there is a reference to bellvuenis:

 [{replica 76
ldap://bellevuenis.bpt.rocks:389

Re: [Freeipa-users] Two masters and one of them is desynchronized

2016-08-25 Thread Ludwig Krispenz
The replication agreements to the "unsync" master says that update has 
started, so it looks like replication connection is active.
You need to check the access and error logs of bot sides and check if 
tehre is replication traffic


On 08/24/2016 06:33 PM, bahan w wrote:

Hey guys.

I performed it :
###
# /usr/bin/repl-monitor.pl  -f /tmp/checkconf -s
Directory Server Replication Status (Version 1.1)

Time: Wed Aug 24 2016 18:16:50

Master: :389 ldap://:389/
Replica ID: 4
Replica Root: dc=
Max CSN: 57bdc89700030004 (08/24/2016 18:17:27 3 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: 0:00:00
Max CSN: 57bdc89700030004 (08/24/2016 18:17:27 3 0)
Last Modify Time: 8/24/2016 18:16:50
Supplier: :389
Sent/Skipped: 179031 / 1037
Update Status: 0 Replica acquired successfully: Incremental update started
Update Started: 08/24/2016 18:16:50
Update Ended: n/a
Schedule: always in sync
SSL: SASL/GSSAPI

Master: :389 ldap://:389/
Replica ID: 3
Replica Root: dc=
Max CSN: 57bdbda10003 (08/24/2016 17:30:41)
Receiver: :389 ldap://:389/
Type: master
Time Lag: - 0:22:29
Max CSN: 57bdb85c0003 (08/24/2016 17:08:12)
Last Modify Time: 8/24/2016 17:07:34
Supplier: :389
Sent/Skipped: 3 / 9045345
Update Status: 0 Replica acquired successfully: Incremental update started
Update Started: 08/24/2016 18:16:50
Update Ended: n/a
Schedule: always in sync
SSL: SASL/GSSAPI
###

Do you see something strange in there ?
I have another environment where I have two replicated master and they 
are OK.

And when I check the same command, the result is a little bit different :
###
Master: :389 ldap://:389/
Replica ID: 4
Replica Root: dc=
Max CSN: 57bdc88d00030004 (08/24/2016 18:17:17 3 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: 0:00:00
Max CSN: 57bdc88d00030004 (08/24/2016 18:17:17 3 0)
Last Modify Time: 8/24/2016 18:16:00
Supplier: :389
Sent/Skipped: 343515 / 0
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:15:59
Update Ended: 08/24/2016 18:16:08
Schedule: always in sync
SSL: SASL/GSSAPI

Master: :389 ldap://:389/
Replica ID: 3
Replica Root: dc=
Max CSN: 57bdc88700080003 (08/24/2016 18:17:11 8 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: - 390:51:38
Max CSN: 57a8500d00040003 (08/08/2016 11:25:33 4 0)
Last Modify Time: 8/8/2016 11:24:28
Supplier: :389
Sent/Skipped: 5 / 2596073
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:16:00
Update Ended: 08/24/2016 18:16:12
Schedule: always in sync
SSL: SASL/GSSAPI
###

Best regards.

Bahan


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Two masters and one of them is desynchronized

2016-08-25 Thread Ludwig Krispenz

I just noticed that you have many skipped entries, Sent/Skipped: 3 / 9045345

that could be an effect of fractional replication which  reiterates the 
same sequence of changes. This is fixed in recent releases, but looks 
like your on RHEL 6.6


Ludwig

On 08/24/2016 06:33 PM, bahan w wrote:

Hey guys.

I performed it :
###
# /usr/bin/repl-monitor.pl  -f /tmp/checkconf -s
Directory Server Replication Status (Version 1.1)

Time: Wed Aug 24 2016 18:16:50

Master: :389 ldap://:389/
Replica ID: 4
Replica Root: dc=
Max CSN: 57bdc89700030004 (08/24/2016 18:17:27 3 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: 0:00:00
Max CSN: 57bdc89700030004 (08/24/2016 18:17:27 3 0)
Last Modify Time: 8/24/2016 18:16:50
Supplier: :389
Sent/Skipped: 179031 / 1037
Update Status: 0 Replica acquired successfully: Incremental update started
Update Started: 08/24/2016 18:16:50
Update Ended: n/a
Schedule: always in sync
SSL: SASL/GSSAPI

Master: :389 ldap://:389/
Replica ID: 3
Replica Root: dc=
Max CSN: 57bdbda10003 (08/24/2016 17:30:41)
Receiver: :389 ldap://:389/
Type: master
Time Lag: - 0:22:29
Max CSN: 57bdb85c0003 (08/24/2016 17:08:12)
Last Modify Time: 8/24/2016 17:07:34
Supplier: :389
Sent/Skipped: 3 / 9045345
Update Status: 0 Replica acquired successfully: Incremental update started
Update Started: 08/24/2016 18:16:50
Update Ended: n/a
Schedule: always in sync
SSL: SASL/GSSAPI
###

Do you see something strange in there ?
I have another environment where I have two replicated master and they 
are OK.

And when I check the same command, the result is a little bit different :
###
Master: :389 ldap://:389/
Replica ID: 4
Replica Root: dc=
Max CSN: 57bdc88d00030004 (08/24/2016 18:17:17 3 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: 0:00:00
Max CSN: 57bdc88d00030004 (08/24/2016 18:17:17 3 0)
Last Modify Time: 8/24/2016 18:16:00
Supplier: :389
Sent/Skipped: 343515 / 0
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:15:59
Update Ended: 08/24/2016 18:16:08
Schedule: always in sync
SSL: SASL/GSSAPI

Master: :389 ldap://:389/
Replica ID: 3
Replica Root: dc=
Max CSN: 57bdc88700080003 (08/24/2016 18:17:11 8 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: - 390:51:38
Max CSN: 57a8500d00040003 (08/08/2016 11:25:33 4 0)
Last Modify Time: 8/8/2016 11:24:28
Supplier: :389
Sent/Skipped: 5 / 2596073
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:16:00
Update Ended: 08/24/2016 18:16:12
Schedule: always in sync
SSL: SASL/GSSAPI
###

Best regards.

Bahan


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Two masters and one of them is desynchronized

2016-08-25 Thread Ludwig Krispenz


On 08/25/2016 04:41 PM, bahan w wrote:


Hello everyone.

Could you explain to me about this field Sent/Skipped please ?
if replication is enabled all changes on a server are logged into the 
changelog -changes coming from clients and internal changes (eg mmeberof 
update, passwordpolocy updates, lstlogin time ...).
In the replication agreement you can configure attributes for which 
changes are not replicated (keep them local) - and IPA uses this feature 
eg for krblastlogintime.


Looking at the replication traffic your monitoring shows, I think most 
of the "real" updates are going to one server and most of the clients 
triggering internal updates are going to the other. This makes 
replciation in one direction "normal" and in the other fractional. The 
problem with fractional is that the determined staring point for a 
replciation session can b every far behind and it again and again 
iterates over the same changes until finally an update which is not 
skipped is found.


There are some options to improve this:
- upgarde to a newer version, teh DS will automatically generate updates 
to a "keep alive" entry, so that the sequences of skipped changes get 
much smaller
- do it yourself by regularily applying a dummy update on the 
problematic server which will be replicated
- check configuration if writing the internal mods can be avoided, I 
think there is an option not to log krblastlogin




I checked the doc and found this :
###

Sent/Skipped :



The number of changes that were sent from the supplier and the number 
skipped in the replication update. The numbers are kept in suppliers’ 
memory only and are cleared if the supplier is restarted.


###

If I check the first part :
###
Master: :389 ldap://:389/
Replica ID: 4
Replica Root: dc=
Max CSN: 57bdcd3600010004 (08/24/2016 18:37:10 1 0)
Receiver: :389 ldap://:389/
Type: master
Time Lag: 0:00:00
Max CSN: 57bdcd3600010004 (08/24/2016 18:37:10 1 0)
Last Modify Time: 8/24/2016 18:36:32
Supplier: :389
Sent/Skipped: 182110 / 1054
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:36:32
Update Ended: 08/24/2016 18:36:34
Schedule: always in sync
SSL: SASL/GSSAPI
###

This is the replication from the MASTER OK (the supplier) to the 
MASTER UNSYNC (the receiver), right ?

So, the MASTER OK sent 182110 changes.
And in addition to these 182110 changes, 1054 changes were sent to the 
MASTER UNSYNC but skipped by the MASTER UNSYNC, right ?

Why are they skipped ?

In the other side, if I take the second part :
###
Master: :389 ldap://:389/
Replica ID: 3
Replica Root: dc=
Max CSN: 57bdbda10003 (08/24/2016 17:30:41)
Receiver: :389 ldap://:389/
Type: master
Time Lag: - 0:22:29
Max CSN: 57bdb85c0003 (08/24/2016 17:08:12)
Last Modify Time: 8/24/2016 17:07:34
Supplier: :389
Sent/Skipped: 3 / 9048655
Update Status: 0 Replica acquired successfully: Incremental update 
succeeded

Update Started: 08/24/2016 18:36:33
Update Ended: 08/24/2016 18:36:34
Schedule: always in sync
SSL: SASL/GSSAPI
###

The supplier is the MASTER UNSYNC and the receiver is the MASTER OK.
In this case I have only 3 changes sent.
And in addition to these 3 changes, 9 048 655 changes were sent but 
skipped on the MASTER OK, right ?


I ask these questions just to be sure I understand right the return of 
the pl script.



Best regards.

Bahan


--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the Freeipa-users mailing list:
https://www.redhat.com/mailman/listinfo/freeipa-users
Go to http://freeipa.org for more info on the project

Re: [Freeipa-users] Freeipa 4.2.0 hangs intermittently

2016-09-14 Thread Ludwig Krispenz

Hi,
On 09/13/2016 07:37 PM, Rakesh Rajasekharan wrote:

Hi All,

Have finally made some progress with this.. after changing the 
checkpoint interval to 180, my hangs have gone down now..


However, I faced a similar hang yesterday... users were not able to 
login.. , though this time the ns-slapd did not had any issues and 
ldapsearch worked fine possibly due to the changes in checpoint. So, I 
think I hit some other issue this time


this is a bit confusing, if your server crashes with the attached 
stacktrace ldapsearch cannot work.


About the core, it looks like you are hitting this  issue: 
https://fedorahosted.org/389/ticket/48388


I had a core genrated and this is the stacktrace of it.. can you 
please go through this and help me identify what could be causing the 
issue this time.. I have put in lot of efforts to debug and really 
would love to have this working in my prod env.. as it does in my 
other envs...


GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-80.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 


This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from /usr/sbin/ns-slapd...
warning: the debug information found in 
"/usr/lib/debug//usr/sbin/ns-slapd.debug" does not match 
"/usr/sbin/ns-slapd" (CRC mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/sbin/ns-slapd.debug" does not match 
"/usr/sbin/ns-slapd" (CRC mismatch).


Reading symbols from /usr/sbin/ns-slapd...(no debugging symbols 
found)...done.

(no debugging symbols found)...done.
[New LWP 15255]
[New LWP 15286]
[New LWP 15245]
[New LWP 15246]
[New LWP 15247]
[New LWP 15248]
[New LWP 15243]

warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/libslapd.so.0.0.0.debug" does not 
match "/usr/lib64/dirsrv/libslapd.so.0" (CRC mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/libslapd.so.0.0.0.debug" does not 
match "/usr/lib64/dirsrv/libslapd.so.0" (CRC mismatch).


[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".

warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/plugins/libsyntax-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libsyntax-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/plugins/libsyntax-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libsyntax-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug//usr/lib64/dirsrv/plugins/libbitwise-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libbitwise-plugin.so" (CRC 
mismatch).



warning: the debug information found in 
"/usr/lib/debug/usr/lib64/dirsrv/plugins/libbitwise-plugin.so.debug" 
does not match "/usr/lib64/dirsrv/plugins/libbitwise-plugin.so" (CRC 
mismatch).


...skipping...
-rw---. 1 dirsrv dirsrv  0 Sep  8 02:55 audit
-rw---. 1 dirsrv dirsrv 2551824384 Sep 12 17:32 core.10450
-rw---. 1 dirsrv dirsrv 1464463360 Sep 12 19:35 core.14709
-rw---. 1 dirsrv dirsrv 4483862528 Sep 13 01:05 core.15243
-rw---. 1 dirsrv dirsrv   66288165 Sep 13 02:10 errors
-rw---. 1 dirsrv dirsrv  104964391 Sep 13 08:30 access.20160913-074214
-rw---. 1 dirsrv dirsrv  105021859 Sep 13 09:26 access.20160913-083046
-rw---. 1 dirsrv dirsrv  104861746 Sep 13 10:31 access.20160913-092646
-rw---. 1 dirsrv dirsrv  105069140 Sep 13 11:36 access.20160913-103137
-rw---. 1 dirsrv dirsrv  104913480 Sep 13 12:41 access.20160913-113638
-rw---. 1 dirsrv dirsrv  105186788 Sep 13 13:46 access.20160913-124118
-rw---. 1 dirsrv dirsrv  105162159 Sep 13 14:51 access.20160913-134619
-rw---. 1 dirsrv dirsrv  105256624 Sep 13 15:56 access.20160913-145120
-rw---. 1 dirsrv dirsrv  105231158 Sep 13 17:01 access.20160913-155620
-rw---. 1 dirsrv dirsrv   1044 Sep 13 17:01 access.rotationinfo
-rw-r--r--. 1 root   root19287 Sep 13 17:28 
stacktrace.1473787719.txt

-rw---. 1 dirsrv dirsrv   45608914 Sep 13 17:29 access
[root@prod-ipa-master-int slapd-SPRINKLR-COM]# gdb -ex 'set confirm 
off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 
'quit' /usr/sbin/ns-slapd 
/var/log/dirsrv/slapd-SPRINKLR-COM/core.15243 stacktrace.`date 
+%s`.txt 2>&1^C
[root@prod-ipa-master-int slapd-SPRINKLR-COM]# gdb -ex 'set confirm 
off' -ex 'set pagination off' -ex 'thread apply all bt full' -ex 
'quit' /usr/sbin/ns-slapd 
/var/log/dirsrv/slapd-SPRINKLR-COM/core.15243 > stacktrace.`date 
+%s`.txt 2>&1

[root@prod-ipa-master-int slapd-SPRINKLR-COM]# ls -ltr
total 6404952
-rw---. 1 dirsrv dirsrv   

Re: [Freeipa-users] FreeIPA upgrade from ipa-server-4.2.0-15.0.1.el7.centos.18 to ipa-server-4.2.0-15.0.1.el7.centos.19 (went sideways)

2016-09-23 Thread Ludwig Krispenz

can you check if you have
/var/lock/dirsrv/slapd-RSINC-LOCAL

if the server user has permissions to write into this directory and its 
subdirs or if any pid file still exists in 
/var/lock/dirsrv/slapd-RSINC-LOCAL/server


On 09/23/2016 07:29 AM, Devin Acosta wrote:


Tonight,

I noticed there was like 30 packages to be applied on my IPA server. I 
did the normal 'yum update' process and it completed. I then rebooted 
the box for the new kernel to take affect and then that is when IPA 
stopped working completely.


When I try to start the dirsrv@RSINC-LOCAL.service, it throws up with:

[23/Sep/2016:05:19:38 +] - SSL alert: Configured NSS Ciphers
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_256_GCM_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_256_GCM_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_256_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_256_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_CAMELLIA_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_CAMELLIA_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_128_GCM_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_128_GCM_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_AES_128_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_AES_128_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_RSA_WITH_CAMELLIA_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_DHE_DSS_WITH_CAMELLIA_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDH_ECDSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDH_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDH_ECDSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_ECDH_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_256_GCM_SHA384: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_256_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_CAMELLIA_256_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_128_GCM_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_AES_128_CBC_SHA256: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: 
TLS_RSA_WITH_CAMELLIA_128_CBC_SHA: enabled
[23/Sep/2016:05:19:38 +] - SSL alert: TLS_RSA_WITH_SEED_CBC_SHA: 
enabled
[23/Sep/2016:05:19:38 +] SSL Initialization - Configured SSL 
version range: min: TLS1.0, max: TLS1.2
[23/Sep/2016:05:19:38 +] - Shutting down due to possible conflicts 
with other slapd processes


*I am not sure what to do about the error "Shutting down due to 
possible conflicts with other slapd processes"??*
The dirserv won't start, and therefore IPA won't start either. Is 
there some way to do some cleanup or to have it repair the issue?


Any help is greatly appreciated!!!

Devin.







--
Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric 
Shander

-- 
Manage your subscription for the 

  1   2   3   >