On 04/03/2014 03:46 PM, Nevada Sanchez wrote:
Okay, I updated the gist and extended some of the logs (ipa2-errors does stop at 20:50:21). I'll follow up when I have the debug stuff in place.


https://gist.github.com/nevsan/8b6f78d7396963dc5f70

Another strange thing - it looks as if the initial replica init completes successfully.

[02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin - Beginning total update of replica "agmt="cn=meToipa2.example.com" (ipa2:389)".

On the replica:

[02/Apr/2014:20:50:18 +0000] NSMMReplicationPlugin - multimaster_be_state_change: replica dc=example,dc=com is going offline; disabling replication [02/Apr/2014:20:50:18 +0000] - WARNING: Import is running with nsslapd-db-private-import-mem on; No other process is allowed to access the database [02/Apr/2014:20:50:21 +0000] - import userRoot: Workers finished; cleaning up...
[02/Apr/2014:20:50:21 +0000] - import userRoot: Workers cleaned up.
[02/Apr/2014:20:50:21 +0000] - import userRoot: Indexing complete. Post-processing... [02/Apr/2014:20:50:21 +0000] - import userRoot: Generating numSubordinates complete.
[02/Apr/2014:20:50:21 +0000] - import userRoot: Flushing caches...
[02/Apr/2014:20:50:21 +0000] - import userRoot: Closing files...
[02/Apr/2014:20:50:21 +0000] - import userRoot: Import complete. Processed 453 entries in 3 seconds. (151.00 entries/sec) [02/Apr/2014:20:50:21 +0000] NSMMReplicationPlugin - multimaster_be_state_change: replica dc=example,dc=com is coming online; enabling replication

On the master, access log:

[02/Apr/2014:20:50:17 +0000] conn=1365 op=15 MOD dn="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config"

This is the operation that triggers the replica init. Then ipa-replica-install polls for agreement status: [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 SRCH base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config" scope=0 filter="(objectClass=*)" attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh nsds5replicaLastInitEnd" [02/Apr/2014:20:50:19 +0000] conn=1365 op=16 RESULT err=0 tag=101 nentries=1 etime=0 [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 SRCH base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config" scope=0 filter="(objectClass=*)" attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh nsds5replicaLastInitEnd" [02/Apr/2014:20:50:20 +0000] conn=1365 op=17 RESULT err=0 tag=101 nentries=1 etime=0 [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 SRCH base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config" scope=0 filter="(objectClass=*)" attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh nsds5replicaLastInitEnd" [02/Apr/2014:20:50:21 +0000] conn=1365 op=18 RESULT err=0 tag=101 nentries=1 etime=0 [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 SRCH base="cn=meToipa2.example.com,cn=replica,cn=dc\3Dexample\2Cdc\3Dcom,cn=mapping tree,cn=config" scope=0 filter="(objectClass=*)" attrs="nsds5replicaLastInitStart nsds5replicaUpdateInProgress nsds5replicaLastInitStatus cn nsds5BeginReplicaRefresh nsds5replicaLastInitEnd" [02/Apr/2014:20:50:22 +0000] conn=1365 op=19 RESULT err=0 tag=101 nentries=1 etime=1

Something happens here. The replica init is done, according to the replica error log. We don't have the replica access log from around this time to see exactly when the connection was closed, but looking at the ipa code, it would appear that ipa did not see a status of "Total update succeeded". Not sure why the master would not have reported that, unless there was some problem getting back the status from the replica.

[02/Apr/2014:20:50:22 +0000] conn=1365 op=20 UNBIND
[02/Apr/2014:20:50:22 +0000] conn=1365 op=20 fd=114 closed - U1

Then ipa-replica-install closes the connection and reports the error.



On Thu, Apr 3, 2014 at 10:38 AM, Rich Megginson <rmegg...@redhat.com <mailto:rmegg...@redhat.com>> wrote:

    On 04/02/2014 09:22 PM, Nevada Sanchez wrote:
    Okay. Updated the gist with the additional logs:
    https://gist.github.com/nevsan/8b6f78d7396963dc5f70



    1) Dirsrv is crashing:
    [02/Apr/2014:20:49:53 +0000] - 389-Directory/1.3.1.22.a1
    B2014.073.1751 starting up
    [02/Apr/2014:20:49:54 +0000] - Db home directory is not set.
    Possibly nsslapd-directory (optionally nsslapd-db-home-directory)
    is missing in the config file.
    [02/Apr/2014:20:49:54 +0000] - I'm resizing my cache now...cache
    was 710029312 and is now 8000000
    [02/Apr/2014:20:49:54 +0000] - 389-Directory/1.3.1.22.a1
    B2014.073.1751 starting up
    [02/Apr/2014:20:49:54 +0000] - Detected Disorderly Shutdown last
    time Directory Server was running, recovering database.
    [02/Apr/2014:20:49:55 +0000] - slapd started. Listening on All
    Interfaces port 389 for LDAP requests

    Please use the instructions at
    http://port389.org/wiki/FAQ#Debugging_Crashes to get a core dump
    and stack trace.

    2) The first occurrence of the connection error is at
    [02/Apr/2014:20:52:38 +0000] but there isn't anything in the
    consumer error log after [02/Apr/2014:20:50:21 +0000] and in the
    consumer access log after [02/Apr/2014:20:50:22 +0000]


    On Wed, Apr 2, 2014 at 9:38 PM, Rich Megginson
    <rmegg...@redhat.com <mailto:rmegg...@redhat.com>> wrote:

        On 04/02/2014 03:01 PM, Nevada Sanchez wrote:
        Okay, I ran it with debug on. The output is quite large. I'm
        not sure what the etiquette is for posting large logs, so I
        threw it on gist here:
        
https://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt
        
<http://gist.githubusercontent.com/nevsan/8b6f78d7396963dc5f70/raw/b76b3c3acce4f12d292d680f4c1dab39c05888d5/gistfile1.txt>


        Let me know if I should copy it into the thread instead.

        Ok.  Now can you post excerpts from the dirsrv errors log
        from both the master replica and the replica from around the
        time of the failure?




        On Wed, Apr 2, 2014 at 1:49 PM, Rich Megginson
        <rmegg...@redhat.com <mailto:rmegg...@redhat.com>> wrote:

            On 04/02/2014 11:45 AM, Nevada Sanchez wrote:
            My apologies. I mistakenly ran the failing ldapsearch
            from an unpriviliged user (couldn't read
            slapd-EXAMPLE-COM directory). Running as root, it now
            works just fine (same result as the one that worked).
            SSL seems to not be the issue. Also, I haven't change
            the SSL certs since I first set up the master.

            I have been doing the replica side things from scratch
            (even so far as starting with a new machine). For the
            master side, I have just been re-preparing the replica.
            I hope I don't have to start from scratch with the
            master replica.

            I guess the next step would be to do the
            ipa-replica-install using -ddd and review the extra
            debug information that comes out.




            On Wed, Apr 2, 2014 at 11:45 AM, Rob Crittenden
            <rcrit...@redhat.com <mailto:rcrit...@redhat.com>> wrote:

                Rich Megginson wrote:

                    On 04/02/2014 09:20 AM, Nevada Sanchez wrote:

                        Okay, we might be on to something:

                        ipa -> ipa2
                        ================================
                        $
                        LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
                        ldapsearch -xLLLZZ
                        -h ipa2.example.com
                        <http://ipa2.example.com>
                        <http://ipa2.example.com> -s base -b ""

                        'objectclass=*' vendorVersion
                        dn:
                        vendorVersion: 389-Directory/1.3.1.22.a1
                        B2014.073.1751
                        ================================

                        ipa2 -> ipa
                        ================================
                        $
                        LDAPTLS_CACERTDIR=/etc/dirsrv/slapd-EXAMPLE-COM
                        ldapsearch -xLLLZZ
                        -h ipa.example.com <http://ipa.example.com>
                        <http://ipa.example.com> -s base -b ""

                        'objectclass=*' vendorVersion
                        ldap_start_tls: Connect error (-11)
                        additional info: TLS error -8172:Peer's
                        certificate issuer has been
                        marked as not trusted by the user.
                        ================================

                        The original IPA trusts the replica (since
                        it signed the cert, I
                        assume), but the replica doesn't trust the
                        main IPA server. I guess
                        the ZZ option would have shown me the
                        failure that I missed in my
                        initial ldapsearch tests.

                    -Z[Z]  Issue StartTLS (Transport Layer
                    Security) extended
                    operation. If
                     you  use  -ZZ, the command will require the
                    operation to
                    be suc-
                     cessful.

                    i.e. use SSL, and force a successful handshake


                        Anyway, what's the best way to remedy this
                        in a way that makes IPA
                        happy? (I've found that LDAP can have
                        different requirements on which
                        certs go where).


                    I'm not sure.
                    ipa-server-install/ipa-replica-prepare/ipa-replica-install
                    is supposed to take care of installing the CA
                    cert properly for you. If
                    you try to hack it and install the CA cert
                    manually, you will probably
                    miss something else that ipa install did not do.

                    I think the only way to ensure that you have a
                    properly configured ipa
                    server + replicas is to get all of the ipa
                    commands completing successfully.

                    Which means going back to the drawing board and
                    starting over from scratch.


                You can compare the certs that each side is using with:

                # certutil -L -d /etc/dirsrv/slapd-EXAMPLE-COM

                Did you by chance replace the SSL server certs that
                IPA uses on your working master?

                rob









_______________________________________________
Freeipa-users mailing list
Freeipa-users@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-users

Reply via email to