Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again
On Thu, May 03, 2007 at 08:57:10PM +0200, Gyuris Szabolcs wrote: I stopped the slapd, then started and tried to run slapcat: bdb_db_open: unclean shutdown detected; attempting recovery. bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered. bdb_db_open: alock_recover failed bdb_db_close: alock_close failed backend_startup_one: bi_db_open failed! (-1) slap_startup failed Segmentation fault. I have no db_stat :( That would be db4.2_stat, in the db4.2-util package. -- Steve Langasek Give me a lever long enough and a Free OS Debian Developer to set it on, and I can move the world. [EMAIL PROTECTED] http://www.debian.org/ -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again - news
Finally I found some relevant logs: May 3 19:46:43 conn=63 op=1542 DEL dn=uid=user_domain1.com,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu conn=63 op=1542 RESULT tag=107 err=0 text= conn=63 op=1543 DEL dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu bdb(dc=domain,dc=hu): page 3: illegal page type or format bdb(dc=domain,dc=hu): PANIC: Invalid argument = bdb_idl_delete_key: c_close failed: DB_RUNRECOVERY: Fatal error, run database recovery (-30978) bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1543 RESULT tag=107 err=80 text=entry index delete failed conn=63 op=1544 DEL dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualWeb,dc=domain,dc=hu bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1544 RESULT tag=107 err=80 text=internal error conn=63 op=1545 DEL dn=domain=domain3.org,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1545 RESULT tag=107 err=80 text=internal error I think the conn=63 is the connection from slurpd the master ldap server's replica daemon. The objects selected to delete existed in the slave ldap database: May 2 18:25:57 conn=63 op=1310 ADD dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualWeb,dc=domain,dc=hu conn=63 op=1310 RESULT tag=105 err=0 text= conn=63 op=1311 ADD dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu conn=63 op=1311 RESULT tag=105 err=0 text= What was happening here? there were no shutdowns, no crashes, just bdb(dc=domain,dc=hu): page 3: illegal page type or format smime.p7s Description: S/MIME Cryptographic Signature
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again - news
--On Friday, May 04, 2007 11:33 AM +0200 Gyuris Szabolcs [EMAIL PROTECTED] wrote: Finally I found some relevant logs: May 3 19:46:43 conn=63 op=1542 DEL dn=uid=user_domain1.com,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu conn=63 op=1542 RESULT tag=107 err=0 text= conn=63 op=1543 DEL dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu bdb(dc=domain,dc=hu): page 3: illegal page type or format bdb(dc=domain,dc=hu): PANIC: Invalid argument = bdb_idl_delete_key: c_close failed: DB_RUNRECOVERY: Fatal error, run database recovery (-30978) bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1543 RESULT tag=107 err=80 text=entry index delete failed conn=63 op=1544 DEL dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualWeb,dc=domain,dc=hu bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1544 RESULT tag=107 err=80 text=internal error conn=63 op=1545 DEL dn=domain=domain3.org,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu bdb(dc=domain,dc=hu): PANIC: fatal region error detected; run recovery conn=63 op=1545 RESULT tag=107 err=80 text=internal error I think the conn=63 is the connection from slurpd the master ldap server's replica daemon. The objects selected to delete existed in the slave ldap database: May 2 18:25:57 conn=63 op=1310 ADD dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualWeb,dc=domain,dc=hu conn=63 op=1310 RESULT tag=105 err=0 text= conn=63 op=1311 ADD dn=domain=domain2.co.hu,domain=domain1.com,ou=virtualFtp,dc=domain,dc=hu conn=63 op=1311 RESULT tag=105 err=0 text= What was happening here? there were no shutdowns, no crashes, just bdb(dc=domain,dc=hu): page 3: illegal page type or format Well, that is a new one to me. :/ Google isn't helping much with it either, I'll consult upstream and see what I get. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419222: it crashed again
I shouted from the rooftops. The slapd made some big mistake and the nagios plugin said: Could not search/find objectclasses in dc=domain,dc=tld I stopped the slapd, then started and tried to run slapcat: bdb_db_open: unclean shutdown detected; attempting recovery. bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered. bdb_db_open: alock_recover failed bdb_db_close: alock_close failed backend_startup_one: bi_db_open failed! (-1) slap_startup failed Segmentation fault. I have no db_stat :( # locate db_stat /usr/share/mysql/mysql-test/r/have_ndb_status_ok.require I guess that the failure occurs when the master slapd make some update to the slave... :( smime.p7s Description: S/MIME Cryptographic Signature
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again
--On May 3, 2007 8:57:10 PM +0200 Gyuris Szabolcs [EMAIL PROTECTED] wrote: I shouted from the rooftops. The slapd made some big mistake and the nagios plugin said: Could not search/find objectclasses in dc=domain,dc=tld I stopped the slapd, then started and tried to run slapcat: bdb_db_open: unclean shutdown detected; attempting recovery. bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered. This says that you are running in read-only mode, so it will not try and recover the database. And it appears to believe you have had an unclean shutdown. How was slapd and/or the system last stopped? --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again
Quanah Gibson-Mount wrote: --On May 3, 2007 8:57:10 PM +0200 Gyuris Szabolcs [EMAIL PROTECTED] wrote: I shouted from the rooftops. The slapd made some big mistake and the nagios plugin said: Could not search/find objectclasses in dc=domain,dc=tld I stopped the slapd, then started and tried to run slapcat: bdb_db_open: unclean shutdown detected; attempting recovery. bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered. This says that you are running in read-only mode, so it will not try and recover the database. And it appears to believe you have had an unclean shutdown. How was slapd and/or the system last stopped? It wasn't stopped at all. I think after the master slapd try to make an update to the slave then something horrible happens and the database will be corrupt. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again
--On May 3, 2007 9:53:25 PM +0200 Gyuris Szabolcs [EMAIL PROTECTED] wrote: Quanah Gibson-Mount wrote: --On May 3, 2007 8:57:10 PM +0200 Gyuris Szabolcs [EMAIL PROTECTED] wrote: I shouted from the rooftops. The slapd made some big mistake and the nagios plugin said: Could not search/find objectclasses in dc=domain,dc=tld I stopped the slapd, then started and tried to run slapcat: bdb_db_open: unclean shutdown detected; attempting recovery. bdb_db_open: Recovery skipped in read-only mode. Run manual recovery if errors are encountered. This says that you are running in read-only mode, so it will not try and recover the database. And it appears to believe you have had an unclean shutdown. How was slapd and/or the system last stopped? It wasn't stopped at all. I think after the master slapd try to make an update to the slave then something horrible happens and the database will be corrupt. There's something really odd about your environment. The error about an unclean shutdown only occurs if slapd has been stopped in an unclean fashion. Now you note that you stopped slapd and then ran slapcat. It sounds then like slapd faulted on shutdown rather than shutting down cleanly, which means you would have to run db_recover prior to running slapcat. I recall having an issue on slapd shutdown recently, let me see if I can go dig that up. --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]
Bug#419222: [Pkg-openldap-devel] Bug#419222: it crashed again
--On May 3, 2007 1:45:42 PM -0700 Quanah Gibson-Mount [EMAIL PROTECTED] wrote: There's something really odd about your environment. The error about an unclean shutdown only occurs if slapd has been stopped in an unclean fashion. Now you note that you stopped slapd and then ran slapcat. It sounds then like slapd faulted on shutdown rather than shutting down cleanly, which means you would have to run db_recover prior to running slapcat. I recall having an issue on slapd shutdown recently, let me see if I can go dig that up. Okay, I found it. ITS#4855 and ITS#4899 deal with slapd crashing on exit. The patches are to libldap_r/tpool.c http://www.openldap.org/devel/cvsweb.cgi/libraries/libldap_r/tpool.c?hideattic=1only_on_branch=OPENLDAP_REL_ENG_2_3 Revisions 1.30.2.18 and 1.3.2.19 respectively deal with this issue, however the changes made in 1.30.2.17 need to go in first. Unfortunately, I'm not sure if the changes for ITS#4805 (the 1.30.2.17 commit) are isolated from other commits made for that ITS. The diff for the 3 versions does apply cleanly: http://www.openldap.org/devel/cvsweb.cgi/libraries/libldap_r/tpool.c.diff?hideattic=1r1=texttr1=1.30.2.16r2=texttr2=1.30.2.19f=u --Quanah -- Quanah Gibson-Mount Principal Software Engineer Zimbra, Inc Zimbra :: the leader in open source messaging and collaboration -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]