[
https://issues.apache.org/jira/browse/AMBARI-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Onischuk updated AMBARI-6702:
------------------------------------
Assignee: (was: Andrew Onischuk)
> Ambari detects RPM DB corruption
> --------------------------------
>
> Key: AMBARI-6702
> URL: https://issues.apache.org/jira/browse/AMBARI-6702
> Project: Ambari
> Issue Type: Bug
> Components: client
> Affects Versions: 1.5.0
> Reporter: Alejandro Fernandez
> Fix For: 1.7.0
>
>
> Users have described scenarios in which the RPM DB becomes corrupt, usually
> after stoping all services, rebooting all hosts (including the server), and
> restarting all services.
> http://hortonworks.com/community/forums/topic/cant-restart-cluster-ambari-not-proving-useful/
> http://hortonworks.com/community/forums/topic/ambari-corrupts-rpmdb/
> * Problem: yum commands fail to run because the RPM database is corrupt.
> * Symptom: The ambari agent log will show something of the sort,
> {code}
> INFO 2014-04-24 05:30:11,051 Controller.py:186 - RegistrationCommand received
> - repeat agent registration
> ERROR 2014-04-24 05:33:22,669 PackagesAnalyzer.py:43 - Task timed out and
> will be killed
> INFO 2014-04-24 05:35:12,815 HostCheckReportFileHandler.py:43 - Host check
> report at /var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:104 - Removing old
> host check file at /var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:109 - Creating
> host check file at /var/lib/ambari-agent/data/hostcheck.result
> root@xhadoopm32p rpm# rpm -qa
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run
> database recovery
> error: cannot open Packages index using db3 - (-30974)
> error: cannot open Packages database in /var/lib/rpm
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run
> database recovery
> error: cannot open Packages database in /var/lib/rpm
> {code}
> * Fix:
> Run the following
> {code}
> rm /var/lib/rpm/__db*
> yum --rebuilddb
> {code}
> This appears to be an underlying issue with yum (either a lock is not
> released, or multiple yum commands are ran in parallel), so to attempt to
> decrease its frequency, the agent's PackagesAnalyzer will increase the time
> it waits for the "yum list available" and "yum list installed" from 10 secs
> to 20 secs.
--
This message was sent by Atlassian JIRA
(v6.2#6252)