[ 
https://issues.apache.org/jira/browse/AMBARI-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alejandro Fernandez updated AMBARI-6702:
----------------------------------------

    Summary: Ambari detects RPM DB corruption  (was: Ambari detects RPM 
corruption)

> Ambari detects RPM DB corruption
> --------------------------------
>
>                 Key: AMBARI-6702
>                 URL: https://issues.apache.org/jira/browse/AMBARI-6702
>             Project: Ambari
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Alejandro Fernandez
>            Assignee: Alejandro Fernandez
>             Fix For: 1.7.0
>
>
> Users have described scenarios in which the RPM DB becomes corrupt, usually 
> after stoping all services, rebooting all hosts (including the server), and 
> restarting all services.
> http://hortonworks.com/community/forums/topic/cant-restart-cluster-ambari-not-proving-useful/
> http://hortonworks.com/community/forums/topic/ambari-corrupts-rpmdb/
> * Problem: yum commands fail to run because the RPM database is corrupt.
> * Symptom: The ambari agent log will show something of the sort,
> {code}
> INFO 2014-04-24 05:30:11,051 Controller.py:186 - RegistrationCommand received 
> - repeat agent registration
> ERROR 2014-04-24 05:33:22,669 PackagesAnalyzer.py:43 - Task timed out and 
> will be killed
> INFO 2014-04-24 05:35:12,815 HostCheckReportFileHandler.py:43 - Host check 
> report at /var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:104 - Removing old 
> host check file at /var/lib/ambari-agent/data/hostcheck.result
> INFO 2014-04-24 05:35:12,845 HostCheckReportFileHandler.py:109 - Creating 
> host check file at /var/lib/ambari-agent/data/hostcheck.result
> root@xhadoopm32p rpm# rpm -qa
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run 
> database recovery
> error: cannot open Packages index using db3 - (-30974)
> error: cannot open Packages database in /var/lib/rpm
> rpmdb: Thread/process 30282/xx failed: Thread died in Berkeley DB library
> error: db3 error(30974) from dbenv>failchk: DB_RUNRECOVERY: Fatal error, run 
> database recovery
> error: cannot open Packages database in /var/lib/rpm
> {code}
> * Fix:
> Run the following
> {code}
> rm /var/lib/rpm/__db*
> yum --rebuilddb
> {code}
> This appears to be an underlying issue with yum (either a lock is not 
> released, or multiple yum commands are ran in parallel), so to attempt to 
> decrease its frequency, the agent's PackagesAnalyzer will increase the time 
> it waits for the "yum list available" and "yum list installed" from 10 secs 
> to 20 secs.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to