Mike Woodcock created AMBARI-25509:
--------------------------------------
Summary: Ambari agent repository checking causing rpmdb corruption
Key: AMBARI-25509
URL: https://issues.apache.org/jira/browse/AMBARI-25509
Project: Ambari
Issue Type: Bug
Components: ambari-agent
Affects Versions: 2.7.3
Environment: Running RHEL7, Ambari and agent running as root in
Centrify-managed environment. Ambari part of Hortonworks (now Cloudera) HDF
(NiFi) 3.4.1.1. Using Ambari 2.7.3. Outside of Ambari, when running as root,
the call to check yum repos takes 20 minutes or more. In our circumstance
running repo check as root locks the rpmdb.
Ambari-agent runs its repo check and eventually kill -9's the task, corrupting
our rpmdb. Side effect - now yum-cron is getting hung up since it can't
access the repo either and we start having lots of yum-repo cron checking
processes hanging in the OS. Our nightly netbackup file backups are also
failing.
Happening in four different clusters on almost all 39 servers since late
January. These environments have lots of service restarts.
Reporter: Mike Woodcock
Exactly the same issue as https://issues.apache.org/jira/browse/AMBARI-6702.
Slightly different circumstances. Using Ambari 2.7.3. Outside of Ambari, when
running as root, the call to check yum repos takes 20 minutes or more. In our
circumstance, running repo check as root locks the rpmdb. When Amabari-agent
runs its repo check and sends kill -9, since it cannot wait 20 minutes, our
repodb becomes corrupted.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)