KyrieG created RANGER-3987:
------------------------------
Summary: Potential risk of OOM
Key: RANGER-3987
URL: https://issues.apache.org/jira/browse/RANGER-3987
Project: Ranger
Issue Type: Bug
Components: admin
Affects Versions: 2.2.0
Reporter: KyrieG
During each policy loading, the attribute "LastActivationTimeInMillis" is
always set to System.currentTimeMillis(). See loadPolicy():
{code:java}
// from PolicyRefresher.java loadPolicy()
//load policy from PolicyAdmin
ServicePolicies svcPolicies = loadPolicyfromPolicyAdmin();
if (svcPolicies == null) {
//if Policy fetch from Policy Admin Fails, load from cache
if (!policiesSetInPlugin) {
svcPolicies = loadFromCache();
}
}
if (PERF_POLICYENGINE_INIT_LOG.isDebugEnabled()) {
long freeMemory = Runtime.getRuntime().freeMemory();
long totalMemory = Runtime.getRuntime().totalMemory();
PERF_POLICYENGINE_INIT_LOG.debug("In-Use memory: " + (totalMemory -
freeMemory) + ", Free memory:" + freeMemory);
}
if (svcPolicies != null) {
plugIn.setPolicies(svcPolicies);
policiesSetInPlugin = true;
serviceDefSetInPlugin = false;
setLastActivationTimeInMillis(System.currentTimeMillis()); // always updated
during each policy loading
lastKnownVersion = svcPolicies.getPolicyVersion() != null ?
svcPolicies.getPolicyVersion() : -1L;
} else {
if (!policiesSetInPlugin && !serviceDefSetInPlugin) {
plugIn.setPolicies(null);
serviceDefSetInPlugin = true;
}
} {code}
In this case, the column "info" from table "x_plugin_info" would always need to
be updated since it is a json string containing activationTime. See
doCreateOrUpdateXXPluginInfo():
{code:java}
// from AssetMgr, doCreateOrUpdateXXPluginInfo().
if (lastPolicyActivationTime != null && lastPolicyActivationTime > 0 &&
(dbObj.getPolicyActivationTime() == null ||
!dbObj.getPolicyActivationTime().equals(lastPolicyActivationTime))) {
dbObj.setPolicyActivationTime(lastPolicyActivationTime);
needsUpdating = true;
} {code}
Since doCreateOrUpdateXXPluginInfo() is a Runnble committed to
RangerTransactionService. (RangerTransactionSynchronizationAdapter in Ranger
2.3.0 though, the risk might still be there). Also see
doCreateOrUpdateXXPluginInfo():
{code:java}
// code placeholder
commitWork = new Runnable() {
@Override
public void run() {
doCreateOrUpdateXXPluginInfo(pluginInfo, entityType,
isTagVersionResetNeeded, clusterName);
}
};
...
activityLogger.commitAfterTransactionComplete(commitWork);{code}
RangerTransactionService use a thread pool with unlimited work queue,
ScheduledExecutorService, to store extra Runnables.
In our cases, there are 1000+ hive and hbase instances, the ranger admin seems
to be under tremendous pressure becuase every instance would periodically
request policy-downloading API and trigger an update of the table
"x_plugin_info". Since the core thread pool seems to be poor and DB is also
likely under pressure, the work queue is stacking, leaking out JVM Heap and
causing OOM finally.
I think adding more core threads would help, but when the system grow, this
part of code would bring a lot overhead, is there any solution?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)