KyrieG created RANGER-3987:
------------------------------

             Summary: Potential risk of OOM
                 Key: RANGER-3987
                 URL: https://issues.apache.org/jira/browse/RANGER-3987
             Project: Ranger
          Issue Type: Bug
          Components: admin
    Affects Versions: 2.2.0
            Reporter: KyrieG


 

During each policy loading, the attribute "LastActivationTimeInMillis" is 
always set to System.currentTimeMillis(). See loadPolicy(): 
{code:java}
// from PolicyRefresher.java loadPolicy()

//load policy from PolicyAdmin
ServicePolicies svcPolicies = loadPolicyfromPolicyAdmin();

if (svcPolicies == null) {
   //if Policy fetch from Policy Admin Fails, load from cache
   if (!policiesSetInPlugin) {
      svcPolicies = loadFromCache();
   }
}

if (PERF_POLICYENGINE_INIT_LOG.isDebugEnabled()) {
   long freeMemory = Runtime.getRuntime().freeMemory();
   long totalMemory = Runtime.getRuntime().totalMemory();
   PERF_POLICYENGINE_INIT_LOG.debug("In-Use memory: " + (totalMemory - 
freeMemory) + ", Free memory:" + freeMemory);
}

if (svcPolicies != null) {
   plugIn.setPolicies(svcPolicies);
   policiesSetInPlugin = true;
   serviceDefSetInPlugin = false;
   setLastActivationTimeInMillis(System.currentTimeMillis()); // always updated 
during each policy loading
   lastKnownVersion = svcPolicies.getPolicyVersion() != null ? 
svcPolicies.getPolicyVersion() : -1L;
} else {
   if (!policiesSetInPlugin && !serviceDefSetInPlugin) {
      plugIn.setPolicies(null);
      serviceDefSetInPlugin = true;
   }
} {code}
In this case, the column "info" from table "x_plugin_info" would always need to 
be updated since it is a json string containing activationTime. See 
doCreateOrUpdateXXPluginInfo(): 

 
{code:java}
// from AssetMgr, doCreateOrUpdateXXPluginInfo().
if (lastPolicyActivationTime != null && lastPolicyActivationTime > 0 && 
(dbObj.getPolicyActivationTime() == null || 
!dbObj.getPolicyActivationTime().equals(lastPolicyActivationTime))) {
   dbObj.setPolicyActivationTime(lastPolicyActivationTime);
   needsUpdating = true;
} {code}
Since doCreateOrUpdateXXPluginInfo() is a Runnble committed to 
RangerTransactionService. (RangerTransactionSynchronizationAdapter in Ranger 
2.3.0 though, the risk might still be there). Also see 
doCreateOrUpdateXXPluginInfo(): 

 
{code:java}
// code placeholder
commitWork = new Runnable() {
   @Override
   public void run() {
      doCreateOrUpdateXXPluginInfo(pluginInfo, entityType, 
isTagVersionResetNeeded, clusterName);
   }
}; 
...
activityLogger.commitAfterTransactionComplete(commitWork);{code}
RangerTransactionService use a thread pool with unlimited work queue, 
ScheduledExecutorService, to store extra Runnables.

In our cases, there are 1000+ hive and hbase instances, the ranger admin seems 
to be  under tremendous pressure becuase every instance would periodically 
request policy-downloading API and trigger an update of the table 
"x_plugin_info". Since the core thread pool seems to be poor and DB is also 
likely under pressure, the work queue is stacking, leaking out JVM Heap and 
causing OOM finally.

I think adding more core threads would help, but when the system grow, this 
part of code would bring a lot overhead, is there any solution?

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to