[ 
https://issues.apache.org/jira/browse/SENTRY-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146711#comment-16146711
 ] 

Hadoop QA commented on SENTRY-1907:
-----------------------------------

Here are the results of testing the latest attachment
https://issues.apache.org/jira/secure/attachment/12884398/SENTRY-1907.01.patch 
against master.

{color:green}Overall:{color} +1 all checks pass

{color:green}SUCCESS:{color} all tests passed

Console output: 
https://builds.apache.org/job/PreCommit-SENTRY-Build/3193/console

This message is automatically generated.

> Potential memory optimization when handling big full snapshots.
> ---------------------------------------------------------------
>
>                 Key: SENTRY-1907
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1907
>             Project: Sentry
>          Issue Type: Improvement
>          Components: Sentry
>    Affects Versions: 2.0.0
>            Reporter: Alexander Kolbasov
>            Assignee: Alexander Kolbasov
>             Fix For: 2.0.0
>
>         Attachments: SENTRY-1907.01.patch
>
>
> PathImageRetriever.retrieveFullImage() has the following code:
> {code}
>       for (Map.Entry<String, Set<String>> pathEnt : pathImage.entrySet()) {
>         TPathChanges pathChange = pathsUpdate.newPathChange(pathEnt.getKey());
>         for (String path : pathEnt.getValue()) {
>           
> pathChange.addToAddPaths(Lists.newArrayList(Splitter.on("/").split(path))); 
> // here
>         }
>       }
> {code}
> We convert many paths objects to list of strings per component so /a/b/c 
> becomes {a, b, c}. There are tons of duplicates there, so after we split we 
> should intern each component before adding it.
> This was observed by code inspection and confirmed by jxray analysis (thanks 
> [[email protected]]) which shows that 61% of memory is used by duplicate 
> strings and shows the following stack trace:
> {code}
> 4. REFERENCE CHAINS WITH HIGH RETAINED MEMORY (MAY SIGNAL MEMORY LEAK)
>  ---- Object tree for GC root(s) Java Local@3c8e00c80 
> (org.apache.sentry.hdfs.service.thrift.TPathsUpdate) ----
>   4,159,037K (33.4%) (1 of org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
>      <-- Java Local@3c8e00c80 
> (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
>   4,135,376K (33.3%) (4897951 of j.u.ArrayList)
>      <-- {j.u.ArrayList} <-- 
> org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <-- 
> {j.u.ArrayList} <-- 
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java 
> Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
>   3,652,177K (29.4%) (52086231 objects)
>      <-- {j.u.ArrayList} <-- {j.u.ArrayList} <-- 
> org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <-- 
> {j.u.ArrayList} <-- 
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java 
> Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
>   GC root stack trace:
>     
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:754)
>     
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:671)
>     
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.write(TPathsUpdate.java:584)
>     
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:505)
>     
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:435)
>     
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse.write(TAuthzUpdateResponse.java:377)
>     
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3608)
>     
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3572)
>     
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result.write(SentryHDFSService.java:3523)
>     org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
>     org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>     
> org.apache.sentry.hdfs.SentryHDFSServiceProcessorFactory$ProcessorWrapper.process(SentryHDFSServiceProcessorFactory.java:47)
>     
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
>     
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>     
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to