[
https://issues.apache.org/jira/browse/SENTRY-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16146711#comment-16146711
]
Hadoop QA commented on SENTRY-1907:
-----------------------------------
Here are the results of testing the latest attachment
https://issues.apache.org/jira/secure/attachment/12884398/SENTRY-1907.01.patch
against master.
{color:green}Overall:{color} +1 all checks pass
{color:green}SUCCESS:{color} all tests passed
Console output:
https://builds.apache.org/job/PreCommit-SENTRY-Build/3193/console
This message is automatically generated.
> Potential memory optimization when handling big full snapshots.
> ---------------------------------------------------------------
>
> Key: SENTRY-1907
> URL: https://issues.apache.org/jira/browse/SENTRY-1907
> Project: Sentry
> Issue Type: Improvement
> Components: Sentry
> Affects Versions: 2.0.0
> Reporter: Alexander Kolbasov
> Assignee: Alexander Kolbasov
> Fix For: 2.0.0
>
> Attachments: SENTRY-1907.01.patch
>
>
> PathImageRetriever.retrieveFullImage() has the following code:
> {code}
> for (Map.Entry<String, Set<String>> pathEnt : pathImage.entrySet()) {
> TPathChanges pathChange = pathsUpdate.newPathChange(pathEnt.getKey());
> for (String path : pathEnt.getValue()) {
>
> pathChange.addToAddPaths(Lists.newArrayList(Splitter.on("/").split(path)));
> // here
> }
> }
> {code}
> We convert many paths objects to list of strings per component so /a/b/c
> becomes {a, b, c}. There are tons of duplicates there, so after we split we
> should intern each component before adding it.
> This was observed by code inspection and confirmed by jxray analysis (thanks
> [[email protected]]) which shows that 61% of memory is used by duplicate
> strings and shows the following stack trace:
> {code}
> 4. REFERENCE CHAINS WITH HIGH RETAINED MEMORY (MAY SIGNAL MEMORY LEAK)
> ---- Object tree for GC root(s) Java Local@3c8e00c80
> (org.apache.sentry.hdfs.service.thrift.TPathsUpdate) ----
> 4,159,037K (33.4%) (1 of org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
> <-- Java Local@3c8e00c80
> (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
> 4,135,376K (33.3%) (4897951 of j.u.ArrayList)
> <-- {j.u.ArrayList} <--
> org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <--
> {j.u.ArrayList} <--
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java
> Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
> 3,652,177K (29.4%) (52086231 objects)
> <-- {j.u.ArrayList} <-- {j.u.ArrayList} <--
> org.apache.sentry.hdfs.service.thrift.TPathChanges.addPaths <--
> {j.u.ArrayList} <--
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.pathChanges <-- Java
> Local@3c8e00c80 (org.apache.sentry.hdfs.service.thrift.TPathsUpdate)
> GC root stack trace:
>
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:754)
>
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate$TPathsUpdateStandardScheme.write(TPathsUpdate.java:671)
>
> org.apache.sentry.hdfs.service.thrift.TPathsUpdate.write(TPathsUpdate.java:584)
>
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:505)
>
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse$TAuthzUpdateResponseStandardScheme.write(TAuthzUpdateResponse.java:435)
>
> org.apache.sentry.hdfs.service.thrift.TAuthzUpdateResponse.write(TAuthzUpdateResponse.java:377)
>
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3608)
>
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result$get_authz_updates_resultStandardScheme.write(SentryHDFSService.java:3572)
>
> org.apache.sentry.hdfs.service.thrift.SentryHDFSService$get_authz_updates_result.write(SentryHDFSService.java:3523)
> org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53)
> org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>
> org.apache.sentry.hdfs.SentryHDFSServiceProcessorFactory$ProcessorWrapper.process(SentryHDFSServiceProcessorFactory.java:47)
>
> org.apache.thrift.TMultiplexedProcessor.process(TMultiplexedProcessor.java:123)
>
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> java.lang.Thread.run(Thread.java:745)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)