[ 
https://issues.apache.org/jira/browse/HDDS-6102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458541#comment-17458541
 ] 

Attila Doroszlai commented on HDDS-6102:
----------------------------------------

Thanks [~djcelis] for the detailed report.  I think this is the same as 
HDDS-5087, which has been fixed in 1.2.0.  Is it possible for you to upgrade to 
that version?

> Integrating Ozone with Hive produce a thread leak in HS2 server
> ---------------------------------------------------------------
>
>                 Key: HDDS-6102
>                 URL: https://issues.apache.org/jira/browse/HDDS-6102
>             Project: Apache Ozone
>          Issue Type: Bug
>          Components: OFS
>    Affects Versions: 1.1.0
>         Environment: ozone: 1.1.0
> hadoop: 3.1.1
> hive: 3.1.0
> tez: 0.10.0 (this version is needed because of 
> [TEZ-4032|https://issues.apache.org/jira/browse/TEZ-4032])
> both the hadoop cluster + ozone are secured using kerberos.
>            Reporter: Diego Jaramillo
>            Priority: Major
>
> Integration ozone with hive is producing a thread leak in HS2, in this 
> sample, 12 open connections to hive produced 149 threads and the count kept 
> increasing until HS2 needed to be restarted.
> SETTINGS:
> HDFS integration using the following settings
> - viewfs-mount-table:
>   fs.viewfs.mounttable.clusters.link./cluster1=hdfs://cluster1
>   fs.viewfs.mounttable.clusters.link./ozfs1=ofs://ozfs1
> - core-site.xml:
>   fs.ofs.impl=org.apache.hadoop.fs.ozone.RootedOzoneFileSystem
>   fs.AbstractFileSystem.o3fs.impl=org.apache.hadoop.fs.ozone.OzFs
>  
> - hdfs-site.xml:
>   ozone.om.service.ids=ozfs1
>   ozone.om.nodes.ozfs1=om1,om2
>   ozone.om.address.ozfs1.om1=ozone1.domain.com:9862
>   ozone.om.address.ozfs1.om2=ozone2.domain.com:9862
>   dfs.nameservices=cluster1,ozfs1
>   ozone.om.kerberos.keytab.file=/etc/security/keytabs/om.service.keytab
>   ozone.om.kerberos.principal=om/[email protected]
>  
> Hive integration using the following setting
> - hive-site.xml:
>   tez.job.fs-servers=hdfs://cluster1,ofs://ozfs1
>   mapreduce.job.hdfs-servers=hdfs://cluster1,ofs://ozfs1
>  
> From hive's stack trace we see many thread like these:
> {{Thread 4958 (Truststore reloader thread):}}
> {{  State: TIMED_WAITING}}
> {{  Blocked count: 0}}
> {{  Waited count: 221}}
> {{  Stack:}}
> {{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
> {{    java.lang.Thread.sleep(Thread.java)}}
> {{    
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
> {{    java.lang.Thread.run(Thread.java:748)}}
> {{Thread 4948 (Truststore reloader thread):}}
> {{  State: TIMED_WAITING}}
> {{  Blocked count: 79}}
> {{  Waited count: 221}}
> {{  Stack:}}
> {{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
> {{    java.lang.Thread.sleep(Thread.java)}}
> {{    
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
> {{    java.lang.Thread.run(Thread.java:748)}}
> {{Thread 4777 (Truststore reloader thread):}}
> {{  State: TIMED_WAITING}}
> {{  Blocked count: 0}}
> {{  Waited count: 252}}
> {{  Stack:}}
> {{    java.lang.Thread.$$YJP$$sleep(Native Method)}}
> {{    java.lang.Thread.sleep(Thread.java)}}
> {{    
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.run(ReloadingX509TrustManager.java:195)}}
> {{    java.lang.Thread.run(Thread.java:748)}}
>  
> Using yourKit we identified the following:
> {{java.lang.Thread.<init>(Runnable, String) Thread.java
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() 
> ReloadingX509TrustManager.java:95
> org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode)
>  FileBasedKeyStoresFactory.java:223
> org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector.getSSLFactory(Configuration)
>  TimelineConnector.java:181
> org.apache.hadoop.yarn.client.api.impl.TimelineConnector.serviceInit(Configuration)
>  TimelineConnector.java:108
> org.apache.hadoop.service.AbstractService.init(Configuration) 
> AbstractService.java:164
> org.apache.hadoop.service.CompositeService.serviceInit(Configuration) 
> CompositeService.java:108
> org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(Configuration)
>  TimelineClientImpl.java:130
> org.apache.hadoop.service.AbstractService.init(Configuration) 
> AbstractService.java:164
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.getTimelineDelegationToken()
>  YarnClientImpl.java:405
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.addTimelineDelegationToken(ContainerLaunchContext)
>  YarnClientImpl.java:381
> org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.submitApplication(ApplicationSubmissionContext)
>  YarnClientImpl.java:300
> org.apache.tez.client.TezYarnClient.submitApplication(ApplicationSubmissionContext)
>  TezYarnClient.java:77
> org.apache.tez.client.TezClient.start() TezClient.java:402
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient,
>  HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], 
> boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
> TezSessionState.java:451
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[],
>  boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
> TezSessionPoolSession.java:124
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(48String[]) 
> TezSessionState.java:373
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState,
>  String[]) TezTask.java:373
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) 
> TezTask.java:200
> org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
> org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, 
> int, DriverContext) Driver.java:2712
> org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
> org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
> org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
> org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
> org.apache.hive.service.cli.operation.SQLOperation.runQuery() 
> SQLOperation.java:226
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) 
> SQLOperation.java:87
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() 
> SQLOperation.java:324
> javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) 
> Subject.java
> org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction)
>  UserGroupInformation.java:1729
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() 
> SQLOperation.java:342
> java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
> java.util.concurrent.FutureTask.run() FutureTask.java:266
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
> ThreadPoolExecutor.java:1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run() 
> ThreadPoolExecutor.java:624
> java.lang.Thread.run() Thread.java:748
> ----
> java.lang.Thread.<init>(Runnable, String) Thread.java
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.init() 
> ReloadingX509TrustManager.java:95
> org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(SSLFactory$Mode)
>  FileBasedKeyStoresFactory.java:223
> org.apache.hadoop.security.ssl.SSLFactory.init() SSLFactory.java:180
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.<init>(URI, Configuration) 
> KMSClientProvider.java:390
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProviders(Configuration,
>  URL, int, String) KMSClientProvider.java:318
> org.apache.hadoop.crypto.key.kms.KMSClientProvider$Factory.createProvider(URI,
>  Configuration) KMSClientProvider.java:303
> org.apache.hadoop.crypto.key.KeyProviderFactory.get(URI, Configuration) 
> KeyProviderFactory.java:96
> org.apache.hadoop.util.KMSUtil.createKeyProviderFromUri(Configuration, URI) 
> KMSUtil.java:83
> org.apache.hadoop.ozone.client.rpc.OzoneKMSUtil.getKeyProvider(ConfigurationSource,
>  URI) OzoneKMSUtil.java:138
> org.apache.hadoop.ozone.client.rpc.RpcClient.getKeyProvider() 
> RpcClient.java:1310
> org.apache.hadoop.ozone.client.ObjectStore.getKeyProvider() 
> ObjectStore.java:222
> org.apache.hadoop.fs.ozone.BasicRootedOzoneClientAdapterImpl.getKeyProvider() 
> BasicRootedOzoneClientAdapterImpl.java:785
> org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getKeyProvider() 
> RootedOzoneFileSystem.java:54
> org.apache.hadoop.fs.ozone.RootedOzoneFileSystem.getAdditionalTokenIssuers() 
> RootedOzoneFileSystem.java:67
> org.apache.hadoop.security.token.DelegationTokenIssuer.collectDelegationTokens(DelegationTokenIssuer,
>  String, Credentials, List) DelegationTokenIssuer.java:104
> org.apache.hadoop.security.token.DelegationTokenIssuer.addDelegationTokens(String,
>  Credentials) DelegationTokenIssuer.java:76
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(FileSystem,
>  Credentials, Configuration) TokenCache.java:140
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystemsInternal(Credentials,
>  Path[], Configuration) TokenCache.java:101
> org.apache.tez.common.security.TokenCache.obtainTokensForFileSystems(Credentials,
>  Path[], Configuration) TokenCache.java:77
> org.apache.tez.client.TezClientUtils.populateTokenCache(TezConfiguration, 
> Credentials) TezClientUtils.java:746
> org.apache.tez.client.TezClientUtils.prepareAmLaunchCredentials(AMConfiguration,
>  Credentials, TezConfiguration, Path) TezClientUtils.java:722
> org.apache.tez.client.TezClientUtils.createApplicationSubmissionContext(ApplicationId,
>  DAG, String, AMConfiguration, Map, Credentials, boolean, TezApiVersionInfo, 
> ServicePluginsDescriptor, JavaOptsChecker) TezClientUtils.java:487
> org.apache.tez.client.TezClient.setupApplicationContext() TezClient.java:501
> org.apache.tez.client.TezClient.start() TezClient.java:401
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.startSessionAndContainers(TezClient,
>  HiveConf, Map, TezConfiguration, boolean) TezSessionState.java:516
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(String[], 
> boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
> TezSessionState.java:451
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(String[],
>  boolean, SessionState$LogHelper, TezSessionState$HiveResources) 
> TezSessionPoolSession.java:124
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(String[]) 
> TezSessionState.java:373
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezSessionState,
>  String[]) TezTask.java:373
> org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(DriverContext) 
> TezTask.java:200
> org.apache.hadoop.hive.ql.exec.Task.executeTask(HiveHistory) Task.java:212
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential() TaskRunner.java:103
> org.apache.hadoop.hive.ql.Driver.launchTask(Task, String, boolean, String, 
> int, DriverContext) Driver.java:2712
> org.apache.hadoop.hive.ql.Driver.execute() Driver.java:2383
> org.apache.hadoop.hive.ql.Driver.runInternal(String, boolean) Driver.java:2055
> org.apache.hadoop.hive.ql.Driver.run(String, boolean) Driver.java:1753
> org.apache.hadoop.hive.ql.Driver.run() Driver.java:1747
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run() ReExecDriver.java:157
> org.apache.hive.service.cli.operation.SQLOperation.runQuery() 
> SQLOperation.java:226
> org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation) 
> SQLOperation.java:87
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run() 
> SQLOperation.java:324
> javax.security.auth.Subject.doAs(Subject, PrivilegedExceptionAction) 
> Subject.java
> org.apache.hadoop.security.UserGroupInformation.doAs(PrivilegedExceptionAction)
>  UserGroupInformation.java:1729
> org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run() 
> SQLOperation.java:342
> java.util.concurrent.Executors$RunnableAdapter.call() Executors.java:511
> java.util.concurrent.FutureTask.run() FutureTask.java:266
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 
> ThreadPoolExecutor.java:1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run() 
> ThreadPoolExecutor.java:624
> java.lang.Thread.run() Thread.java:748}}
>  
> This looks similar to 
> [HDFS-14037|https://issues.apache.org/jira/browse/HDFS-14037]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to