[
https://issues.apache.org/jira/browse/HDFS-16967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707448#comment-17707448
]
ASF GitHub Bot commented on HDFS-16967:
---------------------------------------
virajjasani commented on code in PR #5523:
URL: https://github.com/apache/hadoop/pull/5523#discussion_r1155002213
##########
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/store/driver/impl/StateStoreFileBaseImpl.java:
##########
@@ -168,9 +182,30 @@ public boolean initDriver() {
return false;
}
setInitialized(true);
+ int threads = getConcurrentFilesAccessNumThreads();
+ if (threads > 0) {
+ this.concurrentStoreAccessPool =
+ new ThreadPoolExecutor(threads, threads, 0L, TimeUnit.MILLISECONDS,
+ new LinkedBlockingQueue<>(),
+ new ThreadFactoryBuilder()
+ .setNameFormat("state-store-file-based-concurrent-%d")
+ .setDaemon(true).build());
+ LOG.info("File based state store will be accessed concurrently with {}
max threads", threads);
+ } else {
+ LOG.info("File based state store will be accessed serially");
+ }
return true;
}
+ @Override
+ public void close() throws Exception {
+ if (this.concurrentStoreAccessPool != null) {
+ this.concurrentStoreAccessPool.shutdown();
+ boolean isTerminated =
this.concurrentStoreAccessPool.awaitTermination(5, TimeUnit.SECONDS);
Review Comment:
Done, thanks
> RBF: File based state stores should allow concurrent access to the records
> --------------------------------------------------------------------------
>
> Key: HDFS-16967
> URL: https://issues.apache.org/jira/browse/HDFS-16967
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Viraj Jasani
> Assignee: Viraj Jasani
> Priority: Major
> Labels: pull-request-available
>
> File based state store implementations (StateStoreFileImpl and
> StateStoreFileSystemImpl) should allow updating as well as reading of the
> state store records concurrently rather than serially. Concurrent access to
> the record files on the hdfs based store seems to be improving the state
> store cache loading performance by more than 10x.
> For instance, in order to maintain data integrity, when any mount table
> record(s) is updated, the cache is reloaded. This reload operation seems to
> be able to gain significant performance improvement by the concurrent access
> of the mount table records.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]