[GitHub] [hadoop] sunchao commented on a change in pull request #2080: HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations

2020-07-05 Thread GitBox


sunchao commented on a change in pull request #2080:
URL: https://github.com/apache/hadoop/pull/2080#discussion_r449907165



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/router/TestRouterRpc.java
##
@@ -1777,6 +1777,43 @@ public void testgetGroupsForUser() throws IOException {
 assertArrayEquals(group, result);
   }
 
+  @Test
+  public void testGetCachedDatanodeReport() throws Exception {
+final DatanodeInfo[] datanodeReport =
+routerProtocol.getDatanodeReport(DatanodeReportType.ALL);
+
+// We should have 12 nodes in total
+assertEquals(12, datanodeReport.length);
+
+// We should be caching this information
+DatanodeInfo[] datanodeReport1 =
+routerProtocol.getDatanodeReport(DatanodeReportType.ALL);
+assertArrayEquals(datanodeReport1, datanodeReport);
+
+// Add one datanode
+
getCluster().getCluster().startDataNodes(getCluster().getCluster().getConfiguration(0),

Review comment:
   I think we need to clear state after the test case so that it won't 
affect others like `testNamenodeMetrics`.
   
   Also long line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sunchao commented on a change in pull request #2080: HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-24 Thread GitBox


sunchao commented on a change in pull request #2080:
URL: https://github.com/apache/hadoop/pull/2080#discussion_r444673117



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -361,6 +380,23 @@ public RouterRpcServer(Configuration configuration, Router 
router,
 this.nnProto = new RouterNamenodeProtocol(this);
 this.clientProto = new RouterClientProtocol(conf, this);
 this.routerProto = new RouterUserProtocol(this);
+
+long dnCacheExpire = conf.getTimeDuration(
+DN_REPORT_CACHE_EXPIRE,
+DN_REPORT_CACHE_EXPIRE_MS_DEFAULT, TimeUnit.MILLISECONDS);
+this.dnCache = CacheBuilder.newBuilder()
+.build(new DatanodeReportCacheLoader());
+
+// Actively refresh the dn cache in a configured interval
+Executors

Review comment:
   I see. Makes sense.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sunchao commented on a change in pull request #2080: HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-24 Thread GitBox


sunchao commented on a change in pull request #2080:
URL: https://github.com/apache/hadoop/pull/2080#discussion_r444664594



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -868,6 +904,50 @@ public HdfsLocatedFileStatus getLocatedFileInfo(String src,
 return clientProto.getDatanodeReport(type);
   }
 
+  /**
+   * Get the datanode report from cache.
+   *
+   * @param type Type of the datanode.
+   * @return List of datanodes.
+   * @throws IOException If it cannot get the report.
+   */
+  public DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType type)

Review comment:
   nit: this can be package-private?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -1748,4 +1828,58 @@ public void refreshSuperUserGroupsConfiguration() throws 
IOException {
   public String[] getGroupsForUser(String user) throws IOException {
 return routerProto.getGroupsForUser(user);
   }
-}
\ No newline at end of file
+
+  /**
+   * Deals with loading datanode report into the cache and refresh.
+   */
+  private class DatanodeReportCacheLoader
+  extends CacheLoader {
+
+private ListeningExecutorService executorService;
+
+DatanodeReportCacheLoader() {
+  ThreadFactory threadFactory = new ThreadFactoryBuilder()

Review comment:
   hmm can we just use:
   ```java
   executorService = MoreExecutors.listeningDecorator(
   Executors.newSingleThreadExecutor());
   ```
   ?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -361,6 +380,23 @@ public RouterRpcServer(Configuration configuration, Router 
router,
 this.nnProto = new RouterNamenodeProtocol(this);
 this.clientProto = new RouterClientProtocol(conf, this);
 this.routerProto = new RouterUserProtocol(this);
+
+long dnCacheExpire = conf.getTimeDuration(
+DN_REPORT_CACHE_EXPIRE,
+DN_REPORT_CACHE_EXPIRE_MS_DEFAULT, TimeUnit.MILLISECONDS);
+this.dnCache = CacheBuilder.newBuilder()
+.build(new DatanodeReportCacheLoader());
+
+// Actively refresh the dn cache in a configured interval
+Executors

Review comment:
   Hmm, have you considered using
   ```java
   this.dnCache = CacheBuilder.newBuilder()
.refreshAfterWrite(dnCacheExpire, TimeUnit.MILLISECONDS)
.build(new DatanodeReportCacheLoader());
   ```
   
   This will also automatically refresh the caches. Also it only refreshes a 
key iff 1) it becomes stale, and 2) there is a request on it. So this will save 
some calls for those infrequent DN report types.
   

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -1748,4 +1828,58 @@ public void refreshSuperUserGroupsConfiguration() throws 
IOException {
   public String[] getGroupsForUser(String user) throws IOException {
 return routerProto.getGroupsForUser(user);
   }
-}
\ No newline at end of file
+
+  /**
+   * Deals with loading datanode report into the cache and refresh.
+   */
+  private class DatanodeReportCacheLoader
+  extends CacheLoader {
+
+private ListeningExecutorService executorService;
+
+DatanodeReportCacheLoader() {
+  ThreadFactory threadFactory = new ThreadFactoryBuilder()
+  .setNameFormat("DatanodeReport-Cache-Reload")
+  .setDaemon(true)
+  .build();
+
+  // Only use 1 thread to refresh cache.
+  // With coreThreadCount == maxThreadCount we effectively
+  // create a fixed size thread pool. As allowCoreThreadTimeOut
+  // has been set, all threads will die after 60 seconds of non use.
+  ThreadPoolExecutor parentExecutor = new ThreadPoolExecutor(
+  1,
+  1,
+  60,
+  TimeUnit.SECONDS,
+  new LinkedBlockingQueue(),
+  threadFactory);
+  parentExecutor.allowCoreThreadTimeOut(true);
+  executorService = MoreExecutors.listeningDecorator(parentExecutor);
+}
+
+@Override
+public DatanodeInfo[] load(DatanodeReportType type) throws Exception {
+  return getCachedDatanodeReportImpl(type);
+}
+
+/**
+ * Override the reload method to provide an asynchronous implementation,
+ * so that the query will not be slowed down by the cache refresh. It
+ * will return the old cache value and schedule a background refresh.
+ */
+@Override
+public ListenableFuture reload(
+final DatanodeReportType type, DatanodeInfo[] oldValue)
+throws Exception {
+  ListenableFuture listenableFuture =

Review comment:
   nit: variable `listenableFuture` is redundant - you can just return from 
`submit` call.

##
File path: 

[GitHub] [hadoop] sunchao commented on a change in pull request #2080: HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-19 Thread GitBox


sunchao commented on a change in pull request #2080:
URL: https://github.com/apache/hadoop/pull/2080#discussion_r443091398



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterWebHdfsMethods.java
##
@@ -454,19 +454,12 @@ private URI redirectURI(final Router router, final 
UserGroupInformation ugi,
   private DatanodeInfo chooseDatanode(final Router router,
   final String path, final HttpOpParam.Op op, final long openOffset,
   final String excludeDatanodes) throws IOException {
-// We need to get the DNs as a privileged user
 final RouterRpcServer rpcServer = getRPCServer(router);
-UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
-RouterRpcServer.setCurrentUser(loginUser);
-
 DatanodeInfo[] dns = null;
 try {
-  dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
+  dns = rpcServer.getCachedDatanodeReport(DatanodeReportType.LIVE);

Review comment:
   Thanks. I filed https://issues.apache.org/jira/browse/HDFS-15423 to 
track this.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[GitHub] [hadoop] sunchao commented on a change in pull request #2080: HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations

2020-06-19 Thread GitBox


sunchao commented on a change in pull request #2080:
URL: https://github.com/apache/hadoop/pull/2080#discussion_r443079600



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterWebHdfsMethods.java
##
@@ -454,19 +454,12 @@ private URI redirectURI(final Router router, final 
UserGroupInformation ugi,
   private DatanodeInfo chooseDatanode(final Router router,
   final String path, final HttpOpParam.Op op, final long openOffset,
   final String excludeDatanodes) throws IOException {
-// We need to get the DNs as a privileged user
 final RouterRpcServer rpcServer = getRPCServer(router);
-UserGroupInformation loginUser = UserGroupInformation.getLoginUser();
-RouterRpcServer.setCurrentUser(loginUser);
-
 DatanodeInfo[] dns = null;
 try {
-  dns = rpcServer.getDatanodeReport(DatanodeReportType.LIVE);
+  dns = rpcServer.getCachedDatanodeReport(DatanodeReportType.LIVE);

Review comment:
   Question from me and not quite related to this JIRA: seems this is 
getting DNs for all the clusters while in the `CREATE` case we should only 
choose DN from a particular sub-cluster. Where is this logic implemented?

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
##
@@ -18,6 +18,8 @@
 package org.apache.hadoop.hdfs.server.federation.router;
 
 import static 
org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION;
+import static 
org.apache.hadoop.hdfs.server.federation.metrics.NamenodeBeanMetrics.DN_REPORT_CACHE_EXPIRE;

Review comment:
   This will cause compilation error. Also I'm wondering whether it makes 
sense to move the DN cache logic from  `NamenodeBeanMetrics` to here and have 
the former to depend on this. This way we don't have to keep two copies of 
cache. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org