[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #988: HDDS-3681. Recon: Add support to store file size counts in each volum…

GitBox Fri, 29 May 2020 10:14:15 -0700


avijayanhwx commented on a change in pull request #988:
URL: https://github.com/apache/hadoop-ozone/pull/988#discussion_r432610066




##########
File path: 
hadoop-ozone/recon-codegen/src/main/java/org/hadoop/ozone/recon/schema/UtilizationSchemaDefinition.java
##########
@@ -83,11 +86,17 @@ private void createClusterGrowthTable(Connection conn) {
   }
 
   private void createFileSizeCountTable(Connection conn) {
-    DSL.using(conn).createTableIfNotExists(FILE_COUNT_BY_SIZE_TABLE_NAME)
+    dslContext.createTableIfNotExists(FILE_COUNT_BY_SIZE_TABLE_NAME)
+        .column("volume", SQLDataType.VARCHAR(64))

Review comment:
       Do we know if 64 is the actual volume & bucket name length limit as 
enforced by OM? If not, we have to change this to handle longer lengths.

##########
File path: 
hadoop-ozone/recon/src/test/java/org/apache/hadoop/ozone/recon/tasks/TestFileSizeCountTask.java
##########
@@ -206,10 +216,189 @@ public void testProcess() {
         Arrays.asList(updateEvent, putEvent, deleteEvent));
     fileSizeCountTask.process(omUpdateEventBatch);
 
-    upperBoundCount = fileSizeCountTask.getUpperBoundCount();
-    assertEquals(1, upperBoundCount[0]); // newKey
-    assertEquals(0, upperBoundCount[1]); // deletedKey
-    assertEquals(0, upperBoundCount[4]); // updatedKey old
-    assertEquals(1, upperBoundCount[6]); // updatedKey new
+    assertEquals(4, fileCountBySizeDao.count());
+    recordToFind.value3(1024L);
+    assertEquals(1, fileCountBySizeDao.findById(recordToFind)
+        .getCount().longValue());
+    recordToFind.value3(2048L);
+    assertEquals(0, fileCountBySizeDao.findById(recordToFind)
+        .getCount().longValue());
+    recordToFind.value3(16384L);
+    assertEquals(0, fileCountBySizeDao.findById(recordToFind)
+        .getCount().longValue());
+    recordToFind.value3(65536L);
+    assertEquals(1, fileCountBySizeDao.findById(recordToFind)
+        .getCount().longValue());
+  }
+
+  @Test
+  public void testReprocessAtScale() throws IOException {

Review comment:
       Good test to see how much we can handle!

##########
File path: 
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/api/UtilizationEndpoint.java
##########
@@ -34,7 +34,7 @@
  */
 @Path("/utilization")
 @Produces(MediaType.APPLICATION_JSON)
-public class UtilizationService {
+public class UtilizationEndpoint {

Review comment:
       Can we add a getFileCounts(volume, bucket, fileSize(Default = null)) 
here?

##########
File path: 
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/FileSizeCountTask.java
##########
@@ -49,33 +52,28 @@
   private static final Logger LOG =
       LoggerFactory.getLogger(FileSizeCountTask.class);
 
-  private int maxBinSize = -1;
-  private long maxFileSizeUpperBound = 1125899906842624L; // 1 PB
-  private long[] upperBoundCount;
-  private long oneKb = 1024L;
+  // 1125899906842624L = 1PB
+  private static final long MAX_FILE_SIZE_UPPER_BOUND = 1125899906842624L;
   private FileCountBySizeDao fileCountBySizeDao;
+  // Map to store file counts in each <volume,bucket,fileSizeUpperBound>
+  private Map<FileSizeCountKey, Long> fileSizeCountMap;

Review comment:
       Since we don't read everything from the DB at init time, 
'fileSizeCountMap' can be a local variable created inside the 'process' and 
'reprocess' methods.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #988: HDDS-3681. Recon: Add support to store file size counts in each volum…

Reply via email to