[GitHub] [hadoop-ozone] captainzmc edited a comment on pull request #880: HDDS-3327. Fix s3api create bucket BUCKET_NOT_FOUND when enable acl.

2020-05-07 Thread GitBox


captainzmc edited a comment on pull request #880:
URL: https://github.com/apache/hadoop-ozone/pull/880#issuecomment-625592747


   The review issues has been fixed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc edited a comment on pull request #880: HDDS-3327. Fix s3api create bucket BUCKET_NOT_FOUND when enable acl.

2020-05-07 Thread GitBox


captainzmc edited a comment on pull request #880:
URL: https://github.com/apache/hadoop-ozone/pull/880#issuecomment-625592747


   The issue has been fixed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] captainzmc commented on pull request #880: HDDS-3327. Fix s3api create bucket BUCKET_NOT_FOUND when enable acl.

2020-05-07 Thread GitBox


captainzmc commented on pull request #880:
URL: https://github.com/apache/hadoop-ozone/pull/880#issuecomment-625592747


   Hi @xiaoyuyao,  The issue has been fixed, could you help review it again?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3217) Datanode startup is slow due to iterating container DB 2-3 times

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3217:
-
Affects Version/s: 0.5.0

> Datanode startup is slow due to iterating container DB 2-3 times
> 
>
> Key: HDDS-3217
> URL: https://issues.apache.org/jira/browse/HDDS-3217
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Blocker
>  Labels: billiontest, pull-request-available
> Fix For: 0.6.0
>
> Attachments: Datanode restart problem.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Datanode startup, for each container we iterate 2 times entire DB
> 1. For Setting block length
> 2. For finding delete Key count.
> And for open containers, we do step 1 again.
> *Code Snippet:*
> *ContainerReader.java:*
> *For setting Bytes Used:*
> {code:java}
>   List> liveKeys = metadata.getStore()
>   .getRangeKVs(null, Integer.MAX_VALUE,
>   MetadataKeyFilters.getNormalKeyFilter());
>   bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
> BlockData blockData;
> try {
>   blockData = BlockUtils.getBlockData(e.getValue());
>   return blockData.getSize();
> } catch (IOException ex) {
>   return 0L;
> }
>   }).sum();
>   kvContainerData.setBytesUsed(bytesUsed);
> {code}
> *For setting pending deleted Key count*
> {code:java}
>   MetadataKeyFilters.KeyPrefixFilter filter =
>   new MetadataKeyFilters.KeyPrefixFilter()
>   .addFilter(OzoneConsts.DELETING_KEY_PREFIX);
>   int numPendingDeletionBlocks =
>   containerDB.getStore().getSequentialRangeKVs(null,
>   Integer.MAX_VALUE, filter)
>   .size();
>   kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
> {code}
> *For open Containers*
> {code:java}
>   if (kvContainer.getContainerState()
>   == ContainerProtos.ContainerDataProto.State.OPEN) {
> // commitSpace for Open Containers relies on usedBytes
> initializeUsedBytes(kvContainer);
>   }
> {code}
> *Jstack of DN during startup*
> {code:java}
> "Thread-8" #34 prio=5 os_prio=0 tid=0x7f5df507 nid=0x8ee runnable 
> [0x7f4d840f3000]
>java.lang.Thread.State: RUNNABLE
> at org.rocksdb.RocksIterator.next0(Native Method)
> at 
> org.rocksdb.AbstractRocksIterator.next(AbstractRocksIterator.java:70)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:195)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:155)
> at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:158)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyAndFixupContainerData(ContainerReader.java:191)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:168)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:146)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.run(ContainerReader.java:101)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3217) Datanode startup is slow due to iterating container DB 2-3 times

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3217:
-
Component/s: Ozone Datanode

> Datanode startup is slow due to iterating container DB 2-3 times
> 
>
> Key: HDDS-3217
> URL: https://issues.apache.org/jira/browse/HDDS-3217
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Blocker
>  Labels: billiontest, pull-request-available
> Fix For: 0.6.0
>
> Attachments: Datanode restart problem.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Datanode startup, for each container we iterate 2 times entire DB
> 1. For Setting block length
> 2. For finding delete Key count.
> And for open containers, we do step 1 again.
> *Code Snippet:*
> *ContainerReader.java:*
> *For setting Bytes Used:*
> {code:java}
>   List> liveKeys = metadata.getStore()
>   .getRangeKVs(null, Integer.MAX_VALUE,
>   MetadataKeyFilters.getNormalKeyFilter());
>   bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
> BlockData blockData;
> try {
>   blockData = BlockUtils.getBlockData(e.getValue());
>   return blockData.getSize();
> } catch (IOException ex) {
>   return 0L;
> }
>   }).sum();
>   kvContainerData.setBytesUsed(bytesUsed);
> {code}
> *For setting pending deleted Key count*
> {code:java}
>   MetadataKeyFilters.KeyPrefixFilter filter =
>   new MetadataKeyFilters.KeyPrefixFilter()
>   .addFilter(OzoneConsts.DELETING_KEY_PREFIX);
>   int numPendingDeletionBlocks =
>   containerDB.getStore().getSequentialRangeKVs(null,
>   Integer.MAX_VALUE, filter)
>   .size();
>   kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
> {code}
> *For open Containers*
> {code:java}
>   if (kvContainer.getContainerState()
>   == ContainerProtos.ContainerDataProto.State.OPEN) {
> // commitSpace for Open Containers relies on usedBytes
> initializeUsedBytes(kvContainer);
>   }
> {code}
> *Jstack of DN during startup*
> {code:java}
> "Thread-8" #34 prio=5 os_prio=0 tid=0x7f5df507 nid=0x8ee runnable 
> [0x7f4d840f3000]
>java.lang.Thread.State: RUNNABLE
> at org.rocksdb.RocksIterator.next0(Native Method)
> at 
> org.rocksdb.AbstractRocksIterator.next(AbstractRocksIterator.java:70)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:195)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:155)
> at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:158)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyAndFixupContainerData(ContainerReader.java:191)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:168)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:146)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.run(ContainerReader.java:101)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3217) Datanode startup is slow due to iterating container DB 2-3 times

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3217:
-
Fix Version/s: 0.6.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Datanode startup is slow due to iterating container DB 2-3 times
> 
>
> Key: HDDS-3217
> URL: https://issues.apache.org/jira/browse/HDDS-3217
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Blocker
>  Labels: billiontest, pull-request-available
> Fix For: 0.6.0
>
> Attachments: Datanode restart problem.pdf
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> During Datanode startup, for each container we iterate 2 times entire DB
> 1. For Setting block length
> 2. For finding delete Key count.
> And for open containers, we do step 1 again.
> *Code Snippet:*
> *ContainerReader.java:*
> *For setting Bytes Used:*
> {code:java}
>   List> liveKeys = metadata.getStore()
>   .getRangeKVs(null, Integer.MAX_VALUE,
>   MetadataKeyFilters.getNormalKeyFilter());
>   bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
> BlockData blockData;
> try {
>   blockData = BlockUtils.getBlockData(e.getValue());
>   return blockData.getSize();
> } catch (IOException ex) {
>   return 0L;
> }
>   }).sum();
>   kvContainerData.setBytesUsed(bytesUsed);
> {code}
> *For setting pending deleted Key count*
> {code:java}
>   MetadataKeyFilters.KeyPrefixFilter filter =
>   new MetadataKeyFilters.KeyPrefixFilter()
>   .addFilter(OzoneConsts.DELETING_KEY_PREFIX);
>   int numPendingDeletionBlocks =
>   containerDB.getStore().getSequentialRangeKVs(null,
>   Integer.MAX_VALUE, filter)
>   .size();
>   kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
> {code}
> *For open Containers*
> {code:java}
>   if (kvContainer.getContainerState()
>   == ContainerProtos.ContainerDataProto.State.OPEN) {
> // commitSpace for Open Containers relies on usedBytes
> initializeUsedBytes(kvContainer);
>   }
> {code}
> *Jstack of DN during startup*
> {code:java}
> "Thread-8" #34 prio=5 os_prio=0 tid=0x7f5df507 nid=0x8ee runnable 
> [0x7f4d840f3000]
>java.lang.Thread.State: RUNNABLE
> at org.rocksdb.RocksIterator.next0(Native Method)
> at 
> org.rocksdb.AbstractRocksIterator.next(AbstractRocksIterator.java:70)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:195)
> at 
> org.apache.hadoop.hdds.utils.RocksDBStore.getRangeKVs(RocksDBStore.java:155)
> at 
> org.apache.hadoop.ozone.container.keyvalue.helpers.KeyValueContainerUtil.parseKVContainerData(KeyValueContainerUtil.java:158)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyAndFixupContainerData(ContainerReader.java:191)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.verifyContainerFile(ContainerReader.java:168)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.readVolume(ContainerReader.java:146)
> at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerReader.run(ContainerReader.java:101)
> at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#issuecomment-625549693


   Thank You @hanishakoneru for the review and @adoroszlai thanks for the 
initial review, if you have any more comments further happy to address them in 
separate Jira. (@adoroszlai is on vacation for 2 weeks, so committing for now 
as got one +1 from @hanishakoneru )



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


hanishakoneru commented on pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#issuecomment-625545076


   +1 pending CI. 
   Thanks @bharatviswa504 for working on this diligently.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


hanishakoneru commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421848060



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   Thanks for the explanation Bharat. Makes sense. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #903: HDDS-3453. Use UrlConnectionFactory to handle HTTP Client SPNEGO for …

2020-05-07 Thread GitBox


avijayanhwx commented on a change in pull request #903:
URL: https://github.com/apache/hadoop-ozone/pull/903#discussion_r421803351



##
File path: 
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/spi/impl/OzoneManagerServiceProviderImpl.java
##
@@ -276,6 +278,14 @@ public String getOzoneManagerSnapshotUrl() throws 
IOException {
 return omLeaderUrl;
   }
 
+  private boolean isOmSpengoEnabled() {

Review comment:
   nit. Can be simplified to 1 line method.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421842493



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   And also at the end, even new logic which is using iterator finally uses 
BlockUtils.getBlockData to getNext Block. I feel the current way of iterating 
and computing the size is cleaner way, instead of getting all blocks into 
in-memory at once.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421842493



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   And also at the end, even new logic which is using iterator finally uses 
BlockUtils.getBlockData to getNext Block. I feel the current way (PR proposed) 
of iterating and computing the size is cleaner way, instead of getting all 
blocks into in-memory at once.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421840543



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   Previously we used to compute twice, one using the  logic above and for 
open containers using different logic which is initializeUsedBytes. Now we are 
doing only once by using initializeUsedBytesAndBlockCount logic.
   
   So, previously if a container is open, we used to compute bytesUsed twice 
and now with this only once, and for all containers computing using 
initializeUsedBytesAndBlockCount. And reason for not using the above logic is 
we get all blockData to in-memory at once and compute it. Now with current 
logic using iterators and computing size block by block.





[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421840543



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   Previously we used to compute twice, one using the  logic above and for 
open containers using different logic which is initializeUsedBytes. Now we are 
doing only once by using initializeUsedBytesAndBlockCount logic.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To 

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421840543



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   Previously we used to compute twice, one using different logic and 2nd 
using different logic. Now we are doing only once by using 
initializeUsedBytesAndBlockCount logic.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: 

[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


hanishakoneru commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421837672



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   I see that previously initializeUsedBytes() was done only for OPEN 
containers. And for others, we would just get the BlockData and update 
usedBytes.  
   What I mean to say is that the behavior has changed with this patch. I might 
have missed some background here. Can you please explain why the change.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:

[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#issuecomment-625530032


   > @bharat, the commits seem to have gone back to previous version. I see 
FILE_PER_BLOCK_AND_CONTAINER_DB_HAS_METADATA and versioning in the latest 
commits.
   
   Thank You for review and finding this issue, I accidentally used my old 
branch and pushed it to the branch. Thanks for sharing my old patch I have 
addressed the review comment.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421831584



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/DeleteBlocksCommandHandler.java
##
@@ -256,9 +256,14 @@ private void deleteKeyValueContainerBlocks(
   // Finally commit the DB counters.
   BatchOperation batchOperation = new BatchOperation();
 
-  // Update in DB pending delete key count and delete transaction ID.
-  batchOperation.put(DB_CONTAINER_DELETE_TRANSACTION_KEY,
-  Longs.toByteArray(delTX.getTxID()));
+
+  // In memory is updated only when existing delete transactionID is
+  // greater.
+  if (containerData.getDeleteTransactionId() > delTX.getTxID()) {
+// Update in DB pending delete key count and delete transaction ID.

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


hanishakoneru commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421819523



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/DeleteBlocksCommandHandler.java
##
@@ -256,9 +256,14 @@ private void deleteKeyValueContainerBlocks(
   // Finally commit the DB counters.
   BatchOperation batchOperation = new BatchOperation();
 
-  // Update in DB pending delete key count and delete transaction ID.
-  batchOperation.put(DB_CONTAINER_DELETE_TRANSACTION_KEY,
-  Longs.toByteArray(delTX.getTxID()));
+
+  // In memory is updated only when existing delete transactionID is
+  // greater.
+  if (containerData.getDeleteTransactionId() > delTX.getTxID()) {
+// Update in DB pending delete key count and delete transaction ID.

Review comment:
   Should be if (delTx.getTxID > containerData.getDeleteTransactionId())





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3453) Use UrlConnectionFactory to handle HTTP Client SPNEGO for ozone components

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3453:
-
Labels: pull-request-available  (was: )

> Use UrlConnectionFactory to handle HTTP Client SPNEGO for ozone components
> --
>
> Key: HDDS-3453
> URL: https://issues.apache.org/jira/browse/HDDS-3453
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Xiaoyu Yao
>Assignee: Xiaoyu Yao
>Priority: Major
>  Labels: pull-request-available
>
> Some of the places that need to be fixed, otherwise those http client won't 
> be able to access the endpoints when SPNEGO is enabled on the server side. 
> ReconUtils#makeHttpCall
> OzoneManagerSnapshotProvider#getOzoneManagerDBSnapshot
> The right API to use should be URLConnectionFactory
> public URLConnection openConnection(URL url, boolean isSpnego)
> The isSpnego should be based on OzoneSecurityUtil.isHttpSecurityEnabled()



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao opened a new pull request #903: HDDS-3453. Use UrlConnectionFactory to handle HTTP Client SPNEGO for …

2020-05-07 Thread GitBox


xiaoyuyao opened a new pull request #903:
URL: https://github.com/apache/hadoop-ozone/pull/903


   …ozone components.
   
   ## What changes were proposed in this pull request?
   
   Recon to connect OM dbcheckpoint endpoint with spnego support. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3453
   
   ## How was this patch tested?
   Acceptance tests (positive and negative) added for spnego enabled cluster.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3558) Recon UI: Add strict linter to improve code quality

2020-05-07 Thread Vivek Ratnavel Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Ratnavel Subramanian updated HDDS-3558:
-
Description: Add a linter like ([xojs|https://github.com/xojs/xo]) and fix 
all linter errors and warnings.  (was: Add a linter like 
([xojs|[https://github.com/xojs/xo]]) and fix all linter errors and warnings.)

> Recon UI: Add strict linter to improve code quality
> ---
>
> Key: HDDS-3558
> URL: https://issues.apache.org/jira/browse/HDDS-3558
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>
> Add a linter like ([xojs|https://github.com/xojs/xo]) and fix all linter 
> errors and warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3558) Recon UI: Add strict linter to improve code quality

2020-05-07 Thread Vivek Ratnavel Subramanian (Jira)
Vivek Ratnavel Subramanian created HDDS-3558:


 Summary: Recon UI: Add strict linter to improve code quality
 Key: HDDS-3558
 URL: https://issues.apache.org/jira/browse/HDDS-3558
 Project: Hadoop Distributed Data Store
  Issue Type: Task
  Components: Ozone Recon
Affects Versions: 0.5.0
Reporter: Vivek Ratnavel Subramanian
Assignee: Vivek Ratnavel Subramanian


Add a linter like ([xojs|[https://github.com/xojs/xo]]) and fix all linter 
errors and warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421690519



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {

Review comment:
   This is done like this because hasNext will return 
StorageContainerException when unable to parse block data. So, that we will not 
bail out and still continue the iteration.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#issuecomment-625414687


   Thank You @hanishakoneru for the review. 
   Addressed review comments in the latest commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421697254



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/statemachine/commandhandler/DeleteBlocksCommandHandler.java
##
@@ -251,12 +253,23 @@ private void deleteKeyValueContainerBlocks(
 }
   }
 
-  containerDB.getStore()
-  .put(DFSUtil.string2Bytes(OzoneConsts.DELETE_TRANSACTION_KEY_PREFIX),
-  Longs.toByteArray(delTX.getTxID()));
-  containerData
-  .updateDeleteTransactionId(delTX.getTxID());
-  // update pending deletion blocks count in in-memory container status
+  // Finally commit the DB counters.
+  BatchOperation batchOperation = new BatchOperation();
+
+  // Update in DB pending delete key count and delete transaction ID.
+  batchOperation.put(DB_CONTAINER_DELETE_TRANSACTION_KEY,
+  Longs.toByteArray(delTX.getTxID()));
+
+  batchOperation.put(DB_PENDING_DELETE_BLOCK_COUNT_KEY, Longs.toByteArray(
+  containerData.getNumPendingDeletionBlocks() + newDeletionBlocks));
+
+  containerDB.getStore().writeBatch(batchOperation);
+
+
+  // update pending deletion blocks count and delete transaction ID in
+  // in-memory container status
+  containerData.updateDeleteTransactionId(delTX.getTxID());
+

Review comment:
   Good catch. Added code to update DB only when delete TxID > current Tx ID





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421696155



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/BlockManagerImpl.java
##
@@ -224,11 +239,20 @@ public void deleteBlock(Container container, BlockID 
blockID) throws
   // are not atomic. Leaving it here since the impact is refusing
   // to delete a Block which might have just gotten inserted after
   // the get check.
-  byte[] kKey = Longs.toByteArray(blockID.getLocalID());
+  byte[] blockKey = Longs.toByteArray(blockID.getLocalID());
 
   getBlockByID(db, blockID);
-  db.getStore().delete(kKey);
-  // Decrement blockcount here
+
+  // Update DB to delete block and set block count and bytes used.
+  BatchOperation batch = new BatchOperation();
+  batch.delete(blockKey);
+  batch.put(DB_CONTAINER_BYTES_USED_KEY,
+  Longs.toByteArray(container.getContainerData().getBytesUsed()));

Review comment:
   Right now deleteBlock in BlockManagerImpl is not used, 
BlockDeletingService delete Chunks and then delete the blocks which are marked 
for deleted from container DB.
   
   And also updating bytes used is not correct here, so only updated block 
count here. As anyway bytes used is taken care during delete chunk.
   ```
   
   // Once files are deleted... replace deleting entries with deleted
   // entries
   BatchOperation batch = new BatchOperation();
   succeedBlocks.forEach(entry -> {
 String blockId =
 entry.substring(OzoneConsts.DELETING_KEY_PREFIX.length());
 String deletedEntry = OzoneConsts.DELETED_KEY_PREFIX + blockId;
 batch.put(DFSUtil.string2Bytes(deletedEntry),
 DFSUtil.string2Bytes(blockId));
 batch.delete(DFSUtil.string2Bytes(entry));
   });
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421693311



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {
 try {
-  blockData = BlockUtils.getBlockData(e.getValue());
-  return blockData.getSize();
+  if (blockIter.hasNext()) {
+BlockData block = blockIter.nextBlock();
+long blockLen = 0;
+
+List< ContainerProtos.ChunkInfo > chunkInfoList = 
block.getChunks();
+for (ContainerProtos.ChunkInfo chunk : chunkInfoList) {
+  ChunkInfo info = ChunkInfo.getFromProtoBuf(chunk);
+  blockLen += info.getLen();
+}
+
+usedBytes += blockLen;
+blockCount++;
+  } else {
+success = false;
+  }
 } catch (IOException ex) {
-  return 0L;
+  LOG.error("Failed to parse block data for Container {}",
+  kvContainerData.getContainerID());

Review comment:
   Previously also we used to do this. Slightly modified the code. And I 
think for one block corruption, if we don't add it to container set, then 
BackGround Container Scanner will never know about this container, and it will 
never be fixed. So, I think adding to container set is right thing to do here. 
Let me know your thoughts?
   
   ```
bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
   BlockData blockData;
   try {
 blockData = BlockUtils.getBlockData(e.getValue());
 return blockData.getSize();
   } catch (IOException ex) {
 return 0L;
   

[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #742: HDDS-3217. Datanode startup is slow due to iterating container DB 2-3 times.

2020-05-07 Thread GitBox


bharatviswa504 commented on a change in pull request #742:
URL: https://github.com/apache/hadoop-ozone/pull/742#discussion_r421690519



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/helpers/KeyValueContainerUtil.java
##
@@ -150,29 +160,109 @@ public static void 
parseKVContainerData(KeyValueContainerData kvContainerData,
 }
 kvContainerData.setDbFile(dbFile);
 
-try(ReferenceCountedDB metadata =
-BlockUtils.getDB(kvContainerData, config)) {
-  long bytesUsed = 0;
-  List> liveKeys = metadata.getStore()
-  .getRangeKVs(null, Integer.MAX_VALUE,
-  MetadataKeyFilters.getNormalKeyFilter());
 
-  bytesUsed = liveKeys.parallelStream().mapToLong(e-> {
-BlockData blockData;
+boolean isBlockMetadataSet = false;
+
+try(ReferenceCountedDB containerDB = BlockUtils.getDB(kvContainerData,
+config)) {
+
+  // Set pending deleted block count.
+  byte[] pendingDeleteBlockCount =
+  containerDB.getStore().get(DB_PENDING_DELETE_BLOCK_COUNT_KEY);
+  if (pendingDeleteBlockCount != null) {
+kvContainerData.incrPendingDeletionBlocks(
+Ints.fromByteArray(pendingDeleteBlockCount));
+  } else {
+// Set pending deleted block count.
+MetadataKeyFilters.KeyPrefixFilter filter =
+new MetadataKeyFilters.KeyPrefixFilter()
+.addFilter(OzoneConsts.DELETING_KEY_PREFIX);
+int numPendingDeletionBlocks =
+containerDB.getStore().getSequentialRangeKVs(null,
+Integer.MAX_VALUE, filter)
+.size();
+kvContainerData.incrPendingDeletionBlocks(numPendingDeletionBlocks);
+  }
+
+  // Set delete transaction id.
+  byte[] delTxnId =
+  containerDB.getStore().get(DB_CONTAINER_DELETE_TRANSACTION_KEY);
+  if (delTxnId != null) {
+kvContainerData
+.updateDeleteTransactionId(Longs.fromByteArray(delTxnId));
+  }
+
+  // Set BlockCommitSequenceId.
+  byte[] bcsId = containerDB.getStore().get(
+  DB_BLOCK_COMMIT_SEQUENCE_ID_KEY);
+  if (bcsId != null) {
+kvContainerData
+.updateBlockCommitSequenceId(Longs.fromByteArray(bcsId));
+  }
+
+  // Set bytes used.
+  // commitSpace for Open Containers relies on usedBytes
+  byte[] bytesUsed =
+  containerDB.getStore().get(DB_CONTAINER_BYTES_USED_KEY);
+  if (bytesUsed != null) {
+isBlockMetadataSet = true;
+kvContainerData.setBytesUsed(Longs.fromByteArray(bytesUsed));
+  }
+
+  // Set block count.
+  byte[] blockCount = containerDB.getStore().get(DB_BLOCK_COUNT_KEY);
+  if (blockCount != null) {
+isBlockMetadataSet = true;
+kvContainerData.setKeyCount(Longs.fromByteArray(blockCount));
+  }
+}
+
+if (!isBlockMetadataSet) {
+  initializeUsedBytesAndBlockCount(kvContainerData);
+}
+  }
+
+
+  /**
+   * Initialize bytes used and block count.
+   * @param kvContainerData
+   * @throws IOException
+   */
+  private static void initializeUsedBytesAndBlockCount(
+  KeyValueContainerData kvContainerData) throws IOException {
+
+long blockCount = 0;
+try (KeyValueBlockIterator blockIter = new KeyValueBlockIterator(
+kvContainerData.getContainerID(),
+new File(kvContainerData.getContainerPath( {
+  long usedBytes = 0;
+
+
+  boolean success = true;
+  while (success) {

Review comment:
   This is done like this because hasNext will return 
StorageContainerException when unable to parse block data.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] vivekratnavel commented on pull request #896: HDDS-3333. Recon UI: All the pages should auto reload

2020-05-07 Thread GitBox


vivekratnavel commented on pull request #896:
URL: https://github.com/apache/hadoop-ozone/pull/896#issuecomment-625408001


   @avijayanhwx Thanks for the review!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3333) Recon UI: All the pages should auto reload

2020-05-07 Thread Vivek Ratnavel Subramanian (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vivek Ratnavel Subramanian updated HDDS-:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Recon UI: All the pages should auto reload
> --
>
> Key: HDDS-
> URL: https://issues.apache.org/jira/browse/HDDS-
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
>
> Overview page should auto refresh to fetch updated data and alerts if any.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #893: HDDS-3517. Add a directory based Ozone Manager LoadGenerator.

2020-05-07 Thread GitBox


avijayanhwx commented on a change in pull request #893:
URL: https://github.com/apache/hadoop-ozone/pull/893#discussion_r421689592



##
File path: 
hadoop-ozone/fault-injection-test/mini-chaos-tests/src/test/java/org/apache/hadoop/ozone/loadgenerators/NestedDirLoadGenerator.java
##
@@ -0,0 +1,58 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.loadgenerators;
+
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.ozone.utils.LoadBucket;
+
+import java.util.Map;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * A Load generator where nested directories are created and read them.
+ */
+public class NestedDirLoadGenerator extends LoadGenerator {
+  private final LoadBucket fsBucket;
+  private final int maxDirWidth;

Review comment:
   nit. Shouldn't this be 'maxDirDepth'? 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3513) Add OzoneConfiguration to UGI when startup S3Gateway

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3513:
-
Component/s: (was: s3g)

> Add OzoneConfiguration to UGI when startup S3Gateway
> 
>
> Key: HDDS-3513
> URL: https://issues.apache.org/jira/browse/HDDS-3513
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: S3
>Affects Versions: 0.5.0
>Reporter: Simon Su
>Assignee: Simon Su
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
> Fix For: 0.6.0
>
> Attachments: add_ozone_conf.patch
>
>
> OzoneConfiguration missing load ozone-site.xml by default, this may cause 
> some issues when setup a secure ozone cluster.
> When we start a S3Gateway in secure mode (Kerberos), when start S3Gateway 
> http server, it will use UserGroupInformation to check whether in security 
> mode, which will load Configuration to check whether 
> "hadoop.security.authentication" set to "KERBEROS". but unfortunately, 
> default configuration will only load "core-site.xml, core-default.xml, 
> hdfs-site.xml, hdfs-default.xml ozone-default.xml" and missing 
> *{color:#ff}ozone-site.xml, {color} it*{color:#ff} 
> {color:#172b4d}means we have to configure "hadoop.security.authentication" in 
> one of default 5 config files if we want to start a secure 
> S3Gateway.{color}{color}
> It's better to add ozone-site.xml into OzoneConfiguration by default, so we 
> don't need to make one same configuration in different part.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3513) Add OzoneConfiguration to UGI when startup S3Gateway

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3513:
-
Component/s: s3g

> Add OzoneConfiguration to UGI when startup S3Gateway
> 
>
> Key: HDDS-3513
> URL: https://issues.apache.org/jira/browse/HDDS-3513
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: s3g
>Affects Versions: 0.5.0
>Reporter: Simon Su
>Assignee: Simon Su
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
> Fix For: 0.6.0
>
> Attachments: add_ozone_conf.patch
>
>
> OzoneConfiguration missing load ozone-site.xml by default, this may cause 
> some issues when setup a secure ozone cluster.
> When we start a S3Gateway in secure mode (Kerberos), when start S3Gateway 
> http server, it will use UserGroupInformation to check whether in security 
> mode, which will load Configuration to check whether 
> "hadoop.security.authentication" set to "KERBEROS". but unfortunately, 
> default configuration will only load "core-site.xml, core-default.xml, 
> hdfs-site.xml, hdfs-default.xml ozone-default.xml" and missing 
> *{color:#ff}ozone-site.xml, {color} it*{color:#ff} 
> {color:#172b4d}means we have to configure "hadoop.security.authentication" in 
> one of default 5 config files if we want to start a secure 
> S3Gateway.{color}{color}
> It's better to add ozone-site.xml into OzoneConfiguration by default, so we 
> don't need to make one same configuration in different part.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3513) Add OzoneConfiguration to UGI when startup S3Gateway

2020-05-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3513:
-
Component/s: S3

> Add OzoneConfiguration to UGI when startup S3Gateway
> 
>
> Key: HDDS-3513
> URL: https://issues.apache.org/jira/browse/HDDS-3513
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: S3, s3g
>Affects Versions: 0.5.0
>Reporter: Simon Su
>Assignee: Simon Su
>Priority: Minor
>  Labels: pull-request-available, pull-requests-available
> Fix For: 0.6.0
>
> Attachments: add_ozone_conf.patch
>
>
> OzoneConfiguration missing load ozone-site.xml by default, this may cause 
> some issues when setup a secure ozone cluster.
> When we start a S3Gateway in secure mode (Kerberos), when start S3Gateway 
> http server, it will use UserGroupInformation to check whether in security 
> mode, which will load Configuration to check whether 
> "hadoop.security.authentication" set to "KERBEROS". but unfortunately, 
> default configuration will only load "core-site.xml, core-default.xml, 
> hdfs-site.xml, hdfs-default.xml ozone-default.xml" and missing 
> *{color:#ff}ozone-site.xml, {color} it*{color:#ff} 
> {color:#172b4d}means we have to configure "hadoop.security.authentication" in 
> one of default 5 config files if we want to start a secure 
> S3Gateway.{color}{color}
> It's better to add ozone-site.xml into OzoneConfiguration by default, so we 
> don't need to make one same configuration in different part.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] aryangupta1998 commented on pull request #887: HDDS-3492. With hdfs-fuse file implementation 'cp' command is not working - fixed.

2020-05-07 Thread GitBox


aryangupta1998 commented on pull request #887:
URL: https://github.com/apache/hadoop-ozone/pull/887#issuecomment-625325133


   > Thanks to create this patch @aryangupta1998. Don't see any problem with 
this approach, but neither understand it fully. Is `getattr` supposed to be 
working according to POSIX before closing the file?
   > 
   > Do we need to fix it in our side? (Checking the open key tables when we do 
getFileStatus)
   
   @elek Yes 'getattr' is working according to POSIX before closing the file, 
but first time 'getattr' works before closing of file is to check whether the 
file exists at destination path or not. But second time 'getattr' doesn’t works 
before closing of file is when system creates empty file at destination path 
and opens it to write something but according to ozone filesystem, file is not 
visible till it’s committed so 'getattr' function fails this time before even 
closing of file. To address this issue we have raised this PR.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #899: HDDS-3549. TestKeyInputStream#testSeek fails intermittently.

2020-05-07 Thread GitBox


adoroszlai commented on a change in pull request #899:
URL: https://github.com/apache/hadoop-ozone/pull/899#discussion_r421476470



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestKeyInputStream.java
##
@@ -257,7 +257,7 @@ public void testSeek() throws Exception {
 
 // Seek operation should not result in any readChunk operation.
 Assert.assertEquals(readChunkCount, metrics
-.getContainerOpsMetrics(ContainerProtos.Type.ReadChunk));
+.getContainerOpCountMetrics(ContainerProtos.Type.ReadChunk));

Review comment:
   You are right, I forgot that the other metric is not decremented, so it 
will not have this problem.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3557) Provide generic introduction / deep-dive slides as part of the documentation

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3557:
-
Labels: pull-request-available  (was: )

> Provide generic introduction / deep-dive slides as part of the documentation
> 
>
> Key: HDDS-3557
> URL: https://issues.apache.org/jira/browse/HDDS-3557
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: documentation
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>
> I think it's good to have a generic overview  of Ozonein presentation format 
> and shared with the community. It can be used anytime by anybody to share 
> details of Ozone on any meetup / conferences.
> I am not sure what is the best place for this but documentation seems to be a 
> natural choice:
>  * This is version dependent (can be updated with new releases)
>  * Topics and diagrams can be shared between documentations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #902: HDDS-3557. Provide generic introduction / deep-dive slides as part of the docs

2020-05-07 Thread GitBox


elek opened a new pull request #902:
URL: https://github.com/apache/hadoop-ozone/pull/902


   ## What changes were proposed in this pull request?
   
   I think it's good to have a generic overview of Ozone in presentation format 
and shared with the community. It can be used anytime by anybody to share 
details of Ozone on any meetup / conferences.
   
   I am not sure what is the best place for this but documentation seems to be 
a natural choice:
   
   1.This is version dependent (can be updated with new releases)
   2. Topics and diagrams can be shared between documentations

   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3557## How was this patch tested?
   
## How was this patch tested?
   
   I presented it to a few of my colleges and got positive feedback. It took ~1 
hours, but together with the question 1.5 hours is better.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3557) Provide generic introduction / deep-dive slides as part of the documentation

2020-05-07 Thread Marton Elek (Jira)
Marton Elek created HDDS-3557:
-

 Summary: Provide generic introduction / deep-dive slides as part 
of the documentation
 Key: HDDS-3557
 URL: https://issues.apache.org/jira/browse/HDDS-3557
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: documentation
Reporter: Marton Elek
Assignee: Marton Elek


I think it's good to have a generic overview  of Ozonein presentation format 
and shared with the community. It can be used anytime by anybody to share 
details of Ozone on any meetup / conferences.

I am not sure what is the best place for this but documentation seems to be a 
natural choice:

 * This is version dependent (can be updated with new releases)
 * Topics and diagrams can be shared between documentations



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3552) OzoneFS is slow compared to HDFS using Spark job

2020-05-07 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101603#comment-17101603
 ] 

Marton Elek commented on HDDS-3552:
---

I synced with @Shashikant Banerjee and I learned that the main reason for 
slowness is that Ozone (by default) provides stronger guarantees on replication.

(For example flush() on hdfs means sending data to the network, in Ozone it can 
guarantee to flush to the disk on remote).

There is an ongoing effort to adjust this guarantees and make them configurable 
(and provide guarantees similar to the HDFS).

If you have time, you can set ozone.client.stream.buffer.flush.delay to true, 
which makes the flush implementation similar to HDFS (and makes everything 
faster) and repeat your test.

I am not sure if it's possible with Cloudera Runtime (this setting is not yet 
released). But will try to repeat your result and show how does it work with 
different settings.

> OzoneFS is slow compared to HDFS using Spark job
> 
>
> Key: HDDS-3552
> URL: https://issues.apache.org/jira/browse/HDDS-3552
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Priority: Major
>
> Reported by "Andrey Mindrin" on the the-asf Slack:
> {quote}
> We have made a few tests to compare OZONE (0.4 and 0.5 on Cloudera Runtime 
> 7.0.3 with 3 nodes) performance with HDFS and OZONE is slower in most cases. 
> For example, Spark application with 18 containers that copies 6 Gb parquet 
> file is about 50% slower on OzoneFS. The only one case shows the same 
> performance - Hive queries over partitioned tables.
>  simple spark code we used:
> {code}
> val file = spark.read.format(format).load(path_input)
> file.write.format(format).save(path_output)
> {code}
> Tested on CSV file with 800 million records, 2 columns and parquet file 
> converted from CSV above. Just copied file from HDFS to HDFS and from Ozone 
> to Ozone. Application time is 1m 14s on HDFS and  1m 51s (+50%) on Ozone 
> (parquet file). Ozone has default settings. (edited) 
> {quote}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3327) BUCKET_NOT_FOUND occurs when I create the bucket using the aws s3api

2020-05-07 Thread mingchao zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao reassigned HDDS-3327:
---

Assignee: mingchao zhao

> BUCKET_NOT_FOUND occurs when I create the bucket using the aws s3api
> 
>
> Key: HDDS-3327
> URL: https://issues.apache.org/jira/browse/HDDS-3327
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: S3
>Affects Versions: 0.5.0
>Reporter: mingchao zhao
>Assignee: mingchao zhao
>Priority: Major
>  Labels: pull-request-available
> Attachments: error.log
>
>
> When the acl is enabled(not enable security), It report an error to execute 
> the following command to create the bucket. This is successful when the acl 
> is not enabled. 
> {code:java}
> aws s3api --endpoint-url http://localhost:9878 create-bucket --bucket=bucket1
> {code}
> The error in hadoop-root-s3g.log is as follows:
> {code:java}
> 2020-04-02 15:28:12,030 [qtp2131952342-207] ERROR 
> org.apache.hadoop.ozone.s3.endpoint.BucketEndpoint: Error in Create Bucket 
> Request for bucket: bucket12020-04-02 15:28:12,030 [qtp2131952342-207] ERROR 
> org.apache.hadoop.ozone.s3.endpoint.BucketEndpoint: Error in Create Bucket 
> Request for bucket: bucket1BUCKET_NOT_FOUND 
> org.apache.hadoop.ozone.om.exceptions.OMException: Bucket bucket1 is not 
> found at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:805)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.getOzoneBucketMapping(OzoneManagerProtocolClientSideTranslatorPB.java:1027)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:71) 
> at com.sun.proxy.$Proxy86.getOzoneBucketMapping(Unknown Source) at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.getOzoneBucketMapping(RpcClient.java:791)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.ozone.client.OzoneClientInvocationHandler.invoke(OzoneClientInvocationHandler.java:54)
>  at com.sun.proxy.$Proxy89.getOzoneBucketMapping(Unknown Source) at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:71) 
> at com.sun.proxy.$Proxy89.getOzoneBucketMapping(Unknown Source) at 
> org.apache.hadoop.ozone.client.ObjectStore.getOzoneBucketMapping(ObjectStore.java:135)
>  at 
> org.apache.hadoop.ozone.client.ObjectStore.getOzoneBucketName(ObjectStore.java:159)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.EndpointBase.createS3Bucket(EndpointBase.java:125)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.BucketEndpoint.put(BucketEndpoint.java:208)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498) at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
>  at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:415)
>  at 
> 

[jira] [Assigned] (HDDS-3556) Refactor configuration in SCMRatisServer to Java-based configuration

2020-05-07 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng reassigned HDDS-3556:
--

Assignee: Li Cheng

> Refactor configuration in SCMRatisServer to Java-based configuration
> 
>
> Key: HDDS-3556
> URL: https://issues.apache.org/jira/browse/HDDS-3556
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: SCM
>Affects Versions: 0.5.0
>Reporter: Li Cheng
>Assignee: Li Cheng
>Priority: Major
>
> [https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3556) Refactor configuration in SCMRatisServer to Java-based configuration

2020-05-07 Thread Li Cheng (Jira)
Li Cheng created HDDS-3556:
--

 Summary: Refactor configuration in SCMRatisServer to Java-based 
configuration
 Key: HDDS-3556
 URL: https://issues.apache.org/jira/browse/HDDS-3556
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: SCM
Affects Versions: 0.5.0
Reporter: Li Cheng


[https://cwiki.apache.org/confluence/display/HADOOP/Java-based+configuration+API]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #724: HDDS-3282. ozone.http.filter.initializers can't be set properly for S…

2020-05-07 Thread GitBox


elek commented on pull request #724:
URL: https://github.com/apache/hadoop-ozone/pull/724#issuecomment-625166089


   > I've merged it to master.
   
   Sorry, I merged it locally as I promised but forget to push it. Thanks to 
finish it. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3555) Failed to create bucket by s3g

2020-05-07 Thread runzhiwang (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17101481#comment-17101481
 ] 

runzhiwang commented on HDDS-3555:
--

[~adoroszlai] Could you have a look ?

> Failed to create bucket by s3g
> --
>
> Key: HDDS-3555
> URL: https://issues.apache.org/jira/browse/HDDS-3555
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> command is: aws s3api --endpoint http://ip:9878 create-bucket --bucket 
> idex-bucket
> code is: master branch with latest commit: 
> https://github.com/apache/hadoop-ozone/commit/92cf63c41c4ef338feafa2acb9903185e5a94d39
>  
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3555) Failed to create bucket by s3g

2020-05-07 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3555:
-
Description: 
command is: aws s3api --endpoint http://ip:9878 create-bucket --bucket 
idex-bucket
code is: master branch with latest commit: 
https://github.com/apache/hadoop-ozone/commit/92cf63c41c4ef338feafa2acb9903185e5a94d39
 
 !screenshot-1.png! 

  was: !screenshot-1.png! 


> Failed to create bucket by s3g
> --
>
> Key: HDDS-3555
> URL: https://issues.apache.org/jira/browse/HDDS-3555
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
> command is: aws s3api --endpoint http://ip:9878 create-bucket --bucket 
> idex-bucket
> code is: master branch with latest commit: 
> https://github.com/apache/hadoop-ozone/commit/92cf63c41c4ef338feafa2acb9903185e5a94d39
>  
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3555) Failed to create bucket by s3g

2020-05-07 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3555:
-
Description:  !screenshot-1.png! 

> Failed to create bucket by s3g
> --
>
> Key: HDDS-3555
> URL: https://issues.apache.org/jira/browse/HDDS-3555
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>
>  !screenshot-1.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3555) Failed to create bucket by s3g

2020-05-07 Thread runzhiwang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

runzhiwang updated HDDS-3555:
-
Attachment: screenshot-1.png

> Failed to create bucket by s3g
> --
>
> Key: HDDS-3555
> URL: https://issues.apache.org/jira/browse/HDDS-3555
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: runzhiwang
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3555) Failed to create bucket by s3g

2020-05-07 Thread runzhiwang (Jira)
runzhiwang created HDDS-3555:


 Summary: Failed to create bucket by s3g
 Key: HDDS-3555
 URL: https://issues.apache.org/jira/browse/HDDS-3555
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: runzhiwang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #899: HDDS-3549. TestKeyInputStream#testSeek fails intermittently.

2020-05-07 Thread GitBox


lokeshj1703 commented on a change in pull request #899:
URL: https://github.com/apache/hadoop-ozone/pull/899#discussion_r421335507



##
File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestKeyInputStream.java
##
@@ -257,7 +257,7 @@ public void testSeek() throws Exception {
 
 // Seek operation should not result in any readChunk operation.
 Assert.assertEquals(readChunkCount, metrics
-.getContainerOpsMetrics(ContainerProtos.Type.ReadChunk));
+.getContainerOpCountMetrics(ContainerProtos.Type.ReadChunk));

Review comment:
   I agree it should not be -1. But if we look at completedCount it doesn't 
have a decrement method whereas pendingOpCount does have it. So completedCount 
will never decrement from 0 but pendingOpCount can. There is a problem with how 
we are decrementing the pending metric (we are decrementing when a reply comes 
and in onError call) and that should be tracked via a separate jira. There 
should be a better mechanism to decrement pendingOpCount.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] adoroszlai commented on a change in pull request #886: HDDS-3473. Ozone chunkinfo CLI should display block file path info

2020-05-07 Thread GitBox


adoroszlai commented on a change in pull request #886:
URL: https://github.com/apache/hadoop-ozone/pull/886#discussion_r421289937



##
File path: hadoop-ozone/dist/src/main/smoketest/debug/ozone-debug.robot
##
@@ -0,0 +1,37 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+*** Settings ***
+Documentation   Test ozone Debug CLI
+Library OperatingSystem
+Resource../commonlib.robot
+Test Timeout2 minute
+
+*** Variables ***
+
+
+*** Test Cases ***
+Create Volume,Bucket and put key
+   Execute ozone sh volume create o3://om/vol1 --quota 100TB
+   Execute ozone sh bucket create o3://om/vol1/bucket1
+   Execute ozone sh key put o3://om/vol1/bucket1/debugKey 
/opt/hadoop/NOTICE.txt
+
+Test ozone debug
+${result} = Execute ozone debug chunkinfo 
o3://om/vol1/bucket1/debugKey | jq -r '.[]'
+Should contain  ${result}   files
+${result} = Execute ozone debug chunkinfo 
o3://om/vol1/bucket1/debugKey | jq -r '.[].files[0]'
+${result3} =Execute echo "exists"
+${result2} =Execute test -f ${result} && echo "exists"
+Should Be Equal ${result2}   ${result3}

Review comment:
   `Execute` checks the return code of the command, so I think this should 
be enough to verify that the file exists:
   
   ```
   Execute test -f ${result}
   ```
   
   Better yet, there is a Robot keyword for the same: [`File Should 
Exist`](http://robotframework.org/robotframework/3.0.2/libraries/OperatingSystem.html#File%20Should%20Exist)
   
   ```suggestion
   File Should Exist ${result}
   ```

##
File path: hadoop-ozone/dist/src/main/smoketest/debug/ozone-debug.robot
##
@@ -0,0 +1,37 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+*** Settings ***
+Documentation   Test ozone Debug CLI
+Library OperatingSystem
+Resource../commonlib.robot
+Test Timeout2 minute
+
+*** Variables ***
+
+
+*** Test Cases ***
+Create Volume,Bucket and put key
+   Execute ozone sh volume create o3://om/vol1 --quota 100TB
+   Execute ozone sh bucket create o3://om/vol1/bucket1
+   Execute ozone sh key put o3://om/vol1/bucket1/debugKey 
/opt/hadoop/NOTICE.txt

Review comment:
   I think this should be a `Suite Setup` instead of a `Test Case`, but 
it's OK to improve in a followup Jira.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org