[jira] [Updated] (HDDS-3683) Ozone fuse support

2020-06-10 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3683:
-
Description: 
https://github.com/opendataio/hcfsfuse

design doc will be updated here.
https://docs.google.com/document/d/1IY9xhRTeo42Sfzw6U-NngHOTLO_B7_0BiUsonvPKvh8/edit?usp=sharing

  was:https://github.com/opendataio/hcfsfuse


>  Ozone fuse support
> ---
>
> Key: HDDS-3683
> URL: https://issues.apache.org/jira/browse/HDDS-3683
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> https://github.com/opendataio/hcfsfuse
> design doc will be updated here.
> https://docs.google.com/document/d/1IY9xhRTeo42Sfzw6U-NngHOTLO_B7_0BiUsonvPKvh8/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1054: Hdds 3772. Add LOG to S3ErrorTable for easier problem locating.

2020-06-10 Thread GitBox


codecov-commenter commented on pull request #1054:
URL: https://github.com/apache/hadoop-ozone/pull/1054#issuecomment-642411148


   # 
[Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1054?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@67244e5`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1054/graphs/tree.svg?width=650=150=pr=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1054?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master#1054   +/-   ##
   =
 Coverage  ?   69.43%   
 Complexity? 9113   
   =
 Files ?  961   
 Lines ?48150   
 Branches  ? 4679   
   =
 Hits  ?33435   
 Misses?12499   
 Partials  ? 2216   
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1054?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1054?src=pr=footer).
 Last update 
[67244e5...06c521d](https://codecov.io/gh/apache/hadoop-ozone/pull/1054?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3481) SCM ask too many datanodes to replicate the same container

2020-06-10 Thread Nanda kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-3481:
--
Status: Patch Available  (was: Open)

> SCM ask too many datanodes to replicate the same container
> --
>
> Key: HDDS-3481
> URL: https://issues.apache.org/jira/browse/HDDS-3481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
>  Labels: Triaged, pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
> *What's the problem ?*
> As the image shows,  scm ask 31 datanodes to replicate container 2037 every 
> 10 minutes from 2020-04-17 23:38:51.  And at 2020-04-18 08:58:52 scm find the 
> replicate num of container 2037 is 12, then it ask 11 datanodes to delete 
> container 2037. 
>  !screenshot-1.png! 
>  !screenshot-2.png! 
> *What's the reason ?*
> scm check whether  (container replicates num + 
> inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it 
> will ask some datanode to replicate the container, and add the action into 
> inflightReplication.get(containerId). The replicate action time out is 10 
> minutes, if action timeout, scm will delete the action from 
> inflightReplication.get(containerId) as the image shows. Then (container 
> replicates num + inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask 
> another datanode to replicate the container.
> Because replicate container cost a long time,  sometimes it cannot finish in 
> 10 minutes, thus 31 datanodes has to replicate the container every 10 
> minutes.  19 of 31 datanodes replicate container from the same source 
> datanode,  it will also cause big pressure on the source datanode and 
> replicate container become slower. Actually it cost 4 hours to finish the 
> first replicate. 
>  !screenshot-4.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] leosunli commented on a change in pull request #1033: HDDS-3667. If we gracefully stop datanode it would be better to notify scm and r…

2020-06-10 Thread GitBox


leosunli commented on a change in pull request #1033:
URL: https://github.com/apache/hadoop-ozone/pull/1033#discussion_r438540923



##
File path: 
hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/states/endpoint/UnRegisterEndpointTask.java
##
@@ -0,0 +1,262 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership.  The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations 
under
+ * the License.
+ */
+package org.apache.hadoop.ozone.container.common.states.endpoint;
+
+import java.io.IOException;
+import java.util.UUID;
+import java.util.concurrent.Callable;
+import java.util.concurrent.Future;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.hadoop.hdds.conf.ConfigurationSource;
+import org.apache.hadoop.hdds.protocol.DatanodeDetails;
+import 
org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.ContainerReportsProto;
+import 
org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.NodeReportProto;
+import 
org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.PipelineReportsProto;
+import 
org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.SCMUNRegisteredResponseProto;
+import 
org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine;
+import 
org.apache.hadoop.ozone.container.common.statemachine.EndpointStateMachine.EndPointStates;
+import org.apache.hadoop.ozone.container.common.statemachine.StateContext;
+import org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.google.common.annotations.VisibleForTesting;
+import com.google.common.base.Preconditions;
+
+/**
+ * UnRegister a datanode with SCM.
+ */
+public final class UnRegisterEndpointTask implements
+Callable {
+  static final Logger LOG =
+  LoggerFactory.getLogger(UnRegisterEndpointTask.class);
+
+  private final EndpointStateMachine rpcEndPoint;
+  private final ConfigurationSource conf;
+  private Future result;
+  private DatanodeDetails datanodeDetails;
+  private final OzoneContainer datanodeContainerManager;
+  private StateContext stateContext;
+
+  /**
+   * Creates a register endpoint task.
+   *
+   * @param rpcEndPoint - endpoint
+   * @param conf - conf
+   * @param ozoneContainer - container
+   */
+  @VisibleForTesting
+  public UnRegisterEndpointTask(EndpointStateMachine rpcEndPoint,
+  ConfigurationSource conf, OzoneContainer ozoneContainer,
+  StateContext context) {
+this.rpcEndPoint = rpcEndPoint;
+this.conf = conf;
+this.datanodeContainerManager = ozoneContainer;
+this.stateContext = context;
+
+  }
+
+  /**
+   * Get the DatanodeDetails.
+   *
+   * @return DatanodeDetailsProto
+   */
+  public DatanodeDetails getDatanodeDetails() {
+return datanodeDetails;
+  }
+
+  /**
+   * Set the contiainerNodeID Proto.
+   *
+   * @param datanodeDetails - Container Node ID.
+   */
+  public void setDatanodeDetails(
+  DatanodeDetails datanodeDetails) {
+this.datanodeDetails = datanodeDetails;
+  }
+
+  /**
+   * Computes a result, or throws an exception if unable to do so.
+   *
+   * @return computed result
+   * @throws Exception if unable to compute a result
+   */
+  @Override
+  public EndpointStateMachine.EndPointStates call() throws Exception {
+
+if (getDatanodeDetails() == null) {
+  LOG.error("DatanodeDetails cannot be null in RegisterEndpoint task, "
+  + "shutting down the endpoint.");
+  return 
rpcEndPoint.setState(EndpointStateMachine.EndPointStates.SHUTDOWN);
+}
+
+rpcEndPoint.lock();
+try {
+
+  if (rpcEndPoint.getState()
+  .equals(EndPointStates.SHUTDOWN)) {
+ContainerReportsProto containerReport =
+datanodeContainerManager.getController().getContainerReport();
+NodeReportProto nodeReport = datanodeContainerManager.getNodeReport();
+PipelineReportsProto pipelineReportsProto =
+datanodeContainerManager.getPipelineReport();
+// TODO : Add responses to the command Queue.
+SCMUNRegisteredResponseProto response = rpcEndPoint.getEndPoint()
+.unregister(datanodeDetails.getProtoBufMessage(), nodeReport,
+containerReport, 

[jira] [Updated] (HDDS-3481) SCM ask too many datanodes to replicate the same container

2020-06-10 Thread Nanda kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-3481:
--
Labels: Triaged pull-request-available  (was: TriagePending 
pull-request-available)

> SCM ask too many datanodes to replicate the same container
> --
>
> Key: HDDS-3481
> URL: https://issues.apache.org/jira/browse/HDDS-3481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
>  Labels: Triaged, pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
> *What's the problem ?*
> As the image shows,  scm ask 31 datanodes to replicate container 2037 every 
> 10 minutes from 2020-04-17 23:38:51.  And at 2020-04-18 08:58:52 scm find the 
> replicate num of container 2037 is 12, then it ask 11 datanodes to delete 
> container 2037. 
>  !screenshot-1.png! 
>  !screenshot-2.png! 
> *What's the reason ?*
> scm check whether  (container replicates num + 
> inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it 
> will ask some datanode to replicate the container, and add the action into 
> inflightReplication.get(containerId). The replicate action time out is 10 
> minutes, if action timeout, scm will delete the action from 
> inflightReplication.get(containerId) as the image shows. Then (container 
> replicates num + inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask 
> another datanode to replicate the container.
> Because replicate container cost a long time,  sometimes it cannot finish in 
> 10 minutes, thus 31 datanodes has to replicate the container every 10 
> minutes.  19 of 31 datanodes replicate container from the same source 
> datanode,  it will also cause big pressure on the source datanode and 
> replicate container become slower. Actually it cost 4 hours to finish the 
> first replicate. 
>  !screenshot-4.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] ChenSammi opened a new pull request #1054: Hdds 3772. Add LOG to S3ErrorTable for easier problem locating.

2020-06-10 Thread GitBox


ChenSammi opened a new pull request #1054:
URL: https://github.com/apache/hadoop-ozone/pull/1054


   https://issues.apache.org/jira/browse/HDDS-3772



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3772) Add LOG to S3ErrorTable for easier problem locating

2020-06-10 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3772:
-
Description: 
Currently it's hard to directly tell the failure reason when something 
unexpected happened. Here is an example when downloading a file through aws 
java sdk. 

com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon 
S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 
0a5f2404-71bb-4edf-b488-2a5de9f6b753), S3 Extended Request ID: rv8deQRJyX3zCEk
at 
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1389)
at 
com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:902)
at 
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:607)
at 
com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
at 
com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826)
at 
com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1015)
at 
com.amazonaws.services.s3.transfer.TransferManager.doDownload(TransferManager.java:939)
at 
com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:795)
at 
com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:713)
at 
com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:667)
at AWSS3UtilTest$AWSS3Util.download(AWSS3UtilTest.java:213)
at AWSS3UtilTest.test08_downloadAsyn(AWSS3UtilTest.java:107)
at AWSS3UtilTest.main(AWSS3UtilTest.java:47)

> Add LOG to S3ErrorTable for easier problem locating
> ---
>
> Key: HDDS-3772
> URL: https://issues.apache.org/jira/browse/HDDS-3772
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>
> Currently it's hard to directly tell the failure reason when something 
> unexpected happened. Here is an example when downloading a file through aws 
> java sdk. 
> com.amazonaws.services.s3.model.AmazonS3Exception: Not Found (Service: Amazon 
> S3; Status Code: 404; Error Code: 404 Not Found; Request ID: 
> 0a5f2404-71bb-4edf-b488-2a5de9f6b753), S3 Extended Request ID: rv8deQRJyX3zCEk
>   at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1389)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:902)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:607)
>   at 
> com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
>   at 
> com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
>   at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3826)
>   at 
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1015)
>   at 
> com.amazonaws.services.s3.transfer.TransferManager.doDownload(TransferManager.java:939)
>   at 
> com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:795)
>   at 
> com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:713)
>   at 
> com.amazonaws.services.s3.transfer.TransferManager.download(TransferManager.java:667)
>   at AWSS3UtilTest$AWSS3Util.download(AWSS3UtilTest.java:213)
>   at AWSS3UtilTest.test08_downloadAsyn(AWSS3UtilTest.java:107)
>   at AWSS3UtilTest.main(AWSS3UtilTest.java:47)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3776) Upgrading RocksDB version to avoid java heap issue

2020-06-10 Thread Li Cheng (Jira)
Li Cheng created HDDS-3776:
--

 Summary: Upgrading RocksDB version to avoid java heap issue
 Key: HDDS-3776
 URL: https://issues.apache.org/jira/browse/HDDS-3776
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: upgrade
Affects Versions: 0.5.0
Reporter: Li Cheng


Currently we have rocksdb 6.6.4 as major version and there are some jvm issues 
in tests (happened in [https://github.com/apache/hadoop-ozone/pull/1019]) 
related to rocksdb core dump. We may upgrade to 6.8.1 to avoid this issue.

{{JRE version: Java(TM) SE Runtime Environment (8.0_211-b12) (build 
1.8.0_211-b12)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode bsd-amd64 
compressed oops)
# Problematic frame:
# C  [librocksdbjni2954960755376440018.jnilib+0x602b8]  
rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8

See full dump at 
[https://the-asf.slack.com/files/U0159PV5Z6U/F0152UAJF0S/hs_err_pid90655.log?origin_team=T4S1WH2J3_channel=D014L2URB6E](url)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1019: HDDS-3679. Add unit tests for PipelineManagerV2.

2020-06-10 Thread GitBox


timmylicheng commented on pull request #1019:
URL: https://github.com/apache/hadoop-ozone/pull/1019#issuecomment-642370476


   @elek Tests seem passed here. I created 
https://issues.apache.org/jira/browse/HDDS-3776 to track rocksdb upgrade.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3499) Address compatibility issue by SCM DB instances change

2020-06-10 Thread Li Cheng (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132853#comment-17132853
 ] 

Li Cheng commented on HDDS-3499:


[~arp] Our internal production deployment is still on schedule. But we have 
done internal tests to verify the step works for me. Resolving this now...

> Address compatibility issue by SCM DB instances change
> --
>
> Key: HDDS-3499
> URL: https://issues.apache.org/jira/browse/HDDS-3499
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Li Cheng
>Assignee: Marton Elek
>Priority: Blocker
>  Labels: Triaged
>
> After https://issues.apache.org/jira/browse/HDDS-3172, SCM now has one single 
> rocksdb instance instead of multiple db instances. 
> For running Ozone cluster, we need to address compatibility issues. One 
> possible way is to have a side-way tool to migrate old metadata from multiple 
> dbs to current single db.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3499) Address compatibility issue by SCM DB instances change

2020-06-10 Thread Li Cheng (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Cheng resolved HDDS-3499.

Fix Version/s: 0.6.0
   Resolution: Fixed

> Address compatibility issue by SCM DB instances change
> --
>
> Key: HDDS-3499
> URL: https://issues.apache.org/jira/browse/HDDS-3499
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Li Cheng
>Assignee: Marton Elek
>Priority: Blocker
>  Labels: Triaged
> Fix For: 0.6.0
>
>
> After https://issues.apache.org/jira/browse/HDDS-3172, SCM now has one single 
> rocksdb instance instead of multiple db instances. 
> For running Ozone cluster, we need to address compatibility issues. One 
> possible way is to have a side-way tool to migrate old metadata from multiple 
> dbs to current single db.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] codecov-commenter edited a comment on pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


codecov-commenter edited a comment on pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#issuecomment-642271726


   # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=h1) 
Report
   > Merging 
[#986](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=desc) into 
[master](https://codecov.io/gh/apache/hadoop-ozone/commit/f7e95d9b015e764ca93cfe2ccfc96d95160931bc=desc)
 will **decrease** coverage by `0.10%`.
   > The diff coverage is `65.28%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hadoop-ozone/pull/986/graphs/tree.svg?width=650=150=pr=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master #986  +/-   ##
   
   - Coverage 69.48%   69.38%   -0.11% 
   + Complexity 9110 9102   -8 
   
 Files   961  961  
 Lines 4813248123   -9 
 Branches   4672 4676   +4 
   
   - Hits  3344633388  -58 
   - Misses1246812519  +51 
   + Partials   2218 2216   -2 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...main/java/org/apache/hadoop/ozone/OzoneConsts.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3Avb3pvbmUvT3pvbmVDb25zdHMuamF2YQ==)
 | `84.21% <ø> (ø)` | `1.00 <0.00> (ø)` | |
   | 
[...apache/hadoop/hdds/utils/db/RocksDBCheckpoint.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvZnJhbWV3b3JrL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy91dGlscy9kYi9Sb2Nrc0RCQ2hlY2twb2ludC5qYXZh)
 | `90.90% <ø> (+0.90%)` | `5.00 <0.00> (-3.00)` | :arrow_up: |
   | 
[.../java/org/apache/hadoop/ozone/om/OMConfigKeys.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL2NvbW1vbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL29tL09NQ29uZmlnS2V5cy5qYXZh)
 | `100.00% <ø> (ø)` | `1.00 <0.00> (ø)` | |
   | 
[.../apache/hadoop/ozone/om/OMDBCheckpointServlet.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9PTURCQ2hlY2twb2ludFNlcnZsZXQuamF2YQ==)
 | `66.26% <ø> (-4.27%)` | `8.00 <0.00> (-2.00)` | |
   | 
[...a/org/apache/hadoop/ozone/om/ha/OMNodeDetails.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9oYS9PTU5vZGVEZXRhaWxzLmphdmE=)
 | `86.66% <ø> (ø)` | `12.00 <0.00> (ø)` | |
   | 
[...p/ozone/om/ratis/utils/OzoneManagerRatisUtils.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9yYXRpcy91dGlscy9Pem9uZU1hbmFnZXJSYXRpc1V0aWxzLmphdmE=)
 | `67.44% <0.00%> (-19.13%)` | `39.00 <0.00> (ø)` | |
   | 
[.../java/org/apache/hadoop/ozone/om/OzoneManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9Pem9uZU1hbmFnZXIuamF2YQ==)
 | `64.22% <12.50%> (-0.37%)` | `185.00 <1.00> (-1.00)` | |
   | 
[...adoop/ozone/om/ratis/OzoneManagerStateMachine.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9yYXRpcy9Pem9uZU1hbmFnZXJTdGF0ZU1hY2hpbmUuamF2YQ==)
 | `58.03% <90.00%> (+2.29%)` | `27.00 <4.00> (+1.00)` | |
   | 
[.../org/apache/hadoop/hdds/scm/pipeline/Pipeline.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zY20vcGlwZWxpbmUvUGlwZWxpbmUuamF2YQ==)
 | `85.71% <100.00%> (+0.20%)` | `44.00 <0.00> (+1.00)` | |
   | 
[.../org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree#diff-aGFkb29wLW96b25lL2NvbW1vbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvaGFkb29wL296b25lL29tL2hlbHBlcnMvT21LZXlJbmZvLmphdmE=)
 | `86.25% <100.00%> (+0.33%)` | `42.00 <0.00> (+2.00)` | |
   | ... and [28 
more](https://codecov.io/gh/apache/hadoop-ozone/pull/986/diff?src=pr=tree-more)
 | |
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   

[jira] [Updated] (HDDS-3737) Improve OM performance

2020-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3737:
-
Labels: pull-request-available  (was: )

> Improve OM performance
> --
>
> Key: HDDS-3737
> URL: https://issues.apache.org/jira/browse/HDDS-3737
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] runzhiwang opened a new pull request #1053: HDDS-3737. Avoid serialization between UUID and String

2020-06-10 Thread GitBox


runzhiwang opened a new pull request #1053:
URL: https://github.com/apache/hadoop-ozone/pull/1053


   ## What changes were proposed in this pull request?
   **What's the problem ?**
   
   Serialization between UUID and String: UUID.toString (I have improved this) 
and UUID.fromString, not only cost cpu, because encode and decode String and 
UUID.fromString both cost cpu, but also make the proto bigger, because uuid is 
just a number which is 16Byte, covet it to string will need 32Byte.
   
   **How to fix ?**
   Actually, JDK implement UUID with two long number: `mostSigBits` and 
`leastSigBits`. When `UUID.fromString`, JDK get `mostSigBits` and 
`leastSigBits` from String, and new a object of UUID. So we can convert UUID to 
2 long number in proto, which make serialization and de serialization UUID more 
faster, and make proto smaller.
   
   
![image](https://user-images.githubusercontent.com/51938049/84329780-37fed080-abb8-11ea-8b49-a981334fcb8c.png)
   
![image](https://user-images.githubusercontent.com/51938049/84329867-6f6d7d00-abb8-11ea-8815-71b7ae57d4c1.png)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3763
   
   ## How was this patch tested?
   
   Existed tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] codecov-commenter commented on pull request #1002: HDDS-3642. Stop/Pause Background services while replacing OM DB with checkpoint from Leader

2020-06-10 Thread GitBox


codecov-commenter commented on pull request #1002:
URL: https://github.com/apache/hadoop-ozone/pull/1002#issuecomment-642325738


   # 
[Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/1002?src=pr=h1) 
Report
   > Merging 
[#1002](https://codecov.io/gh/apache/hadoop-ozone/pull/1002?src=pr=desc) 
into 
[master](https://codecov.io/gh/apache/hadoop-ozone/commit/f7e95d9b015e764ca93cfe2ccfc96d95160931bc=desc)
 will **decrease** coverage by `0.03%`.
   > The diff coverage is `0.00%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/graphs/tree.svg?width=650=150=pr=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/1002?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#1002  +/-   ##
   
   - Coverage 69.48%   69.45%   -0.04% 
   - Complexity 9110 9114   +4 
   
 Files   961  961  
 Lines 4813248155  +23 
 Branches   4672 4679   +7 
   
 Hits  3344633446  
   - Misses1246812494  +26 
   + Partials   2218 2215   -3 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/hadoop-ozone/pull/1002?src=pr=tree) | 
Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[.../java/org/apache/hadoop/ozone/om/OzoneManager.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9Pem9uZU1hbmFnZXIuamF2YQ==)
 | `64.27% <0.00%> (-0.32%)` | `186.00 <0.00> (ø)` | |
   | 
[...er/common/transport/server/GrpcXceiverService.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3RyYW5zcG9ydC9zZXJ2ZXIvR3JwY1hjZWl2ZXJTZXJ2aWNlLmphdmE=)
 | `70.00% <0.00%> (-10.00%)` | `3.00% <0.00%> (ø%)` | |
   | 
[...ache/hadoop/ozone/om/codec/S3SecretValueCodec.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9jb2RlYy9TM1NlY3JldFZhbHVlQ29kZWMuamF2YQ==)
 | `90.90% <0.00%> (-9.10%)` | `3.00% <0.00%> (-1.00%)` | |
   | 
[.../transport/server/ratis/ContainerStateMachine.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3RyYW5zcG9ydC9zZXJ2ZXIvcmF0aXMvQ29udGFpbmVyU3RhdGVNYWNoaW5lLmphdmE=)
 | `69.36% <0.00%> (-6.76%)` | `59.00% <0.00%> (-5.00%)` | |
   | 
[...ozone/container/ozoneimpl/ContainerController.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvb3pvbmVpbXBsL0NvbnRhaW5lckNvbnRyb2xsZXIuamF2YQ==)
 | `63.15% <0.00%> (-5.27%)` | `11.00% <0.00%> (-1.00%)` | |
   | 
[...iner/common/transport/server/ratis/CSMMetrics.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3RyYW5zcG9ydC9zZXJ2ZXIvcmF0aXMvQ1NNTWV0cmljcy5qYXZh)
 | `67.69% <0.00%> (-3.08%)` | `19.00% <0.00%> (-1.00%)` | |
   | 
[.../ozone/container/common/volume/AbstractFuture.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIvY29tbW9uL3ZvbHVtZS9BYnN0cmFjdEZ1dHVyZS5qYXZh)
 | `29.87% <0.00%> (-0.52%)` | `19.00% <0.00%> (-1.00%)` | |
   | 
[...doop/ozone/container/keyvalue/KeyValueHandler.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29udGFpbmVyLXNlcnZpY2Uvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9jb250YWluZXIva2V5dmFsdWUvS2V5VmFsdWVIYW5kbGVyLmphdmE=)
 | `61.55% <0.00%> (-0.45%)` | `63.00% <0.00%> (-1.00%)` | |
   | 
[...adoop/ozone/om/request/key/OMKeyCommitRequest.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLW96b25lL296b25lLW1hbmFnZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2hhZG9vcC9vem9uZS9vbS9yZXF1ZXN0L2tleS9PTUtleUNvbW1pdFJlcXVlc3QuamF2YQ==)
 | `97.00% <0.00%> (ø)` | `18.00% <0.00%> (+1.00%)` | |
   | 
[.../org/apache/hadoop/hdds/scm/pipeline/Pipeline.java](https://codecov.io/gh/apache/hadoop-ozone/pull/1002/diff?src=pr=tree#diff-aGFkb29wLWhkZHMvY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9oYWRvb3AvaGRkcy9zY20vcGlwZWxpbmUvUGlwZWxpbmUuamF2YQ==)
 | `85.71% <0.00%> (+0.20%)` | `44.00% <0.00%> (+1.00%)` | |
   | ... and [15 

[GitHub] [hadoop-ozone] codecov-commenter edited a comment on pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


codecov-commenter edited a comment on pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#issuecomment-642271726


   # [Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=h1) 
Report
   > :exclamation: No coverage uploaded for pull request base 
(`master@3328d7d`). [Click here to learn what that 
means](https://docs.codecov.io/docs/error-reference#section-missing-base-commit).
   > The diff coverage is `n/a`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/hadoop-ozone/pull/986/graphs/tree.svg?width=650=150=pr=5YeeptJMby)](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=tree)
   
   ```diff
   @@Coverage Diff@@
   ## master #986   +/-   ##
   =
 Coverage  ?   69.48%   
 Complexity? 9112   
   =
 Files ?  961   
 Lines ?48107   
 Branches  ? 4669   
   =
 Hits  ?33428   
 Misses?12468   
 Partials  ? 2211   
   ```
   
   
   
   --
   
   [Continue to review full report at 
Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=continue).
   > **Legend** - [Click here to learn 
more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute  (impact)`, `ø = not affected`, `? = missing data`
   > Powered by 
[Codecov](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=footer). 
Last update 
[3328d7d...b2bda39](https://codecov.io/gh/apache/hadoop-ozone/pull/986?src=pr=lastupdated).
 Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 merged pull request #1052: HDDS-3749. Addendum: Fix checkstyle issue.

2020-06-10 Thread GitBox


bharatviswa504 merged pull request #1052:
URL: https://github.com/apache/hadoop-ozone/pull/1052


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1034: HDDS-3749. Improve OM performance with 3.7% by avoid stream.collect

2020-06-10 Thread GitBox


bharatviswa504 commented on pull request #1034:
URL: https://github.com/apache/hadoop-ozone/pull/1034#issuecomment-642305014


   Hi @xiaoyuyao 
   This has caused checkstyle issues and PR's are failing with CI run.
   Posted a PR to fix this
   https://github.com/apache/hadoop-ozone/pull/1052
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 opened a new pull request #1052: HDDS-3479. Addendum: Fix checkstyle issue.

2020-06-10 Thread GitBox


bharatviswa504 opened a new pull request #1052:
URL: https://github.com/apache/hadoop-ozone/pull/1052


   ## What changes were proposed in this pull request?
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, 
remove this)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3749) Improve OM performance with 3.7% by avoid stream.collect

2020-06-10 Thread Xiaoyu Yao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDDS-3749.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

> Improve OM performance with 3.7% by avoid stream.collect
> 
>
> Key: HDDS-3749
> URL: https://issues.apache.org/jira/browse/HDDS-3749
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png, screenshot-5.png
>
>
> I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two 
> weeks with heavy workload, and perf om.
>  !screenshot-1.png! 
>  !screenshot-2.png! 
>  !screenshot-3.png! 
>  !screenshot-4.png! 
>  !screenshot-5.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao merged pull request #1034: HDDS-3749. Improve OM performance with 3.7% by avoid stream.collect

2020-06-10 Thread GitBox


xiaoyuyao merged pull request #1034:
URL: https://github.com/apache/hadoop-ozone/pull/1034


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438435882



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();
+return lastAppliedIndex;

Review comment:
   So, that is the reason get lastAppliedIndex first, then flush. If we 
change the order, it will lead to data loss. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438435468



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();
+return lastAppliedIndex;

Review comment:
   Why it will lead to data loss. I am returning already flushed index, and 
ratis only does log purge which have been flushed to DB.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438435468



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();
+return lastAppliedIndex;

Review comment:
   Why it will lead to data loss. We are returning already flushed index, 
and ratis only does log purge which have been flushed to DB.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438434496



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();

Review comment:
   Done

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -515,13 +528,12 @@ private synchronized void 
computeAndUpdateLastAppliedIndex(
 }
   }
 
-  public void updateLastAppliedIndexWithSnaphsotIndex() {
+  public void updateLastAppliedIndexWithSnaphsotIndex() throws IOException {

Review comment:
   Done

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3168,8 +3172,8 @@ File replaceOMDBWithCheckpoint(long lastAppliedIndex, 
Path checkpointPath)
* All the classes which use/ store MetadataManager should also be updated
* with the new MetadataManager instance.
*/
-  void reloadOMState(long newSnapshotIndex,
-  long newSnapShotTermIndex) throws IOException {
+  void reloadOMState(long newSnapshotIndex, long newSnapShotTermIndex)
+  throws IOException {

Review comment:
   Done

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3033,32 +3024,47 @@ public TermIndex installSnapshot(String leaderId) {
 DBCheckpoint omDBcheckpoint = getDBCheckpointFromLeader(leaderId);
 Path newDBlocation = omDBcheckpoint.getCheckpointLocation();
 
-// Check if current ratis log index is smaller than the downloaded
-// snapshot index. If yes, proceed by stopping the ratis server so that
-// the OM state can be re-initialized. If no, then do not proceed with
-// installSnapshot.
+LOG.info("Downloaded checkpoint from Leader {}, in to the location {}",
+leaderId, newDBlocation);
+
 long lastAppliedIndex = omRatisServer.getLastAppliedTermIndex().getIndex();
-long checkpointSnapshotIndex = omDBcheckpoint.getRatisSnapshotIndex();
-long checkpointSnapshotTermIndex =
-omDBcheckpoint.getRatisSnapshotTerm();
-if (checkpointSnapshotIndex <= lastAppliedIndex) {
-  LOG.error("Failed to install checkpoint from OM leader: {}. The last " +
-  "applied index: {} is greater than or equal to the checkpoint's"
-  + " " +
-  "snapshot index: {}. Deleting the downloaded checkpoint {}",
-  leaderId,
-  lastAppliedIndex, checkpointSnapshotIndex,
+
+// Check if current ratis log index is smaller than the downloaded
+// checkpoint transaction index. If yes, proceed by stopping the ratis
+// server so that the OM state can be re-initialized. If no, then do not
+// proceed with installSnapshot.
+
+OMTransactionInfo omTransactionInfo = null;
+
+Path dbDir = newDBlocation.getParent();
+if (dbDir == null) {
+  LOG.error("Incorrect DB location path {} received from checkpoint.",
   newDBlocation);
-  try {
-FileUtils.deleteFully(newDBlocation);
-  } catch (IOException e) {
-LOG.error("Failed to fully delete the downloaded DB checkpoint {} " +
-"from OM leader {}.", newDBlocation,
-leaderId, e);
-  }
   return null;
 }
 
+try {
+  omTransactionInfo =
+  OzoneManagerRatisUtils.getTransactionInfoFromDownloadedSnapshot(
+  configuration, dbDir);
+} catch (Exception ex) {
+  LOG.error("Failed during opening downloaded snapshot from " +
+  "{} to obtain transaction index", newDBlocation, ex);
+  return null;
+}
+
+boolean canProceed =
+OzoneManagerRatisUtils.verifyTransactionInfo(omTransactionInfo,
+lastAppliedIndex, leaderId, newDBlocation);
+

Review comment:
   Done.

##
File path: 

[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#issuecomment-642292921


   Thank You @hanishakoneru for the review.
   I have addressed review comments.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


hanishakoneru commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438432283



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();
+return lastAppliedIndex;

Review comment:
   So this needs to be fixed then. Or could lead to data loss.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438431531



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerStateMachine.java
##
@@ -338,20 +352,19 @@ public void unpause(long newLastAppliedSnaphsotIndex,
   }
 
   /**
-   * Take OM Ratis snapshot. Write the snapshot index to file. Snapshot index
-   * is the log index corresponding to the last applied transaction on the OM
-   * State Machine.
+   * Take OM Ratis snapshot is a dummy operation as when double buffer
+   * flushes the lastAppliedIndex is flushed to DB and that is used as
+   * snapshot index.
*
* @return the last applied index on the state machine which has been
* stored in the snapshot file.
*/
   @Override
   public long takeSnapshot() throws IOException {
-LOG.info("Saving Ratis snapshot on the OM.");
-if (ozoneManager != null) {
-  return ozoneManager.saveRatisSnapshot().getIndex();
-}
-return 0;
+LOG.info("Current Snapshot Index {}", getLastAppliedTermIndex());
+long lastAppliedIndex = getLastAppliedTermIndex().getIndex();
+ozoneManager.getMetadataManager().getStore().flush();
+return lastAppliedIndex;

Review comment:
   Yes. It would not. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


bharatviswa504 commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438407648



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OzoneManagerSnapshotProvider.java
##
@@ -112,16 +112,16 @@ public OzoneManagerSnapshotProvider(ConfigurationSource 
conf,
*/
   public DBCheckpoint getOzoneManagerDBSnapshot(String leaderOMNodeID)
   throws IOException {
-String snapshotFileName = OM_SNAPSHOT_DB + "_" + 
System.currentTimeMillis();
-File targetFile = new File(omSnapshotDir, snapshotFileName + ".tar.gz");
+String snapshotTime = Long.toString(System.currentTimeMillis());
+String snapshotFileName = Paths.get(omSnapshotDir.getAbsolutePath(),
+snapshotTime, OM_DB_NAME).toFile().getAbsolutePath();
+File targetFile = new File(snapshotFileName + ".tar.gz");

Review comment:
   Still we need this. As We use DBStore.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on a change in pull request #986: HDDS-3476. Use persisted transaction info during OM startup in OM StateMachine.

2020-06-10 Thread GitBox


hanishakoneru commented on a change in pull request #986:
URL: https://github.com/apache/hadoop-ozone/pull/986#discussion_r438353082



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OmMetadataManagerImpl.java
##
@@ -259,16 +261,25 @@ public void start(OzoneConfiguration configuration) 
throws IOException {
 rocksDBConfiguration.setSyncOption(true);
   }
 
-  DBStoreBuilder dbStoreBuilder = DBStoreBuilder.newBuilder(configuration,
-  rocksDBConfiguration).setName(OM_DB_NAME)
-  .setPath(Paths.get(metaDir.getPath()));
+  this.store = loadDB(configuration, metaDir);
 
-  this.store = addOMTablesAndCodecs(dbStoreBuilder).build();
+  // This value will be used internally, not to be exposed to end users.

Review comment:
   We can remove this comment now.

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3033,32 +3024,47 @@ public TermIndex installSnapshot(String leaderId) {
 DBCheckpoint omDBcheckpoint = getDBCheckpointFromLeader(leaderId);
 Path newDBlocation = omDBcheckpoint.getCheckpointLocation();
 
-// Check if current ratis log index is smaller than the downloaded
-// snapshot index. If yes, proceed by stopping the ratis server so that
-// the OM state can be re-initialized. If no, then do not proceed with
-// installSnapshot.
+LOG.info("Downloaded checkpoint from Leader {}, in to the location {}",
+leaderId, newDBlocation);
+
 long lastAppliedIndex = omRatisServer.getLastAppliedTermIndex().getIndex();
-long checkpointSnapshotIndex = omDBcheckpoint.getRatisSnapshotIndex();
-long checkpointSnapshotTermIndex =
-omDBcheckpoint.getRatisSnapshotTerm();
-if (checkpointSnapshotIndex <= lastAppliedIndex) {
-  LOG.error("Failed to install checkpoint from OM leader: {}. The last " +
-  "applied index: {} is greater than or equal to the checkpoint's"
-  + " " +
-  "snapshot index: {}. Deleting the downloaded checkpoint {}",
-  leaderId,
-  lastAppliedIndex, checkpointSnapshotIndex,
+
+// Check if current ratis log index is smaller than the downloaded
+// checkpoint transaction index. If yes, proceed by stopping the ratis
+// server so that the OM state can be re-initialized. If no, then do not
+// proceed with installSnapshot.
+
+OMTransactionInfo omTransactionInfo = null;
+
+Path dbDir = newDBlocation.getParent();
+if (dbDir == null) {
+  LOG.error("Incorrect DB location path {} received from checkpoint.",
   newDBlocation);
-  try {
-FileUtils.deleteFully(newDBlocation);
-  } catch (IOException e) {
-LOG.error("Failed to fully delete the downloaded DB checkpoint {} " +
-"from OM leader {}.", newDBlocation,
-leaderId, e);
-  }
   return null;
 }
 
+try {
+  omTransactionInfo =
+  OzoneManagerRatisUtils.getTransactionInfoFromDownloadedSnapshot(
+  configuration, dbDir);
+} catch (Exception ex) {
+  LOG.error("Failed during opening downloaded snapshot from " +
+  "{} to obtain transaction index", newDBlocation, ex);
+  return null;
+}
+
+boolean canProceed =
+OzoneManagerRatisUtils.verifyTransactionInfo(omTransactionInfo,
+lastAppliedIndex, leaderId, newDBlocation);
+

Review comment:
   The lastAppliedIndex could have been updated between its assignment and 
the canProceed check. This check should be synchronous. Or at least the 
assignment should happen after reading the transactionInfo from DB.

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/OzoneManager.java
##
@@ -3168,8 +3172,8 @@ File replaceOMDBWithCheckpoint(long lastAppliedIndex, 
Path checkpointPath)
* All the classes which use/ store MetadataManager should also be updated
* with the new MetadataManager instance.
*/
-  void reloadOMState(long newSnapshotIndex,
-  long newSnapShotTermIndex) throws IOException {
+  void reloadOMState(long newSnapshotIndex, long newSnapShotTermIndex)
+  throws IOException {

Review comment:
   NIT: SnapShot -> Snapshot

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/snapshot/OzoneManagerSnapshotProvider.java
##
@@ -112,16 +112,16 @@ public OzoneManagerSnapshotProvider(ConfigurationSource 
conf,
*/
   public DBCheckpoint getOzoneManagerDBSnapshot(String leaderOMNodeID)
   throws IOException {
-String snapshotFileName = OM_SNAPSHOT_DB + "_" + 
System.currentTimeMillis();
-File targetFile = new File(omSnapshotDir, snapshotFileName + ".tar.gz");
+String snapshotTime = Long.toString(System.currentTimeMillis());
+String snapshotFileName = Paths.get(omSnapshotDir.getAbsolutePath(),
+snapshotTime, 

[jira] [Commented] (HDDS-1134) OzoneFileSystem#create should allocate alteast one block for future writes.

2020-06-10 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132701#comment-17132701
 ] 

Bharat Viswanadham commented on HDDS-1134:
--

Hi [~msingh]
I see this is being handled in OzoneManager, if the length passed is zero, we 
allocate at least one block.

Code link. 
[#codelink|https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCreateRequest.java#L104]


> OzoneFileSystem#create should allocate alteast one block for future writes.
> ---
>
> Key: HDDS-1134
> URL: https://issues.apache.org/jira/browse/HDDS-1134
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: TriagePending
> Fix For: 0.6.0
>
> Attachments: HDDS-1134.001.patch
>
>
> While opening a new key, OM should at least allocate one block for the key, 
> this should be done in case the client is not sure about the number of block. 
> However for users of OzoneFS, if the key is being created for a directory, 
> then no blocks should be allocated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1134) OzoneFileSystem#create should allocate alteast one block for future writes.

2020-06-10 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-1134.
--
Fix Version/s: 0.6.0
   Resolution: Fixed

This has been already fixed. Right now, we allocate at least one block in 
createKey call.
This has been taken care of during the OM HA refactor.


> OzoneFileSystem#create should allocate alteast one block for future writes.
> ---
>
> Key: HDDS-1134
> URL: https://issues.apache.org/jira/browse/HDDS-1134
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.0
>Reporter: Mukul Kumar Singh
>Assignee: Mukul Kumar Singh
>Priority: Major
>  Labels: TriagePending
> Fix For: 0.6.0
>
> Attachments: HDDS-1134.001.patch
>
>
> While opening a new key, OM should at least allocate one block for the key, 
> this should be done in case the client is not sure about the number of block. 
> However for users of OzoneFS, if the key is being created for a directory, 
> then no blocks should be allocated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl edited a comment on pull request #1046: HDDS-3767. [OFS] Address merge conflicts after HDDS-3627

2020-06-10 Thread GitBox


smengcl edited a comment on pull request #1046:
URL: https://github.com/apache/hadoop-ozone/pull/1046#issuecomment-642120503


   > Thanks this patch @smengcl (And sorry for the master changes, we worked 
paralell.)
   > 
   > Understanding this PR is really challenging. Finally, I fetched the PR 
branch and compared with the master and everything seems to be the right place.
   
   Thanks for the review @elek . Actually I put a [compare 
link](https://github.com/smengcl/hadoop-ozone/compare/HDDS-2665-ofs...HDDS-3767)
 in a previous comment which should have make the review easier in theory.
   
   > 
   > One question: Why did you deleted 
`TestRootedOzoneFileSystemWithMocks.java`?
   
   I removed `TestRootedOzoneFileSystemWithMocks.java` because HDDS-3627 
removed `TestOzoneFileSystemWithMocks.java`.
   
   I have just restored `TestRootedOzoneFileSystemWithMocks.java` under 
`ozonefs`.
   
   > 
   > And one comment: `META-INF/services/...FileSystem` entries can be created 
for ofs, too. (In the future)
   
   I believe we could only put one implementation 
[here](https://github.com/apache/hadoop-ozone/blob/072370b947416d89fae11d00a84a1d9a6b31beaa/hadoop-ozone/ozonefs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem#L16)?
 Maybe later we can replace `org.apache.hadoop.fs.ozone.OzoneFileSystem` with 
`org.apache.hadoop.fs.ozone.RootedOzoneFileSystem`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3685) Remove replay logic from actual request logic

2020-06-10 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-3685:
-
Priority: Critical  (was: Major)

> Remove replay logic from actual request logic
> -
>
> Key: HDDS-3685
> URL: https://issues.apache.org/jira/browse/HDDS-3685
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Critical
>
> HDDS-3476 used the transaction info persisted in OM DB during double buffer 
> flush when OM is restarted. This transaction info log index and the term are 
> used as a snapshot index. So, we can remove the replay logic from actual 
> request logic. (As now we shall never have the transaction which is applied 
> to OM DB will never be again replayed to DB)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-3707) UUID can be non unique for a huge samples

2020-06-10 Thread Arpit Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132675#comment-17132675
 ] 

Arpit Agarwal edited comment on HDDS-3707 at 6/10/20, 7:52 PM:
---

Hi [~maobaolong] the probability is so infinitesimal I don't think it is worth 
trying to change it. :)


was (Author: arpitagarwal):
Hi [~maobaolong] the probability is so infinitesimal I don't think it is worth 
trying to fix it. :)

> UUID can be non unique for a huge samples
> -
>
> Key: HDDS-3707
> URL: https://issues.apache.org/jira/browse/HDDS-3707
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, Ozone Manager, SCM
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Minor
>  Labels: Triaged
>
> Now, we have used UUID as id for many places, for example, DataNodeId, 
> pipelineId. I believe that it should be pretty less chance to met collision, 
> but, if met the collision, we are in trouble.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3707) UUID can be non unique for a huge samples

2020-06-10 Thread Arpit Agarwal (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17132675#comment-17132675
 ] 

Arpit Agarwal commented on HDDS-3707:
-

Hi [~maobaolong] the probability is so infinitesimal I don't think it is worth 
trying to fix it. :)

> UUID can be non unique for a huge samples
> -
>
> Key: HDDS-3707
> URL: https://issues.apache.org/jira/browse/HDDS-3707
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode, Ozone Manager, SCM
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Minor
>  Labels: Triaged
>
> Now, we have used UUID as id for many places, for example, DataNodeId, 
> pipelineId. I believe that it should be pretty less chance to met collision, 
> but, if met the collision, we are in trouble.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3639) Maintain FileHandle Information in OMMetadataManager

2020-06-10 Thread Hanisha Koneru (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru resolved HDDS-3639.
--
Resolution: Fixed

> Maintain FileHandle Information in OMMetadataManager
> 
>
> Key: HDDS-3639
> URL: https://issues.apache.org/jira/browse/HDDS-3639
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Filesystem
>Reporter: Prashant Pogde
>Assignee: Prashant Pogde
>Priority: Major
>  Labels: pull-request-available
>
> Maintain FileHandle Information in OMMetadataManager.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3775) Add documentation for flame graph

2020-06-10 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HDDS-3775:
-

 Summary: Add documentation for flame graph 
 Key: HDDS-3775
 URL: https://issues.apache.org/jira/browse/HDDS-3775
 Project: Hadoop Distributed Data Store
  Issue Type: Task
Reporter: Wei-Chiu Chuang


HDDS-1116 added flame graph but looks like there's no documentation to enable 
it.

To enable it,
add configuration hdds.profiler.endpoint.enabled = true to ozone-site.xml
download the profiler from 
https://github.com/jvm-profiling-tools/async-profiler to a local directory, say 
/tmp and start the DataNode with java system property 
-Dasync.profiler.home=/tmp or environment variable $ASYNC_PROFILER_HOME

and then go to the datanode servlet, say dn1:9883/prof to see the graph.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] vivekratnavel commented on pull request #1047: HDDS-3726. Upload code coverage data to Codecov and enable checks in …

2020-06-10 Thread GitBox


vivekratnavel commented on pull request #1047:
URL: https://github.com/apache/hadoop-ozone/pull/1047#issuecomment-642162869


   @elek Thanks for the review and merge!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on pull request #1051: Redundancy if condition code in ListPipelinesSubcommand

2020-06-10 Thread GitBox


maobaolong commented on pull request #1051:
URL: https://github.com/apache/hadoop-ozone/pull/1051#issuecomment-642148224


   @bhemanthkumar Thanks for working on this, please fix the style problem. 
Also, please update the description from the given template. 
   
   Reference this PR. https://github.com/apache/hadoop-ozone/pull/920



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] maobaolong commented on a change in pull request #1051: Redundancy if condition code in ListPipelinesSubcommand

2020-06-10 Thread GitBox


maobaolong commented on a change in pull request #1051:
URL: https://github.com/apache/hadoop-ozone/pull/1051#discussion_r438285600



##
File path: 
hadoop-hdds/tools/src/main/java/org/apache/hadoop/hdds/scm/cli/pipeline/ListPipelinesSubcommand.java
##
@@ -54,17 +54,13 @@
   @Override
   public Void call() throws Exception {
 try (ScmClient scmClient = parent.getParent().createScmClient()) {
-  if (Strings.isNullOrEmpty(factor) && Strings.isNullOrEmpty(state)) {
-scmClient.listPipelines().forEach(System.out::println);
-  } else {
-scmClient.listPipelines().stream()
-.filter(p -> ((Strings.isNullOrEmpty(factor) ||
-(p.getFactor().toString().compareToIgnoreCase(factor) == 0))
-&& (Strings.isNullOrEmpty(state) ||
-(p.getPipelineState().toString().compareToIgnoreCase(state)
-== 0
+   scmClient.listPipelines().stream()

Review comment:
   Please reduce the indent to fix the checkstyle failure.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3726) Upload code coverage to Codecov and enable checks in PR workflow of Github Actions

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek updated HDDS-3726:
--
Fix Version/s: 0.6.0

> Upload code coverage to Codecov and enable checks in PR workflow of Github 
> Actions
> --
>
> Key: HDDS-3726
> URL: https://issues.apache.org/jira/browse/HDDS-3726
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: build
>Affects Versions: 0.6.0
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> HDDS-3170 aggregates code coverage across all components. We need to upload 
> the reports to codecov to be able to keep track of coverage and coverage 
> diffs to be able to tell if a PR does not do a good job on writing unit tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] smengcl commented on pull request #1046: HDDS-3767. [OFS] Address merge conflicts after HDDS-3627

2020-06-10 Thread GitBox


smengcl commented on pull request #1046:
URL: https://github.com/apache/hadoop-ozone/pull/1046#issuecomment-642120503


   > Thanks this patch @smengcl (And sorry for the master changes, we worked 
paralell.)
   > 
   > Understanding this PR is really challenging. Finally, I fetched the PR 
branch and compared with the master and everything seems to be the right place.
   
   Thanks for the review @elek . Actually I put a [compare 
link](https://github.com/smengcl/hadoop-ozone/compare/HDDS-2665-ofs...HDDS-3767)
 in a previous comment which should have make the review easier in theory.
   
   > 
   > One question: Why did you deleted 
`TestRootedOzoneFileSystemWithMocks.java`?
   
   I removed `TestRootedOzoneFileSystemWithMocks.java` because HDDS-3627 
removed `TestOzoneFileSystemWithMocks.java`. Shall we put it back?
   
   > 
   > And one comment: `META-INF/services/...FileSystem` entries can be created 
for ofs, too. (In the future)
   
   I believe we could only put one implementation 
[here](https://github.com/apache/hadoop-ozone/blob/072370b947416d89fae11d00a84a1d9a6b31beaa/hadoop-ozone/ozonefs/src/main/resources/META-INF/services/org.apache.hadoop.fs.FileSystem#L16)?
 Maybe later we can replace `org.apache.hadoop.fs.ozone.OzoneFileSystem` with 
`org.apache.hadoop.fs.ozone.RootedOzoneFileSystem`.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-3747) Remove the redundancy if condition code in ListPipelinesSubcommand

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130830#comment-17130830
 ] 

hemanthboyina edited comment on HDDS-3747 at 6/10/20, 4:02 PM:
---

raised a PR : [https://github.com/apache/hadoop-ozone/pull/1051]

please review


was (Author: hemanthboyina):
raised a PR : [https://github.com/apache/hadoop-ozone/pull/1051]

> Remove the redundancy if condition code in ListPipelinesSubcommand
> --
>
> Key: HDDS-3747
> URL: https://issues.apache.org/jira/browse/HDDS-3747
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone CLI
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3747) Remove the redundancy if condition code in ListPipelinesSubcommand

2020-06-10 Thread hemanthboyina (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130830#comment-17130830
 ] 

hemanthboyina commented on HDDS-3747:
-

raised a PR : [https://github.com/apache/hadoop-ozone/pull/1051]

> Remove the redundancy if condition code in ListPipelinesSubcommand
> --
>
> Key: HDDS-3747
> URL: https://issues.apache.org/jira/browse/HDDS-3747
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone CLI
>Affects Versions: 0.7.0
>Reporter: maobaolong
>Priority: Minor
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bhemanthkumar opened a new pull request #1051: Redundancy if condition code in ListPipelinesSubcommand

2020-06-10 Thread GitBox


bhemanthkumar opened a new pull request #1051:
URL: https://github.com/apache/hadoop-ozone/pull/1051


   Remove the redundancy if condition code in ListPipelinesSubcommand
   
   ## What changes were proposed in this pull request?
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, 
remove this)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3512) s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* jar

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130792#comment-17130792
 ] 

Marton Elek commented on HDDS-3512:
---

Is it still a problem?

I tried to reproduce it with freon:


{code:java}
ozone freon s3kg -e http://s3g:9878 -n 10 -s 5242880 {code}
But chunk files created with the same size:
{code:java}
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385031656.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385031657.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385031658.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385162731.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385424876.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385424877.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385424878.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385424879.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/104320375385490416.block
-rw-r--r-- 1 hadoop users 5242880 Jun 10 15:23 
./hdds/hdds/15292b6c-34d6-48d0-bb97-f58043767ade/current/containerDir0/1/chunks/10432037538953.block
 {code}

> s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* 
> jar
> -
>
> Key: HDDS-3512
> URL: https://issues.apache.org/jira/browse/HDDS-3512
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: S3
>Reporter: Sammi Chen
>Assignee: Marton Elek
>Priority: Blocker
>  Labels: TriagePending
>
> The default multi-part size  is 5MB, which is 5242880 byte, while all the 
> chunks saved by s3g is 5246566 byte which is greater than 5MB.
> By looking into the ObjectEndpoint.java, it seems the chunk size is retrieved 
> from the "Content-Length" header. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] nandakumar131 commented on pull request #1048: HDDS-3481. SCM ask too many datanodes to replicate the same container

2020-06-10 Thread GitBox


nandakumar131 commented on pull request #1048:
URL: https://github.com/apache/hadoop-ozone/pull/1048#issuecomment-642077756


   @runzhiwang was this PR closed by accident?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3481) SCM ask too many datanodes to replicate the same container

2020-06-10 Thread Nanda kumar (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130772#comment-17130772
 ] 

Nanda kumar commented on HDDS-3481:
---

[~yjxxtd], [~xyao], [~elek], [~sodonnell]
 I completely agree. We need some kind of balancing and throttling in SCM.

Created HDDS-3774 for the same.

> SCM ask too many datanodes to replicate the same container
> --
>
> Key: HDDS-3481
> URL: https://issues.apache.org/jira/browse/HDDS-3481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
>  Labels: TriagePending, pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
> *What's the problem ?*
> As the image shows,  scm ask 31 datanodes to replicate container 2037 every 
> 10 minutes from 2020-04-17 23:38:51.  And at 2020-04-18 08:58:52 scm find the 
> replicate num of container 2037 is 12, then it ask 11 datanodes to delete 
> container 2037. 
>  !screenshot-1.png! 
>  !screenshot-2.png! 
> *What's the reason ?*
> scm check whether  (container replicates num + 
> inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it 
> will ask some datanode to replicate the container, and add the action into 
> inflightReplication.get(containerId). The replicate action time out is 10 
> minutes, if action timeout, scm will delete the action from 
> inflightReplication.get(containerId) as the image shows. Then (container 
> replicates num + inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask 
> another datanode to replicate the container.
> Because replicate container cost a long time,  sometimes it cannot finish in 
> 10 minutes, thus 31 datanodes has to replicate the container every 10 
> minutes.  19 of 31 datanodes replicate container from the same source 
> datanode,  it will also cause big pressure on the source datanode and 
> replicate container become slower. Actually it cost 4 hours to finish the 
> first replicate. 
>  !screenshot-4.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3774) Throttle replication commands sent to datanode

2020-06-10 Thread Nanda kumar (Jira)
Nanda kumar created HDDS-3774:
-

 Summary: Throttle replication commands sent to datanode
 Key: HDDS-3774
 URL: https://issues.apache.org/jira/browse/HDDS-3774
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: SCM
Reporter: Nanda kumar
Assignee: Nanda kumar


The Replication/Deletion command sent by SCM to datanode should be throttled 
and controlled by SCM.

* SCM should consider the load on datanode before sending any command.
* If network topology is configured, SCM should use it for sorting the source 
datanode for replication.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3512) s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* jar

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek reassigned HDDS-3512:
-

Assignee: Marton Elek

> s3g multi-upload saved content incorrect when client uses aws java sdk 1.11.* 
> jar
> -
>
> Key: HDDS-3512
> URL: https://issues.apache.org/jira/browse/HDDS-3512
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: S3
>Reporter: Sammi Chen
>Assignee: Marton Elek
>Priority: Blocker
>  Labels: TriagePending
>
> The default multi-part size  is 5MB, which is 5242880 byte, while all the 
> chunks saved by s3g is 5246566 byte which is greater than 5MB.
> By looking into the ObjectEndpoint.java, it seems the chunk size is retrieved 
> from the "Content-Length" header. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2447) Allow datanodes to operate with simulated containers

2020-06-10 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-2447:
--
Target Version/s: 0.7.0  (was: 0.6.0)

> Allow datanodes to operate with simulated containers
> 
>
> Key: HDDS-2447
> URL: https://issues.apache.org/jira/browse/HDDS-2447
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Affects Versions: 0.5.0
>Reporter: Stephen O'Donnell
>Priority: Major
>  Labels: TriagePending
>
> The Storage Container Manager (SCM) generally deals with datanodes and 
> containers. Datanodes report their containers via container reports and the 
> SCM keeps track of them, schedules new replicas to be created when needed 
> etc. SCM does not care about individual blocks within the containers (aside 
> from deleting them) or keys. Therefore it should be possible to scale test 
> much of SCM without OM or worrying about writing keys.
> In order to scale test SCM and some of its internal features like like 
> decommission, maintenance mode and the replication manager, it would be 
> helpful to quickly create clusters with many containers, without needing to 
> go through a data loading exercise.
> What I imagine happening is:
> * We generate a list of container IDs and container sizes - this could be a 
> fixed size or configured size for all containers. We could also fix the 
> number of blocks / chunks inside a 'generated simulated container' so they 
> are all the same.
> * When the Datanode starts, if it has simulated containers enabled, it would 
> optionally look for this list of containers and load the meta data into 
> memory. Then it would report the containers to SCM as normal, and the SCM 
> would believe the containers actually exist.
> * If SCM creates a new container, then the datanode should create the 
> meta-data in memory, but not write anything to disk.
> * If SCM instructs a DN to replicate a container, then we should stream 
> simulated data over the wire equivalent to the container size, but again 
> throw away the data at the receiving side and store only the metadata in 
> datanode memory.
> * It would be acceptable for a DN restart to forget all containers and 
> re-load them from the generated list. A nice-to-have feature would persist 
> any changes to disk somehow so a DN restart would return to its pre-restart 
> state.
> At this stage, I am not too concerned about OM, or clients trying to read 
> chunks out of these simulated containers (my focus is on SCM at the moment), 
> but it would be great if that were possible too.
> I believe this feature would let us do scale testing of SCM and benchmark 
> some dead node / replication / decommission scenarios on clusters with much 
> reduced hardware requirements.
> It would also allow clusters with a large number of containers to be created 
> quickly, rather than going through a dataload exercise.
> This would open the door to a tool similar to 
> https://github.com/linkedin/dynamometer which uses simulated storage on HDFS 
> to perform scale tests against the namenode with reduced hardware 
> requirements.
> HDDS-1094 added the ability to have a level of simulated storage on a 
> datanode. In that Jira, when a client writes data to a chunk the data is 
> thrown away and nothing is written to disk. If a client later tries to read 
> the data back, it just gets zeroed byte buffers. Hopefully this Jira could 
> build on that feature to fully simulate the containers from the SCM point of 
> view and later we can extend to allowing clients to create keys etc too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2449) Delete block command should use a thread pool

2020-06-10 Thread Lokesh Jain (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lokesh Jain updated HDDS-2449:
--
Target Version/s: 0.7.0  (was: 0.6.0)

> Delete block command should use a thread pool
> -
>
> Key: HDDS-2449
> URL: https://issues.apache.org/jira/browse/HDDS-2449
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Affects Versions: 0.6.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: TriagePending
>
> The datanode receives commands over the heartbeat and queues all commands on 
> a single queue in StateContext.commandQueue. Inside DatanodeStateMachine a 
> single thread is used to process this queue (started by initCommandHander 
> thread) and it passes each command to a ‘handler’. Each command type has its 
> own handler.
> The delete block command immediately executes the command on the thread used 
> to process the command queue. Therefore if the delete is slow for some reason 
> (it must access disk, so this is possible) it could cause other commands to 
> backup.
> This should be changed to use a threadpool to queue the deleteBlock command, 
> in a similar way to ReplicateContainerCommand.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #1019: HDDS-3679. Add unit tests for PipelineManagerV2.

2020-06-10 Thread GitBox


elek commented on pull request #1019:
URL: https://github.com/apache/hadoop-ozone/pull/1019#issuecomment-642035752


   > @elek Shall we make a separate commit to upgrade rocksdb version?
   
   I am open for both approaches, but seems to be a good idea to do it on 
`master`, too.
   
   I am +1, in advance, if the build is green ;-)
   
   (But we can also add it to here temporary, to check if it helps...)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-3773) Add OMDBDefinition to define structure of om.db

2020-06-10 Thread Sadanand Shenoy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadanand Shenoy reassigned HDDS-3773:
-

Assignee: Sadanand Shenoy

> Add OMDBDefinition to define structure of om.db
> ---
>
> Key: HDDS-3773
> URL: https://issues.apache.org/jira/browse/HDDS-3773
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Sadanand Shenoy
>Assignee: Sadanand Shenoy
>Priority: Minor
>
> The rocksdb tool to display data  from a db file uses  implementations of 
> DBDefinition class which specifically describes the structure and type of the 
> db file. To support the tool for om.db this class is defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3773) Add OMDBDefinition to define structure of om.db

2020-06-10 Thread Sadanand Shenoy (Jira)
Sadanand Shenoy created HDDS-3773:
-

 Summary: Add OMDBDefinition to define structure of om.db
 Key: HDDS-3773
 URL: https://issues.apache.org/jira/browse/HDDS-3773
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Tools
Reporter: Sadanand Shenoy


The rocksdb tool to display data  from a db file uses  implementations of 
DBDefinition class which specifically describes the structure and type of the 
db file. To support the tool for om.db this class is defined.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek merged pull request #1043: HDDS-3760. Avoid UUID#toString call in Pipeline#getProtobufMessage

2020-06-10 Thread GitBox


elek merged pull request #1043:
URL: https://github.com/apache/hadoop-ozone/pull/1043


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #1005: HDDS-3350. Ozone Retry Policy Improvements.

2020-06-10 Thread GitBox


lokeshj1703 commented on a change in pull request #1005:
URL: https://github.com/apache/hadoop-ozone/pull/1005#discussion_r438144376



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConfigKeys.java
##
@@ -281,15 +281,33 @@
 
   public static final String DFS_CONTAINER_RATIS_DATANODE_STORAGE_DIR =
   "dfs.container.ratis.datanode.storage.dir";
-  public static final String DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_KEY =
-  ScmConfigKeys.DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_KEY;
-  public static final int DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_DEFAULT =
-  ScmConfigKeys.DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_DEFAULT;
-  public static final String DFS_RATIS_CLIENT_REQUEST_RETRY_INTERVAL_KEY =
-  ScmConfigKeys.DFS_RATIS_CLIENT_REQUEST_RETRY_INTERVAL_KEY;
+

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #1005: HDDS-3350. Ozone Retry Policy Improvements.

2020-06-10 Thread GitBox


lokeshj1703 commented on a change in pull request #1005:
URL: https://github.com/apache/hadoop-ozone/pull/1005#discussion_r438144281



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/RatisHelper.java
##
@@ -269,23 +282,76 @@ static GrpcTlsConfig createTlsClientConfig(SecurityConfig 
conf,
 return tlsConfig;
   }
 

Review comment:
   Done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] lokeshj1703 commented on a change in pull request #1005: HDDS-3350. Ozone Retry Policy Improvements.

2020-06-10 Thread GitBox


lokeshj1703 commented on a change in pull request #1005:
URL: https://github.com/apache/hadoop-ozone/pull/1005#discussion_r438143919



##
File path: 
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/RatisHelper.java
##
@@ -269,23 +282,76 @@ static GrpcTlsConfig createTlsClientConfig(SecurityConfig 
conf,
 return tlsConfig;
   }
 
-  static RetryPolicy createRetryPolicy(ConfigurationSource conf) {
-int maxRetryCount =
-conf.getInt(OzoneConfigKeys.DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_KEY,
+  public static RetryPolicy createRetryPolicy(ConfigurationSource conf) {
+ExponentialBackoffRetry exponentialBackoffRetry =
+createExponentialBackoffPolicy(conf);
+MultipleLinearRandomRetry multipleLinearRandomRetry =
+MultipleLinearRandomRetry.parseCommaSeparated(conf.get(
+OzoneConfigKeys.DFS_RATIS_CLIENT_MULTILINEAR_RANDOM_RETRY_POLICY,
 OzoneConfigKeys.
-DFS_RATIS_CLIENT_REQUEST_MAX_RETRIES_DEFAULT);
-long retryInterval = conf.getTimeDuration(OzoneConfigKeys.
-DFS_RATIS_CLIENT_REQUEST_RETRY_INTERVAL_KEY, OzoneConfigKeys.
-DFS_RATIS_CLIENT_REQUEST_RETRY_INTERVAL_DEFAULT
-.toIntExact(TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
-TimeDuration sleepDuration =
-TimeDuration.valueOf(retryInterval, TimeUnit.MILLISECONDS);
-RetryPolicy retryPolicy = RetryPolicies
-.retryUpToMaximumCountWithFixedSleep(maxRetryCount, sleepDuration);
-return retryPolicy;
+DFS_RATIS_CLIENT_MULTILINEAR_RANDOM_RETRY_POLICY_DEFAULT));
+
+long writeTimeout = conf.getTimeDuration(
+OzoneConfigKeys.DFS_RATIS_CLIENT_REQUEST_WRITE_TIMEOUT, 
OzoneConfigKeys.
+DFS_RATIS_CLIENT_REQUEST_WRITE_TIMEOUT_DEFAULT
+.toIntExact(TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
+long watchTimeout = conf.getTimeDuration(
+OzoneConfigKeys.DFS_RATIS_CLIENT_REQUEST_WATCH_TIMEOUT, 
OzoneConfigKeys.
+DFS_RATIS_CLIENT_REQUEST_WATCH_TIMEOUT_DEFAULT
+.toIntExact(TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
+
+return RequestTypeDependentRetryPolicy.newBuilder()
+.setRetryPolicy(RaftProtos.RaftClientRequestProto.TypeCase.WRITE,
+createExceptionDependentPolicy(exponentialBackoffRetry,
+multipleLinearRandomRetry, exponentialBackoffRetry))
+.setRetryPolicy(RaftProtos.RaftClientRequestProto.TypeCase.WATCH,
+createExceptionDependentPolicy(exponentialBackoffRetry,
+multipleLinearRandomRetry, RetryPolicies.noRetry()))
+.setTimeout(RaftProtos.RaftClientRequestProto.TypeCase.WRITE,
+TimeDuration.valueOf(writeTimeout, TimeUnit.MILLISECONDS))
+.setTimeout(RaftProtos.RaftClientRequestProto.TypeCase.WATCH,
+TimeDuration.valueOf(watchTimeout, TimeUnit.MILLISECONDS))
+.build();
+  }
+
+  private static ExponentialBackoffRetry createExponentialBackoffPolicy(
+  ConfigurationSource conf) {
+long exponentialBaseSleep = conf.getTimeDuration(
+OzoneConfigKeys.DFS_RATIS_CLIENT_EXPONENTIAL_BACKOFF_BASE_SLEEP,
+OzoneConfigKeys.DFS_RATIS_CLIENT_EXPONENTIAL_BACKOFF_BASE_SLEEP_DEFAULT
+.toIntExact(TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
+long exponentialMaxSleep = conf.getTimeDuration(
+OzoneConfigKeys.DFS_RATIS_CLIENT_EXPONENTIAL_BACKOFF_MAX_SLEEP,
+OzoneConfigKeys.
+DFS_RATIS_CLIENT_EXPONENTIAL_BACKOFF_MAX_SLEEP_DEFAULT
+.toIntExact(TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
+return ExponentialBackoffRetry.newBuilder()
+.setBaseSleepTime(
+TimeDuration.valueOf(exponentialBaseSleep, TimeUnit.MILLISECONDS))
+.setMaxSleepTime(
+TimeDuration.valueOf(exponentialMaxSleep, TimeUnit.MILLISECONDS))
+.build();
+  }
+
+  private static ExceptionDependentRetry createExceptionDependentPolicy(
+  ExponentialBackoffRetry exponentialBackoffRetry,
+  MultipleLinearRandomRetry multipleLinearRandomRetry,

Review comment:
   RaftLogIOException is never received at raft client. I have added 
AlreadyClosedException.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3622) Implement rocksdb tool to parse scm db

2020-06-10 Thread Sadanand Shenoy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadanand Shenoy resolved HDDS-3622.
---
Target Version/s: 0.6.0
  Resolution: Resolved

> Implement rocksdb tool to parse scm db 
> ---
>
> Key: HDDS-3622
> URL: https://issues.apache.org/jira/browse/HDDS-3622
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Sadanand Shenoy
>Assignee: Sadanand Shenoy
>Priority: Major
>  Labels: pull-request-available
>
> This tool parses content from scm.db file and displays specified table 
> contents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3622) Implement rocksdb tool to parse scm db

2020-06-10 Thread Sadanand Shenoy (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sadanand Shenoy updated HDDS-3622:
--
Component/s: Tools

> Implement rocksdb tool to parse scm db 
> ---
>
> Key: HDDS-3622
> URL: https://issues.apache.org/jira/browse/HDDS-3622
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Tools
>Reporter: Sadanand Shenoy
>Assignee: Sadanand Shenoy
>Priority: Major
>  Labels: pull-request-available
>
> This tool parses content from scm.db file and displays specified table 
> contents.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] sadanand48 closed pull request #864: HDDS-3405. Tool for Listing keys from the OpenKeyTable

2020-06-10 Thread GitBox


sadanand48 closed pull request #864:
URL: https://github.com/apache/hadoop-ozone/pull/864


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] sadanand48 edited a comment on pull request #864: HDDS-3405. Tool for Listing keys from the OpenKeyTable

2020-06-10 Thread GitBox


sadanand48 edited a comment on pull request #864:
URL: https://github.com/apache/hadoop-ozone/pull/864#issuecomment-642019017


   > Do we need this patch? It seems to be more easy to extend #945 / 
[HDDS-3622](https://issues.apache.org/jira/browse/HDDS-3622) with supporting 
OM...
   > 
   > What do you think?
   
   Yes . I will close the pr.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] sadanand48 commented on pull request #864: HDDS-3405. Tool for Listing keys from the OpenKeyTable

2020-06-10 Thread GitBox


sadanand48 commented on pull request #864:
URL: https://github.com/apache/hadoop-ozone/pull/864#issuecomment-642019017


   > Do we need this patch? It seems to be more easy to extend #945 / 
[HDDS-3622](https://issues.apache.org/jira/browse/HDDS-3622) with supporting 
OM...
   > 
   > What do you think?
   Yes . I will close the pr.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3757) Add test coverage of the acceptance tests to overall test coverage

2020-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3757:
-
Labels: pull-request-available  (was: )

> Add test coverage of the acceptance tests to overall test coverage 
> ---
>
> Key: HDDS-3757
> URL: https://issues.apache.org/jira/browse/HDDS-3757
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
>
> Acceptance test coverage should be added to the generic coverage numbers. We 
> have a lot of important tests there...



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek opened a new pull request #1050: HDDS-3757. Add test coverage of the acceptance tests to overall test coverage

2020-06-10 Thread GitBox


elek opened a new pull request #1050:
URL: https://github.com/apache/hadoop-ozone/pull/1050


   ## What changes were proposed in this pull request?
   
   This patch adds the coverage data from the acceptance test to the generic 
coverage measurement.
   
   There was one question during the implementation: I decided to add the 
required HADOOP_OPTS to all the docker-compose file without using tricky 
docker-compose extension. I found that I need to add a few lines anyway, and I 
preferred to keep it simple, even if a possible change would require slightly 
more work (but can be done with easy search and replace)
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-3757
   
   ## How was this patch tested?
   
   Pushed the branch to apache repo and checked sonar cloud.
   
   https://sonarcloud.io/dashboard?branch=HDDS-3757=hadoop-ozone



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #864: HDDS-3405. Tool for Listing keys from the OpenKeyTable

2020-06-10 Thread GitBox


elek commented on pull request #864:
URL: https://github.com/apache/hadoop-ozone/pull/864#issuecomment-642015500


   Do we need this patch? It seems to be more easy to extend #945 / HDDS-3622 
with supporting OM...
   
   What do you think?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-3772) Add LOG to S3ErrorTable for easier problem locating

2020-06-10 Thread Sammi Chen (Jira)
Sammi Chen created HDDS-3772:


 Summary: Add LOG to S3ErrorTable for easier problem locating
 Key: HDDS-3772
 URL: https://issues.apache.org/jira/browse/HDDS-3772
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
Reporter: Sammi Chen
Assignee: Sammi Chen






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #1031: HDDS-3745. Improve OM and SCM performance with 64% by avoid call getServiceInfo in s3g

2020-06-10 Thread GitBox


elek commented on pull request #1031:
URL: https://github.com/apache/hadoop-ozone/pull/1031#issuecomment-642010606


   Thanks the patch @runzhiwang 
   
   It looks correct me, but it's also a question about the long-term usage of 
`getServiceInfo`. Originally it was introduced (as far as I remember) to get 
the address of the SCM. But over the time the client is improved to avoid all 
the direct calls to the SCM.
   
   I agree, that long-term we should use proxy users for S3 and pooling the 
connections.
   
   Short-term this patch looks good to me, but why don't we use the 
`getServiceInfo` in case of secure clusters? Do we need to replace it with 
something simple.
   
   I would be interested about the opinion of @nandakumar131. As far as I 
remember he worked on the original implementation.
   
   Personally I think a generic `getServiceClient` can be useful. For example 
active `storage-class`-es can be downloaded from the server at the beginning of 
the connection. But that's a long term plan and this patch can help short term.
   
   Let's wait for more opinions. 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3750) Improve SCM performance with 3.2% by avoid stream.collect

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved HDDS-3750.
---
Fix Version/s: 0.6.0
   Resolution: Fixed

> Improve SCM performance with 3.2% by avoid stream.collect
> -
>
> Key: HDDS-3750
> URL: https://issues.apache.org/jira/browse/HDDS-3750
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> I start a ozone cluster with 1000 datanodes and 10 s3gateway, and run two 
> weeks with heavy workload, and perf scm.
>  !screenshot-1.png! 
>  !screenshot-2.png! 
>  !screenshot-3.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek merged pull request #1035: HDDS-3750. Improve SCM performance with 3.2% by avoid stream.collect

2020-06-10 Thread GitBox


elek merged pull request #1035:
URL: https://github.com/apache/hadoop-ozone/pull/1035


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] runzhiwang closed pull request #1048: HDDS-3481. SCM ask too many datanodes to replicate the same container

2020-06-10 Thread GitBox


runzhiwang closed pull request #1048:
URL: https://github.com/apache/hadoop-ozone/pull/1048


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3481) SCM ask too many datanodes to replicate the same container

2020-06-10 Thread Stephen O'Donnell (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130640#comment-17130640
 ] 

Stephen O'Donnell commented on HDDS-3481:
-

I saw this problem a long time back in theory from reading the code.

I don't think it is a good idea for SCM to hand out all the replication work 
immediately. After SCM passes out the commands, it loses the ability to adjust 
the work later. It is effectively flooding downstream workers who have no 
ability to provide back pressure and indicate they are overloaded.

Eg, if it needs to replicate 1000 containers, and it gives 500 to node 1 and 
500 to node 2. What if node one completes its work more quickly (maybe its 
under less read load, has faster disks, is on the same rack as the target ... ) 
- then we cannot just take some of the containers allocated to node 2 and give 
them to node 1 to complete replication faster, as the commands are fired with 
no easy way to see their progress or cancel them.

It is better for the supervisor (SCM) to hand out the work incrementally as the 
workers have capacity for it. Even with a longer timeout, I reckon this bad 
feedback loop will happen.

This is roughly how HDFS does it - there is a replication queue in the 
namenode, and each datanode has a limit of how many replications it can have. 
On each heartbeat, it gets given more work up to its maximum. The namenode 
holds the work back until the workers have capacity to receive it. There isn't 
a feedback loop for the commands in HDFS, but the limit of work + a relatively 
short deadline to complete that work results in it working well.

> SCM ask too many datanodes to replicate the same container
> --
>
> Key: HDDS-3481
> URL: https://issues.apache.org/jira/browse/HDDS-3481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
>  Labels: TriagePending, pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
> *What's the problem ?*
> As the image shows,  scm ask 31 datanodes to replicate container 2037 every 
> 10 minutes from 2020-04-17 23:38:51.  And at 2020-04-18 08:58:52 scm find the 
> replicate num of container 2037 is 12, then it ask 11 datanodes to delete 
> container 2037. 
>  !screenshot-1.png! 
>  !screenshot-2.png! 
> *What's the reason ?*
> scm check whether  (container replicates num + 
> inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it 
> will ask some datanode to replicate the container, and add the action into 
> inflightReplication.get(containerId). The replicate action time out is 10 
> minutes, if action timeout, scm will delete the action from 
> inflightReplication.get(containerId) as the image shows. Then (container 
> replicates num + inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask 
> another datanode to replicate the container.
> Because replicate container cost a long time,  sometimes it cannot finish in 
> 10 minutes, thus 31 datanodes has to replicate the container every 10 
> minutes.  19 of 31 datanodes replicate container from the same source 
> datanode,  it will also cause big pressure on the source datanode and 
> replicate container become slower. Actually it cost 4 hours to finish the 
> first replicate. 
>  !screenshot-4.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3755) Storage-class support for Ozone

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130629#comment-17130629
 ] 

Marton Elek commented on HDDS-3755:
---

[~maobaolong]

Would be great to provide an example configuration for the current scheme 
(Ratis/THREE -> Ratis/ONE). I think we can start a fork branch and create a POC.

Also: the abstraction level can be improved over the time. We can start with 
defining the replication factors for the existing scheme and continue with 
defining the full state transitions.

> Storage-class support for Ozone
> ---
>
> Key: HDDS-3755
> URL: https://issues.apache.org/jira/browse/HDDS-3755
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> Use a storage-class as an abstraction which combines replication 
> configuration, container states and transitions. 
> See this thread for the detailed design doc:
>  
> [https://lists.apache.org/thread.html/r1e2a5d5581abe9dd09834305ca65a6807f37bd229a07b8b31bda32ad%40%3Cozone-dev.hadoop.apache.org%3E]
> which is also uploaded to here: 
> https://hackmd.io/4kxufJBOQNaKn7PKFK_6OQ?edit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3755) Storage-class support for Ozone

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130626#comment-17130626
 ] 

Marton Elek commented on HDDS-3755:
---

*Decouple EC from this feature:*

For me storage-class, is a framework which can make easier to implement some 
new features (TWO replications, EC). I agree with decoupling the detailed 
design, but it's important to find an abstraction level which is good enough 
for all the considered EC implementations.
{quote}Would like to see better defined use-cases, and some discussion on the 
use-cases before we get into the design.
{quote}
Can you please help me with some questions. For me the important use cases:

 1. EC ( I understand that the detailed design discussion is separated, but 
this framework should be good enough for EC)

 2. Defining different Closed replication schemes (currently you couldn't 
configure different replication factor for Closed containers)

 3. Simplify configuration and hide the implementation details from the user, 
but keep the flexibility for the admins

 4. Make it easy to experiment with different replication scheme (like TWO)

> Storage-class support for Ozone
> ---
>
> Key: HDDS-3755
> URL: https://issues.apache.org/jira/browse/HDDS-3755
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> Use a storage-class as an abstraction which combines replication 
> configuration, container states and transitions. 
> See this thread for the detailed design doc:
>  
> [https://lists.apache.org/thread.html/r1e2a5d5581abe9dd09834305ca65a6807f37bd229a07b8b31bda32ad%40%3Cozone-dev.hadoop.apache.org%3E]
> which is also uploaded to here: 
> https://hackmd.io/4kxufJBOQNaKn7PKFK_6OQ?edit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3481) SCM ask too many datanodes to replicate the same container

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130610#comment-17130610
 ] 

Marton Elek commented on HDDS-3481:
---

{quote}bq. Should we consider balancing the replication source among datanodes 
or throttle the replication per datanode?
{quote}
Agree, sooner or later we need this. A few days ago I learned that using High 
Density datanodes  (data nodes with extreme capacity) can be more and more 
commons. But requesting the replication of ALL containers of a missing datanode 
with multiple hundred of terrabytes can be a disaster.

> SCM ask too many datanodes to replicate the same container
> --
>
> Key: HDDS-3481
> URL: https://issues.apache.org/jira/browse/HDDS-3481
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Blocker
>  Labels: TriagePending, pull-request-available
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png, 
> screenshot-4.png
>
>
> *What's the problem ?*
> As the image shows,  scm ask 31 datanodes to replicate container 2037 every 
> 10 minutes from 2020-04-17 23:38:51.  And at 2020-04-18 08:58:52 scm find the 
> replicate num of container 2037 is 12, then it ask 11 datanodes to delete 
> container 2037. 
>  !screenshot-1.png! 
>  !screenshot-2.png! 
> *What's the reason ?*
> scm check whether  (container replicates num + 
> inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3. If less than 3, it 
> will ask some datanode to replicate the container, and add the action into 
> inflightReplication.get(containerId). The replicate action time out is 10 
> minutes, if action timeout, scm will delete the action from 
> inflightReplication.get(containerId) as the image shows. Then (container 
> replicates num + inflightReplication.get(containerId).size() - 
> inflightDeletion.get(containerId).size()) is less than 3 again, and scm ask 
> another datanode to replicate the container.
> Because replicate container cost a long time,  sometimes it cannot finish in 
> 10 minutes, thus 31 datanodes has to replicate the container every 10 
> minutes.  19 of 31 datanodes replicate container from the same source 
> datanode,  it will also cause big pressure on the source datanode and 
> replicate container become slower. Actually it cost 4 hours to finish the 
> first replicate. 
>  !screenshot-4.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng commented on pull request #1019: HDDS-3679. Add unit tests for PipelineManagerV2.

2020-06-10 Thread GitBox


timmylicheng commented on pull request #1019:
URL: https://github.com/apache/hadoop-ozone/pull/1019#issuecomment-641967456


   > FYI: can reproduce it on linux, locally.
   > 
   > It seems to be disappeared when I upgraded my rocksdb version in the main 
`pom.xml`:
   > 
   > ```
   > -6.6.4
   > +6.8.1
   > ```
   > 
   > I think it's good to upgrade as multiple corruption issues are fixed since 
6.8.1...
   
   @elek Shall we make a separate commit to upgrade rocksdb version?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2880) Rename legacy/current ozonefs to isolated/share

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved HDDS-2880.
---
Release Note: They are refactor and renamed during HDDS-3458. We don't have 
isolated any more as we don't have specific classloader.
  Resolution: Won't Fix

> Rename legacy/current ozonefs to isolated/share
> ---
>
> Key: HDDS-2880
> URL: https://issues.apache.org/jira/browse/HDDS-2880
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Priority: Major
>  Labels: TriagePending, newbie
>
> When we started to provide two different packaging for ozonefs we named it 
> legacy and current.
> "Legacy" contains all the required hadoop classes and a classloader 
> separation with the help of a specific class loader instance.
> Current contains only the shaded (with package relocation) dependencies but 
> without Hadoop classes and specific classloader.
>  
> The "current" can be used with the latest Hadoop version, "legacy" can be 
> used with "all" version of Hadoop.
>  
> As "legacy" is a more generic approach it might be better to use better 
> naming. I suggest to name based on the function:
>  
>  * rename "legacy" to "isolated"
>  * rename "current" to "shared"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3519) Finalize network and storage protocol of Ozone

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130596#comment-17130596
 ] 

Marton Elek commented on HDDS-3519:
---

I think it's very close to any intra-service protocol (as far as understood its 
used inside the Ratis log entries). I would put it to the 
hadoop-hdds/interface-server project, but it sounds reasonable to keep it in a 
separated file.

> Finalize network and storage protocol of Ozone
> --
>
> Key: HDDS-3519
> URL: https://issues.apache.org/jira/browse/HDDS-3519
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: build
>Reporter: Marton Elek
>Priority: Critical
>  Labels: TriagePending
>
> One of the next releases of Ozone can be named as GA which means that 
> backward compatibility should be more important.
> Before GA I propose to cleanup the current RPC interface and stabilize the 
> storage interface.
> Goals:
>  * Clearly define the client / storage interfaces and monitor the changes
>  * Separate Client RPC from intra-service / admin interfaces (for security 
> reasons)
>  * Remove unusued / out-of date messages
> I propose the following steps
> 1. We should separate client / admin / server calls on the services.
>   -> Majority of existing calls are "client" calls, used by the client
>   -> Admin calls are supposed to be used by admin CLI (local only in a secure 
> environment
>   -> Server calls are intra-server calls (like db HB) 
> 2. We should use unified naming convention
> 3. protocol files can be moved to separated maven project to make it easier 
> to reuse from language binding and make it easier to monitor API change
> 4. We should use RDBStore interface everywhere instead of the old 
> Metadatastore interface
> 5. We can move all the table definition interfaces to separated project and 
> monitor the API changes
> This is my previous proposal for naming convetion, which was discussed and 
> accepted during one of the community meetings:
> {quote}My simplified name convention suggest to separate only the server 
> (like om2scm) the client (like client2om) and admin (like pipeline list, safe 
> mode administration, etc.) protocol services.
> 1. admin services should be available only from the cluster (use 
> AdminProtocol as name)
> 2. client can be available from inside and outside (use ClientProtocol as 
> name)
> 3. server protocol can be restricted to be used only between the services. 
> (use ..ServerProtocol as name)
> Based on this convention:
> --> OMClientProtocol
> Should contain all the client calls (OzoneManagerProtocol)
> --> OMAdminProtocol
> It's a new service can contain the new omha commands
> --> SCMAdminProtocol
> Can contain all the admin commands from StorageContainerLocation protocol 
> (QueryNode, InsafeMode, )
> --> SCMClientProtocol
> It seems that we don't need it any more as client doesn't require any 
> connection to the SCM (please confirm)
> --> SCMServerProtocol (server2server calls)
>  * Remaining part of the StorageContainerLocation protocol (allocate 
> container, get container)
>  * Content of the SCMSecurityProtocol.proto
>  * Content of SCMBlockLocationProtocol
> -> SCMHeartbeatProtocol
> Well, it's so specific that we can create a custom postfix instead of Server. 
> This is the HB (StorageContainerDatanodeProtocol)
> -> DatanodeClientProtocol
>  
> Chunks, upload from the DatanodeContainerProtocol
> --> DatanodeServerProtocol
>  There is one service here which publishes the container.tar.gz for the other 
> services. As of now it's combined with the DatanodeClientProtocol.
> {quote}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1328) Add a new API getS3Bucket

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved HDDS-1328.
---
Resolution: Won't Fix

We don't need it after the recent s3 volume mapping change.

> Add a new API getS3Bucket
> -
>
> Key: HDDS-1328
> URL: https://issues.apache.org/jira/browse/HDDS-1328
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Currently, to get an s3bucket, we need 3 RPC's.
>  # Get OzoneVolumeName
>  # Get OzoneVolume
>  # then getBucket. 
> With the proposed approach, we can have one RPC call getS3Bucket, with which 
> we can save 2 RPC's for each operation in S3.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #1019: HDDS-3679. Add unit tests for PipelineManagerV2.

2020-06-10 Thread GitBox


elek commented on pull request #1019:
URL: https://github.com/apache/hadoop-ozone/pull/1019#issuecomment-641960612


   FYI: can reproduce it on linux, locally. 
   
   It seems to be disappeared when I upgraded my rocksdb version in the main 
`pom.xml`:
   
   ```
   -6.6.4
   +6.8.1
   ```
   
   I think it's good to upgrade as multiple corruption issues are fixed since 
6.8.1...
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3761) "ozone fs -get" is way to slow than "ozone sh key get"

2020-06-10 Thread Sammi Chen (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130574#comment-17130574
 ] 

Sammi Chen commented on HDDS-3761:
--

Hi [~msingh] and [~rakeshr],  Please hold on the investigation, the data is 
captured in our production environment, I will verify it with the lastest 
master again. 

> "ozone fs -get" is way to slow than "ozone sh key get"
> --
>
> Key: HDDS-3761
> URL: https://issues.apache.org/jira/browse/HDDS-3761
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Sammi Chen
>Priority: Major
>
> Read time spend to download a 7GB+ object, 
> time ozone fs -get 
> o3fs://konajdk-profiler.s325d55ad283aa400af464c76d713c07ad/part-0 
> ./part-0-back
> 2020-06-09 11:19:47,284 [main] INFO impl.MetricsConfig: Loaded properties 
> from hadoop-metrics2.properties
> 2020-06-09 11:19:47,339 [main] INFO impl.MetricsSystemImpl: Scheduled Metric 
> snapshot period at 10 second(s).
> 2020-06-09 11:19:47,339 [main] INFO impl.MetricsSystemImpl: 
> XceiverClientMetrics metrics system started
> real  45m26.152s
> user  0m28.576s
> sys   0m13.488s
> 222
> time bin/hadoop fs -get s3a://konajdk-profiler/part-0 ./part-0-back-1
> 20/06/09 11:19:57 INFO security.UserGroupInformation: Hadoop UGI 
> authentication : SIMPLE
> real  3m3.542s
> user  0m7.644s
> sys   0m12.016s
> 222
> time bin/ozone sh key get 
> s325d55ad283aa400af464c76d713c07ad/konajdk-profiler/part-0 
> ./part-0-back
> real  1m26.900s
> user  0m19.604s
> sys   0m10.280s



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng edited a comment on pull request #1019: HDDS-3679. Add unit tests for PipelineManagerV2.

2020-06-10 Thread GitBox


timmylicheng edited a comment on pull request #1019:
URL: https://github.com/apache/hadoop-ozone/pull/1019#issuecomment-641767465


   Dump shows it's related to rocksdb. Not sure if it's related to multiple DBs 
being merged. Stack looks weird to me. Any ideas? @nandakumar131 @elek 
@xiaoyuyao 
   
   ```
   JRE version: Java(TM) SE Runtime Environment (8.0_211-b12) (build 
1.8.0_211-b12)
   # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.211-b12 mixed mode 
bsd-amd64 compressed oops)
   # Problematic frame:
   # C  [librocksdbjni2954960755376440018.jnilib+0x602b8]  
rocksdb::GetColumnFamilyID(rocksdb::ColumnFamilyHandle*)+0x8
   
   See full dump at 
[https://the-asf.slack.com/files/U0159PV5Z6U/F0152UAJF0S/hs_err_pid90655.log?origin_team=T4S1WH2J3_channel=D014L2URB6E](url)
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek commented on pull request #1038: HDDS-3754. Rename framework to common-server

2020-06-10 Thread GitBox


elek commented on pull request #1038:
URL: https://github.com/apache/hadoop-ozone/pull/1038#issuecomment-641935708


   @nandakumar131 Not sure if understood your proposal. Do you propose to 
create separated projects for separated framework/services? Like 
`framework-eventqueue`, or `framework-dbstore`?
   
   I am fine with that. But the current project (which is used only to collect 
server side utilities and classes) is more like `common` for the server side.
   
   But I am fine to leave it as is, if you don't think it's confusing...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3662) decouple finalize and destroy pipeline

2020-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3662:
-
Labels: pull-request-available  (was: )

> decouple finalize and destroy pipeline
> --
>
> Key: HDDS-3662
> URL: https://issues.apache.org/jira/browse/HDDS-3662
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Li Cheng
>Assignee: Li Cheng
>Priority: Major
>  Labels: pull-request-available
>
> We have to decouple finalize and destroy pipeline. We should have two 
> separate calls, closePipeline and destroyPipeline.
> Close pipeline should only update the pipeline state, it’s the job of the 
> caller to issue close container commands to all the containers in the 
> pipeline.
> Destroy pipeline should be called from pipeline scrubber, once a pipeline has 
> spent enough time in closed state the pipeline scrubber should call destroy 
> pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] timmylicheng opened a new pull request #1049: HDDS-3662 Decouple finalizeAndDestroyPipeline.

2020-06-10 Thread GitBox


timmylicheng opened a new pull request #1049:
URL: https://github.com/apache/hadoop-ozone/pull/1049


   ## What changes were proposed in this pull request?
   Decouple finalizeAndDestroyPipeline.
   
   Close pipeline should only update the pipeline state, it’s the job of the 
caller to issue close container commands to all the containers in the pipeline.
   
   Destroy pipeline should be called from pipeline scrubber, once a pipeline 
has spent enough time in closed state the pipeline scrubber should call destroy 
pipeline.
   
   (Please fill in changes proposed in this fix)
   
   ## What is the link to the Apache JIRA
   https://issues.apache.org/jira/browse/HDDS-3662
   
   (Please create an issue in ASF JIRA before opening a pull request,
   and you need to set the title of the pull request which starts with
   the corresponding JIRA issue number. (e.g. HDDS-. Fix a typo in YYY.)
   
   Please replace this section with the link to the Apache JIRA)
   
   ## How was this patch tested?
   UT
   (Please explain how this patch was tested. Ex: unit tests, manual tests)
   (If this patch involves UI changes, please attach a screen-shot; otherwise, 
remove this)
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3764) Spark is failing with no such method exception

2020-06-10 Thread Marton Elek (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130482#comment-17130482
 ] 

Marton Elek commented on HDDS-3764:
---

Sure, I merged it. Yes, it seems to be duplicate (Spark uses Hadoop 2.7.4 by 
default).

Will retest spark with your patch and close this issue.

> Spark is failing with no such method exception
> --
>
> Key: HDDS-3764
> URL: https://issues.apache.org/jira/browse/HDDS-3764
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: build
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Critical
>
> When I tested the existing documentation (Spark + Kubernetes) I found that 
> Spark (default distribution with Hadoop 2.7) is failing with NoSuchMethod 
> exception:
> {code:java}
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: 
> Lost task 0.3 in stage 0.0 (TID 3, 10.42.0.169, executor 1): 
> java.lang.NoSuchMethodError: org.apache.hadoop.util.Time.monotonicNowNanos()J
>   at 
> org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandAsync(XceiverClientGrpc.java:437)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3715) Improvement for OzoneFS client to work with Hadoop 2.7.3

2020-06-10 Thread Marton Elek (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Elek resolved HDDS-3715.
---
   Fix Version/s: 0.6.0
Target Version/s: 0.6.0
  Resolution: Fixed

> Improvement for OzoneFS client to work with Hadoop 2.7.3
> 
>
> Key: HDDS-3715
> URL: https://issues.apache.org/jira/browse/HDDS-3715
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.6.0
>
>
> The background is the hadoop production clusters we used internally is based 
> on Hadoop 2.7.3.  Currenlty we maintain an internal OzoneFS client for Hadoop 
> 2.7.3.  With HDDS-3627 merged,  it is the right time to use the community 
> version instead of an internal version. 
> The improvement are some Hadoop2.7.7 newly added functions which are not 
> available in 2.7.3.   So refact the code to use an older version with same 
> functionality. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3715) Improvement for OzoneFS client to work with Hadoop 2.7.3

2020-06-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-3715:
-
Labels: pull-request-available  (was: )

> Improvement for OzoneFS client to work with Hadoop 2.7.3
> 
>
> Key: HDDS-3715
> URL: https://issues.apache.org/jira/browse/HDDS-3715
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Sammi Chen
>Assignee: Sammi Chen
>Priority: Major
>  Labels: pull-request-available
>
> The background is the hadoop production clusters we used internally is based 
> on Hadoop 2.7.3.  Currenlty we maintain an internal OzoneFS client for Hadoop 
> 2.7.3.  With HDDS-3627 merged,  it is the right time to use the community 
> version instead of an internal version. 
> The improvement are some Hadoop2.7.7 newly added functions which are not 
> available in 2.7.3.   So refact the code to use an older version with same 
> functionality. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek merged pull request #1036: HDDS-3715. Improvement for OzoneFS client to work with Hadoop 2.7.3.

2020-06-10 Thread GitBox


elek merged pull request #1036:
URL: https://github.com/apache/hadoop-ozone/pull/1036


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] elek merged pull request #945: HDDS-3622. Implement rocksdb tool to parse scm db

2020-06-10 Thread GitBox


elek merged pull request #945:
URL: https://github.com/apache/hadoop-ozone/pull/945


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3755) Storage-class support for Ozone

2020-06-10 Thread maobaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130397#comment-17130397
 ] 

maobaolong commented on HDDS-3755:
--

[~arp] Yeah, i agree with decouple EC from this feature.

[~elek] I think storage-class can be a suite of parameters to describe how to 
write a file, for example, it can contains replication factor, replicationType.

Transfer rule: We can define some rule to describe when(condition or timer) 
invoke a convert action. For example, convert files from a storage-class to 
another storage-class when files lives 7 days.

Phase: Files can have many phases, we can defines some rules to connect the 
phase.  file > phase1(storageClassA) --rule1-> phase2(storageClassB) 
--rule2-> phase3(storageClassC) --> rule3 -->phase4(deleted)

Transfer chain:  A whole chain from start phase to the end phase.

Please correct me if i am wrong.

> Storage-class support for Ozone
> ---
>
> Key: HDDS-3755
> URL: https://issues.apache.org/jira/browse/HDDS-3755
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> Use a storage-class as an abstraction which combines replication 
> configuration, container states and transitions. 
> See this thread for the detailed design doc:
>  
> [https://lists.apache.org/thread.html/r1e2a5d5581abe9dd09834305ca65a6807f37bd229a07b8b31bda32ad%40%3Cozone-dev.hadoop.apache.org%3E]
> which is also uploaded to here: 
> https://hackmd.io/4kxufJBOQNaKn7PKFK_6OQ?edit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-3755) Storage-class support for Ozone

2020-06-10 Thread maobaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17128866#comment-17128866
 ] 

maobaolong edited comment on HDDS-3755 at 6/10/20, 8:06 AM:


[~elek]Thanks for bring this invention to Ozone, after a concretely discussion 
with you, i saw a great feature from this. Now i would like to do some works to 
started this.

In my view, we can define some concept clearly, for example(feel free to add or 
remote the following items)
- Storage-class
- Transfer rule
- Phase
- Transfer chain

We can also define boundaries of each develop phase, then we can create some 
definite sub task, some contributors like me can take some tickets to do. For 
example,
- Use Storage-class to combine a suite of parameters about how to write and 
store data 
- Two replication factor support

My draft about storage-as-a-framework and Ozone-storage-transfer related docs, 
feel free to discuss with me or left any comments in the docs. I think that 
after several round of discussion, we can reach the consensus.

https://docs.google.com/document/d/1gfjiKEpfyEfqXI3aT12dMibc0i14YYv5o09lKm_F-ZA/edit?usp=sharing


was (Author: maobaolong):
[~elek]Thanks for bring this invention to Ozone, after a concretely discussion 
with you, i saw a great feature from this. Now i would like to do some works to 
started this.

In my view, we can define some concept clearly, for example(feel free to add or 
remote the following items)
- Storage-class
- Transfer rule
- Phase
- Transfer chain

We can also define boundaries of each develop phase, then we can create some 
definite sub task, some contributors like me can take some tickets to do. For 
example,
- Use Storage-class to combine a suite of parameters about how to write and 
store data 
- Two replication factor support

My draft about storage-as-a-framework and Ozone-storage-transfer related docs, 
feel free to discuss with me or left any comments in the docs. I think that 
after several round of discussion, we can reach the consensus.

> Storage-class support for Ozone
> ---
>
> Key: HDDS-3755
> URL: https://issues.apache.org/jira/browse/HDDS-3755
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>
> Use a storage-class as an abstraction which combines replication 
> configuration, container states and transitions. 
> See this thread for the detailed design doc:
>  
> [https://lists.apache.org/thread.html/r1e2a5d5581abe9dd09834305ca65a6807f37bd229a07b8b31bda32ad%40%3Cozone-dev.hadoop.apache.org%3E]
> which is also uploaded to here: 
> https://hackmd.io/4kxufJBOQNaKn7PKFK_6OQ?edit



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-3771) Block when using ’ozone fs -cat o3fs://xxxxx.xxxx/xxx‘

2020-06-10 Thread mingchao zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao resolved HDDS-3771.
-
Resolution: Cannot Reproduce

> Block when using ’ozone fs -cat o3fs://x./xxx‘
> --
>
> Key: HDDS-3771
> URL: https://issues.apache.org/jira/browse/HDDS-3771
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.6.0
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-06-10-11-48-12-299.png
>
>
> Block when I use ’ozone fs -cat o3fs://x./xxx‘. And no logs are seen 
> in the background. This is normal when I use ’ozone sh key cat 
> /x//xxx ‘.
> !image-2020-06-10-11-48-12-299.png|width=919,height=88!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3771) Block when using ’ozone fs -cat o3fs://xxxxx.xxxx/xxx‘

2020-06-10 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130368#comment-17130368
 ] 

mingchao zhao commented on HDDS-3771:
-

This problem is no longer present in the latest version of the code.  Close 
this.

> Block when using ’ozone fs -cat o3fs://x./xxx‘
> --
>
> Key: HDDS-3771
> URL: https://issues.apache.org/jira/browse/HDDS-3771
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.6.0
>Reporter: mingchao zhao
>Priority: Major
> Attachments: image-2020-06-10-11-48-12-299.png
>
>
> Block when I use ’ozone fs -cat o3fs://x./xxx‘. And no logs are seen 
> in the background. This is normal when I use ’ozone sh key cat 
> /x//xxx ‘.
> !image-2020-06-10-11-48-12-299.png|width=919,height=88!
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-3683) Ozone fuse support

2020-06-10 Thread maobaolong (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17130367#comment-17130367
 ] 

maobaolong commented on HDDS-3683:
--

[~msingh] [~aryangupta1998] [~nanda] Thanks for your help to let me run the 
dfs-fuse on my test cluster. I did a bundle tests for `dfs-fuse` and `hcfsfuse` 
to measure the read performance, the following is some screenshot of the 
test results.

the file is 1.6GB

 !screenshot-1.png! 

 !screenshot-2.png! 

 !screenshot-3.png! 

>From the above screenshot, i think the write  and warm read performance is 
>almost the same, for the cold read performance, the hcfsfuse is better than 
>dfsfuse.

>  Ozone fuse support
> ---
>
> Key: HDDS-3683
> URL: https://issues.apache.org/jira/browse/HDDS-3683
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> https://github.com/opendataio/hcfsfuse



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3725) Ozone sh volume client support quota option.

2020-06-10 Thread Sammi Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sammi Chen updated HDDS-3725:
-
Target Version/s: 0.7.0

> Ozone sh volume client support quota option.
> 
>
> Key: HDDS-3725
> URL: https://issues.apache.org/jira/browse/HDDS-3725
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Simon Su
>Assignee: Simon Su
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 49h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3683) Ozone fuse support

2020-06-10 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3683:
-
Attachment: screenshot-2.png

>  Ozone fuse support
> ---
>
> Key: HDDS-3683
> URL: https://issues.apache.org/jira/browse/HDDS-3683
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png
>
>
> https://github.com/opendataio/hcfsfuse



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-3683) Ozone fuse support

2020-06-10 Thread maobaolong (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maobaolong updated HDDS-3683:
-
Attachment: screenshot-3.png

>  Ozone fuse support
> ---
>
> Key: HDDS-3683
> URL: https://issues.apache.org/jira/browse/HDDS-3683
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Affects Versions: 0.6.0
>Reporter: maobaolong
>Assignee: maobaolong
>Priority: Major
> Attachments: screenshot-1.png, screenshot-2.png, screenshot-3.png
>
>
> https://github.com/opendataio/hcfsfuse



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



  1   2   >