[jira] [Created] (HDFS-12340) Ozone: C/C++ implementation of ozone client using curl
Shashikant Banerjee created HDFS-12340: -- Summary: Ozone: C/C++ implementation of ozone client using curl Key: HDFS-12340 URL: https://issues.apache.org/jira/browse/HDFS-12340 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 This Jira is introduced for implementation of ozone client in C/C++ using curl library. All these calls will make use of HTTP protocol and would require libcurl. The libcurl API are referenced from here: https://curl.haxx.se/libcurl/ Additional details would be posted along with the patches. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-11613) Ozone: Cleanup findbugs issues
[ https://issues.apache.org/jira/browse/HDFS-11613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-11613. Resolution: Not A Problem > Ozone: Cleanup findbugs issues > -- > > Key: HDFS-11613 > URL: https://issues.apache.org/jira/browse/HDFS-11613 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Anu Engineer >Assignee: Shashikant Banerjee >Priority: Blocker > Labels: ozoneMerge > > Some of the ozone checkins happened before Findbugs started running on test > files. This will cause issues when we attempt to merge with trunk. This jira > tracks cleaning up all Findbugs issue under ozone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12594) SnapshotDiff - snapshotDiff fails if the snapshotDiff . report exceeds the RPC response limit
Shashikant Banerjee created HDFS-12594: -- Summary: SnapshotDiff - snapshotDiff fails if the snapshotDiff . report exceeds the RPC response limit Key: HDFS-12594 URL: https://issues.apache.org/jira/browse/HDFS-12594 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee The snapshotDiff command fails if the snapshotDiff report size is larger than the configuration value of ipc.maximum.response.length which is by default 128 MB. Worst case, with all Renames ops in sanpshots each with source and target name equal to MAX_PATH_LEN which is 8k characters, this would result in at 8192 renames. SnapshotDiff is currently used by distcp to optimize copy operations and in case of the the diff report exceeding the limit , it fails with the below exception: Test set: org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport --- Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 112.095 sec <<< FAILURE! - in org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport testDiffReportWithMillionFiles(org.apache.hadoop.hdfs.server.namenode.snapshot.TestSnapshotDiffReport) Time elapsed: 111.906 sec <<< ERROR! java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "hw15685.local/10.200.5.230"; destination host is: "localhost":59808; Attached is the proposal for the changes required. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12890) XceiverClient should have upper bound on async requests
Shashikant Banerjee created HDFS-12890: -- Summary: XceiverClient should have upper bound on async requests Key: HDFS-12890 URL: https://issues.apache.org/jira/browse/HDFS-12890 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 XceiverClient-ratis maintains upper bound on the no of outstanding async requests . XceiverClient should also impose an upper bound on the no of outstanding async requests received from client for write. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12794) Ozone: Parallelize ChunkOutputSream Writes to container
Shashikant Banerjee created HDFS-12794: -- Summary: Ozone: Parallelize ChunkOutputSream Writes to container Key: HDFS-12794 URL: https://issues.apache.org/jira/browse/HDFS-12794 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 The chunkOutPutStream Write are sync in nature .Once one chunk of data gets written, the next chunk write is blocked until the previous chunk is written to the container. The ChunkOutputWrite Stream writes should be made async and Close on the OutputStream should ensure flushing of all dirty buffers to the container. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12796) SCM should not start if Cluster Version file does not exist
Shashikant Banerjee created HDFS-12796: -- Summary: SCM should not start if Cluster Version file does not exist Key: HDFS-12796 URL: https://issues.apache.org/jira/browse/HDFS-12796 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 We have SCM --init command which persist the cluster Version info in the version file. If SCM gets started without SCM --init being done even once, it should fail. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12898) Ozone: TestSCMCli#testHelp and TestSCMCli#testListContainerCommand fail consistently
Shashikant Banerjee created HDFS-12898: -- Summary: Ozone: TestSCMCli#testHelp and TestSCMCli#testListContainerCommand fail consistently Key: HDFS-12898 URL: https://issues.apache.org/jira/browse/HDFS-12898 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee The help msg for SCMCl commands has been modified with HDFS-12588. SCMCLI tests need to be modified accordingly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12948) DiskBalancer report command top option should only take positive numeric values
Shashikant Banerjee created HDFS-12948: -- Summary: DiskBalancer report command top option should only take positive numeric values Key: HDFS-12948 URL: https://issues.apache.org/jira/browse/HDFS-12948 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Currently , diskBalancer report command "top" option takes values like "AAA" as well negative/decimal values and gives outPut. These invalid values should not be processed and some error/warning should be given. For Example, $ hdfs diskbalancer -report -top -100 17/12/19 15:07:01 INFO command.Command: Processing report command 17/12/19 15:07:02 INFO balancer.KeyManager: Block token params received from NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec 17/12/19 15:07:02 INFO block.BlockTokenSecretManager: Setting block keys 17/12/19 15:07:02 INFO balancer.KeyManager: Update block keys every 2hrs, 30mins, 0sec 17/12/19 15:07:02 INFO command.Command: Reporting top -100 DataNode(s) benefiting from running DiskBalancer. Processing report command Reporting top -100 DataNode(s) benefiting from running DiskBalancer. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12947) Limit the number of Snapshots allowed to be created for a Snapshottable Directory
Shashikant Banerjee created HDFS-12947: -- Summary: Limit the number of Snapshots allowed to be created for a Snapshottable Directory Key: HDFS-12947 URL: https://issues.apache.org/jira/browse/HDFS-12947 Project: Hadoop HDFS Issue Type: Improvement Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Currently, A snapshottable directory is able to accommodate 65,536 snapshots. In case a directory has large no of snapshots , deletion of any of the earlier snapshots take a lot of time which might lead to namenode crash (HDFS-11225). This jira is introduced to limit the no of the snapshots under a snapshottable directory to a reasonable value(say 10) which can be overriden. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-56) TestContainerSupervisor#testAddingNewPoolWorks and TestContainerSupervisor#testDetectOverReplica fail consistently
Shashikant Banerjee created HDDS-56: --- Summary: TestContainerSupervisor#testAddingNewPoolWorks and TestContainerSupervisor#testDetectOverReplica fail consistently Key: HDDS-56 URL: https://issues.apache.org/jira/browse/HDDS-56 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-55) Fix the findBug issue SCMDatanodeProtocolServer#updateContainerReportMetrics
Shashikant Banerjee created HDDS-55: --- Summary: Fix the findBug issue SCMDatanodeProtocolServer#updateContainerReportMetrics Key: HDDS-55 URL: https://issues.apache.org/jira/browse/HDDS-55 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee The findBug issue is reported because we are using synchronized on ConcurrentHashMap. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-57) TestContainerCloser#testRepeatedClose and TestContainerCloser#testCleanupThreadRuns fail consistently
Shashikant Banerjee created HDDS-57: --- Summary: TestContainerCloser#testRepeatedClose and TestContainerCloser#testCleanupThreadRuns fail consistently Key: HDDS-57 URL: https://issues.apache.org/jira/browse/HDDS-57 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-12) Modify containerProtocol Calls for read, write and delete chunk to datanode to use a "long" blockId key
[ https://issues.apache.org/jira/browse/HDDS-12?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-12. - Resolution: Fixed This is already taken care as a part of HDDS-1. > Modify containerProtocol Calls for read, write and delete chunk to datanode > to use a "long" blockId key > --- > > Key: HDDS-12 > URL: https://issues.apache.org/jira/browse/HDDS-12 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > HDFS-13437 changes the blockId in SCM to a long value. With respect to this, > the container protocol protobuf messages and handlers need to change to use a > long blockId value rather than > a string present currently. This Jira proposes to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13437) Ozone : Make BlockId in SCM a long value
[ https://issues.apache.org/jira/browse/HDFS-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-13437. Resolution: Fixed This is already taken care as a part of HDDS-1. > Ozone : Make BlockId in SCM a long value > > > Key: HDFS-13437 > URL: https://issues.apache.org/jira/browse/HDFS-13437 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ozone >Affects Versions: HDFS-7240 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Minor > Fix For: HDFS-7240 > > Attachments: HDFS-13437.000.patch > > > Currently , when allocation of block happens inside SCM, its assigned a UUID > string value. This should be made a Long value. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-38) Add SCMNodeStorage map in SCM class to store storage statistics per Datanode
Shashikant Banerjee created HDDS-38: --- Summary: Add SCMNodeStorage map in SCM class to store storage statistics per Datanode Key: HDDS-38 URL: https://issues.apache.org/jira/browse/HDDS-38 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 Currently , the storage stats per Datanode are maintained inside scmNodeManager. This will move the scmNodeStats for storage outside SCMNodeManager to simplify refactoring. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-86) Add functionality in Node2CotainerMap to list missing open containers
Shashikant Banerjee created HDDS-86: --- Summary: Add functionality in Node2CotainerMap to list missing open containers Key: HDDS-86 URL: https://issues.apache.org/jira/browse/HDDS-86 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee In the event of datanode loss/disk loss, we need to figure out the open containers residing on the datanode and issue a close on those containers on other dataNodes as a part of pipeline Recovery. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-85) Send Container State while sending the container report from Datanode to SCM
Shashikant Banerjee created HDDS-85: --- Summary: Send Container State while sending the container report from Datanode to SCM Key: HDDS-85 URL: https://issues.apache.org/jira/browse/HDDS-85 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee While sending the container report, the container lifecycle state info is not sent. This information will be required in the event of a datanode loss/disk loss to figure out the open containers which need to be closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13581) On clicking DN UI logs link it uses http protocol for Wire encrypted cluster
Shashikant Banerjee created HDFS-13581: -- Summary: On clicking DN UI logs link it uses http protocol for Wire encrypted cluster Key: HDFS-13581 URL: https://issues.apache.org/jira/browse/HDFS-13581 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee On clicking DN UI logs link, it uses http protocol for Wire encrypted cluster.When the link's address is changed to https, it throws proper expected error message. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-83) Rename StorageLocationReport class to VolumeInfo
Shashikant Banerjee created HDDS-83: --- Summary: Rename StorageLocationReport class to VolumeInfo Key: HDDS-83 URL: https://issues.apache.org/jira/browse/HDDS-83 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-87) Make storageLocation field in StoargeReport protoBuf message optional
Shashikant Banerjee created HDDS-87: --- Summary: Make storageLocation field in StoargeReport protoBuf message optional Key: HDDS-87 URL: https://issues.apache.org/jira/browse/HDDS-87 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-83) Rename StorageLocationReport class to VolumeInfo
[ https://issues.apache.org/jira/browse/HDDS-83?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-83. - Resolution: Not A Problem Fix Version/s: 0.2.1 Resolving this as this change is not required. > Rename StorageLocationReport class to VolumeInfo > > > Key: HDDS-83 > URL: https://issues.apache.org/jira/browse/HDDS-83 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Minor > Fix For: 0.2.1 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-78) Add per volume level storage stats in SCM.
Shashikant Banerjee created HDDS-78: --- Summary: Add per volume level storage stats in SCM. Key: HDDS-78 URL: https://issues.apache.org/jira/browse/HDDS-78 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee HDDS-38 adds Storage Statistics per Datanode in SCM. This Jira aims to add per volume per Datanode storage stats in SCM. These will be useful while figuring out failed volumes, out of space disks, over utilized and under utilized disks which will be used in balancing the data within a datanode across multiple disks as well as across the cluster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13574) Modify SCMStorageReportProto to include the data dir paths as well as as the StorageType info
Shashikant Banerjee created HDFS-13574: -- Summary: Modify SCMStorageReportProto to include the data dir paths as well as as the StorageType info Key: HDFS-13574 URL: https://issues.apache.org/jira/browse/HDFS-13574 Project: Hadoop HDFS Issue Type: Bug Components: scm Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Currently, SCMStorageReport contains the staorageUUID which are ent across to SCM for maintaining storage Report info. This Jira aims to include the data dir paths for actual disks as well as the storage Type info for each volume on datanode to be sent to DataNode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-108) Update Node2ContainerMap while processing container reports
Shashikant Banerjee created HDDS-108: Summary: Update Node2ContainerMap while processing container reports Key: HDDS-108 URL: https://issues.apache.org/jira/browse/HDDS-108 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When the container report comes, the Node2Container Map should be updated via SCMContainerManager. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-161) Add functionality to queue ContainerClose command from SCM Hearbeat Reposnse to Ratis
Shashikant Banerjee created HDDS-161: Summary: Add functionality to queue ContainerClose command from SCM Hearbeat Reposnse to Ratis Key: HDDS-161 URL: https://issues.apache.org/jira/browse/HDDS-161 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode, SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a container needs to be closed at the Datanode, SCM will queue a close command which will be encoded as a part of Heartbeat Response to the Datanode. This command will be picked up from the response at the Datanode which will then be submitted to the XeiverServer to process the close command. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-162) DataNode Container reads/Writes should be disallowed for open containers if the replication type mismatches
Shashikant Banerjee created HDDS-162: Summary: DataNode Container reads/Writes should be disallowed for open containers if the replication type mismatches Key: HDDS-162 URL: https://issues.apache.org/jira/browse/HDDS-162 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode, SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 In Ozone, container can be created via ratis or Standalone protocol. However, the reads/.writes on the containers on datanodes can be done through either of these if the container location is known. A case may arise where data is being written into container via Ratis i.e, the container is in open State on the Datanodes and read via Standalone. This should not be allowed as if the read from the follower Datanodes in Ratis via Standalone Protocol might result in giving stale data. Once the container is closed on the datanode, data can be read via either of the protocols. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13679) Fix Typo in javadoc for ScanInfoPerBlockPool#addAll
Shashikant Banerjee created HDFS-13679: -- Summary: Fix Typo in javadoc for ScanInfoPerBlockPool#addAll Key: HDFS-13679 URL: https://issues.apache.org/jira/browse/HDFS-13679 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-162) DataNode Container reads/Writes should be disallowed for open containers if the replication type mismatches
[ https://issues.apache.org/jira/browse/HDDS-162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-162. -- Resolution: Not A Problem > DataNode Container reads/Writes should be disallowed for open containers if > the replication type mismatches > --- > > Key: HDDS-162 > URL: https://issues.apache.org/jira/browse/HDDS-162 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode, SCM >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > In Ozone, container can be created via ratis or Standalone protocol. However, > the reads/.writes on the containers on datanodes can be done through either > of these if the container location is known. A case may arise where data is > being written into container via Ratis i.e, the container is in open State on > the Datanodes and read via Standalone. This should not be allowed as if the > read from the follower Datanodes in Ratis via Standalone Protocol might > result in giving stale data. Once the container is closed on the datanode, > data can be read via either of the protocols. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-132) Update SCMNodeStorageStatMap while processing Node Report
Shashikant Banerjee created HDDS-132: Summary: Update SCMNodeStorageStatMap while processing Node Report Key: HDDS-132 URL: https://issues.apache.org/jira/browse/HDDS-132 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When the node report is received at SCM, SCMNodeStorageStatMap needs to get updated. In the event of a node/Volume failure, this Map needs to be updated as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-144) Fix TestEndPoint#testHeartbeat and TestEndPoint#testRegister
Shashikant Banerjee created HDDS-144: Summary: Fix TestEndPoint#testHeartbeat and TestEndPoint#testRegister Key: HDDS-144 URL: https://issues.apache.org/jira/browse/HDDS-144 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Datanode, SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 TestEndPoint#testHeartbeat fails with the following exception {code:java} 2018-06-02 01:37:18,338 WARN ipc.Server (Server.java:logException(2724)) - IPC Server handler 0 on 63450, call Call#10 Retry#0 org.apache.hadoop.ozone.protocol.StorageContainerDatanodeProtocol.sendHeartbeat from 127.0.0.1:63469 com.google.protobuf.UninitializedMessageException: Message missing required fields: datanodeUUID at com.google.protobuf.AbstractMessage$Builder.newUninitializedMessageException(AbstractMessage.java:770) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$SCMHeartbeatResponseProto$Builder.build(StorageContainerDatanodeProtocolProtos.java:5412) at org.apache.hadoop.ozone.container.common.ScmTestMock.sendHeartbeat(ScmTestMock.java:182) at org.apache.hadoop.ozone.protocolPB.StorageContainerDatanodeProtocolServerSideTranslatorPB.sendHeartbeat(StorageContainerDatanodeProtocolServerSideTranslatorPB.java:90) at org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos$StorageContainerDatanodeProtocolService$2.callBlockingMethod(StorageContainerDatanodeProtocolProtos.java:18059) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524){code} TestEndPoint#testRegister fails with Assertion. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-141) Remove PipeLine Class from SCM and move the data field in the Pipeline to ContainerInfo
Shashikant Banerjee created HDDS-141: Summary: Remove PipeLine Class from SCM and move the data field in the Pipeline to ContainerInfo Key: HDDS-141 URL: https://issues.apache.org/jira/browse/HDDS-141 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Affects Versions: 0.2.1 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 Pipeline class currently differs from the pipelineChannel with the data field, this field was introduced with HDFS-8 to maintain per container local data. However, this data field can be moved to the ContainerInfo class and then the pipelineChannel can be used interchangeably with pipeline everywhere. This will help with making code being cleaner. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-127) Add CloseOpenPipelines and CloseContainerEventHandler in SCM
Shashikant Banerjee created HDDS-127: Summary: Add CloseOpenPipelines and CloseContainerEventHandler in SCM Key: HDDS-127 URL: https://issues.apache.org/jira/browse/HDDS-127 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a node fails or node is out of space, all the pipelines which has this particular datanode id, need to be removed from the active pipelines list. Moreover, all open containers residing on the datanode need to be closed on all the other datanodes as well as the State in SCM container state manager needs to be updated to maintain consistency. This Jira aims to add the required event handlers . -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-131) Replace pipeline info from container info with a pipeline id
Shashikant Banerjee created HDDS-131: Summary: Replace pipeline info from container info with a pipeline id Key: HDDS-131 URL: https://issues.apache.org/jira/browse/HDDS-131 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 Currently, in the containerInfo object, the complete pipeline object is stored. The idea here is to decouple the pipeline info from container info and replace it with a pipeline Id. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-197) DataNode should return Datanode to return ContainerClosingException/ContainerClosedException (CCE) to client if the container is in Closing/Closed State
Shashikant Banerjee created HDDS-197: Summary: DataNode should return Datanode to return ContainerClosingException/ContainerClosedException (CCE) to client if the container is in Closing/Closed State Key: HDDS-197 URL: https://issues.apache.org/jira/browse/HDDS-197 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 SCM queues the CloeContainer command to DataNode over herabeat response which is handled by the Ratis Server inside the Datanode. In Case, the container transitions to CLOSING/CLOSED state, while the ozone client is writing Data, It should throw ContainerClosingException/ContainerClosedExceptionContainerClosingException/ContainerClosedException accordingly. These exceptions will be handled by the client which will retry to get the last committed BlockInfo from Datanode and update the OzoneMaster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12742) Add support for KSM --expunge command
Shashikant Banerjee created HDFS-12742: -- Summary: Add support for KSM --expunge command Key: HDFS-12742 URL: https://issues.apache.org/jira/browse/HDFS-12742 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 KSM --expunge will delete all the data from the data nodes for all the keys in the KSM db. User will have no control over the deletion. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12740) SCM should support a RPC to share the cluster Id with KSM and DataNodes
Shashikant Banerjee created HDFS-12740: -- Summary: SCM should support a RPC to share the cluster Id with KSM and DataNodes Key: HDFS-12740 URL: https://issues.apache.org/jira/browse/HDFS-12740 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 When the ozone cluster is first Created, SCM --init command will generate cluster Id as well as SCM Id and persist it locally. The same cluster Id and the SCM id will be shared with KSM during the KSM initialization and Datanodes during datanode registration. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12739) Add Support for SCM --init command
Shashikant Banerjee created HDFS-12739: -- Summary: Add Support for SCM --init command Key: HDFS-12739 URL: https://issues.apache.org/jira/browse/HDFS-12739 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12741) ADD support for KSM --createObjectStore command
Shashikant Banerjee created HDFS-12741: -- Summary: ADD support for KSM --createObjectStore command Key: HDFS-12741 URL: https://issues.apache.org/jira/browse/HDFS-12741 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Fix For: HDFS-7240 KSM --createObjectStore command reads the ozone configuration information and creates the KSM version file and reads the SCM version file from the SCM specified. The SCM version file is stored in the KSM metadata directory and before communicating with an SCM KSM verifies that it is communicating with an SCM where the relationship has been established via createObjectStore command. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12998) SnapshotDiff - Provide an iterator-based listing API for calculating snapshotDiff
Shashikant Banerjee created HDFS-12998: -- Summary: SnapshotDiff - Provide an iterator-based listing API for calculating snapshotDiff Key: HDFS-12998 URL: https://issues.apache.org/jira/browse/HDFS-12998 Project: Hadoop HDFS Issue Type: Improvement Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Currently , SnapshotDiff computation happens over multiple rpc calls to namenode depending on the no of snapshotDiff entries where each rpc call returns at max 1000 entries by default . Each "getSnapshotDiffreportListing" call to namenode returns a partial snapshotDiffreportList which are all combined and processed at the client side to generate a final snapshotDiffreport. There can be cases where SnapshotDiffReport can be huge and in situations as such , the rpc calls to namnode should happen on demand at the client side. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13006) Ozone: TestContainerPersistence#testMultipleWriteSingleRead and TestContainerPersistence#testOverWrite fail consistently
Shashikant Banerjee created HDFS-13006: -- Summary: Ozone: TestContainerPersistence#testMultipleWriteSingleRead and TestContainerPersistence#testOverWrite fail consistently Key: HDFS-13006 URL: https://issues.apache.org/jira/browse/HDFS-13006 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 testMultipleWriteSingleRead: {code} org.junit.ComparisonFailure: Expected :ba7110cc8c721d04fa60639cd065d5cb5d78ffe05b30c8ab05684f63b7ecbb81 Actual :496271a1c82a712c4716b12c96017c97c46d30d825588bc0605d54200dab5c87 at org.junit.Assert.assertEquals(Assert.java:115) at org.junit.Assert.assertEquals(Assert.java:144) at org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence.testMultipleWriteSingleRead(TestContainerPersistence.java:586) {code} testOverWrite : {code} java.lang.AssertionError at org.junit.Assert.fail(Assert.java:86) at org.junit.Assert.assertTrue(Assert.java:41) at org.junit.Assert.assertTrue(Assert.java:52) at org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence.testOverWrite(TestContainerPersistence.java:534) org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13102) Implement SnapshotSkipList class to store Multi level DirectoryDiffs
Shashikant Banerjee created HDFS-13102: -- Summary: Implement SnapshotSkipList class to store Multi level DirectoryDiffs Key: HDFS-13102 URL: https://issues.apache.org/jira/browse/HDFS-13102 Project: Hadoop HDFS Issue Type: Improvement Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee HDFS-11225 explains an issue where deletion of older snapshots can take a very long time in case the no of snapshot diffs is quite large for directories. For any directory under a snapshot, to construct the children list , it needs to combine all the diffs from that particular snapshot to the last snapshotDiff record and reverseApply to the current children list of the directory on live fs. This can take a significant time if the no of snapshot diffs are quite large and changes per diff is significant. This Jira proposes to store the Directory diffs in a SnapshotSkip list, where we store multi level DirectoryDiffs. At each level, the Directory Diff will be cumulative diff of k snapshot diffs, where k is the level of a node in the list. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13142) Define and Implement a DiifList Interface to store and manage SnapshotDiffs
Shashikant Banerjee created HDFS-13142: -- Summary: Define and Implement a DiifList Interface to store and manage SnapshotDiffs Key: HDFS-13142 URL: https://issues.apache.org/jira/browse/HDFS-13142 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee The InodeDiffList class contains a generic List to store snapshotDiffs. The generic List interface is bulky and to store and manage snapshotDiffs, we need only a few specific methods. This Jira proposes to define a new interface called DiffList interface which will be used to store and manage snapshotDiffs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13143) SnapshotDiff - snapshotDiffReport might be inconsistent if the snapshotDiff calculation happens between a snapshot and the current tree
Shashikant Banerjee created HDFS-13143: -- Summary: SnapshotDiff - snapshotDiffReport might be inconsistent if the snapshotDiff calculation happens between a snapshot and the current tree Key: HDFS-13143 URL: https://issues.apache.org/jira/browse/HDFS-13143 Project: Hadoop HDFS Issue Type: Improvement Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee HDFS-12594 introduced an iterative approach for computing snapshotDiffs over multiple rpc calls. The iterative approach depends upon the startPath (path of the directory with respect to the snapshottable root) and the size of the createdList (0 in case the startPath refers a file) to exactly determine form where in each iteration the calculation has to start. In case of the diff computation between a snapshot and current tree(if any of the snapshot names specified in the getSnapshotDiffReport call is null or empty), the last SnapshotDiff associated with directory/file might change owing to changes in the current tree in between the rpc calls in the absence of a global fsn lock. This might result in consistencies in the snapshotDiffReport. In case the snapshotDiffReport computation needs to be done between the current tree and a snapshot, we should fall back to non-iterative approach to compute snapshotDiff. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13144) Ozone: DeleteContainer call in SCM does not remove the container info from the container inmemory maps.
Shashikant Banerjee created HDFS-13144: -- Summary: Ozone: DeleteContainer call in SCM does not remove the container info from the container inmemory maps. Key: HDFS-13144 URL: https://issues.apache.org/jira/browse/HDFS-13144 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee When the deleteContainer call gets executed at SCM, the container info is deleted from SCM but not from the container Maps. Even after the deleteContainer call is successful, listContainer lists the deleted containers. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13172) Impelement task Manager to handle create and delete multiLevel nodes is SnapshotSkipList
Shashikant Banerjee created HDFS-13172: -- Summary: Impelement task Manager to handle create and delete multiLevel nodes is SnapshotSkipList Key: HDFS-13172 URL: https://issues.apache.org/jira/browse/HDFS-13172 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13171) Handle Deletion of nodes in SnasphotSkipList
Shashikant Banerjee created HDFS-13171: -- Summary: Handle Deletion of nodes in SnasphotSkipList Key: HDFS-13171 URL: https://issues.apache.org/jira/browse/HDFS-13171 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee This Jira will handle deletion of skipListNodes from DirectoryDiffList . If a node has multiple levels, the list needs to be balanced .If the node is uni level, no balancing of the list is required. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13173) Replace ArrayList with DirectoryDiffList(SnapshotSkipList) to store DirectoryDiffs
Shashikant Banerjee created HDFS-13173: -- Summary: Replace ArrayList with DirectoryDiffList(SnapshotSkipList) to store DirectoryDiffs Key: HDFS-13173 URL: https://issues.apache.org/jira/browse/HDFS-13173 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee This Jira will replace the existing ArrayList with DirectoryDiffList to store directory diffs for snapshots based on the config value of skipInterval. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12968) Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info.
Shashikant Banerjee created HDFS-12968: -- Summary: Ozone : TestSCMCli and TestContainerStateManager tests are failing consistently while updating the container state info. Key: HDFS-12968 URL: https://issues.apache.org/jira/browse/HDFS-12968 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 TestContainerStateManager#testUpdateContainerState is failing with the following exception: org.apache.hadoop.ozone.scm.exceptions.SCMException: Failed to update container state container28655, reason: invalid state transition from state: OPEN upon event: CLOSE. at org.apache.hadoop.ozone.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:355) at org.apache.hadoop.ozone.scm.container.ContainerMapping.updateContainerState(ContainerMapping.java:336) at org.apache.hadoop.ozone.scm.container.TestContainerStateManager.testUpdateContainerState(TestContainerStateManager.java:244) org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) at org.junit.rules.RunRules.evaluate(RunRules.java:20) Similarly, TestSCMCli#testDeleteContainer and TestSCMCli#testInfoContainer are failing with the same exception. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12965) Ozone: Documentation : Add ksm -createObjectstore command documentaion.
Shashikant Banerjee created HDFS-12965: -- Summary: Ozone: Documentation : Add ksm -createObjectstore command documentaion. Key: HDFS-12965 URL: https://issues.apache.org/jira/browse/HDFS-12965 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 ksm -createObjectStore command once executed gets the cluster id and scm id from the scm instance running and persist it locally. Once ksm starts , it verifies whether the scm instance its connecting to, has the same cluster id and scm id as present in the version file in KSM and fails in case the info does not match. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-12966) Ozone: owner name should be set properly when the container allocation happens
Shashikant Banerjee created HDFS-12966: -- Summary: Ozone: owner name should be set properly when the container allocation happens Key: HDFS-12966 URL: https://issues.apache.org/jira/browse/HDFS-12966 Project: Hadoop HDFS Issue Type: Sub-task Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 Currently , while the container allocation happens, the owner name is hardcoded as "OZONE". It should be set to KSM instance id/ CBlock Manager instance Id from where the container creation call happens. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-181) CloseContainer should commit all pending open Keys on a dataode
Shashikant Banerjee created HDDS-181: Summary: CloseContainer should commit all pending open Keys on a dataode Key: HDDS-181 URL: https://issues.apache.org/jira/browse/HDDS-181 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 A close container command arrives in the Datanode by the SCM heartBeat response. It will then be queued up over the ratis pipeline. Once the command execution starts inside the Datanode, it will mark the container in CLOSING State. All the pending open keys for the container now will be committed followed by the transition of the container state from CLOSING to CLOSED state. For achieving this, all the open keys for a container need to be tracked. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-179) CloseContainer command should be executed only if all the prior "Write" type container requests get executed
Shashikant Banerjee created HDDS-179: Summary: CloseContainer command should be executed only if all the prior "Write" type container requests get executed Key: HDDS-179 URL: https://issues.apache.org/jira/browse/HDDS-179 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a close Container command request comes to a Datanode (via SCM hearbeat response) through the Ratis protocol, all the prior enqueued "Write" type of request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be executed first before CloseContainer request gets executed. This synchronization needs to be handled in the containerStateMachine. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-180) CloseContainer should commit all pending open keys for a container
Shashikant Banerjee created HDDS-180: Summary: CloseContainer should commit all pending open keys for a container Key: HDDS-180 URL: https://issues.apache.org/jira/browse/HDDS-180 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a close container command gets executed, it will first mark the container in closing state. All the open Keys for the container will now have to be committed. This requires us to track all pending open keys for a container on a DataNode. This Jira aims to address all these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-180) CloseContainer should commit all pending open keys for a container
[ https://issues.apache.org/jira/browse/HDDS-180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-180. -- Resolution: Fixed > CloseContainer should commit all pending open keys for a container > -- > > Key: HDDS-180 > URL: https://issues.apache.org/jira/browse/HDDS-180 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > When a close container command gets executed, it will first mark the > container in closing state. All the open Keys for the container will now > have to be committed. This requires us to track all pending open keys for a > container on a DataNode. This Jira aims to address all these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-185) TestCloseContainerByPipeline#testCloseContainerViaRatis fail intermittently
Shashikant Banerjee created HDDS-185: Summary: TestCloseContainerByPipeline#testCloseContainerViaRatis fail intermittently Key: HDDS-185 URL: https://issues.apache.org/jira/browse/HDDS-185 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: SCM Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-322) Restructure ChunkGroupOutputStream and ChunkOutputStream
Shashikant Banerjee created HDDS-322: Summary: Restructure ChunkGroupOutputStream and ChunkOutputStream Key: HDDS-322 URL: https://issues.apache.org/jira/browse/HDDS-322 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 Currently, ChunkOutputStream allocate a chunk size buffer to cache client data. The idea here is to allocate the buffer in ChunkGroupOutputStream and pass it to underlying ChunkOutputStream so as to make handling CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This Jira will also modify the PutKey response to contain the committed block length to ozone client to do some validation checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-302) Fix javadoc and add implementation details in ContainerStateMachine
Shashikant Banerjee created HDDS-302: Summary: Fix javadoc and add implementation details in ContainerStateMachine Key: HDDS-302 URL: https://issues.apache.org/jira/browse/HDDS-302 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-306) Add functionality to construct OpenContainerBlockMap on datanode restart
Shashikant Banerjee created HDDS-306: Summary: Add functionality to construct OpenContainerBlockMap on datanode restart Key: HDDS-306 URL: https://issues.apache.org/jira/browse/HDDS-306 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 OpenContainerBlockMap contains list of blocks which are not committed on a Datanode. In case the Datanode restarts, we may need to reconstruct this map by reading block layout for each container and verifying it with the container DB. This is required to close the container on a single dataNode as well as for Ozone garbage collection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-86) Add functionality in Node2CotainerMap to list missing open containers
[ https://issues.apache.org/jira/browse/HDDS-86?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-86. - Resolution: Won't Do > Add functionality in Node2CotainerMap to list missing open containers > - > > Key: HDDS-86 > URL: https://issues.apache.org/jira/browse/HDDS-86 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > > In the event of datanode loss/disk loss, we need to figure out the open > containers residing on the datanode and issue a close on those containers on > other dataNodes as a part of pipeline Recovery. This Jira aims to address > this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-197) DataNode should return ContainerClosingException/ContainerClosedException (CCE) to client if the container is in Closing/Closed State
[ https://issues.apache.org/jira/browse/HDDS-197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-197. -- Resolution: Fixed Fixed in the current code with HDDS-173. > DataNode should return ContainerClosingException/ContainerClosedException > (CCE) to client if the container is in Closing/Closed State > - > > Key: HDDS-197 > URL: https://issues.apache.org/jira/browse/HDDS-197 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Blocker > Fix For: 0.2.1 > > > SCM queues the CloeContainer command to DataNode over herabeat response which > is handled by the Ratis Server inside the Datanode. In Case, the container > transitions to CLOSING/CLOSED state, while the ozone client is writing Data, > It should throw > ContainerClosingException/ContainerClosedExceptionContainerClosingException/ContainerClosedException > accordingly. These exceptions will be handled by the client which will retry > to get the last committed BlockInfo from Datanode and update the OzoneMaster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-339) Add block length and blockId in PutKeyResponse
Shashikant Banerjee created HDDS-339: Summary: Add block length and blockId in PutKeyResponse Key: HDDS-339 URL: https://issues.apache.org/jira/browse/HDDS-339 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 The putKey response will include blockId as well committed block length in the PutKey response. This will be extended to include blockCommitSequenceId as well all of which will be updated on Ozone Master. This all be required to add validation as well handle 2 node failure. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-295) TestCloseContainerByPipeline is failing because of timeout
[ https://issues.apache.org/jira/browse/HDDS-295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-295. -- Resolution: Duplicate Fix Version/s: 0.2.1 > TestCloseContainerByPipeline is failing because of timeout > -- > > Key: HDDS-295 > URL: https://issues.apache.org/jira/browse/HDDS-295 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Mukul Kumar Singh >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > The test is failing because the test is timing out waiting for the container > to be closed. > The details are logged at > https://builds.apache.org/job/PreCommit-HDDS-Build/627/testReport/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-353) Multiple delete Blocks tests are failing consistetly
Shashikant Banerjee created HDDS-353: Summary: Multiple delete Blocks tests are failing consistetly Key: HDDS-353 URL: https://issues.apache.org/jira/browse/HDDS-353 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Manager, SCM Reporter: Shashikant Banerjee Fix For: 0.2.1 As per the test reports here: [https://builds.apache.org/job/PreCommit-HDDS-Build/771/testReport/], following tests are failing: 1 . TestStorageContainerManager#testBlockDeletionTransactions 2. TestStorageContainerManager#testBlockDeletingThrottling 3.TestBlockDeletion#testBlockDeletion -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-244) Synchronize PutKey and WriteChunk requests in Ratis Server
[ https://issues.apache.org/jira/browse/HDDS-244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-244. -- Resolution: Fixed > Synchronize PutKey and WriteChunk requests in Ratis Server > -- > > Key: HDDS-244 > URL: https://issues.apache.org/jira/browse/HDDS-244 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Blocker > Fix For: 0.2.1 > > > In Ratis, all the WriteChunk Requests are submitted to Ratis with > Replication_Majority semantics. That means, the command execution from Ratis > completes any 2 of 3 datanodes complete execution of the request. It might > happen that on one of the follower, PutKey might start execution while all > the WriteChunks requests processing for the same block are still in > progress. There needs to be a synchronization enforced between PutKey and > corresponding WriteChunk requests in the ContainerStateMachine. This Jira > aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-179) CloseContainer command should be executed only if all the prior "Write" type container requests get executed
[ https://issues.apache.org/jira/browse/HDDS-179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee reopened HDDS-179: -- Reopening this issue, as the synchronizatio between closeContainer and writeChunks need synchronization during containerStateMachien#applyTransaction phase. > CloseContainer command should be executed only if all the prior "Write" type > container requests get executed > - > > Key: HDDS-179 > URL: https://issues.apache.org/jira/browse/HDDS-179 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > Attachments: HDDS-179.01.patch, HDDS-179.02.patch, HDDS-179.03.patch, > HDDS-179.04.patch, HDDS-179.05.patch > > > When a close Container command request comes to a Datanode (via SCM hearbeat > response) through the Ratis protocol, all the prior enqueued "Write" type of > request like putKey, WriteChunk, DeleteKey, CompactChunk etc should be > executed first before CloseContainer request gets executed. This > synchronization needs to be handled in the containerStateMachine. This Jira > aims to address this. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-327) CloseContainer command handler in HDDS Dispatcher should not throw exception if the container is already closed
Shashikant Banerjee created HDDS-327: Summary: CloseContainer command handler in HDDS Dispatcher should not throw exception if the container is already closed Key: HDDS-327 URL: https://issues.apache.org/jira/browse/HDDS-327 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 Currently, closeContainer handler in the HDDS Dispatcher throws an exception if the container is open state. If the container is already closed, it should not throw any exception but just return. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-322) Restructure ChunkGroupOutputStream and ChunkOutputStream
[ https://issues.apache.org/jira/browse/HDDS-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-322. -- Resolution: Won't Do > Restructure ChunkGroupOutputStream and ChunkOutputStream > > > Key: HDDS-322 > URL: https://issues.apache.org/jira/browse/HDDS-322 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.2.1 > > > Currently, ChunkOutputStream allocate a chunk size buffer to cache client > data. The idea here is to allocate the buffer in ChunkGroupOutputStream and > pass it to underlying ChunkOutputStream so as to make reclaiming of the > uncommitted leftover data in the buffer and reallocating to next Block while > handling CLOSE_CONTAINER_EXCEPTION in ozone client becomes simpler. This > Jira will also add the code to close the underlying ChunkOutputStream as soon > as the complete block data is written. > This Jira will also modify the PutKey response to contain the committed block > length to ozone client to do some validation checks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-371) Add RetriableException class in Ozone
Shashikant Banerjee created HDDS-371: Summary: Add RetriableException class in Ozone Key: HDDS-371 URL: https://issues.apache.org/jira/browse/HDDS-371 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.3.0 Certain Exception thrown by a server can be because server is in a state where request cannot be processed temporarily. Ozone Client may retry the request. If the service is up, the server may be able to process a retried request. This Jira aims to introduce notion of RetriableException in Ozone. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-383) Ozone Client should closed container Info to discard preallocated blocks from closed containers
Shashikant Banerjee created HDDS-383: Summary: Ozone Client should closed container Info to discard preallocated blocks from closed containers Key: HDDS-383 URL: https://issues.apache.org/jira/browse/HDDS-383 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When key write happens in Ozone client, based on the initial size given, preallocation of blocks happen. While write happens, containers can get closed and if the remaining preallocated blocks belong to closed containers , they can be discarded right away instead of trying to write these blocks and failing with exception. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-365) Implement flushStateMachineData for containerStateMachine
Shashikant Banerjee created HDDS-365: Summary: Implement flushStateMachineData for containerStateMachine Key: HDDS-365 URL: https://issues.apache.org/jira/browse/HDDS-365 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 With RATIS-295 , a new stateMachine API called flushStateMachineData has been introduced. This API needs to be implemented in ContainerStateMachine so as when actual flush happens via Ratis for the actual log file, the corresponding stateMachineData should also get flushed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13860) Space character in the filePath is encode as "+" while creating files in WebHDFS
Shashikant Banerjee created HDFS-13860: -- Summary: Space character in the filePath is encode as "+" while creating files in WebHDFS Key: HDFS-13860 URL: https://issues.apache.org/jira/browse/HDFS-13860 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee $ ./hdfs dfs -mkdir webhdfs://127.0.0.1/tmp1/"file 1" 2018-08-23 15:16:08,258 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable HW15685:bin sbanerjee$ ./hdfs dfs -ls webhdfs://127.0.0.1/tmp1 2018-08-23 15:16:21,244 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxr-xr-x - sbanerjee hadoop 0 2018-08-23 15:16 webhdfs://127.0.0.1/tmp1/file+1 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-263) Add retries in Ozone Client to handle BLOCK_NOT_COMMITTED Exception
Shashikant Banerjee created HDDS-263: Summary: Add retries in Ozone Client to handle BLOCK_NOT_COMMITTED Exception Key: HDDS-263 URL: https://issues.apache.org/jira/browse/HDDS-263 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 While Ozone client writes are going on, a container on a datanode can gets closed because of node failures, disk out of space etc. In situations as such, client write will fail with CLOSED_CONTAINER_IO. In this case, ozone client should try to get the committed block length for the pending open blocks and update the OzoneManager. While trying to get the committed block length, it may fail with BLOCK_NOT_COMMITTED exception as the as a part of transiton from CLOSING to CLOSED state for the container , it commits all open blocks one by one. In such cases, client needs to retry to get the committed block length for a fixed no of attempts and eventually throw the exception to the application if its not able to successfully get and update the length in the OzoneManager eventually. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-244) PutKey should get executed only if all WriteChunk request fior the same block complete in Ratis
Shashikant Banerjee created HDDS-244: Summary: PutKey should get executed only if all WriteChunk request fior the same block complete in Ratis Key: HDDS-244 URL: https://issues.apache.org/jira/browse/HDDS-244 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 In Ratis, all the WriteChunk Requests are submitted to Ratis with Replication_Majority semantics. That means, the command execution from Ratis completes any 2 of 3 datanodes complete execution of the request. It might happen that on one of the follower, PutKey might start execution while all the WriteChunks requests processing for the same block are still in progress. There needs to be a synchronization enforced between PutKey and Corresponding WriteChunkrequets in the ContainerStateMachine. This Jira aims to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-246) Ratis leader should throw BlockNotCommittedException for uncommitted blocks to Ozone Client
Shashikant Banerjee created HDDS-246: Summary: Ratis leader should throw BlockNotCommittedException for uncommitted blocks to Ozone Client Key: HDDS-246 URL: https://issues.apache.org/jira/browse/HDDS-246 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 As a part of closing the container on a datanode, all the open keys(blocks) will be committed.In between if the client calls getCommittedBlockLength for uncommitted block on the container, the leader will throw BlockNotCommitted exception to the Client Back. The client should retry to fetch the committed block length and update the OzoneManager with the length. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-247) Handle CLOSED_CONTAINER_IO exception in ozoneClient
Shashikant Banerjee created HDDS-247: Summary: Handle CLOSED_CONTAINER_IO exception in ozoneClient Key: HDDS-247 URL: https://issues.apache.org/jira/browse/HDDS-247 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 In case of ongoing writes by Ozone client to a container,the container might get closed on the Datanodes because of node loss, out of space issues etc. In such cases, the operation will fail with CLOSED_CONTAINER_IO exception exception. In cases as such, ozone client should try to get the committed length of the block from the Datanodes, and update the KSM. This Jira aims to address this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13211) Refactor Unit Tests for SnapshotSKipList
Shashikant Banerjee created HDFS-13211: -- Summary: Refactor Unit Tests for SnapshotSKipList Key: HDFS-13211 URL: https://issues.apache.org/jira/browse/HDFS-13211 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee HDFS-13102 implements the DiffList interface for storing Directory Diffs using SkipList. This Jira proposes to refactor the unit tests for HDFS-13102. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13426) Fixed javadoc in FsDatasetAsyncDiskService#removeVolume
Shashikant Banerjee created HDFS-13426: -- Summary: Fixed javadoc in FsDatasetAsyncDiskService#removeVolume Key: HDFS-13426 URL: https://issues.apache.org/jira/browse/HDFS-13426 Project: Hadoop HDFS Issue Type: Bug Components: hdfs Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13437) Ozone : Make BlockId in SCM a long value
Shashikant Banerjee created HDFS-13437: -- Summary: Ozone : Make BlockId in SCM a long value Key: HDFS-13437 URL: https://issues.apache.org/jira/browse/HDFS-13437 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 Currently , when allocation of block happens inside SCM, its assigned a UUID string value. This should be made a Long value. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13438) Fix javadoc in FsVolumeList#removeVolume
Shashikant Banerjee created HDFS-13438: -- Summary: Fix javadoc in FsVolumeList#removeVolume Key: HDFS-13438 URL: https://issues.apache.org/jira/browse/HDFS-13438 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13006) Ozone: TestContainerPersistence#testMultipleWriteSingleRead and TestContainerPersistence#testOverWrite fail consistently
[ https://issues.apache.org/jira/browse/HDFS-13006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-13006. Resolution: Fixed Seems to be fixed in the latest build. > Ozone: TestContainerPersistence#testMultipleWriteSingleRead and > TestContainerPersistence#testOverWrite fail consistently > > > Key: HDFS-13006 > URL: https://issues.apache.org/jira/browse/HDFS-13006 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: HDFS-7240 >Affects Versions: HDFS-7240 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: HDFS-7240 > > > testMultipleWriteSingleRead: > {code} > org.junit.ComparisonFailure: > Expected :ba7110cc8c721d04fa60639cd065d5cb5d78ffe05b30c8ab05684f63b7ecbb81 > Actual :496271a1c82a712c4716b12c96017c97c46d30d825588bc0605d54200dab5c87 > > at org.junit.Assert.assertEquals(Assert.java:115) > at org.junit.Assert.assertEquals(Assert.java:144) > at > org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence.testMultipleWriteSingleRead(TestContainerPersistence.java:586) > {code} > > testOverWrite : > {code} > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.container.common.impl.TestContainerPersistence.testOverWrite(TestContainerPersistence.java:534) > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:168) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13413) ClusterId and DatanodeUuid should be marked mandatory fileds in SCMRegisteredCmdResponseProto
Shashikant Banerjee created HDFS-13413: -- Summary: ClusterId and DatanodeUuid should be marked mandatory fileds in SCMRegisteredCmdResponseProto Key: HDFS-13413 URL: https://issues.apache.org/jira/browse/HDFS-13413 Project: Hadoop HDFS Issue Type: Bug Components: ozone Environment: ClusterId as well as the DatanodeUuid are optional fields in SCMRegisteredCmdResponseProto currently. We have to make both clusterId and DatanodeUuid as required field and handle it properly. As of now, we don't do anything with the response of datanode registration. We should validate the clusterId and also the datanodeUuid Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13464) Fix javadoc in FsVolumeList#handleVolumeFailures
Shashikant Banerjee created HDFS-13464: -- Summary: Fix javadoc in FsVolumeList#handleVolumeFailures Key: HDFS-13464 URL: https://issues.apache.org/jira/browse/HDFS-13464 Project: Hadoop HDFS Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-13172) Impelement task Manager to handle create and delete multiLevel nodes is SnapshotSkipList
[ https://issues.apache.org/jira/browse/HDFS-13172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDFS-13172. Resolution: Won't Do Inline deletion of multilevel nodes in snapshotSkipList is quite fast. Moreover, async deletion of multilevel nodes from snapshotSkipList containing snapshotDiffs leads to complicated scenarios of maintaining quota consistency semantics in HDFS. It won't be done for now. > Impelement task Manager to handle create and delete multiLevel nodes is > SnapshotSkipList > > > Key: HDFS-13172 > URL: https://issues.apache.org/jira/browse/HDFS-13172 > Project: Hadoop HDFS > Issue Type: Improvement > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > > This Jira proposes to add a Taskmanager in SnapshotManager. The TaskManager > will maintain a TaskQueue. Each task in the task queue will handle creation > of multiple levels of a node or or balancing of the list in case of deletion > of a node in the SnapshotSkipList. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13463) Fix javadoc in FsDatasetImpl#checkAndUpdate
Shashikant Banerjee created HDFS-13463: -- Summary: Fix javadoc in FsDatasetImpl#checkAndUpdate Key: HDFS-13463 URL: https://issues.apache.org/jira/browse/HDFS-13463 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13454) Ozone : Fix the test logic in TestKeySpaceManager#testDeleteKey
Shashikant Banerjee created HDFS-13454: -- Summary: Ozone : Fix the test logic in TestKeySpaceManager#testDeleteKey Key: HDFS-13454 URL: https://issues.apache.org/jira/browse/HDFS-13454 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 The test logic in TestKeySpaceManager#testDeleteKey seems to be wrong. The test validates the keyArgs instead of blockId to make sure the key gets deleted from SCM. Also, after the first exception validation , the subsequent statements in the junit never gets executed here. {code:java} keys.add(keyArgs.getResourceName()); exception.expect(IOException.class); exception.expectMessage("Specified block key does not exist"); cluster.getStorageContainerManager().getBlockLocations(keys); // Delete the key again to test deleting non-existing key. // These will never get executed. exception.expect(IOException.class); exception.expectMessage("KEY_NOT_FOUND"); storageHandler.deleteKey(keyArgs); Assert.assertEquals(1 + numKeyDeleteFails, ksmMetrics.getNumKeyDeletesFails());{code} The test needs to be modified to address all these. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13361) Ozone: Remove commands from command queue when the datanode is declared dead
Shashikant Banerjee created HDFS-13361: -- Summary: Ozone: Remove commands from command queue when the datanode is declared dead Key: HDFS-13361 URL: https://issues.apache.org/jira/browse/HDFS-13361 Project: Hadoop HDFS Issue Type: Sub-task Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 SCM can queue commands for Datanodes to pickup. However, a dead datanode may never pick up the commands.The command queue needs to be cleaned for the datanode once its declared dead. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13368) Ozone:TestEndPoint tests are failing consistently
Shashikant Banerjee created HDFS-13368: -- Summary: Ozone:TestEndPoint tests are failing consistently Key: HDFS-13368 URL: https://issues.apache.org/jira/browse/HDFS-13368 Project: Hadoop HDFS Issue Type: Bug Components: ozone Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: HDFS-7240 With HDFS-13300, the hostName and IpAdress in the DatanodeDetails.proto file made required fiields. These parameters are not set in TestEndPoint which lead these to fail consistently. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13320) Ozone:Support for MicrobenchMarking Tool
Shashikant Banerjee created HDFS-13320: -- Summary: Ozone:Support for MicrobenchMarking Tool Key: HDFS-13320 URL: https://issues.apache.org/jira/browse/HDFS-13320 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee This Jira proposes to add a micro benchmarking tool called Genesis which executes a set of HDSL/Ozone benchmarks. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13342) Ozone: Fix the class names in Ozone Script
Shashikant Banerjee created HDFS-13342: -- Summary: Ozone: Fix the class names in Ozone Script Key: HDFS-13342 URL: https://issues.apache.org/jira/browse/HDFS-13342 Project: Hadoop HDFS Issue Type: Bug Components: HDFS-7240 Affects Versions: HDFS-7240 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee The Ozone (oz script) has wrong classnames for freon etc, As a result of which freon cannot be started from command line. This Jira proposes to fix all these. The oz script will be renamed to Ozone as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13227) Add a method to calculate cumulative diff over multiple snapshots in DirectoryDiffList
Shashikant Banerjee created HDFS-13227: -- Summary: Add a method to calculate cumulative diff over multiple snapshots in DirectoryDiffList Key: HDFS-13227 URL: https://issues.apache.org/jira/browse/HDFS-13227 Project: Hadoop HDFS Issue Type: Improvement Components: snapshots Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee This Jira proposes to add an API in DirectoryWithSnapshotFeature#f DirectoryDiffList which will return minimal list of diffs needed to combine to get the cumulative diff between two given snapshots. The same method will be made use while constructing the childrenList for a directory. DirectoryWithSnapshotFeature#getChildrenList and DirectoryWithSnapshotFeature#computeDiffBetweenSnapshots will make use of the same method to get the cumulative diff. When snapshotSkipList, with minimal set of diffs to combine in order to get the cumulative diff, the overall computation will be faster. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDFS-13705) The native ISA-L library loading failure should be made warning rather than an error message
Shashikant Banerjee created HDFS-13705: -- Summary: The native ISA-L library loading failure should be made warning rather than an error message Key: HDFS-13705 URL: https://issues.apache.org/jira/browse/HDFS-13705 Project: Hadoop HDFS Issue Type: Bug Components: erasure-coding Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee If the loading of native ISA-L library fails, the java built in library is used for Erasure coding. The loading failure should be logged as warning and the stack trace below should be suppressed. {code:java} 18/06/26 10:22:34 ERROR erasurecode.ErasureCodeNative: Loading ISA-L failed java.lang.UnsatisfiedLinkError: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory) at org.apache.hadoop.io.erasurecode.ErasureCodeNative.loadLibrary(Native Method) at org.apache.hadoop.io.erasurecode.ErasureCodeNative.(ErasureCodeNative.java:46) at org.apache.hadoop.io.erasurecode.rawcoder.NativeRSRawEncoder.(NativeRSRawEncoder.java:34) at org.apache.hadoop.io.erasurecode.rawcoder.NativeRSRawErasureCoderFactory.createEncoder(NativeRSRawErasureCoderFactory.java:35) at org.apache.hadoop.io.erasurecode.CodecUtil.createRawEncoderWithFallback(CodecUtil.java:177) at org.apache.hadoop.io.erasurecode.CodecUtil.createRawEncoder(CodecUtil.java:129) at org.apache.hadoop.hdfs.DFSStripedOutputStream.(DFSStripedOutputStream.java:309) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:307){code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-203) Add getCommittedBlockLength API in datanode
Shashikant Banerjee created HDDS-203: Summary: Add getCommittedBlockLength API in datanode Key: HDDS-203 URL: https://issues.apache.org/jira/browse/HDDS-203 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.2.1 When a container gets closed on the Datanode while the active Writes are happening by OzoneClient, Client Write requests will fail with ContainerClosedException. In such case, ozone Client needs to enquire the last committed block length from dataNodes and update the OzoneMaster with the updated length for the block. This Jira proposes to add to RPC call to get the last committed length of a block on a Datanode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-582) Remove ChunkOutputStreamEntry class from ChunkGroupOutputStream
Shashikant Banerjee created HDDS-582: Summary: Remove ChunkOutputStreamEntry class from ChunkGroupOutputStream Key: HDDS-582 URL: https://issues.apache.org/jira/browse/HDDS-582 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-579) ContainerStateMachine should track last successful applied transaction index per container and fail subsequent transactions in case one fails
Shashikant Banerjee created HDDS-579: Summary: ContainerStateMachine should track last successful applied transaction index per container and fail subsequent transactions in case one fails Key: HDDS-579 URL: https://issues.apache.org/jira/browse/HDDS-579 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee ContainerStateMachine will keep of track of the last successfully applied transaction index and on restart inform Ratis the index, so that the subsequent transactions can be reapplied from here. Moreover, in case one transaction fails, all the subsequent transactions on the container should fail in the containerStateMachine. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-552) Partial Block Commits should happen via Ratis while closing the container in case of no pipeline failures
[ https://issues.apache.org/jira/browse/HDDS-552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-552. -- Resolution: Won't Do > Partial Block Commits should happen via Ratis while closing the container in > case of no pipeline failures > - > > Key: HDDS-552 > URL: https://issues.apache.org/jira/browse/HDDS-552 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.3.0 > > > Containers need to be closed when the space gets full or in case the pipeline > is not healthy. In case of no pipeline failure, the partial block commits > need to happen via Ratis. Currently this is handled per datanode by > openContainerBlockMap. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-640) Fix Failing Unit Test cases
Shashikant Banerjee created HDDS-640: Summary: Fix Failing Unit Test cases Key: HDDS-640 URL: https://issues.apache.org/jira/browse/HDDS-640 Project: Hadoop Distributed Data Store Issue Type: Bug Components: OM, Ozone Client, SCM Reporter: Shashikant Banerjee The following tests seem be failing consistently: 1.TestKeys#testPutAndGetKey 2.TestRocksDBStoreMBean#testJmxBeans 3.TestNodeFailure#testPipelineFail 4.TestNodeFailure#testPipelineFail Test report for reference: https://builds.apache.org/job/PreCommit-HDDS-Build/1381/testReport/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-306) Add functionality to construct OpenContainerBlockMap on datanode restart
[ https://issues.apache.org/jira/browse/HDDS-306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-306. -- Resolution: Won't Do As part of failure recovery in Ozone, partial flush on the datanodes won't be required. Resolving this. > Add functionality to construct OpenContainerBlockMap on datanode restart > > > Key: HDDS-306 > URL: https://issues.apache.org/jira/browse/HDDS-306 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > > OpenContainerBlockMap contains list of blocks which are not committed on a > Datanode. In case the Datanode restarts, we may need to reconstruct this map > by reading block layout for each container and verifying it with the > container DB. This is required to close the container on a single dataNode as > well as for Ozone garbage collection -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-697) update the BCSID for PutSmallFile command
Shashikant Banerjee created HDDS-697: Summary: update the BCSID for PutSmallFile command Key: HDDS-697 URL: https://issues.apache.org/jira/browse/HDDS-697 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-708) Validate BCSID while reading blocks from containers in datanodes
Shashikant Banerjee created HDDS-708: Summary: Validate BCSID while reading blocks from containers in datanodes Key: HDDS-708 URL: https://issues.apache.org/jira/browse/HDDS-708 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Ozone client while making a getBlock call during reading data , should read the bcsId from OzoneManager for the block and the same needs to be validated in Datanode. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-709) Modify Close Container handling sequence on datanodes
Shashikant Banerjee created HDDS-709: Summary: Modify Close Container handling sequence on datanodes Key: HDDS-709 URL: https://issues.apache.org/jira/browse/HDDS-709 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee With quasi closed container state for handling majority node failures, the close container handling sequence in Datanodes need to change. Once the datanodes receive a close container command from SCM, the open container replicas individually be marked in the closing state. In a closing state, only the transactions coming from the Ratis leader are allowed , all other write transaction will fail. A close container transaction will be queued via Ratis on the leader which will be replayed to the followers which makes it transition to CLOSED/QUASI CLOSED state. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org