[
https://issues.apache.org/jira/browse/HDDS-263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Shashikant Banerjee updated HDDS-263:
-------------------------------------
Description: While Ozone client writes are going on, a container on a
datanode can gets closed because of node failures, disk out of space etc. In
situations as such, client write will fail with CLOSED_CONTAINER_IO. In this
case, ozone client should try to get the committed block length for the pending
open blocks and update the OzoneManager. While trying to get the committed
block length, it may fail with BLOCK_NOT_COMMITTED exception because as a part
of transiton from CLOSING to CLOSED state for the container , it commits all
open blocks one by one. In such cases, client needs to retry to get the
committed block length for a fixed no of attempts and eventually throw the
exception to the application if its not able to successfully get and update the
length in the OzoneManager. This Jira aims to address this. (was: While Ozone
client writes are going on, a container on a datanode can gets closed because
of node failures, disk out of space etc. In situations as such, client write
will fail with CLOSED_CONTAINER_IO. In this case, ozone client should try to
get the committed block length for the pending open blocks and update the
OzoneManager. While trying to get the committed block length, it may fail with
BLOCK_NOT_COMMITTED exception as the as a part of transiton from CLOSING to
CLOSED state for the container , it commits all open blocks one by one. In such
cases, client needs to retry to get the committed block length for a fixed no
of attempts and eventually throw the exception to the application if its not
able to successfully get and update the length in the OzoneManager. This Jira
aims to address this.)
> Add retries in Ozone Client to handle BLOCK_NOT_COMMITTED Exception
> -------------------------------------------------------------------
>
> Key: HDDS-263
> URL: https://issues.apache.org/jira/browse/HDDS-263
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Components: Ozone Client
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Priority: Blocker
> Fix For: 0.2.1
>
> Attachments: HDDS-263.00.patch, HDDS-263.01.patch
>
>
> While Ozone client writes are going on, a container on a datanode can gets
> closed because of node failures, disk out of space etc. In situations as
> such, client write will fail with CLOSED_CONTAINER_IO. In this case, ozone
> client should try to get the committed block length for the pending open
> blocks and update the OzoneManager. While trying to get the committed block
> length, it may fail with BLOCK_NOT_COMMITTED exception because as a part of
> transiton from CLOSING to CLOSED state for the container , it commits all
> open blocks one by one. In such cases, client needs to retry to get the
> committed block length for a fixed no of attempts and eventually throw the
> exception to the application if its not able to successfully get and update
> the length in the OzoneManager. This Jira aims to address this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]