[
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657779#comment-16657779
]
Shashikant Banerjee edited comment on HDDS-676 at 10/20/18 8:01 AM:
--------------------------------------------------------------------
{code:java}
A question rather than a comment. If we make this a required field, and if get
to a datanode and want to read a block – say a case where we say give any
version of data you have, how does that work? Should we make this optional?
This also means that if you have written a block using 0.2.1, then we cannot
read that via this protocol. It is an Alpha release, so we don't have to
strictly be backward compactable; just a thought.{code}
I agree . Made the bcsId filed optional with default value 0.
{code:java}
More a question of code style: should we have bcsId inside blockID or no? In
the blockManager Interface.
{code}
I feel blockID is used to uniquely identify a block in the system but the BCSID
of a block may be different in different non closed container replica. BCSID
should not be necessarily a part of BlockId.
{code:java}
This does not match with the earlier logic in putBlock. I am presuming the
putSmallFile call was committed via Ratis. Please correct me if I am wrong.
{code}
I agree. But as per [~jnp] comments above, this will be handled in a separate
Jira HDDS-697.
{code:java}
XceiverClientRatis#watchForCommit - if this is a test only function perhaps
have in a test helper file?
{code}
The api will be used HDDS-675. For now, its being only used in a test. I have
removed the visibleForTesting tag.
{code:java}
I am asking more to make sure that I understand this correctly. We look up the
max container bcsid, and then later read the bcsid. I am guessing we do this
because we are counting on containerBCSId will be cached and it will not cause
a real disk I/O and therefore checking it makes it easier for us to fail faster?
{code}
Yes, i have now changed the code to compare the cached container bcsid value
first and then later compare the block bcsid.
{code:java}
There is also an issue that is not taken care in this code path. What happens
if a block is deleted? If the datanode says I don't have this block, do we
still try all 3 data nodes? I am presuming we don't have to deal with it since
it is an edge case.
{code}
I think, if the datanode says i don't have this block for an open container
replica, It is difficult to figure out at client whether block got deleted or
the block commit transaction has not yet happened on this node(in case delete
is allowed on open containers). But for closed container replica we may not
need to retry. I think we need to think through this more as the retry
behaviour might be different for open and closed container replica here.
was (Author: shashikant):
{code:java}
A question rather than a comment. If we make this a required field, and if get
to a datanode and want to read a block – say a case where we say give any
version of data you have, how does that work? Should we make this optional?
This also means that if you have written a block using 0.2.1, then we cannot
read that via this protocol. It is an Alpha release, so we don't have to
strictly be backward compactable; just a thought.{code}
I agree . Made the bcsId filed optional with default value 0.
{code:java}
More a question of code style: should we have bcsId inside blockID or no? In
the blockManager Interface.
{code}
I feel blockID is used to uniquely identify a block in the system but the BCSID
of a block may be different in different open container replica. BCSID should
not be necessarily a part of BlockId.
{code:java}
This does not match with the earlier logic in putBlock. I am presuming the
putSmallFile call was committed via Ratis. Please correct me if I am wrong.
{code}
I agree. But as per [~jnp] comments above, this will be handled in a separate
Jira HDDS-697.
{code:java}
XceiverClientRatis#watchForCommit - if this is a test only function perhaps
have in a test helper file?
{code}
The api will be used HDDS-675. For now, its being only used in a test. I have
removed the visibleForTesting tag.
{code:java}
I am asking more to make sure that I understand this correctly. We look up the
max container bcsid, and then later read the bcsid. I am guessing we do this
because we are counting on containerBCSId will be cached and it will not cause
a real disk I/O and therefore checking it makes it easier for us to fail faster?
{code}
Yes, i have now changed the code to compare the cached container bcsid value
first and then later compare the block bcsid.
{code:java}
There is also an issue that is not taken care in this code path. What happens
if a block is deleted? If the datanode says I don't have this block, do we
still try all 3 data nodes? I am presuming we don't have to deal with it since
it is an edge case.
{code}
I think, if the datanode says i don't have this block for an open container
replica, It is difficult to figure out at client whether block got deleted or
the block commit transaction has not yet happened on this node(in case delete
is allowed on open containers). But for closed container replica we may not
need to retry. I think we need to think through this more as the retry
behaviour might be different for open and closed container replica here.
> Enable Read from open Containers via Standalone Protocol
> --------------------------------------------------------
>
> Key: HDDS-676
> URL: https://issues.apache.org/jira/browse/HDDS-676
> Project: Hadoop Distributed Data Store
> Issue Type: Bug
> Reporter: Shashikant Banerjee
> Assignee: Shashikant Banerjee
> Priority: Major
> Attachments: HDDS-676.001.patch, HDDS-676.002.patch,
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open
> containers in OM as well datanode, Ozone Client reads can through Standalone
> protocol not necessarily requiring Ratis. Client should verify the BCSID of
> the container which has the data block , which should always be greater than
> or equal to the BCSID of the block to be read and the existing block BCSID
> should exactly match that of the block to be read. As a part of this, Client
> can try to read from a replica with a supplied BCSID and failover to the next
> one in case the block does ont exist on one replica.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]