[ 
https://issues.apache.org/jira/browse/HDDS-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16657779#comment-16657779
 ] 

Shashikant Banerjee edited comment on HDDS-676 at 10/20/18 8:01 AM:
--------------------------------------------------------------------

 
{code:java}
A question rather than a comment. If we make this a required field, and if get 
to a datanode and want to read a block – say a case where we say give any 
version of data you have, how does that work? Should we make this optional? 
This also means that if you have written a block using 0.2.1, then we cannot 
read that via this protocol. It is an Alpha release, so we don't have to 
strictly be backward compactable; just a thought.{code}
I agree . Made the bcsId filed optional with default value 0. 
{code:java}
More a question of code style: should we have bcsId inside blockID or no? In 
the blockManager Interface.
{code}
I feel blockID is used to uniquely identify a block in the system but the BCSID 
of a block may be different in different non closed container replica. BCSID 
should not be necessarily a part of BlockId.
{code:java}
This does not match with the earlier logic in putBlock. I am presuming the 
putSmallFile call was committed via Ratis. Please correct me if I am wrong.
{code}
I agree. But as per [~jnp] comments above, this will be handled in a separate 
Jira HDDS-697.
{code:java}
 XceiverClientRatis#watchForCommit - if this is a test only function perhaps 
have in a test helper file?
{code}
The api will be used HDDS-675. For now, its being only used in a test. I have 
removed the visibleForTesting tag.
{code:java}
I am asking more to make sure that I understand this correctly. We look up the 
max container bcsid, and then later read the bcsid. I am guessing we do this 
because we are counting on containerBCSId will be cached and it will not cause 
a real disk I/O and therefore checking it makes it easier for us to fail faster?
{code}
Yes, i have now changed the code to compare the cached container bcsid value 
first and then later compare the block bcsid.
{code:java}
There is also an issue that is not taken care in this code path. What happens 
if a block is deleted? If the datanode says I don't have this block, do we 
still try all 3 data nodes? I am presuming we don't have to deal with it since 
it is an edge case.
{code}
I think, if the datanode says i don't have this block for an open container 
replica, It is difficult to figure out at client whether block got deleted or 
the block commit transaction has not yet happened on this node(in case delete 
is allowed on open containers). But for closed container replica we may not 
need to retry. I think we  need to think through this more as the retry 
behaviour might be different for open and closed container replica here. 

 

 


was (Author: shashikant):
 
{code:java}
A question rather than a comment. If we make this a required field, and if get 
to a datanode and want to read a block – say a case where we say give any 
version of data you have, how does that work? Should we make this optional? 
This also means that if you have written a block using 0.2.1, then we cannot 
read that via this protocol. It is an Alpha release, so we don't have to 
strictly be backward compactable; just a thought.{code}
I agree . Made the bcsId filed optional with default value 0. 
{code:java}
More a question of code style: should we have bcsId inside blockID or no? In 
the blockManager Interface.
{code}
I feel blockID is used to uniquely identify a block in the system but the BCSID 
of a block may be different in different open container replica. BCSID should 
not be necessarily a part of BlockId.
{code:java}
This does not match with the earlier logic in putBlock. I am presuming the 
putSmallFile call was committed via Ratis. Please correct me if I am wrong.
{code}
I agree. But as per [~jnp] comments above, this will be handled in a separate 
Jira HDDS-697.
{code:java}
 XceiverClientRatis#watchForCommit - if this is a test only function perhaps 
have in a test helper file?
{code}
The api will be used HDDS-675. For now, its being only used in a test. I have 
removed the visibleForTesting tag.
{code:java}
I am asking more to make sure that I understand this correctly. We look up the 
max container bcsid, and then later read the bcsid. I am guessing we do this 
because we are counting on containerBCSId will be cached and it will not cause 
a real disk I/O and therefore checking it makes it easier for us to fail faster?
{code}
Yes, i have now changed the code to compare the cached container bcsid value 
first and then later compare the block bcsid.
{code:java}
There is also an issue that is not taken care in this code path. What happens 
if a block is deleted? If the datanode says I don't have this block, do we 
still try all 3 data nodes? I am presuming we don't have to deal with it since 
it is an edge case.
{code}
I think, if the datanode says i don't have this block for an open container 
replica, It is difficult to figure out at client whether block got deleted or 
the block commit transaction has not yet happened on this node(in case delete 
is allowed on open containers). But for closed container replica we may not 
need to retry. I think we  need to think through this more as the retry 
behaviour might be different for open and closed container replica here. 

 

 

> Enable Read from open Containers via Standalone Protocol
> --------------------------------------------------------
>
>                 Key: HDDS-676
>                 URL: https://issues.apache.org/jira/browse/HDDS-676
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>            Priority: Major
>         Attachments: HDDS-676.001.patch, HDDS-676.002.patch, 
> HDDS-676.003.patch, HDDS-676.004.patch, HDDS-676.005.patch
>
>
> With BlockCommitSequenceId getting updated per block commit on open 
> containers in OM as well datanode, Ozone Client reads can through Standalone 
> protocol not necessarily requiring Ratis. Client should verify the BCSID of 
> the container which has the data block , which should always be greater than 
> or equal to the BCSID of the block to be read and the existing block BCSID 
> should exactly match that of the block to be read. As a part of this, Client 
> can try to read from a replica with a supplied BCSID and failover to the next 
> one in case the block does ont exist on one replica.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to