[jira] [Comment Edited] (HDFS-13421) [PROVIDED Phase 2] Implement DNA_BACKUP command in Datanode

Ewan Higgs (JIRA) Mon, 09 Jul 2018 15:10:14 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16537678#comment-16537678
 ]


Ewan Higgs edited comment on HDFS-13421 at 7/9/18 10:09 PM:
------------------------------------------------------------

{quote} * What's the purpose of {{DFS_PROVIDED_HEARTBEAT_BACKUP_NUM}}?{quote}
Looks like I had an error in constructing the patch. There is supposed to be a 
line in DatanodeManager.java with the following, but it's more for when the NN 
starts issuing the commands (so there's no need for the DfsConfigKeys change in 
this patch):
{quote}{code}this.maxBackupCommandsPerHeartbeat = 
conf.getInt(DFSConfigKeys.DFS_PROVIDED_HEARTBEAT_BACKUP_NUM,
 DFSConfigKeys.DFS_PROVIDED_HEARTBEAT_BACKUP_NUM_DEFAULT);{code}
{quote}
 
{quote} * You don't need to initialize {{executorService}} in both the 
constructor of {{SyncServiceSatisfierDatanodeWorker}} and 
{{SyncServiceSatisfierDatanodeWorker#start()}}, right?{quote}
Good catch. Correct!
{quote}shouldn't each LocatedBlock be associated with one putPart? The current 
implementation concats all the blocks in a BlockSyncTask as a single part in a 
multi-part upload.
{quote}
An issue here is that each putPart needs a number. In the simple case, if DN1 
receives putPart 1 and DN2 receives putPart 2, then DN1 decides to split the 
putPart into two operations, what integer does it choose? So we chose a design 
where the NN is quite explicit in what needs to happen the DN is a dumb worker.

Another issue is that, e.g. in S3, parts need to be at least 5MB except the 
last part. This means that if you've used concat on two files and a block < 5MB 
in size is internal to the file, then this would not work (for S3). Using the 
current approach, if there is a <5MB block in the middle of the file then we 
can upload it along with the block before or after it.
{quote}Can we finish integrating the SyncServiceSatisfierDatanodeWorker with 
the heartbeat/DN commands in this sub-task (and add appropriate test cases)?
{quote}
I avoided adding any work from the NN because the flow of reporting uploaded 
blocks to the BlockManager and ProvidedStorageMap and related retry mechanisms 
have long tendrils so there's not an obvious place to partition the patch aside 
from proto (HDFS-13310), DN (this ticket) and NN. If you think it's best, I can 
try to add some of the NN side things.


was (Author: ehiggs):
{quote} * What's the purpose of {{DFS_PROVIDED_HEARTBEAT_BACKUP_NUM}}?{quote}
Looks like I had an error in constructing the patch. There is supposed to be a 
line in DatanodeManager.java with the following, but it's more for when the NN 
starts issuing the commands (so there's no need for the DfsConfigKeys change in 
this patch):
{quote}{{{color:#000080}this{color}.{color:#660e7a}maxBackupCommandsPerHeartbeat
 {color}=}}
{{ 
conf.getInt(DFSConfigKeys.{color:#660e7a}DFS_PROVIDED_HEARTBEAT_BACKUP_NUM{color},}}
{{ 
DFSConfigKeys.{color:#660e7a}DFS_PROVIDED_HEARTBEAT_BACKUP_NUM_DEFAULT{color});}}
{quote}
 
{quote} * You don't need to initialize {{executorService}} in both the 
constructor of {{SyncServiceSatisfierDatanodeWorker}} and 
{{SyncServiceSatisfierDatanodeWorker#start()}}, right?{quote}
Good catch. Correct!
{quote}{{shouldn't each LocatedBlock be associated with one putPart? The 
current implementation concats all the blocks in a BlockSyncTask as a single 
part in a multi-part upload.}}
{quote}
An issue here is that each putPart needs a number. In the simple case, if DN1 
receives putPart 1 and DN2 receives putPart 2, then DN1 decides to split the 
putPart into two operations, what integer does it choose? So we chose a design 
where the NN is quite explicit in what needs to happen the DN is a dumb worker.

Another issue is that, e.g. in S3, parts need to be at least 5MB except the 
last part. This means that if you've used concat on two files and a block < 5MB 
in size is internal to the file, then this would not work (for S3). Using the 
current approach, if there is a <5MB block in the middle of the file then we 
can upload it along with the block before or after it.
{quote}Can we finish integrating the SyncServiceSatisfierDatanodeWorker with 
the heartbeat/DN commands in this sub-task (and add appropriate test cases)?
{quote}
I avoided adding any work from the NN because the flow of reporting uploaded 
blocks to the BlockManager and ProvidedStorageMap and related retry mechanisms 
have long tendrils so there's not an obvious place to partition the patch aside 
from proto (HDFS-13310), DN (this ticket) and NN. If you think it's best, I can 
try to add some of the NN side things.

> [PROVIDED Phase 2] Implement DNA_BACKUP command in Datanode
> -----------------------------------------------------------
>
>                 Key: HDFS-13421
>                 URL: https://issues.apache.org/jira/browse/HDFS-13421
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Ewan Higgs
>            Assignee: Ewan Higgs
>            Priority: Major
>         Attachments: HDFS-13421-HDFS-12090.001.patch, 
> HDFS-13421-HDFS-12090.002.patch, HDFS-13421-HDFS-12090.003.patch
>
>
> HDFS-13310 introduces an API for DNA_BACKUP. Here, we implement DNA_BACKUP 
> command in Datanode. 
> These have been broken up to make reviewing it easier.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HDFS-13421) [PROVIDED Phase 2] Implement DNA_BACKUP command in Datanode

Reply via email to