[
https://issues.apache.org/jira/browse/HDFS-11334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rakesh R updated HDFS-11334:
----------------------------
Attachment: HDFS-11334-HDFS-10285-03.patch
> [SPS]: NN switch and rescheduling movements can lead to have more than one
> coordinator for same file blocks
> -----------------------------------------------------------------------------------------------------------
>
> Key: HDFS-11334
> URL: https://issues.apache.org/jira/browse/HDFS-11334
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: HDFS-10285
> Reporter: Uma Maheswara Rao G
> Assignee: Rakesh R
> Fix For: HDFS-10285
>
> Attachments: HDFS-11334-HDFS-10285-00.patch,
> HDFS-11334-HDFS-10285-01.patch, HDFS-11334-HDFS-10285-02.patch,
> HDFS-11334-HDFS-10285-03.patch
>
>
> I am summarizing the scenarios here what Rakesh and me discussed offline:
> Here we need to handle couple of cases:
> # NN switch - it will freshly start scheduling for all files.
> At this time, old co-ordinators may continue movement work and send
> results back. This could confuse NN SPS that which result is right one.
> *NEED TO HANDLE*
> # DN disconnected for heartbeat expiry - If DN disconnected for long
> time(more than heartbeat expiry), NN will remove this nodes. After SPS
> Monitor time out, it may retry for files which were scheduled to that DN, by
> finding new co-ordinator. But if it reconnects back after NN reschedules, it
> may lead to get different results from deferent co-ordinators.
> *NEED TO HANDLE*
> # NN Restart- Should be same as point 1
> # DN disconnect - here When DN disconnected simply and reconnected
> immediately (before heartbeat expiry), there should not any issues
> *NEED NOT HANDLE*, but can think of more scenarios if any thing missing
> # DN Restart- If DN restarted, DN can not send any results as it will loose
> everything. After NN SPS Monitor timeout, it will retry.
> *NEED NOT HANDLE*, but can think of more scenarios if any thing missing
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]