[ 
https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322207#comment-16322207
 ] 

Rakesh R commented on HDFS-12090:
---------------------------------

Great work [~ehiggs]. I quickly looked at the patch and adding few thoughts.

{{SyncServiceSatisfierWorker}} - IIUC this logic is similar to the C-DN 
approach we coded earlier in HDFS-10285. FYI, we have moved the C-DN logic to 
Namenode and now {{StoragePolicySatisfier}} is doing the file level 
co-ordination(tracking the file blocks) through 
{{BlockStorageMovementAttemptedItems}} monitor, please refer 
{{StoragePolicySatisfier#AttemptedItemInfo}} class and you could see trackId Vs 
blocks. I'd like to bring [few history discussion 
thread|https://issues.apache.org/jira/browse/HDFS-10285?focusedCommentId=16141060&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16141060]
 for this design change - here the main concern was the way providing the DN 
throttling mechanism, to control the load of block move tasks at DN. One idea 
was to add separate throttling configurations, but this would add overheads(to 
the admin) of managing multiple configs to control the load of DN and was not 
accepted. Another idea came up was, good to utilize the existing xmits count. 
Presently, HDFS block replication/ec-reconstruction task looks at the number of 
xmits used on the DN to throttle appropriately. Now, block move tasks is also 
considered and share the ratio of number of 
tasks(blockreplication/ec-reconstruction/blockmove). I'd suggest you to look at 
the SPS logic to understand more. IMHO, please take this into consideration and 
hope will help you guys to avoid any conflicts of interests later while 
merging. Thanks!
Also, please refer the configuration 
{{dfs.storage.policy.satisfier.low.max-streams.preference}} introduced by SPS 
to respect the xmits.

I hope you saw the comments/discussions in HDFS-10285 jira. As per the merge 
discussion, SPS feature has to satsify the usecases of running 
internally(embedded with NN) OR as an external service outside NN. Presently, 
we are working to de-couple the SPS-NN interactions via the {{Context}} 
interface idea so that both internal and external SPS can implement/plug-in 
their own specific logic for the block movement. We would be happy to help you 
in incorporating the {{FileBackupTasks}} with SPS service.

> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>         Attachments: HDFS-12090-Functional-Specification.001.pdf, 
> HDFS-12090-Functional-Specification.002.pdf, 
> HDFS-12090-Functional-Specification.003.pdf, HDFS-12090-design.001.pdf, 
> HDFS-12090.0000.patch
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in 
> external storage systems accessible through HDFS. However, HDFS-9806 is 
> limited to data being read through HDFS. This JIRA will deal with how data 
> can be written to such {{PROVIDED}} storages from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to