[ 
https://issues.apache.org/jira/browse/HDFS-15417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akira Ajisaka updated HDFS-15417:
---------------------------------
        Parent: HDFS-14603
    Issue Type: Sub-task  (was: Improvement)

> RBF: Get the datanode report from cache for federation WebHDFS operations
> -------------------------------------------------------------------------
>
>                 Key: HDFS-15417
>                 URL: https://issues.apache.org/jira/browse/HDFS-15417
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: federation, rbf, webhdfs
>            Reporter: Ye Ni
>            Assignee: Ye Ni
>            Priority: Major
>             Fix For: 3.4.0
>
>
> *Why*
>  For WebHDFS CREATE, OPEN, APPEND and GETFILECHECKSUM operations, router or 
> namenode needs to get the datanodes where the block is located, then redirect 
> the request to one of the datanodes.
> However, this chooseDatanode action in router is much slower than namenode, 
> which directly affects the WebHDFS operations above.
> For namenode WebHDFS, it normally takes tens of milliseconds, while router 
> always takes more than 2 seconds.
> *How*
> Cache the datanode report in router RPC server. Actively refresh with a 
> configured interval. Only get the datanode report when necessary in router.
> It is a very expense operation where all the time is spent on.
> This is only needed when we want to exclude some datanodes or find a random 
> datanode for CREATE.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to