[
https://issues.apache.org/jira/browse/HDFS-13248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811939#comment-16811939
]
Íñigo Goiri commented on HDFS-13248:
------------------------------------
Thanks [~hexiaoqiao] for [^RBF Data Locality Design.pdf].
My main concern with modifying {{ClientProtocol}} is that it requires the
client itself to change.
The change is backwards compatible but for it to work you need the client to be
up to date.
>From our experience, this is pretty challenging.
WebHDFS is another example; clients would need to pass the new parameter for it
to work.
In addition, compatibility happens at the expense of duplicating methods for
just one parameter.
The current approach for locality is to use {{Server#getRemoteAddress()}} for
RPC and {{JspHelper#getRemoteAddr()}} for HTTP (this is the most case with
reads with {{getBlockLocations()}}).
For some of them it also combines this with a parameter {{clientName}}.
I think the best approach is to extend the RPC framework and modify the
Namenode and the Router to leverage this.
Instead of {{hostname}}, I would call it {{proxyHostname}} or
{{clientHostname}}.
In any case, I'm fine with extending the protocol to add the new field, it
should be fairly easy to cover all the compatibility cases.
I'd like to go deeper on what the security risks are here.
BTW, we could do right away the one that
{{RouterRpcServer#getBlockLocations()}} reorders the destinations.
> RBF: Namenode need to choose block location for the client
> ----------------------------------------------------------
>
> Key: HDFS-13248
> URL: https://issues.apache.org/jira/browse/HDFS-13248
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Weiwei Wu
> Assignee: Íñigo Goiri
> Priority: Major
> Attachments: HDFS-13248.000.patch, HDFS-13248.001.patch,
> HDFS-13248.002.patch, HDFS-13248.003.patch, HDFS-13248.004.patch,
> HDFS-13248.005.patch, HDFS-Router-Data-Locality.odt, RBF Data Locality
> Design.pdf, clientMachine-call-path.jpeg, debug-info-1.jpeg, debug-info-2.jpeg
>
>
> When execute a put operation via router, the NameNode will choose block
> location for the router, not for the real client. This will affect the file's
> locality.
> I think on both NameNode and Router, we should add a new addBlock method, or
> add a parameter for the current addBlock method, to pass the real client
> information.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]