[
https://issues.apache.org/jira/browse/HDFS-3969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13462293#comment-13462293
]
Todd Lipcon commented on HDFS-3969:
-----------------------------------
A few small improvements:
- I'm looking at a network trace of a usage of this API where the client is
asking for locations of about 10,000 blocks. This resulted in a 750KB rpc call.
Most of the data is spent in repeated serialization of the same block pool ID.
We can change around the wire protocol so that the block pool is only passed
once, and then use packed repeated fields for the block IDs. This should chop
off about 40-45 bytes per block. We can also probably drop the genstamps, since
blocks dont switch between drives when the gen stamp changes.
- It currently fills in an "empty" HdfsVolumeId which returns false for
isValid() for any blocks that don't come back properly. Instead, I think the
API would be clearer if it dropped those from the result rather than returning
'invalid' placeholder objects.
- We should improve the code so that the timeout parameter has a unit attached:
eg 'timeoutMs' instead of 'timeout' to prevent bugs like this.
- When a DN does not respond correctly, we should include the exception text in
the log message, even if not the full trace.
> Small bug fixes and improvements for disk locations API
> -------------------------------------------------------
>
> Key: HDFS-3969
> URL: https://issues.apache.org/jira/browse/HDFS-3969
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs client
> Affects Versions: 3.0.0
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
>
> The new disk block locations API has a configurable timeout, but it's used
> inconsistently: the invokeAll() call to the thread pool assumes the timeout
> is in seconds, but the RPC timeout is set in milliseconds.
> Also, we can improve the wire protocol for this API to be a lot more
> efficient.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira