[
https://issues.apache.org/jira/browse/HADOOP-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15628817#comment-15628817
]
Steve Loughran commented on HADOOP-13783:
-----------------------------------------
1. telemetry is always good; maybe make a separate JIRA. FWIW, S3a will print
out its metrics on a toString() call of an FS instance or output stream, so
letting downstream projects log the stats on a call without making changes to
their code which only works on 2.8+ to use the storage stats code. Worth
copying.
There's a class in the Swift FS client.
{{org.apache.hadoop.fs.swift.util.DurationStats}} which is used to track stats
on HTTP calls, including min/max/variance; I used this to track down throttling
of DELETE requests against one swift service. That class could be moved into
Hadoop-common and used in WASB as well as other blobstores, where it can
isolate problems at the HTTPS level
2. Interesting to hear that azure has blob placement issues on key name too; so
does S3 —the traditional layout of hive-friendly datasets appears to be
suboptimal for object stores. It'd be good to help work out a layout policy
which Hive can use that lays out datasets better
> Improve efficiency of WASB over page blobs
> ------------------------------------------
>
> Key: HADOOP-13783
> URL: https://issues.apache.org/jira/browse/HADOOP-13783
> Project: Hadoop Common
> Issue Type: Bug
> Components: azure
> Reporter: NITIN VERMA
> Assignee: NITIN VERMA
>
> 1) Add telemetry to WASB driver. WASB driver is lack of any log or
> telemetry which makes trouble shoot very difficult. For example, we don’t
> know where is high e2e latency between HBase and Azure storage came from when
> Azure storage server latency was very low. Also we don’t know why WASB can
> only do 166 IOPs which is way below azure storage 500 IOPs. And we had
> several incidents before related to storage latency, because of lacking logs,
> we couldn’t find the ownership of the incident quickly.
> 2) Resolving the hot spotting issue when WASB driver partition the azure
> page blobs by changing the key. Current key design is causing the hot
> spotting on azure storage.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]