[jira] [Commented] (HADOOP-13783) Improve efficiency of WASB over page blobs

2016-11-02 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15628817#comment-15628817
 ] 

Steve Loughran commented on HADOOP-13783:
-

1. telemetry is always good; maybe make a separate JIRA. FWIW, S3a will print 
out its metrics on a toString() call of an FS instance or output stream, so 
letting downstream projects log the stats on a call without making changes to 
their code which only works on 2.8+ to use the storage stats code. Worth 
copying.

There's a class in the Swift FS client. 
{{org.apache.hadoop.fs.swift.util.DurationStats}} which is used to track stats 
on HTTP calls, including min/max/variance; I used this to track down throttling 
of DELETE requests against one swift service. That class could be moved into 
Hadoop-common and used in WASB as well as other blobstores, where it can 
isolate problems at the HTTPS level

2. Interesting to hear that azure has blob placement issues on key name too; so 
does S3 —the traditional layout of hive-friendly datasets appears to be 
suboptimal for object stores. It'd be good to help work out a layout policy 
which Hive can use that lays out datasets better

> Improve efficiency of WASB over page blobs
> --
>
> Key: HADOOP-13783
> URL: https://issues.apache.org/jira/browse/HADOOP-13783
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: azure
>Reporter: NITIN VERMA
>Assignee: NITIN VERMA
>
> 1)Add telemetry to WASB driver. WASB driver is lack of any log or 
> telemetry which makes trouble shoot very difficult. For example, we don’t 
> know where is high e2e latency between HBase and Azure storage came from when 
> Azure storage server latency was very low. Also we don’t know why WASB can 
> only do 166 IOPs which is way below azure storage 500 IOPs. And we had 
> several incidents before related to storage latency, because of lacking logs, 
> we couldn’t find the ownership of the incident quickly.
> 2)Resolving the hot spotting issue when WASB driver partition the azure 
> page blobs by changing the key. Current key design is causing the hot 
> spotting on azure storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13783) Improve efficiency of WASB over page blobs

2016-11-01 Thread NITIN VERMA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627043#comment-15627043
 ] 

NITIN VERMA commented on HADOOP-13783:
--

Thanks [~liuml07]

> Improve efficiency of WASB over page blobs
> --
>
> Key: HADOOP-13783
> URL: https://issues.apache.org/jira/browse/HADOOP-13783
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: azure
>Reporter: NITIN VERMA
>Assignee: NITIN VERMA
>
> 1)Add telemetry to WASB driver. WASB driver is lack of any log or 
> telemetry which makes trouble shoot very difficult. For example, we don’t 
> know where is high e2e latency between HBase and Azure storage came from when 
> Azure storage server latency was very low. Also we don’t know why WASB can 
> only do 166 IOPs which is way below azure storage 500 IOPs. And we had 
> several incidents before related to storage latency, because of lacking logs, 
> we couldn’t find the ownership of the incident quickly.
> 2)Resolving the hot spotting issue when WASB driver partition the azure 
> page blobs by changing the key. Current key design is causing the hot 
> spotting on azure storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13783) Improve efficiency of WASB over page blobs

2016-11-01 Thread Mingliang Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15627034#comment-15627034
 ] 

Mingliang Liu commented on HADOOP-13783:


Hi [~nitin.ve...@gmail.com], I added you as Hadoop Contributor, and assigned 
this JIRA to you. Looking forward to your contribution!

> Improve efficiency of WASB over page blobs
> --
>
> Key: HADOOP-13783
> URL: https://issues.apache.org/jira/browse/HADOOP-13783
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: azure
>Reporter: NITIN VERMA
>Assignee: NITIN VERMA
>
> 1)Add telemetry to WASB driver. WASB driver is lack of any log or 
> telemetry which makes trouble shoot very difficult. For example, we don’t 
> know where is high e2e latency between HBase and Azure storage came from when 
> Azure storage server latency was very low. Also we don’t know why WASB can 
> only do 166 IOPs which is way below azure storage 500 IOPs. And we had 
> several incidents before related to storage latency, because of lacking logs, 
> we couldn’t find the ownership of the incident quickly.
> 2)Resolving the hot spotting issue when WASB driver partition the azure 
> page blobs by changing the key. Current key design is causing the hot 
> spotting on azure storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Commented] (HADOOP-13783) Improve efficiency of WASB over page blobs

2016-11-01 Thread NITIN VERMA (JIRA)

[ 
https://issues.apache.org/jira/browse/HADOOP-13783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15626985#comment-15626985
 ] 

NITIN VERMA commented on HADOOP-13783:
--

I don't have permission to assign this JIRA. Could someone assign this to me? 

> Improve efficiency of WASB over page blobs
> --
>
> Key: HADOOP-13783
> URL: https://issues.apache.org/jira/browse/HADOOP-13783
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: azure
>Reporter: NITIN VERMA
>
> 1)Add telemetry to WASB driver. WASB driver is lack of any log or 
> telemetry which makes trouble shoot very difficult. For example, we don’t 
> know where is high e2e latency between HBase and Azure storage came from when 
> Azure storage server latency was very low. Also we don’t know why WASB can 
> only do 166 IOPs which is way below azure storage 500 IOPs. And we had 
> several incidents before related to storage latency, because of lacking logs, 
> we couldn’t find the ownership of the incident quickly.
> 2)Resolving the hot spotting issue when WASB driver partition the azure 
> page blobs by changing the key. Current key design is causing the hot 
> spotting on azure storage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org