[
https://issues.apache.org/jira/browse/HADOOP-14988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16222168#comment-16222168
]
Steve Loughran commented on HADOOP-14988:
-----------------------------------------
As discussed in HADOOP-14973
* I concur with the need to collect client-side statistics from the object
store clients, especially related to failures and throttling, as that answers
the question "why are things so slow"
* I also see that classic metric publishing isn't always the right way to do
it. Sometimes it is: if a specific node is failing the most, that's a node
problem for cluster management tools to detect and react to. But if its a
specific job being throttled, that's not an admin problem, that's a job config
and store-layout problem, which needs to be returned at the job level.
w.r.t Using Hadoop counters for this, it's cute. But these are not "Hadoop
counters", they are mapreduce counters; you can't have a filesystem in hadoop
common using or publishing them. Which means an alternative means of publishing
them is needed.
# Hadoop MR could collect the stats from the output filesystem & uprate them to
MR counters.. Issue: do you want this per fs, or would it be aggregated across
all instances of an fs class?
# the stuff could be collected by the committer and propagated back anyway.
This is what I'm doing in the S3A committers, where I write the stats to
_SUCCESS. But that's across the entire set of filesystems of a specific schema
(s3a:// here), not per query, (moot in MR, different in spark)
# Mingliang's per-thread work here is more foundational, as you want all the
stats for a task.
Overall then, yes: I want the counters, not things lost in logs. But we need to
have something which is (a) cross-engine and (b) works on multitenant execution
engines and so tie stats back to specific jobs.
> WASB: Expose WASB status metrics as counters in Hadoop
> ------------------------------------------------------
>
> Key: HADOOP-14988
> URL: https://issues.apache.org/jira/browse/HADOOP-14988
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/azure
> Reporter: Rajesh Balamohan
> Priority: Minor
>
> It would be good to expose WASB status metrics (e.g 503) as Hadoop counters.
> Here is an example from a spark job, where it ends up spending large amount
> of time in retries. Adding hadoop counters would help in analyzing and tuning
> long running tasks.
> {noformat}
> 2017-10-23 23:07:20,876 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:20,877 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=1, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:07:21,877 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:21,879 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=2, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:07:24,070 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:24,073 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: q:: ResponseReceived: threadId=99, Status=503,
> Elapsed(ms)=3, ETAG=null, contentLength=198, requestMethod=GET
> 2017-10-23 23:07:27,917 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:27,920 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=2, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:07:36,879 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:36,881 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=1, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:07:54,786 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:07:54,789 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=3, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:08:24,790 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> 2017-10-23 23:08:24,794 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: ResponseReceived:
> threadId=99, Status=503, Elapsed(ms)=4, ETAG=null, contentLength=198,
> requestMethod=GET
> 2017-10-23 23:08:54,794 DEBUG [Executor task launch worker for task 2463]
> azure.SelfThrottlingIntercept: SelfThrottlingIntercept:: SendingRequest:
> threadId=99, requestType=read , isFirstRequest=false, sleepDuration=0
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]