[
https://issues.apache.org/jira/browse/HBASE-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrew Purtell updated HBASE-16577:
-----------------------------------
Summary: Toward a more useful slow query log (was: On reducing the log
level of *TooSlow log lines)
> Toward a more useful slow query log
> -----------------------------------
>
> Key: HBASE-16577
> URL: https://issues.apache.org/jira/browse/HBASE-16577
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Andrew Purtell
>
> I was looking at the nature and distribution of responseTooSlow messages on
> our clusters. The majority of responseTooSlow warnings are for Scan or Multi.
> Scans may take a long time to return. It will totally depend on how much data
> is in the table, how the data is distributed, and the range and selectivity
> of the query. We are not measuring response time in a way to know what is
> proportionate to the work requested. Another problematic example is Multi. We
> don't get valid results considering a multi with 1 op and a multi with 100
> ops to be equivalent as far as being "too slow".
> If we aggressively filter responseTooSlow messages to just include the ops we
> can expect to be small, constant, request-independent units of work, this
> leaves us with Get and Mutate. This gives us no more information then we get
> from the Canary with read and write checks turned on. The Canary issues Get
> and Mutate ops and, better, measures availability and latency from the client
> perspective.
>
> Where I end up is I think my shop should ignore responseTooSlow as signal as
> being far too noisy. The trouble then is it is logged at WARN level. This
> implies there is something wrong that needs to be fixed. That may not be the
> case. It's going to require some analysis of the application and the request
> particulars extracted from the log line. WARN seems inappropriate for this
> type of indication. There's nothing (necessarily) wrong with HBase, or the
> app. Should be logged at more like INFO.
> Furthermore the response might not be too slow, so calling it
> "responseTooSlow" isn't quite right. More like "responseMaybeSlow" :-)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)