[ 
https://issues.apache.org/jira/browse/HBASE-18764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16158789#comment-16158789
 ] 

Josh Elser commented on HBASE-18764:
------------------------------------

Thanks for the reply!

This is definitely one of those tricky kinds of situations. To use your same 
argument against you, RS logging would still not give us a definitive clue as 
to where the problem lies, just a hint of where to look. When a RegionServer is 
processing 15+ RPCs at one point in time, it requires a bit of human effort to 
figure out which handler thread was actually tied to the user whose request 
missed the SLA. That's the crux of why I suggested looking at HDFS metrics 
would be a better approach.

To that point, if you want to figure out why some specific request was slow, 
and specifically what part of the request was slow, we'd instead want to use 
something like HTrace to aggregate this. However, in practice, this too is 
difficult due to the overhead of the tracing as well as the intermittent 
slowness that typically comes with these kinds of problems.

bq. When we talk about hdfs metrics, we will see 95th, 90th, mean, median etc

Yep, that is the downside of using aggregations like percentiles :). In most 
cases, we know that we don't want to look at tail of operation times because of 
"expected" slowness (e.g. JVM garbage collection or compactions); however, 
there are certainly cases when looking at all of the metrics is necessary to 
identify a culprit. Specifically, the case you have identified when a small 
percent of requests to HDFS or, even worse, a small percent of requests to a 
DataNode fall outside the bounds of normalcy.

At the end of the day, I'm not trying to levy a veto on this kind of change, 
but instead trying to outline some cases in which this will just have you 
blocked on the next question :). Since I've already spent this much time 
thinking about the subject, let me look closer at your patch too:

{noformat}
+      this.conf = new HBaseConfiguration();
{noformat}

The no-args constructor is deprecated.

{noformat}
+      this.slowReadNs =
+        1000000 * conf.getInt("hbase.regionserver.hlog.slowsync.ms",
+        DEFAULT_SLOW_READ_TIME_MS);
{noformat}

Create a new configuration property, don't reuse the slowsync key.

{noformat}
+      long timeInNanos = System.nanoTime() - startNano;
+      if (timeInNanos > this.slowReadNs) {
{noformat}

Nit: combine these two lines

{noformat}
+              DFSInputStream dfsInputStream = (DFSInputStream) 
is.getWrappedStream();
{noformat}

You need to check if this is an instance of {{DFSInputStream}} instead of 
blindly casting it.

{noformat}
+          } catch (Exception e) {
+              e.printStackTrace();
+          }
{noformat}

We do not print to stdout/stderr directly. You use a logger to do that. In this 
case, since this is not a critical codepath, I'd suggest you make a {{debug}} 
log call, with a message making clear that the exception was caught trying to 
collect this slowread information. We want to make sure that if/when this block 
happens, an admin/user would not interpret the exception as something they need 
to take action on.

> add slow read block log entry to alert slow datanodeinfo when reading a block 
> is slow
> -------------------------------------------------------------------------------------
>
>                 Key: HBASE-18764
>                 URL: https://issues.apache.org/jira/browse/HBASE-18764
>             Project: HBase
>          Issue Type: Improvement
>          Components: HFile
>    Affects Versions: 1.1.2
>            Reporter: Wang, Xinglong
>            Priority: Minor
>         Attachments: HBASE-18764.rev1.1.2.patch
>
>
> HBASE is on top of HDFS and both are distributed system. HBASE will also get 
> impacted when there is struggler datanode due to network/disk/cpu issue. All 
> HBASE read/scan towards that datanode will be slowdown. It's not easy for 
> hbase admin to find out the struggler datanode in such case.
> While we have a log entry known as slow sync. One such entry is like the 
> following. It will help hbase admin to quickly identify the slow datanode in 
> the pipline in case of network/disk/cup issue with one of the 3 datanods in 
> pipeline.
> {noformat}
> 2017-07-08 19:11:30,538 INFO  [sync.3] wal.FSHLog: Slow sync cost: 490189 ms, 
> current pipeline: 
> [DatanodeInfoWithStorage[xx.xx.xx.xx:50010,DS-c391299a-aa9f-4146-ac7e-a493ae536bff,DISK],
>  DatanodeInfoWithSt
> orage[xx.xx.xx.xx:50010,DS-21a85f8b-f389-4f9e-95a8-b711945fd210,DISK], 
> DatanodeInfoWithStorage[xx.xx.xx.xx:50010,DS-aa48cef2-3554-482f-b49d-be4763f4d8b8,DISK]]
> {noformat}
> Inspired by slow sync log entry, I think it will also be beneficial for us to 
> print out such kind of entry when we encounter slow read case. So that it 
> will be easy to identify the slow datanode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to