[
https://issues.apache.org/jira/browse/IMPALA-9033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949017#comment-16949017
]
ASF subversion and git services commented on IMPALA-9033:
---------------------------------------------------------
Commit 89dc05bb24d069b0816b1015795bd3371cd6979c in impala's branch
refs/heads/master from Tim Armstrong
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=89dc05b ]
IMPALA-9033: log on slow HDFS I/Os
This logs a message with the time taken and also logs basic HDFS
statistics from the I/O, which would tell us if it was a remote
read, etc.
The threshold is 10s and is configurable via
--fs_slow_read_log_threshold_ms in case we want to make it more or less
sensitive.
Here's some example output that I obtained by adding a 500ms sleep
to the code path, and lowering the threshold to 500ms:
I1010 12:09:38.211959 30292 hdfs-file-reader.cc:173]
2448e3196bf9ee94:69adb16f00000001] Slow FS I/O operation on
hdfs://localhost:20500/test-warehouse/tpch.lineitem/lineitem.tbl for instance
2448e3196bf9ee94:69adb16f00000001 of query 2448e3196bf9ee94:69adb16f00000000.
Last read returned 8.00 MB. This thread has read 8.00 MB/8.00 MB starting at
offset 394264576 in this I/O scheduling quantum and taken 584.129ms so far. I/O
status: OK
I1010 12:09:38.212011 30292 hdfs-file-reader.cc:353]
2448e3196bf9ee94:69adb16f00000001] Stats for last read by this I/O thread:
totalBytesRead=8388608 totalLocalBytesRead=8388608
totalShortCircuitBytesRead=8388608 totalZeroCopyBytesRead=0
Change-Id: I1929921495706b482d91d91cffe27bee4478f5c4
Reviewed-on: http://gerrit.cloudera.org:8080/14406
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Log metrics for slow I/Os
> -------------------------
>
> Key: IMPALA-9033
> URL: https://issues.apache.org/jira/browse/IMPALA-9033
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Tim Armstrong
> Assignee: Tim Armstrong
> Priority: Major
>
> We should detect I/Os that take longer than a configurable threshold, e.g 10
> seconds might be reasonable and then log some extra information about them -
> any of the metrics we get from the HDFS client.
> This will help debug certain perf issues and hangs.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]