> -----Original Message-----
> From: Jens Axboe [mailto:[email protected]]
> Sent: Tuesday, 16 December, 2014 4:43 PM
...
>
> I think I figured out what was going in, another user reported a similar
> issue. Since you have it running, can you try and gdb attach to one of
> the fio jobs? Then add a breakpoint in backend.c:__update_tv_cache(),
> and print td->tv_cache after each call. Just do that a handful of times,
> enough so we can see if it's incrementing or if it's stuck at the same
> value.
>
That is a static inline function, and its caller do_verify
also got inlined. I tried a breakpoint by line number in
thread_main; although IOs resumed, it doesn't trigger.
(gdb) b backend.c:1536
Breakpoint 2 at 0x44c67b: file backend.c, line 1536.
(gdb) cont
objdump -drS around that address is:
check_update_rusage(td);
if (td->error)
44c66e: 83 bb 90 4e 00 00 00 cmpl $0x0,0x4e90(%rbx)
44c675: 0f 84 0d 01 00 00 je 44c788 <thread_main+0x1538>
fio_gettime(&td->start, NULL);
do_verify(td, verify_bytes);
fio_mutex_down(stat_mutex);
44c67b: 48 8b 3d 16 a4 2d 00 mov 0x2da416(%rip),%rdi #
726a98 <stat_mutex>
44c682: e8 a9 40 fe ff callq 430730 <fio_mutex_down>
td->ts.runtime[DDIR_READ] += mtime_since_now(&td->start);
44c687: 4c 8b bb 44 4c 00 00 mov 0x4c44(%rbx),%r15
44c68e: 48 89 ef mov %rbp,%rdi
44c691: e8 9a e0 fb ff callq 40a730 <mtime_since_now>
fio_gettime(&td->start, NULL);
44c696: 31 f6 xor %esi,%esi
44c698: 48 89 ef mov %rbp,%rdi
fio_gettime(&td->start, NULL);
do_verify(td, verify_bytes);
fio_mutex_down(stat_mutex);
td->ts.runtime[DDIR_READ] += mtime_since_now(&td->start);
44c69b: 4c 01 f8 add %r15,%rax
44c69e: 48 89 83 44 4c 00 00 mov %rax,0x4c44(%rbx)
fio_gettime(&td->start, NULL);
44c6a5: e8 56 de fb ff callq 40a500 <fio_gettime>
fio_mutex_up(stat_mutex);
44c6aa: 48 8b 3d e7 a3 2d 00 mov 0x2da3e7(%rip),%rdi #
726a98 <stat_mutex>
44c6b1: e8 0a 40 fe ff callq 4306c0 <fio_mutex_up>
I printed that value for thread 2, let it run for
a few seconds, then printed again, in case that helps
(tv_usec increments, but tv_sec does not).
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fa92bf87700 (LWP 6733))]#0
0x0000003657600667 in io_submit () from /lib64/libaio.so.1
(gdb) bt
#0 0x0000003657600667 in io_submit () from /lib64/libaio.so.1
#1 0x0000000000457058 in fio_libaio_commit (td=0x7fa9a0dd1860) at
engines/libaio.c:255
#2 0x000000000040b395 in td_io_commit (td=0x7fa9a0dd1860) at ioengines.c:396
#3 0x000000000040bea1 in td_io_queue (td=0x7fa9a0dd1860, io_u=0x7fa8e401c980)
at ioengines.c:343
#4 0x000000000044a75d in do_io (td=0x7fa9a0dd1860) at backend.c:792
#5 0x000000000044c209 in thread_main (data=0x7fa9a0dd1860) at backend.c:1504
#6 0x0000003974c079d1 in start_thread () from /lib64/libpthread.so.0
#7 0x00000039748e8b7d in clone () from /lib64/libc.so.6
(gdb) print td
No symbol "td" in current context.
(gdb) select-frame 5
(gdb) print td->tv_cache
$51 = {tv_sec = 1099511, tv_usec = 641885}
(gdb) cont
Continuing.
^C
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7fa9a99b1720 (LWP 6508)]
0x00000039748acced in nanosleep () from /lib64/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fa92bf87700 (LWP 6733))]#0
0x0000003657600667 in io_submit () from /lib64/libaio.so.1
(gdb) bt
#0 0x0000003657600667 in io_submit () from /lib64/libaio.so.1
#1 0x0000000000457058 in fio_libaio_commit (td=0x7fa9a0dd1860) at
engines/libaio.c:255
#2 0x000000000040b47a in td_io_getevents (td=0x7fa9a0dd1860, min=<value
optimized out>, max=<value optimized out>, t=<value optimized out>) at
ioengines.c:238
#3 0x000000000042b442 in io_u_queued_complete (td=0x7fa9a0dd1860,
min_evts=<value optimized out>, bytes=0x7fa92bf86d00) at io_u.c:1793
#4 0x000000000044a45e in do_io (td=0x7fa9a0dd1860) at backend.c:889
#5 0x000000000044c209 in thread_main (data=0x7fa9a0dd1860) at backend.c:1504
#6 0x0000003974c079d1 in start_thread () from /lib64/libpthread.so.0
#7 0x00000039748e8b7d in clone () from /lib64/libc.so.6
(gdb) select-frame 5
(gdb) print td->tv_cache
$52 = {tv_sec = 1099511, tv_usec = 971364}
and again after about a minute:
(gdb) cont
Continuing.
^C
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7fa9a99b1720 (LWP 6508)]
0x00000039748acced in nanosleep () from /lib64/libc.so.6
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fa92bf87700 (LWP 6733))]#0
0x0000003657600667 in io_submit () from /lib64/libaio.so.1
(gdb) bt
#0 0x0000003657600667 in io_submit () from /lib64/libaio.so.1
#1 0x0000000000457058 in fio_libaio_commit (td=0x7fa9a0dd1860) at
engines/libaio.c:255
#2 0x000000000040b395 in td_io_commit (td=0x7fa9a0dd1860) at ioengines.c:396
#3 0x000000000040bea1 in td_io_queue (td=0x7fa9a0dd1860, io_u=0x7fa8e401fe80)
at ioengines.c:343
#4 0x000000000044a75d in do_io (td=0x7fa9a0dd1860) at backend.c:792
#5 0x000000000044c209 in thread_main (data=0x7fa9a0dd1860) at backend.c:1504
#6 0x0000003974c079d1 in start_thread () from /lib64/libpthread.so.0
#7 0x00000039748e8b7d in clone () from /lib64/libc.so.6
(gdb) select-frame 5
(gdb) print td->tv_cache
$53 = {tv_sec = 1099511, tv_usec = 578033}
--
To unsubscribe from this list: send the line "unsubscribe fio" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html