On 2014-03-16, 11:09 AM, Richard Weinberger wrote:
> Am 14.03.2014 15:57, schrieb Thomas Meyer:
>>
>> only some processes get stuck.
>>
>> After enabling hung task detection in the kernel I see this in the logs:
>>
>> [ 8040.100000] INFO: task jbd2/ubda-8:308 blocked for more than 120
>> seconds.
...
>>
>> any ideas? some synchronisation error in ext4?
>
> Hmm, maybe you suffer from the same issue this patch tries to address:
> https://lkml.org/lkml/2014/2/14/733

I'm running into a very similar issue on vanilla 3.14.4 and 3.15-rc5, 
both appear to have that patch already applied.

I've tested ext4, ext3, btrfs, all with the same issue: processes 
writing to disk in the guest randomly hang permanently after a few 
minutes of uptime. The guest reports 100% iowait, the host has no load 
and is totally fine.

Any suggestions would be appreciated.

Thanks,

Jonathan


INFO: task jbd2/ubda-8:347 blocked for more than 120 seconds.
       Not tainted 3.14.4 #5
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
jbd2/ubda-8     D 000000000042fc13     0   347      2 0x00000000
Stack:
  60538e60 7fb14820 8044dba0 80d2a4d0
  7fa5a480 60021e1f 60038290 6007db40
  80d2a1c0 6042adf2 00000000 6006d120
Call Trace:
  [<60021e1f>] ? __switch_to+0x4f/0x90
  [<60038290>] ? block_signals+0x0/0x20
  [<6007db40>] ? rcu_sched_qs+0x0/0xb0
  [<6042adf2>] ? __schedule+0x1c2/0x510
  [<6006d120>] ? pick_next_task_fair+0x0/0x190
  [<60119d50>] ? sleep_on_buffer+0x0/0x20
  [<60023340>] ? itimer_read+0x10/0x40
  [<60119d50>] ? sleep_on_buffer+0x0/0x20
  [<6042b173>] ? schedule+0x33/0x80
  [<60234023>] ? submit_bio+0xa3/0x1d0
  [<6042b692>] ? io_schedule+0xb2/0x130
  [<6006fc00>] ? prepare_to_wait+0x0/0x90
  [<60119d60>] ? sleep_on_buffer+0x10/0x20
  [<6042b8b0>] ? __wait_on_bit+0x60/0xa0
  [<60119d50>] ? sleep_on_buffer+0x0/0x20
  [<6042ba2b>] ? out_of_line_wait_on_bit+0x8b/0xa0
  [<601d1061>] ? journal_submit_commit_record.isra.25+0x181/0x240
  [<6006f940>] ? wake_bit_function+0x0/0x40
  [<6011b090>] ? __brelse+0x0/0x20
  [<6011b090>] ? __brelse+0x0/0x20
  [<6042b350>] ? _cond_resched+0x0/0x50
  [<601d25a1>] ? jbd2_journal_commit_transaction+0x1481/0x1770
  [<60038290>] ? block_signals+0x0/0x20
  [<6004cb1f>] ? lock_timer_base.isra.37+0x3f/0x80
  [<6004d6f0>] ? del_timer+0x0/0x60
  [<6006fae0>] ? __wake_up+0x0/0x60
  [<601d5b74>] ? kjournald2+0xd4/0x2b0
  [<6042ae0b>] ? __schedule+0x1db/0x510
  [<6006d120>] ? pick_next_task_fair+0x0/0x190
  [<6006f900>] ? autoremove_wake_function+0x0/0x40
  [<6006f9b0>] ? __init_waitqueue_head+0x0/0x10
  [<601d5aa0>] ? kjournald2+0x0/0x2b0
  [<6006f9b0>] ? __init_waitqueue_head+0x0/0x10
  [<600609ef>] ? kthread+0x10f/0x140
  [<6006817d>] ? finish_task_switch.isra.78+0x2d/0x90
  [<60069612>] ? schedule_tail+0x22/0xd0
  [<60021c02>] ? new_thread_handler+0x82/0xb0

INFO: task git:1150 blocked for more than 120 seconds.
       Not tainted 3.14.4 #5
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
git             D 000000000042fc13     0  1150   1146 0x00000000
Stack:
  60538e60 6052b1e0 7e801ce0 7fb9b750
  807ddc00 60021e1f 60038290 6007db40
  7fb9b440 6042adf2 604324c0 6006d120
Call Trace:
  [<60021e1f>] ? __switch_to+0x4f/0x90
  [<60038290>] ? block_signals+0x0/0x20
  [<6007db40>] ? rcu_sched_qs+0x0/0xb0
  [<6042adf2>] ? __schedule+0x1c2/0x510
  [<6006d120>] ? pick_next_task_fair+0x0/0x190
  [<600385dc>] ? set_signals+0x3c/0x50
  [<6006fd20>] ? prepare_to_wait_event+0x0/0x110
  [<6042b140>] ? schedule+0x0/0x80
  [<6042b173>] ? schedule+0x33/0x80
  [<601d6693>] ? jbd2_log_wait_commit+0x93/0x100
  [<6006f900>] ? autoremove_wake_function+0x0/0x40
  [<6018b942>] ? ext4_sync_file+0x282/0x310
  [<601187aa>] ? do_fsync+0x4a/0x80
  [<60118b12>] ? SyS_fsync+0x12/0x20
  [<600256a0>] ? handle_syscall+0x60/0x80
  [<6003be9b>] ? userspace+0x49b/0x5a0
  [<600256f0>] ? copy_chunk_to_user+0x0/0x30
  [<60025985>] ? do_op_one_page+0x145/0x210
  [<60025c71>] ? copy_to_user+0x61/0xb0
  [<60036f5f>] ? save_registers+0x1f/0x40
  [<6003ec40>] ? arch_prctl+0x190/0x1c0
  [<60021cb5>] ? fork_handler+0x85/0x90

  $ ps -eo pid,user,wchan=WIDE-WCHAN-COLUMN0000000000000 -o s,cmd
   PID USER     WIDE-WCHAN-COLUMN0000000000000 S CMD
     1 root     pick_next_task_fair            S /sbin/init
     2 root     pick_next_task_fair            S [kthreadd]
     3 root     pick_next_task_fair            S [ksoftirqd/0]
     4 root     pick_next_task_fair            S [kworker/0:0]
     5 root     pick_next_task_fair            S [kworker/0:0H]
     6 root     pick_next_task_fair            S [kworker/u2:0]
     7 root     pick_next_task_fair            S [watchdog/0]
     8 root     pick_next_task_fair            S [khelper]
     9 root     pick_next_task_fair            S [kdevtmpfs]
    10 root     pick_next_task_fair            S [netns]
    91 root     pick_next_task_fair            S [writeback]
    93 root     pick_next_task_fair            S [bioset]
    95 root     pick_next_task_fair            S [kblockd]
   113 root     pick_next_task_fair            S [kworker/0:1]
   123 root     pick_next_task_fair            S [khungtaskd]
   124 root     pick_next_task_fair            S [kswapd0]
   125 root     pick_next_task_fair            S [fsnotify_mark]
   249 root     pick_next_task_fair            S [ipv6_addrconf]
   346 root     pick_next_task_fair            S [deferwq]
   347 root     pick_next_task_fair            D [jbd2/ubda-8]
   348 root     pick_next_task_fair            S [ext4-rsv-conver]
   366 root     pick_next_task_fair            S [kworker/0:1H]
   468 root     pick_next_task_fair            S upstart-udev-bridge
   472 root     pick_next_task_fair            S /lib/systemd
   741 root     pick_next_task_fair            S [jbd2/ubdb-8]
   742 root     pick_next_task_fair            S [ext4-rsv-conver]
   890 root     pick_next_task_fair            S /usr/bin/docker -d
   967 syslog   pick_next_task_fair            S rsyslogd
  1006 root     pick_next_task_fair            S [kworker/u2:2]
  1036 root     pick_next_task_fair            S /sbin/getty -8 38400
  1037 root     pick_next_task_fair            S /sbin/getty -8 38400
  1040 root     pick_next_task_fair            S /sbin/getty -8 38400
  1041 root     pick_next_task_fair            S /bin/login --
  1043 root     pick_next_task_fair            S /sbin/getty -8 38400
  1089 root     pick_next_task_fair            S /usr/sbin/sshd -D
  1090 root     pick_next_task_fair            S cron
  1092 root     pick_next_task_fair            S sshd: ubuntu [priv]
  1096 root     pick_next_task_fair            S [kauditd]
  1106 ubuntu   pick_next_task_fair            S sshd: ubuntu@notty
  1107 ubuntu   pick_next_task_fair            S bash
  1112 root     pick_next_task_fair            S /sbin/getty -8 38400
  1132 root     pick_next_task_fair            S upstart-socket-bridge
  1133 root     pick_next_task_fair            S upstart-file-bridge
  1138 ubuntu   pick_next_task_fair            S /bin/bash
  1141 ubuntu   pick_next_task_fair            S git clone
  1142 ubuntu   pick_next_task_fair            S git-remote-https origin
  1146 ubuntu   pick_next_task_fair            S git fetch-pack
  1150 ubuntu   pick_next_task_fair            D git index-pack --stdin
  1160 ubuntu   pick_next_task_fair            S -bash

------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
User-mode-linux-user mailing list
User-mode-linux-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-user

Reply via email to