Hello,

Not sure if this the right place for this, but before going to BTS I
better ask here for some advice.

Recently, (since October 8 and not before) some of my servers running an
up-to-date Wheezy with aacraid card (Adaptec 2020SA) are going nuts:

(...)
Oct 12 07:38:34 my_machine kernel: [3007914.062687] aacraid: Host adapter 
abort request (0,0,0,0)
Oct 12 07:38:34 my_machine kernel: [3007914.065137] aacraid: Host adapter reset 
request. SCSI hang ?
Oct 12 07:38:41 my_machine kernel: [3007920.532027] INFO: task kworker/2:1:37 
blocked for more than 120 seconds.
Oct 12 07:38:41 my_machine kernel: [3007920.535101] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 12 07:38:41 my_machine kernel: [3007920.538765] kworker/2:1     D 
ffff88022fd13780     0    37      2 0x00000000
Oct 12 07:38:41 my_machine kernel: [3007920.538803]  ffff8802258a3180 
0000000000000046 0000000000000000 ffff880226cd0780
Oct 12 07:38:41 my_machine kernel: [3007920.538816]  0000000000013780 
ffff88022591bfd8 ffff88022591bfd8 ffff8802258a3180
Oct 12 07:38:41 my_machine kernel: [3007920.538901]  0000000000000001 
0000000100000400 0000000000000001 ffff8802259c7be8
Oct 12 07:38:41 my_machine kernel: [3007920.538987] Call Trace:
Oct 12 07:38:41 my_machine kernel: [3007920.539003]  [<ffffffff8134fa44>] ? 
__mutex_lock_common.isra.5+0xff/0x164
Oct 12 07:38:41 my_machine kernel: [3007920.539012]  [<ffffffff8135049f>] ? 
_raw_spin_unlock_irqrestore+0xe/0xf
Oct 12 07:38:41 my_machine kernel: [3007920.539021]  [<ffffffff8134f932>] ? 
mutex_lock+0x1a/0x2d
Oct 12 07:38:41 my_machine kernel: [3007920.539077]  [<ffffffffa0093c82>] ? 
reiserfs_mutex_lock_safe+0x19/0x24 [reiserfs]
Oct 12 07:38:41 my_machine kernel: [3007920.539096]  [<ffffffffa0095059>] ? 
flush_commit_list+0x11b/0x4fc [reiserfs]
Oct 12 07:38:41 my_machine kernel: [3007920.539105]  [<ffffffff8134f144>] ? 
_cond_resched+0x7/0x1c
Oct 12 07:38:41 my_machine kernel: [3007920.539123]  [<ffffffffa0095b44>] ? 
flush_async_commits+0x3b/0x46 [reiserfs]
Oct 12 07:38:41 my_machine kernel: [3007920.539134]  [<ffffffff8105b5f7>] ? 
process_one_work+0x161/0x269
Oct 12 07:38:41 my_machine kernel: [3007920.539142]  [<ffffffff8105c5c0>] ? 
worker_thread+0xc2/0x145
Oct 12 07:38:41 my_machine kernel: [3007920.539174]  [<ffffffff8105c4fe>] ? 
manage_workers.isra.25+0x15b/0x15b
Oct 12 07:38:41 my_machine kernel: [3007920.539183]  [<ffffffff8105f701>] ? 
kthread+0x76/0x7e
Oct 12 07:38:41 my_machine kernel: [3007920.539216]  [<ffffffff813575b4>] ? 
kernel_thread_helper+0x4/0x10
Oct 12 07:38:41 my_machine kernel: [3007920.539250]  [<ffffffff8105f68b>] ? 
kthread_worker_fn+0x139/0x139
Oct 12 07:38:41 my_machine kernel: [3007920.539282]  [<ffffffff813575b0>] ? 
gs_change+0x13/0x13

Systems do not really hang but any operations are carried out slowly
and after some minutes they recover their normal status. I recently
triggered this messages when simply issuing an update (apt-get update
&& apt-get -V dist-upgrade) but given this message can appear again
I'm not sure what would be the best next step.

Adaptec has a KB article¹ about this but looking at the suggested
value for "timeout" they are already set to 45 seconds but the
message comes out. Should I increase the timeout to something else
like 180 or so? Any tips would be appreciated.

¹http://ask.adaptec.com/app/answers/detail/a_id/15357/related/1

Greetings,

-- 
Camaleón


-- 
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]
Archive: https://lists.debian.org/[email protected]

Reply via email to