Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
Le 22/10/2012 16:14, Yann Dupont a écrit : Hello. This mail is a follow up of a message on XFS mailing list. I had hang with 3.6.1, and then , damage on XFS filesystem. 3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a different trace this time , so not really sure the 2 problems are related . Anyway the problem is maybe not XFS, but is just a consequence of what seems more like kernel problems. cc: to linux-kernel Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991908] INFO: task ceph-osd:4409 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991954] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991999] ceph-osdD 88084c049030 0 4409 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992003] 88084c048d60 0086 880a1421de78 880a17caa820 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992054] 880a1421dfd8 880a1421dfd8 880a1421dfd8 88084c048d60 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992105] 03373001 88084c048d60 88051775cb20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992156] Call Trace: Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992184] [] ? rwsem_down_failed_common+0xbd/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992215] [] ? call_rwsem_down_write_failed+0x13/0x20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992248] [] ? cap_mmap_addr+0x50/0x50 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992275] [] ? down_write+0x1c/0x1d Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992303] [] ? vm_mmap_pgoff+0x64/0xb0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992331] [] ? sys_mmap_pgoff+0x5c/0x190 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992360] [] ? do_sys_open+0x161/0x1e0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992387] [] ? system_call_fastpath+0x1a/0x1f Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992423] INFO: task ceph-osd:25297 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992451] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992495] ceph-osdD 8801bce7b1a0 0 25297 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992497] 8801bce7aed0 0086 88025d903fd8 880a17cab580 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992548] 88025d903fd8 88025d903fd8 88025d903fd8 8801bce7aed0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992599] 8801bce7aed0 8801bce7aed0 88051775cb20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992650] Call Trace: Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992673] [] ? rwsem_down_failed_common+0xbd/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992702] [] ? call_rwsem_down_read_failed+0x14/0x30 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992732] [] ? down_read+0xe/0x10 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992759] [] ? do_page_fault+0x16c/0x460 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992787] [] ? release_sock+0xd2/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992815] [] ? inet_stream_connect+0x4b/0x70 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992844] [] ? sys_connect+0xa5/0xe0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992871] [] ? fd_install+0x33/0x70 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992898] [] ? page_fault+0x25/0x30 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992925] INFO: task ceph-osd:32469 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992953] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992996] ceph-osdD 880556237b30 0 32469 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992999] 880556237860 0086 88059fe5dfd8 880a17c742e0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993050] 88059fe5dfd8 88059fe5dfd8 88059fe5dfd8 880556237860 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993101] 880556237860 880556237860 88051775cb20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993153] Call Trace: Oct 22 20:54:29
Problems with kernel 3.6.x (vm ?) (was : Is kernel 3.6.1 or filestreams option toxic ?)
Le 22/10/2012 16:14, Yann Dupont a écrit : Hello. This mail is a follow up of a message on XFS mailing list. I had hang with 3.6.1, and then , damage on XFS filesystem. 3.6.1 is not alone. Tried 3.6.2, and had another hang with quite a different trace this time , so not really sure the 2 problems are related . Anyway the problem is maybe not XFS, but is just a consequence of what seems more like kernel problems. cc: to linux-kernel Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991908] INFO: task ceph-osd:4409 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991954] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.991999] ceph-osdD 88084c049030 0 4409 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992003] 88084c048d60 0086 880a1421de78 880a17caa820 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992054] 880a1421dfd8 880a1421dfd8 880a1421dfd8 88084c048d60 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992105] 03373001 88084c048d60 88051775cb20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992156] Call Trace: Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992184] [813c52fd] ? rwsem_down_failed_common+0xbd/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992215] [812094a3] ? call_rwsem_down_write_failed+0x13/0x20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992248] [811b83e0] ? cap_mmap_addr+0x50/0x50 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992275] [813c3cbc] ? down_write+0x1c/0x1d Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992303] [810fcf74] ? vm_mmap_pgoff+0x64/0xb0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992331] [8110d4cc] ? sys_mmap_pgoff+0x5c/0x190 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992360] [811357f1] ? do_sys_open+0x161/0x1e0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992387] [813c5ffd] ? system_call_fastpath+0x1a/0x1f Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992423] INFO: task ceph-osd:25297 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992451] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992495] ceph-osdD 8801bce7b1a0 0 25297 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992497] 8801bce7aed0 0086 88025d903fd8 880a17cab580 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992548] 88025d903fd8 88025d903fd8 88025d903fd8 8801bce7aed0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992599] 8801bce7aed0 8801bce7aed0 88051775cb20 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992650] Call Trace: Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992673] [813c52fd] ? rwsem_down_failed_common+0xbd/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992702] [81209474] ? call_rwsem_down_read_failed+0x14/0x30 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992732] [813c3c9e] ? down_read+0xe/0x10 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992759] [8103129c] ? do_page_fault+0x16c/0x460 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992787] [81305862] ? release_sock+0xd2/0x150 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992815] [8137aceb] ? inet_stream_connect+0x4b/0x70 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992844] [81302b55] ? sys_connect+0xa5/0xe0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992871] [811343e3] ? fd_install+0x33/0x70 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992898] [813c5a75] ? page_fault+0x25/0x30 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992925] INFO: task ceph-osd:32469 blocked for more than 120 seconds. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992953] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992996] ceph-osdD 880556237b30 0 32469 1 0x Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.992999] 880556237860 0086 88059fe5dfd8 880a17c742e0 Oct 22 20:54:29 braeval.u14.univ-nantes.prive kernel: [629576.993050] 88059fe5dfd8 88059fe5dfd8 88059fe5dfd8