On Sun, Jul 3, 2016 at 7:51 AM, Alex Gorbachev <a...@iss-integration.com> wrote:
>> Thank you Stefan and Campbell for the info - hope 4.7rc5 resolves this
>> for us - please note that my workload is purely RBD, no QEMU/KVM.
>> Also, we do not have CFQ turned on, neither scsi-mq and blk-mq, so I
>> am surmising ceph-osd must be using something from the fair scheduler.
>> I read that its IO has been switched to blk-mq internally, so maybe
>> there is a relationship there.
>
> If the OSD code is compiled against the source from a buggy fair
> scheduler code, then that would be an OSD code issue, correct?

OSD code is not compiled against any kernel code. ceph-osd runs in userpace,
not kernelspace. A userspace process should not be able to crash a kernel, if it
can that's a kernel bug.

HTH,
Brad
>
>>
>> We had no such problems with kernel 4.2.x, but had other issues with
>> XFS, which do not seem to happen now.
>>
>> Regards,
>> Alex
>>
>>>
>>> Stefan
>>>
>>> Am 29.06.2016 um 11:41 schrieb Campbell Steven:
>>>> Hi Alex/Stefan,
>>>>
>>>> I'm in the middle of testing 4.7rc5 on our test cluster to confirm
>>>> once and for all this particular issue has been completely resolved by
>>>> Peter's recent patch to sched/fair.c refereed to by Stefan above. For
>>>> us anyway the patches that Stefan applied did not solve the issue and
>>>> neither did any 4.5.x or 4.6.x released kernel thus far, hopefully it
>>>> does the trick for you. We could get about 4 hours uptime before
>>>> things went haywire for us.
>>>>
>>>> It's interesting how it seems the CEPH workload triggers this bug so
>>>> well as it's quite a long standing issue that's only just been
>>>> resolved, another user chimed in on the lkml thread a couple of days
>>>> ago as well and again his trace had ceph-osd in it as well.
>>>>
>>>> https://lkml.org/lkml/headers/2016/6/21/491
>>>>
>>>> Campbell
>>>>
>>>> On 29 June 2016 at 18:29, Stefan Priebe - Profihost AG
>>>> <s.pri...@profihost.ag> wrote:
>>>>>
>>>>> Am 29.06.2016 um 04:30 schrieb Alex Gorbachev:
>>>>>> Hi Stefan,
>>>>>>
>>>>>> On Tue, Jun 28, 2016 at 1:46 PM, Stefan Priebe - Profihost AG
>>>>>> <s.pri...@profihost.ag> wrote:
>>>>>>> Please be aware that you may need even more patches. Overall this needs 
>>>>>>> 3
>>>>>>> patches. Where the first two try to fix a bug and the 3rd one fixes the
>>>>>>> fixes + even more bugs related to the scheduler. I've no idea on which 
>>>>>>> patch
>>>>>>> level Ubuntu is.
>>>>>>
>>>>>> Stefan, would you be able to please point to the other two patches
>>>>>> beside https://lkml.org/lkml/diff/2016/6/22/102/1 ?
>>>>>
>>>>> Sorry sure yes:
>>>>>
>>>>> 1. 2b8c41daba32 ("sched/fair: Initiate a new task's util avg to a
>>>>> bounded value")
>>>>>
>>>>> 2.) 40ed9cba24bb7e01cc380a02d3f04065b8afae1d ("sched/fair: Fix
>>>>> post_init_entity_util_avg() serialization")
>>>>>
>>>>> 3.) the one listed at lkml.
>>>>>
>>>>> Stefan
>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>> Alex
>>>>>>
>>>>>>>
>>>>>>> Stefan
>>>>>>>
>>>>>>> Excuse my typo sent from my mobile phone.
>>>>>>>
>>>>>>> Am 28.06.2016 um 17:59 schrieb Tim Bishop <tim-li...@bishnet.net>:
>>>>>>>
>>>>>>> Yes - I noticed this today on Ubuntu 16.04 with the default kernel. No
>>>>>>> useful information to add other than it's not just you.
>>>>>>>
>>>>>>> Tim.
>>>>>>>
>>>>>>> On Tue, Jun 28, 2016 at 11:05:40AM -0400, Alex Gorbachev wrote:
>>>>>>>
>>>>>>> After upgrading to kernel 4.4.13 on Ubuntu, we are seeing a few of
>>>>>>>
>>>>>>> these issues where an OSD would fail with the stack below.  I logged a
>>>>>>>
>>>>>>> bug at https://bugzilla.kernel.org/show_bug.cgi?id=121101 and there is
>>>>>>>
>>>>>>> a similar description at https://lkml.org/lkml/2016/6/22/102, but the
>>>>>>>
>>>>>>> odd part is we have turned off CFQ and blk-mq/scsi-mq and are using
>>>>>>>
>>>>>>> just the noop scheduler.
>>>>>>>
>>>>>>>
>>>>>>> Does the ceph kernel code somehow use the fair scheduler code block?
>>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Alex Gorbachev
>>>>>>>
>>>>>>> Storcium
>>>>>>>
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.684974] CPU: 30 PID:
>>>>>>>
>>>>>>> 10403 Comm: ceph-osd Not tainted 4.4.13-040413-generic #201606072354
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.684991] Hardware name:
>>>>>>>
>>>>>>> Supermicro X9DRi-LN4+/X9DR3-LN4+/X9DRi-LN4+/X9DR3-LN4+, BIOS 3.2
>>>>>>>
>>>>>>> 03/04/2015
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685009] task:
>>>>>>>
>>>>>>> ffff880f79df8000 ti: ffff880f79fb8000 task.ti: ffff880f79fb8000
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685024] RIP:
>>>>>>>
>>>>>>> 0010:[<ffffffff810b416e>]  [<ffffffff810b416e>]
>>>>>>>
>>>>>>> task_numa_find_cpu+0x22e/0x6f0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685051] RSP:
>>>>>>>
>>>>>>> 0018:ffff880f79fbb818  EFLAGS: 00010206
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685063] RAX:
>>>>>>>
>>>>>>> 0000000000000000 RBX: ffff880f79fbb8b8 RCX: 0000000000000000
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685076] RDX:
>>>>>>>
>>>>>>> 0000000000000000 RSI: 0000000000000000 RDI: ffff8810352d4800
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685107] RBP:
>>>>>>>
>>>>>>> ffff880f79fbb880 R08: 00000001020cf87c R09: 0000000000ff00ff
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685150] R10:
>>>>>>>
>>>>>>> 0000000000000009 R11: 0000000000000006 R12: ffff8807c3adc4c0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685194] R13:
>>>>>>>
>>>>>>> 0000000000000006 R14: 000000000000033e R15: fffffffffffffec7
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685238] FS:
>>>>>>>
>>>>>>> 00007f30e46b8700(0000) GS:ffff88105f580000(0000)
>>>>>>>
>>>>>>> knlGS:0000000000000000
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685283] CS:  0010 DS:
>>>>>>>
>>>>>>> 0000 ES: 0000 CR0: 0000000080050033
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685310] CR2:
>>>>>>>
>>>>>>> 000000001321a000 CR3: 0000000853598000 CR4: 00000000000406e0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685354] Stack:
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685374]
>>>>>>>
>>>>>>> ffffffff813d050f 000000000000000d 0000000000000045 ffff880f79df8000
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685426]
>>>>>>>
>>>>>>> 000000000000033f 0000000000000000 0000000000016b00 000000000000033f
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685477]
>>>>>>>
>>>>>>> ffff880f79df8000 ffff880f79fbb8b8 00000000000001f4 0000000000000054
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685528] Call Trace:
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685555]
>>>>>>>
>>>>>>> [<ffffffff813d050f>] ? cpumask_next_and+0x2f/0x40
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685584]
>>>>>>>
>>>>>>> [<ffffffff810b4a6e>] task_numa_migrate+0x43e/0x9b0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685613]
>>>>>>>
>>>>>>> [<ffffffff810b3acc>] ? update_cfs_shares+0xbc/0x100
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685642]
>>>>>>>
>>>>>>> [<ffffffff810b5059>] numa_migrate_preferred+0x79/0x80
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685672]
>>>>>>>
>>>>>>> [<ffffffff810b9b94>] task_numa_fault+0x7f4/0xd40
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685700]
>>>>>>>
>>>>>>> [<ffffffff813d9634>] ? timerqueue_del+0x24/0x70
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685729]
>>>>>>>
>>>>>>> [<ffffffff810b9205>] ? should_numa_migrate_memory+0x55/0x130
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685762]
>>>>>>>
>>>>>>> [<ffffffff811bd590>] handle_mm_fault+0xbc0/0x1820
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685793]
>>>>>>>
>>>>>>> [<ffffffff810edc00>] ? __hrtimer_init+0x90/0x90
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685822]
>>>>>>>
>>>>>>> [<ffffffff810c211d>] ? remove_wait_queue+0x4d/0x60
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685853]
>>>>>>>
>>>>>>> [<ffffffff8121e20a>] ? poll_freewait+0x4a/0xa0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685882]
>>>>>>>
>>>>>>> [<ffffffff8106a537>] __do_page_fault+0x197/0x400
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685910]
>>>>>>>
>>>>>>> [<ffffffff8106a7c2>] do_page_fault+0x22/0x30
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685939]
>>>>>>>
>>>>>>> [<ffffffff8180a878>] page_fault+0x28/0x30
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685967]
>>>>>>>
>>>>>>> [<ffffffff813e4c5f>] ? copy_page_to_iter_iovec+0x5f/0x300
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.685997]
>>>>>>>
>>>>>>> [<ffffffff810b2795>] ? select_task_rq_fair+0x625/0x700
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686026]
>>>>>>>
>>>>>>> [<ffffffff813e4f16>] copy_page_to_iter+0x16/0xa0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686056]
>>>>>>>
>>>>>>> [<ffffffff816f02ad>] skb_copy_datagram_iter+0x14d/0x280
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686087]
>>>>>>>
>>>>>>> [<ffffffff8174a503>] tcp_recvmsg+0x613/0xbe0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686117]
>>>>>>>
>>>>>>> [<ffffffff8177844e>] inet_recvmsg+0x7e/0xb0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686146]
>>>>>>>
>>>>>>> [<ffffffff816e0d3b>] sock_recvmsg+0x3b/0x50
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686173]
>>>>>>>
>>>>>>> [<ffffffff816e0f91>] SYSC_recvfrom+0xe1/0x160
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686202]
>>>>>>>
>>>>>>> [<ffffffff810f36f5>] ? ktime_get_ts64+0x45/0xf0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686230]
>>>>>>>
>>>>>>> [<ffffffff816e239e>] SyS_recvfrom+0xe/0x10
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686259]
>>>>>>>
>>>>>>> [<ffffffff818086f2>] entry_SYSCALL_64_fastpath+0x16/0x71
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686287] Code: 55 b0 4c
>>>>>>>
>>>>>>> 89 f7 e8 53 cd ff ff 48 8b 55 b0 49 8b 4e 78 48 8b 82 d8 01 00 00 48
>>>>>>>
>>>>>>> 83 c1 01 31 d2 49 0f af 86 b0 00 00 00 4c 8b 73 78 <48> f7 f1 48 8b 4b
>>>>>>>
>>>>>>> 20 49 89 c0 48 29 c1 48 8b 45 d0 4c 03 43 48
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686512] RIP
>>>>>>>
>>>>>>> [<ffffffff810b416e>] task_numa_find_cpu+0x22e/0x6f0
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686544]  RSP
>>>>>>> <ffff880f79fbb818>
>>>>>>>
>>>>>>> Jun 28 09:46:41 roc04r-sca090 kernel: [137912.686896] ---[ end trace
>>>>>>>
>>>>>>> 544cb9f68cb55c93 ]---
>>>>>>>
>>>>>>> Jun 28 09:52:15 roc04r-sca090 kernel: [138246.669713] mpt2sas_cm0:
>>>>>>>
>>>>>>> log_info(0x30030101): originator(IOP), code(0x03), sub_code(0x0101)
>>>>>>>
>>>>>>> Jun 28 09:55:01 roc0
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>>
>>>>>>> ceph-users mailing list
>>>>>>>
>>>>>>> ceph-users@lists.ceph.com
>>>>>>>
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Tim.
>>>>>>>
>>>>>>> --
>>>>>>> Tim Bishop
>>>>>>> http://www.bishnet.net/tim/
>>>>>>> PGP Key: 0x6C226B37FDF38D55
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> ceph-users mailing list
>>>>>>> ceph-users@lists.ceph.com
>>>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>> _______________________________________________
>>>>> ceph-users mailing list
>>>>> ceph-users@lists.ceph.com
>>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Cheers,
Brad
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to