Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Linh Vu
Might be a networking problem. Are your client nodes on the same subnet as ceph 
client network (i.e public_network in ceph.conf)? In my experience, the kernel 
client only likes being on the same public_network subnet as the MDS, Mons and 
OSDs. Else you get tons of weird issues. The fuse client however is a lot more 
tolerant of this and can jump through gateways etc. no problem.


From: ceph-users  on behalf of Andras Pataki 

Sent: Tuesday, 2 October 2018 6:40:44 AM
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] cephfs kernel client stability

Unfortunately the CentOS kernel (3.10.0-862.14.4.el7.x86_64) has issues
as well.  Different ones, but the nodes end up with an unusable mount in
an hour or two.  Here are some syslogs:

Oct  1 11:50:28 worker1004 kernel: INFO: task fio:29007 blocked for more
than 120 seconds.
Oct  1 11:50:28 worker1004 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 11:50:28 worker1004 kernel: fio D
996d86e92f70 0 29007  28970 0x
Oct  1 11:50:28 worker1004 kernel: Call Trace:
Oct  1 11:50:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:50:28 worker1004 kernel: [] schedule+0x29/0x70
Oct  1 11:50:28 worker1004 kernel: []
schedule_timeout+0x239/0x2c0
Oct  1 11:50:28 worker1004 kernel: [] ?
ktime_get_ts64+0x52/0xf0
Oct  1 11:50:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:50:28 worker1004 kernel: []
io_schedule_timeout+0xad/0x130
Oct  1 11:50:28 worker1004 kernel: []
io_schedule+0x18/0x20
Oct  1 11:50:28 worker1004 kernel: []
bit_wait_io+0x11/0x50
Oct  1 11:50:28 worker1004 kernel: []
__wait_on_bit_lock+0x61/0xc0
Oct  1 11:50:28 worker1004 kernel: []
__lock_page+0x74/0x90
Oct  1 11:50:28 worker1004 kernel: [] ?
wake_bit_function+0x40/0x40
Oct  1 11:50:28 worker1004 kernel: []
__find_lock_page+0x54/0x70
Oct  1 11:50:28 worker1004 kernel: []
grab_cache_page_write_begin+0x55/0xc0
Oct  1 11:50:28 worker1004 kernel: []
ceph_write_begin+0x43/0xe0 [ceph]
Oct  1 11:50:28 worker1004 kernel: []
generic_file_buffered_write+0x124/0x2c0
Oct  1 11:50:28 worker1004 kernel: []
ceph_aio_write+0xa3e/0xcb0 [ceph]
Oct  1 11:50:28 worker1004 kernel: [] ?
do_numa_page+0x1be/0x250
Oct  1 11:50:28 worker1004 kernel: [] ?
handle_pte_fault+0x316/0xd10
Oct  1 11:50:28 worker1004 kernel: [] ?
aio_read_events+0x1f3/0x2e0
Oct  1 11:50:28 worker1004 kernel: [] ?
security_file_permission+0x27/0xa0
Oct  1 11:50:28 worker1004 kernel: [] ?
ceph_direct_read_write+0xcd0/0xcd0 [ceph]
Oct  1 11:50:28 worker1004 kernel: []
do_io_submit+0x3c3/0x870
Oct  1 11:50:28 worker1004 kernel: []
SyS_io_submit+0x10/0x20
Oct  1 11:50:28 worker1004 kernel: []
system_call_fastpath+0x22/0x27
Oct  1 11:52:28 worker1004 kernel: INFO: task fio:29007 blocked for more
than 120 seconds.
Oct  1 11:52:28 worker1004 kernel: "echo 0 >
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 11:52:28 worker1004 kernel: fio D
996d86e92f70 0 29007  28970 0x
Oct  1 11:52:28 worker1004 kernel: Call Trace:
Oct  1 11:52:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:52:28 worker1004 kernel: [] schedule+0x29/0x70
Oct  1 11:52:28 worker1004 kernel: []
schedule_timeout+0x239/0x2c0
Oct  1 11:52:28 worker1004 kernel: [] ?
ktime_get_ts64+0x52/0xf0
Oct  1 11:52:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:52:28 worker1004 kernel: []
io_schedule_timeout+0xad/0x130
Oct  1 11:52:28 worker1004 kernel: []
io_schedule+0x18/0x20
Oct  1 11:52:28 worker1004 kernel: []
bit_wait_io+0x11/0x50
Oct  1 11:52:28 worker1004 kernel: []
__wait_on_bit_lock+0x61/0xc0
Oct  1 11:52:28 worker1004 kernel: []
__lock_page+0x74/0x90
Oct  1 11:52:28 worker1004 kernel: [] ?
wake_bit_function+0x40/0x40
Oct  1 11:52:28 worker1004 kernel: []
__find_lock_page+0x54/0x70
Oct  1 11:52:28 worker1004 kernel: []
grab_cache_page_write_begin+0x55/0xc0
Oct  1 11:52:28 worker1004 kernel: []
ceph_write_begin+0x43/0xe0 [ceph]
Oct  1 11:52:28 worker1004 kernel: []
generic_file_buffered_write+0x124/0x2c0
Oct  1 11:52:28 worker1004 kernel: []
ceph_aio_write+0xa3e/0xcb0 [ceph]
Oct  1 11:52:28 worker1004 kernel: [] ?
do_numa_page+0x1be/0x250
Oct  1 11:52:28 worker1004 kernel: [] ?
handle_pte_fault+0x316/0xd10
Oct  1 11:52:28 worker1004 kernel: [] ?
aio_read_events+0x1f3/0x2e0
Oct  1 11:52:28 worker1004 kernel: [] ?
security_file_permission+0x27/0xa0
Oct  1 11:52:28 worker1004 kernel: [] ?
ceph_direct_read_write+0xcd0/0xcd0 [ceph]
Oct  1 11:52:28 worker1004 kernel: []
do_io_submit+0x3c3/0x870
Oct  1 11:52:28 worker1004 kernel: []
SyS_io_submit+0x10/0x20
Oct  1 11:52:28 worker1004 kernel: []
system_call_fastpath+0x22/0x27

Oct  1 15:04:08 worker1004 kernel: libceph: reset on mds0
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 closed our session
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 reconnect start
Oct  1 15:04:08 worker1004 kernel: libceph: osd182 10.128.150.155:6976
socket closed (con s

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Gregory Farnum
The critical bit in your logs below are the lines
Oct  1 15:04:08 worker1004 kernel: libceph: reset on mds0
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 closed our session
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 reconnect start
...
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 reconnect denied

I couldn't tell you why the kernel client is facing disconnects that it
doesn't handle more often than the userspace client is, perhaps it (or at
least this kernel's version) isn't subscribing to mdsmap updates or
handling them quickly enough.
But that sequence means that the mount is busted and can't recover itself.
-Greg


On Mon, Oct 1, 2018 at 1:41 PM Andras Pataki 
wrote:

> Unfortunately the CentOS kernel (3.10.0-862.14.4.el7.x86_64) has issues
> as well.  Different ones, but the nodes end up with an unusable mount in
> an hour or two.  Here are some syslogs:
>
> Oct  1 11:50:28 worker1004 kernel: INFO: task fio:29007 blocked for more
> than 120 seconds.
> Oct  1 11:50:28 worker1004 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct  1 11:50:28 worker1004 kernel: fio D
> 996d86e92f70 0 29007  28970 0x
> Oct  1 11:50:28 worker1004 kernel: Call Trace:
> Oct  1 11:50:28 worker1004 kernel: [] ?
> bit_wait+0x50/0x50
> Oct  1 11:50:28 worker1004 kernel: [] schedule+0x29/0x70
> Oct  1 11:50:28 worker1004 kernel: []
> schedule_timeout+0x239/0x2c0
> Oct  1 11:50:28 worker1004 kernel: [] ?
> ktime_get_ts64+0x52/0xf0
> Oct  1 11:50:28 worker1004 kernel: [] ?
> bit_wait+0x50/0x50
> Oct  1 11:50:28 worker1004 kernel: []
> io_schedule_timeout+0xad/0x130
> Oct  1 11:50:28 worker1004 kernel: []
> io_schedule+0x18/0x20
> Oct  1 11:50:28 worker1004 kernel: []
> bit_wait_io+0x11/0x50
> Oct  1 11:50:28 worker1004 kernel: []
> __wait_on_bit_lock+0x61/0xc0
> Oct  1 11:50:28 worker1004 kernel: []
> __lock_page+0x74/0x90
> Oct  1 11:50:28 worker1004 kernel: [] ?
> wake_bit_function+0x40/0x40
> Oct  1 11:50:28 worker1004 kernel: []
> __find_lock_page+0x54/0x70
> Oct  1 11:50:28 worker1004 kernel: []
> grab_cache_page_write_begin+0x55/0xc0
> Oct  1 11:50:28 worker1004 kernel: []
> ceph_write_begin+0x43/0xe0 [ceph]
> Oct  1 11:50:28 worker1004 kernel: []
> generic_file_buffered_write+0x124/0x2c0
> Oct  1 11:50:28 worker1004 kernel: []
> ceph_aio_write+0xa3e/0xcb0 [ceph]
> Oct  1 11:50:28 worker1004 kernel: [] ?
> do_numa_page+0x1be/0x250
> Oct  1 11:50:28 worker1004 kernel: [] ?
> handle_pte_fault+0x316/0xd10
> Oct  1 11:50:28 worker1004 kernel: [] ?
> aio_read_events+0x1f3/0x2e0
> Oct  1 11:50:28 worker1004 kernel: [] ?
> security_file_permission+0x27/0xa0
> Oct  1 11:50:28 worker1004 kernel: [] ?
> ceph_direct_read_write+0xcd0/0xcd0 [ceph]
> Oct  1 11:50:28 worker1004 kernel: []
> do_io_submit+0x3c3/0x870
> Oct  1 11:50:28 worker1004 kernel: []
> SyS_io_submit+0x10/0x20
> Oct  1 11:50:28 worker1004 kernel: []
> system_call_fastpath+0x22/0x27
> Oct  1 11:52:28 worker1004 kernel: INFO: task fio:29007 blocked for more
> than 120 seconds.
> Oct  1 11:52:28 worker1004 kernel: "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct  1 11:52:28 worker1004 kernel: fio D
> 996d86e92f70 0 29007  28970 0x
> Oct  1 11:52:28 worker1004 kernel: Call Trace:
> Oct  1 11:52:28 worker1004 kernel: [] ?
> bit_wait+0x50/0x50
> Oct  1 11:52:28 worker1004 kernel: [] schedule+0x29/0x70
> Oct  1 11:52:28 worker1004 kernel: []
> schedule_timeout+0x239/0x2c0
> Oct  1 11:52:28 worker1004 kernel: [] ?
> ktime_get_ts64+0x52/0xf0
> Oct  1 11:52:28 worker1004 kernel: [] ?
> bit_wait+0x50/0x50
> Oct  1 11:52:28 worker1004 kernel: []
> io_schedule_timeout+0xad/0x130
> Oct  1 11:52:28 worker1004 kernel: []
> io_schedule+0x18/0x20
> Oct  1 11:52:28 worker1004 kernel: []
> bit_wait_io+0x11/0x50
> Oct  1 11:52:28 worker1004 kernel: []
> __wait_on_bit_lock+0x61/0xc0
> Oct  1 11:52:28 worker1004 kernel: []
> __lock_page+0x74/0x90
> Oct  1 11:52:28 worker1004 kernel: [] ?
> wake_bit_function+0x40/0x40
> Oct  1 11:52:28 worker1004 kernel: []
> __find_lock_page+0x54/0x70
> Oct  1 11:52:28 worker1004 kernel: []
> grab_cache_page_write_begin+0x55/0xc0
> Oct  1 11:52:28 worker1004 kernel: []
> ceph_write_begin+0x43/0xe0 [ceph]
> Oct  1 11:52:28 worker1004 kernel: []
> generic_file_buffered_write+0x124/0x2c0
> Oct  1 11:52:28 worker1004 kernel: []
> ceph_aio_write+0xa3e/0xcb0 [ceph]
> Oct  1 11:52:28 worker1004 kernel: [] ?
> do_numa_page+0x1be/0x250
> Oct  1 11:52:28 worker1004 kernel: [] ?
> handle_pte_fault+0x316/0xd10
> Oct  1 11:52:28 worker1004 kernel: [] ?
> aio_read_events+0x1f3/0x2e0
> Oct  1 11:52:28 worker1004 kernel: [] ?
> security_file_permission+0x27/0xa0
> Oct  1 11:52:28 worker1004 kernel: [] ?
> ceph_direct_read_write+0xcd0/0xcd0 [ceph]
> Oct  1 11:52:28 worker1004 kernel: []
> do_io_submit+0x3c3/0x870
> Oct  1 11:52:28 worker1004 kernel: []
> SyS_io_submit+0x10/0x20
> Oct  1 11:52:28 worker1004 kernel: []
> 

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Andras Pataki
Unfortunately the CentOS kernel (3.10.0-862.14.4.el7.x86_64) has issues 
as well.  Different ones, but the nodes end up with an unusable mount in 
an hour or two.  Here are some syslogs:


Oct  1 11:50:28 worker1004 kernel: INFO: task fio:29007 blocked for more 
than 120 seconds.
Oct  1 11:50:28 worker1004 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 11:50:28 worker1004 kernel: fio D 
996d86e92f70 0 29007  28970 0x

Oct  1 11:50:28 worker1004 kernel: Call Trace:
Oct  1 11:50:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:50:28 worker1004 kernel: [] schedule+0x29/0x70
Oct  1 11:50:28 worker1004 kernel: [] 
schedule_timeout+0x239/0x2c0
Oct  1 11:50:28 worker1004 kernel: [] ? 
ktime_get_ts64+0x52/0xf0

Oct  1 11:50:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:50:28 worker1004 kernel: [] 
io_schedule_timeout+0xad/0x130
Oct  1 11:50:28 worker1004 kernel: [] 
io_schedule+0x18/0x20
Oct  1 11:50:28 worker1004 kernel: [] 
bit_wait_io+0x11/0x50
Oct  1 11:50:28 worker1004 kernel: [] 
__wait_on_bit_lock+0x61/0xc0
Oct  1 11:50:28 worker1004 kernel: [] 
__lock_page+0x74/0x90
Oct  1 11:50:28 worker1004 kernel: [] ? 
wake_bit_function+0x40/0x40
Oct  1 11:50:28 worker1004 kernel: [] 
__find_lock_page+0x54/0x70
Oct  1 11:50:28 worker1004 kernel: [] 
grab_cache_page_write_begin+0x55/0xc0
Oct  1 11:50:28 worker1004 kernel: [] 
ceph_write_begin+0x43/0xe0 [ceph]
Oct  1 11:50:28 worker1004 kernel: [] 
generic_file_buffered_write+0x124/0x2c0
Oct  1 11:50:28 worker1004 kernel: [] 
ceph_aio_write+0xa3e/0xcb0 [ceph]
Oct  1 11:50:28 worker1004 kernel: [] ? 
do_numa_page+0x1be/0x250
Oct  1 11:50:28 worker1004 kernel: [] ? 
handle_pte_fault+0x316/0xd10
Oct  1 11:50:28 worker1004 kernel: [] ? 
aio_read_events+0x1f3/0x2e0
Oct  1 11:50:28 worker1004 kernel: [] ? 
security_file_permission+0x27/0xa0
Oct  1 11:50:28 worker1004 kernel: [] ? 
ceph_direct_read_write+0xcd0/0xcd0 [ceph]
Oct  1 11:50:28 worker1004 kernel: [] 
do_io_submit+0x3c3/0x870
Oct  1 11:50:28 worker1004 kernel: [] 
SyS_io_submit+0x10/0x20
Oct  1 11:50:28 worker1004 kernel: [] 
system_call_fastpath+0x22/0x27
Oct  1 11:52:28 worker1004 kernel: INFO: task fio:29007 blocked for more 
than 120 seconds.
Oct  1 11:52:28 worker1004 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct  1 11:52:28 worker1004 kernel: fio D 
996d86e92f70 0 29007  28970 0x

Oct  1 11:52:28 worker1004 kernel: Call Trace:
Oct  1 11:52:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:52:28 worker1004 kernel: [] schedule+0x29/0x70
Oct  1 11:52:28 worker1004 kernel: [] 
schedule_timeout+0x239/0x2c0
Oct  1 11:52:28 worker1004 kernel: [] ? 
ktime_get_ts64+0x52/0xf0

Oct  1 11:52:28 worker1004 kernel: [] ? bit_wait+0x50/0x50
Oct  1 11:52:28 worker1004 kernel: [] 
io_schedule_timeout+0xad/0x130
Oct  1 11:52:28 worker1004 kernel: [] 
io_schedule+0x18/0x20
Oct  1 11:52:28 worker1004 kernel: [] 
bit_wait_io+0x11/0x50
Oct  1 11:52:28 worker1004 kernel: [] 
__wait_on_bit_lock+0x61/0xc0
Oct  1 11:52:28 worker1004 kernel: [] 
__lock_page+0x74/0x90
Oct  1 11:52:28 worker1004 kernel: [] ? 
wake_bit_function+0x40/0x40
Oct  1 11:52:28 worker1004 kernel: [] 
__find_lock_page+0x54/0x70
Oct  1 11:52:28 worker1004 kernel: [] 
grab_cache_page_write_begin+0x55/0xc0
Oct  1 11:52:28 worker1004 kernel: [] 
ceph_write_begin+0x43/0xe0 [ceph]
Oct  1 11:52:28 worker1004 kernel: [] 
generic_file_buffered_write+0x124/0x2c0
Oct  1 11:52:28 worker1004 kernel: [] 
ceph_aio_write+0xa3e/0xcb0 [ceph]
Oct  1 11:52:28 worker1004 kernel: [] ? 
do_numa_page+0x1be/0x250
Oct  1 11:52:28 worker1004 kernel: [] ? 
handle_pte_fault+0x316/0xd10
Oct  1 11:52:28 worker1004 kernel: [] ? 
aio_read_events+0x1f3/0x2e0
Oct  1 11:52:28 worker1004 kernel: [] ? 
security_file_permission+0x27/0xa0
Oct  1 11:52:28 worker1004 kernel: [] ? 
ceph_direct_read_write+0xcd0/0xcd0 [ceph]
Oct  1 11:52:28 worker1004 kernel: [] 
do_io_submit+0x3c3/0x870
Oct  1 11:52:28 worker1004 kernel: [] 
SyS_io_submit+0x10/0x20
Oct  1 11:52:28 worker1004 kernel: [] 
system_call_fastpath+0x22/0x27


Oct  1 15:04:08 worker1004 kernel: libceph: reset on mds0
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 closed our session
Oct  1 15:04:08 worker1004 kernel: ceph: mds0 reconnect start
Oct  1 15:04:08 worker1004 kernel: libceph: osd182 10.128.150.155:6976 
socket closed (con state OPEN)
Oct  1 15:04:08 worker1004 kernel: libceph: osd548 10.128.150.171:6936 
socket closed (con state OPEN)
Oct  1 15:04:08 worker1004 kernel: libceph: osd59 10.128.150.154:6918 
socket closed (con state OPEN)

Oct  1 15:04:08 worker1004 kernel: ceph: mds0 reconnect denied
Oct  1 15:04:08 worker1004 kernel: ceph:  dropping dirty+flushing Fw 
state for 997ff05193b0 1099516450605
Oct  1 15:04:08 worker1004 kernel: ceph:  dropping dirty+flushing Fw 
state for 997ff0519930 1099516450607
Oct  1 15:04:08 worker1004 kernel: ceph:  dropping dirty+flushing Fw 

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Andras Pataki
These hangs happen during random I/O fio benchmark loads.  Something 
like 4 or 8 fio processes doing random reads/writes to distinct large 
files (to ensure there is no caching possible).  This is all on CentOS 
7.4 nodes.  Same (and even tougher) tests run without any problems with 
ceph-fuse.  We do have jobs that do heavy parallel I/O (MPI-IO, HDF5 via 
MPI-IO, etc.) - so running 8 parallel random I/O generating processes on 
nodes with 28 cores and plenty of RAM (256GB - 512GB) should not be 
excessive.


I am going to test the latest CentOS kernel next (the one you are 
referencing).  The RedHat/CentOS kernels are not "old kernel clients" - 
they contains various backports of hundreds of patches to all kinds of 
subsystems of Linux.  What is unclear there is exactly what ceph client 
RedHat is backporting to their kernels.  Any pointers there would be 
helpful.


Andras


On 10/1/18 2:26 PM, Marc Roos wrote:
  
How do you test this? I have had no issues under "normal load" with an

old kernel client and a stable os.

CentOS Linux release 7.5.1804 (Core)
Linux c04 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018
x86_64 x86_64 x86_64 GNU/Linux





-Original Message-
From: Andras Pataki [mailto:apat...@flatironinstitute.org]
Sent: maandag 1 oktober 2018 20:10
To: ceph-users
Subject: [ceph-users] cephfs kernel client stability

We have so far been using ceph-fuse for mounting cephfs, but the small
file performance of ceph-fuse is often problematic.  We've been testing
the kernel client, and have seen some pretty bad crashes/hangs.

What is the policy on fixes to the kernel client?  Is only the latest
stable kernel updated (4.18.x nowadays), or are fixes backported to LTS
kernels also (like 4.14.x or 4.9.x for example)? I've seen various
threads that certain newer features require pretty new kernels - but I'm
wondering whether newer kernels are also required for better stability -
or - in general, where the kernel client stability stands nowadays.

Here is an example of kernel hang with 4.14.67.  On heavy loads the
machine isn't even pingable.

Sep 29 21:10:16 worker1004 kernel: INFO: rcu_sched self-detected stall
on CPU Sep 29 21:10:16 worker1004 kernel: #0111-...: (1 GPs behind)
idle=bee/141/0 softirq=21319/21319 fqs=7499 Sep 29 21:10:16
worker1004 kernel: #011 (t=15000 jiffies g=13989 c=13988
q=8334)
Sep 29 21:10:16 worker1004 kernel: NMI backtrace for cpu 1 Sep 29
21:10:16 worker1004 kernel: CPU: 1 PID: 19436 Comm: kworker/1:42
Tainted: P    W  O    4.14.67 #1
Sep 29 21:10:16 worker1004 kernel: Hardware name: Dell Inc. PowerEdge
C6320/082F9M, BIOS 2.6.0 10/27/2017 Sep 29 21:10:16 worker1004 kernel:
Workqueue: ceph-msgr ceph_con_workfn [libceph] Sep 29 21:10:16
worker1004 kernel: Call Trace:
Sep 29 21:10:16 worker1004 kernel:  Sep 29 21:10:16 worker1004
kernel: dump_stack+0x46/0x5f Sep 29 21:10:16 worker1004 kernel:
nmi_cpu_backtrace+0xba/0xc0 Sep 29 21:10:16 worker1004 kernel: ?
irq_force_complete_move+0xd0/0xd0 Sep 29 21:10:16 worker1004 kernel:
nmi_trigger_cpumask_backtrace+0x8a/0xc0
Sep 29 21:10:16 worker1004 kernel: rcu_dump_cpu_stacks+0x81/0xb1 Sep 29
21:10:16 worker1004 kernel: rcu_check_callbacks+0x642/0x790 Sep 29
21:10:16 worker1004 kernel: ? update_wall_time+0x26d/0x6e0 Sep 29
21:10:16 worker1004 kernel: update_process_times+0x23/0x50 Sep 29
21:10:16 worker1004 kernel: tick_sched_timer+0x2f/0x60 Sep 29 21:10:16
worker1004 kernel: __hrtimer_run_queues+0xa3/0xf0 Sep 29 21:10:16
worker1004 kernel: hrtimer_interrupt+0x94/0x170 Sep 29 21:10:16
worker1004 kernel: smp_apic_timer_interrupt+0x4c/0x90
Sep 29 21:10:16 worker1004 kernel: apic_timer_interrupt+0x84/0x90 Sep 29
21:10:16 worker1004 kernel:  Sep 29 21:10:16 worker1004 kernel:
RIP: 0010:crush_hash32_3+0x1e5/0x270 [libceph] Sep 29 21:10:16
worker1004 kernel: RSP: 0018:c9000fdff5d8 EFLAGS:
0a97 ORIG_RAX: ff10
Sep 29 21:10:16 worker1004 kernel: RAX: 06962033 RBX:
883f6e7173c0 RCX: dcdcc373
Sep 29 21:10:16 worker1004 kernel: RDX: bd5425ca RSI:
8a8b0b56 RDI: b1983b87
Sep 29 21:10:16 worker1004 kernel: RBP: 0023 R08:
bd5425ca R09: 137904e9
Sep 29 21:10:16 worker1004 kernel: R10:  R11:
0002 R12: b0f29f21
Sep 29 21:10:16 worker1004 kernel: R13: 000c R14:
f0ae R15: 0023
Sep 29 21:10:16 worker1004 kernel: crush_bucket_choose+0x2ad/0x340
[libceph] Sep 29 21:10:16 worker1004 kernel:
crush_choose_firstn+0x1b0/0x4c0 [libceph] Sep 29 21:10:16 worker1004
kernel: crush_choose_firstn+0x48d/0x4c0 [libceph] Sep 29 21:10:16
worker1004 kernel: crush_do_rule+0x28c/0x5a0 [libceph] Sep 29 21:10:16
worker1004 kernel: ceph_pg_to_up_acting_osds+0x459/0x850
[libceph]
Sep 29 21:10:16 worker1004 kernel: calc_target+0x213/0x520 [libceph] Sep
29 21:10:16 worker1004 kernel: ? ixgbe_xmit_frame_ring+0x362/0xe80
[ixgbe] Sep 29 21:10:16 worker1004 kernel: ? 

Re: [ceph-users] cephfs kernel client stability

2018-10-01 Thread Marc Roos
 
How do you test this? I have had no issues under "normal load" with an 
old kernel client and a stable os.  

CentOS Linux release 7.5.1804 (Core)
Linux c04 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 
x86_64 x86_64 x86_64 GNU/Linux





-Original Message-
From: Andras Pataki [mailto:apat...@flatironinstitute.org] 
Sent: maandag 1 oktober 2018 20:10
To: ceph-users
Subject: [ceph-users] cephfs kernel client stability

We have so far been using ceph-fuse for mounting cephfs, but the small 
file performance of ceph-fuse is often problematic.  We've been testing 
the kernel client, and have seen some pretty bad crashes/hangs.

What is the policy on fixes to the kernel client?  Is only the latest 
stable kernel updated (4.18.x nowadays), or are fixes backported to LTS 
kernels also (like 4.14.x or 4.9.x for example)? I've seen various 
threads that certain newer features require pretty new kernels - but I'm 
wondering whether newer kernels are also required for better stability - 
or - in general, where the kernel client stability stands nowadays.

Here is an example of kernel hang with 4.14.67.  On heavy loads the 
machine isn't even pingable.

Sep 29 21:10:16 worker1004 kernel: INFO: rcu_sched self-detected stall 
on CPU Sep 29 21:10:16 worker1004 kernel: #0111-...: (1 GPs behind) 
idle=bee/141/0 softirq=21319/21319 fqs=7499 Sep 29 21:10:16 
worker1004 kernel: #011 (t=15000 jiffies g=13989 c=13988
q=8334)
Sep 29 21:10:16 worker1004 kernel: NMI backtrace for cpu 1 Sep 29 
21:10:16 worker1004 kernel: CPU: 1 PID: 19436 Comm: kworker/1:42
Tainted: P    W  O    4.14.67 #1
Sep 29 21:10:16 worker1004 kernel: Hardware name: Dell Inc. PowerEdge 
C6320/082F9M, BIOS 2.6.0 10/27/2017 Sep 29 21:10:16 worker1004 kernel: 
Workqueue: ceph-msgr ceph_con_workfn [libceph] Sep 29 21:10:16 
worker1004 kernel: Call Trace:
Sep 29 21:10:16 worker1004 kernel:  Sep 29 21:10:16 worker1004 
kernel: dump_stack+0x46/0x5f Sep 29 21:10:16 worker1004 kernel: 
nmi_cpu_backtrace+0xba/0xc0 Sep 29 21:10:16 worker1004 kernel: ? 
irq_force_complete_move+0xd0/0xd0 Sep 29 21:10:16 worker1004 kernel: 
nmi_trigger_cpumask_backtrace+0x8a/0xc0
Sep 29 21:10:16 worker1004 kernel: rcu_dump_cpu_stacks+0x81/0xb1 Sep 29 
21:10:16 worker1004 kernel: rcu_check_callbacks+0x642/0x790 Sep 29 
21:10:16 worker1004 kernel: ? update_wall_time+0x26d/0x6e0 Sep 29 
21:10:16 worker1004 kernel: update_process_times+0x23/0x50 Sep 29 
21:10:16 worker1004 kernel: tick_sched_timer+0x2f/0x60 Sep 29 21:10:16 
worker1004 kernel: __hrtimer_run_queues+0xa3/0xf0 Sep 29 21:10:16 
worker1004 kernel: hrtimer_interrupt+0x94/0x170 Sep 29 21:10:16 
worker1004 kernel: smp_apic_timer_interrupt+0x4c/0x90
Sep 29 21:10:16 worker1004 kernel: apic_timer_interrupt+0x84/0x90 Sep 29 
21:10:16 worker1004 kernel:  Sep 29 21:10:16 worker1004 kernel: 
RIP: 0010:crush_hash32_3+0x1e5/0x270 [libceph] Sep 29 21:10:16 
worker1004 kernel: RSP: 0018:c9000fdff5d8 EFLAGS: 
0a97 ORIG_RAX: ff10
Sep 29 21:10:16 worker1004 kernel: RAX: 06962033 RBX: 
883f6e7173c0 RCX: dcdcc373
Sep 29 21:10:16 worker1004 kernel: RDX: bd5425ca RSI: 
8a8b0b56 RDI: b1983b87
Sep 29 21:10:16 worker1004 kernel: RBP: 0023 R08: 
bd5425ca R09: 137904e9
Sep 29 21:10:16 worker1004 kernel: R10:  R11: 
0002 R12: b0f29f21
Sep 29 21:10:16 worker1004 kernel: R13: 000c R14: 
f0ae R15: 0023
Sep 29 21:10:16 worker1004 kernel: crush_bucket_choose+0x2ad/0x340 
[libceph] Sep 29 21:10:16 worker1004 kernel: 
crush_choose_firstn+0x1b0/0x4c0 [libceph] Sep 29 21:10:16 worker1004 
kernel: crush_choose_firstn+0x48d/0x4c0 [libceph] Sep 29 21:10:16 
worker1004 kernel: crush_do_rule+0x28c/0x5a0 [libceph] Sep 29 21:10:16 
worker1004 kernel: ceph_pg_to_up_acting_osds+0x459/0x850
[libceph]
Sep 29 21:10:16 worker1004 kernel: calc_target+0x213/0x520 [libceph] Sep 
29 21:10:16 worker1004 kernel: ? ixgbe_xmit_frame_ring+0x362/0xe80 
[ixgbe] Sep 29 21:10:16 worker1004 kernel: ? put_prev_entity+0x27/0x620 
Sep 29 21:10:16 worker1004 kernel: ? pick_next_task_fair+0x1c7/0x520 Sep 
29 21:10:16 worker1004 kernel: 
scan_requests.constprop.55+0x16f/0x280 [libceph] Sep 29 21:10:16 
worker1004 kernel: handle_one_map+0x175/0x200 [libceph] Sep 29 21:10:16 
worker1004 kernel: ceph_osdc_handle_map+0x390/0x850 [libceph] Sep 29 
21:10:16 worker1004 kernel: ? ceph_x_encrypt+0x46/0x70 [libceph] Sep 29 
21:10:16 worker1004 kernel: dispatch+0x2ef/0xba0 [libceph] Sep 29 
21:10:16 worker1004 kernel: ? read_partial_message+0x215/0x880 [libceph] 
Sep 29 21:10:16 worker1004 kernel: ? inet_recvmsg+0x45/0xb0 Sep 29 
21:10:16 worker1004 kernel: try_read+0x6f8/0x11b0 [libceph] Sep 29 
21:10:16 worker1004 kernel: ? sched_clock_cpu+0xc/0xa0 Sep 29 21:10:16 
worker1004 kernel: ? put_prev_entity+0x27/0x620 Sep 29 21:10:16 
worker1004 kernel: ? pick_next_task_fair+0x415/0x520 Sep 29 21:10:16