from:"Marc"

Re: [ceph-users] ceph log level

2019-12-30 Thread Marc Roos

However, I can not get rid of these messages.

Dec 30 10:13:10 c02 ceph-mgr: 2019-12-30 10:13:10.343 7f7d3a2f8700  0 
log_channel(cluster) log [DBG] : pgmap v710220: 


-Original Message-
To: ceph-users; deaderzzs
Subject: Re: [ceph-users] ceph log level

I am decreasing logging with this script.


#!/bin/bash

declare -A logarrosd
declare -A logarrmon
declare -A logarrmgr

# default values luminous 12.2.7
logarrosd[debug_asok]="1/5"
logarrosd[debug_auth]="1/5"
logarrosd[debug_buffer]="0/1"
logarrosd[debug_client]="0/5"
logarrosd[debug_context]="0/1"
logarrosd[debug_crush]="1/1"
logarrosd[debug_filer]="0/1"
logarrosd[debug_filestore]="1/3"
logarrosd[debug_finisher]="1/1"
logarrosd[debug_heartbeatmap]="1/5"
logarrosd[debug_journal]="1/3"
logarrosd[debug_journaler]="0/5"
logarrosd[debug_lockdep]="0/1"
logarrosd[debug_mds]="1/5"
logarrosd[debug_mon]="1/5"
logarrosd[debug_monc]="0/10"
logarrosd[debug_ms]="0/5"
logarrosd[debug_objclass]="0/5"
logarrosd[debug_objectcacher]="0/5"
logarrosd[debug_objecter]="0/1"
logarrosd[debug_optracker]="0/5"
logarrosd[debug_osd]="1/5"
logarrosd[debug_paxos]="1/5"
logarrosd[debug_perfcounter]="1/5"
logarrosd[debug_rados]="0/5"
logarrosd[debug_rbd]="0/5"
logarrosd[debug_rgw]="1/5"
logarrosd[debug_rgw_sync]="1/5"
logarrosd[debug_throttle]="1/1"
logarrosd[debug_timer]="0/1"
logarrosd[debug_tp]="0/5"

logarrosd[debug_mds_balancer]="1/5"
logarrosd[debug_mds_locker]="1/5"
logarrosd[debug_mds_log]="1/5"
logarrosd[debug_mds_log_expire]="1/5"
logarrosd[debug_mds_migrator]="1/5"
logarrosd[debug_striper]="0/1"
logarrosd[debug_rbd_mirror]="0/5"
logarrosd[debug_rbd_replay]="0/5"
logarrosd[debug_crypto]="1/5"
logarrosd[debug_reserver]="1/1"
logarrosd[debug_civetweb]="1/10"
logarrosd[debug_javaclient]="1/5"
logarrosd[debug_xio]="1/5"
logarrosd[debug_compressor]="1/5"
logarrosd[debug_bluestore]="1/5"
logarrosd[debug_bluefs]="1/5" 
logarrosd[debug_bdev]="1/3"
logarrosd[debug_kstore]="1/5"
logarrosd[debug_rocksdb]="4/5"
logarrosd[debug_leveldb]="4/5"
logarrosd[debug_memdb]="4/5"
logarrosd[debug_kinetic]="1/5"
logarrosd[debug_fuse]="1/5"
logarrosd[debug_mgr]="1/5"
logarrosd[debug_mgrc]="1/5"
logarrosd[debug_dpdk]="1/5"
logarrosd[debug_eventtrace]="1/5"
logarrmon[debug_asok]="1/5"
logarrmon[debug_auth]="1/5"
logarrmon[debug_bdev]="1/3"
logarrmon[debug_bluefs]="1/5"
logarrmon[debug_bluestore]="1/5"
logarrmon[debug_buffer]="0/1"
logarrmon[debug_civetweb]="1/10"
logarrmon[debug_client]="0/5"
logarrmon[debug_compressor]="1/5"
logarrmon[debug_context]="0/1"
logarrmon[debug_crush]="1/1"
logarrmon[debug_crypto]="1/5"
logarrmon[debug_dpdk]="1/5"
logarrmon[debug_eventtrace]="1/5"
logarrmon[debug_filer]="0/1"
logarrmon[debug_filestore]="1/3"
logarrmon[debug_finisher]="1/1"
logarrmon[debug_fuse]="1/5"
logarrmon[debug_heartbeatmap]="1/5"
logarrmon[debug_javaclient]="1/5"
logarrmon[debug_journal]="1/3"
logarrmon[debug_journaler]="0/5"
logarrmon[debug_kinetic]="1/5"
logarrmon[debug_kstore]="1/5"
logarrmon[debug_leveldb]="4/5"
logarrmon[debug_lockdep]="0/1"
logarrmon[debug_mds]="1/5"
logarrmon[debug_mds_balancer]="1/5"
logarrmon[debug_mds_locker]="1/5"
logarrmon[debug_mds_log]="1/5"
logarrmon[debug_mds_log_expire]="1/5"
logarrmon[debug_mds_migrator]="1/5"
logarrmon[debug_memdb]="4/5"
logarrmon[debug_mgr]="1/5"
logarrmon[debug_mgrc]="1/5"
logarrmon[debug_mon]="1/5"
logarrmon[debug_monc]="0/10"
logarrmon[debug_ms]="0/0"
logarrmon[debug_none]="0/5"
logarrmon[debug_objclass]="0/5"
logarrmon[debug_objectcacher]="0/5"
logarrmon[debug_objecter]="0/1"
logarrmon[debug_optracker]="0/5"
logarrmon[debug_osd]="1/5"
logarrmon[debug_paxos]="1/5"
logarrmon[debug_perfcounter]="1/5"
logarrmon[debug_rados]="0/5"
logarrmon[debug_rbd]="0/5"
logarrmon[debug_rbd_mirror]="0/5"
logarrmon[debug_rbd_replay]="0/5"
logarrmon[debug_refs]="0/0"
logarrmon[debug_reserver]="1/1"
logarrmon[debug_rgw]="1/5"
logarrmon[debug_rgw_sync]="1/5"
logarrmon[debug_rocksdb]="4/5"
logarrmon[debug_striper]="0/1"
logarrmon[debug_throttle]="1/1"
logarrmon[debug_timer]="0/1"
logarrmon[debug_tp]="0/5"
logarrmon[debug_xio]="1/5"


for osdk in "${!logarrosd[@]}"
do
  ceph tell osd.* injectargs --$osdk=0/0 done

for monk in "${!logarrmon[@]}"
do
  ceph tell mon.* injectargs --$monk=0/0 done



-Original Message-
From: Zhenshi Zhou [mailto:deader...@gmail.com]
Sent: 30 December 2019 05:41
To: ceph-users
Subject: [ceph-users] ceph log level

Hi all,

OSD servers generate huge number of log. I configure 'debug_osd' to 1/5 
or 1/20, but it seems not working. Is there any other option which 
overrides this configuration?

Ceph version mimic(13.2.5)

Thanks


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph log level

2019-12-30 Thread Marc Roos

I am decreasing logging with this script.


#!/bin/bash

declare -A logarrosd
declare -A logarrmon
declare -A logarrmgr

# default values luminous 12.2.7
logarrosd[debug_asok]="1/5"
logarrosd[debug_auth]="1/5"
logarrosd[debug_buffer]="0/1"
logarrosd[debug_client]="0/5"
logarrosd[debug_context]="0/1"
logarrosd[debug_crush]="1/1"
logarrosd[debug_filer]="0/1"
logarrosd[debug_filestore]="1/3"
logarrosd[debug_finisher]="1/1"
logarrosd[debug_heartbeatmap]="1/5"
logarrosd[debug_journal]="1/3"
logarrosd[debug_journaler]="0/5"
logarrosd[debug_lockdep]="0/1"
logarrosd[debug_mds]="1/5"
logarrosd[debug_mon]="1/5"
logarrosd[debug_monc]="0/10"
logarrosd[debug_ms]="0/5"
logarrosd[debug_objclass]="0/5"
logarrosd[debug_objectcacher]="0/5"
logarrosd[debug_objecter]="0/1"
logarrosd[debug_optracker]="0/5"
logarrosd[debug_osd]="1/5"
logarrosd[debug_paxos]="1/5"
logarrosd[debug_perfcounter]="1/5"
logarrosd[debug_rados]="0/5"
logarrosd[debug_rbd]="0/5"
logarrosd[debug_rgw]="1/5"
logarrosd[debug_rgw_sync]="1/5"
logarrosd[debug_throttle]="1/1"
logarrosd[debug_timer]="0/1"
logarrosd[debug_tp]="0/5"

logarrosd[debug_mds_balancer]="1/5"
logarrosd[debug_mds_locker]="1/5"
logarrosd[debug_mds_log]="1/5"
logarrosd[debug_mds_log_expire]="1/5"
logarrosd[debug_mds_migrator]="1/5"
logarrosd[debug_striper]="0/1"
logarrosd[debug_rbd_mirror]="0/5"
logarrosd[debug_rbd_replay]="0/5"
logarrosd[debug_crypto]="1/5"
logarrosd[debug_reserver]="1/1"
logarrosd[debug_civetweb]="1/10"
logarrosd[debug_javaclient]="1/5"
logarrosd[debug_xio]="1/5"
logarrosd[debug_compressor]="1/5"
logarrosd[debug_bluestore]="1/5"
logarrosd[debug_bluefs]="1/5" 
logarrosd[debug_bdev]="1/3"
logarrosd[debug_kstore]="1/5"
logarrosd[debug_rocksdb]="4/5"
logarrosd[debug_leveldb]="4/5"
logarrosd[debug_memdb]="4/5"
logarrosd[debug_kinetic]="1/5"
logarrosd[debug_fuse]="1/5"
logarrosd[debug_mgr]="1/5"
logarrosd[debug_mgrc]="1/5"
logarrosd[debug_dpdk]="1/5"
logarrosd[debug_eventtrace]="1/5"
logarrmon[debug_asok]="1/5"
logarrmon[debug_auth]="1/5"
logarrmon[debug_bdev]="1/3"
logarrmon[debug_bluefs]="1/5"
logarrmon[debug_bluestore]="1/5"
logarrmon[debug_buffer]="0/1"
logarrmon[debug_civetweb]="1/10"
logarrmon[debug_client]="0/5"
logarrmon[debug_compressor]="1/5"
logarrmon[debug_context]="0/1"
logarrmon[debug_crush]="1/1"
logarrmon[debug_crypto]="1/5"
logarrmon[debug_dpdk]="1/5"
logarrmon[debug_eventtrace]="1/5"
logarrmon[debug_filer]="0/1"
logarrmon[debug_filestore]="1/3"
logarrmon[debug_finisher]="1/1"
logarrmon[debug_fuse]="1/5"
logarrmon[debug_heartbeatmap]="1/5"
logarrmon[debug_javaclient]="1/5"
logarrmon[debug_journal]="1/3"
logarrmon[debug_journaler]="0/5"
logarrmon[debug_kinetic]="1/5"
logarrmon[debug_kstore]="1/5"
logarrmon[debug_leveldb]="4/5"
logarrmon[debug_lockdep]="0/1"
logarrmon[debug_mds]="1/5"
logarrmon[debug_mds_balancer]="1/5"
logarrmon[debug_mds_locker]="1/5"
logarrmon[debug_mds_log]="1/5"
logarrmon[debug_mds_log_expire]="1/5"
logarrmon[debug_mds_migrator]="1/5"
logarrmon[debug_memdb]="4/5"
logarrmon[debug_mgr]="1/5"
logarrmon[debug_mgrc]="1/5"
logarrmon[debug_mon]="1/5"
logarrmon[debug_monc]="0/10"
logarrmon[debug_ms]="0/0"
logarrmon[debug_none]="0/5"
logarrmon[debug_objclass]="0/5"
logarrmon[debug_objectcacher]="0/5"
logarrmon[debug_objecter]="0/1"
logarrmon[debug_optracker]="0/5"
logarrmon[debug_osd]="1/5"
logarrmon[debug_paxos]="1/5"
logarrmon[debug_perfcounter]="1/5"
logarrmon[debug_rados]="0/5"
logarrmon[debug_rbd]="0/5"
logarrmon[debug_rbd_mirror]="0/5"
logarrmon[debug_rbd_replay]="0/5"
logarrmon[debug_refs]="0/0"
logarrmon[debug_reserver]="1/1"
logarrmon[debug_rgw]="1/5"
logarrmon[debug_rgw_sync]="1/5"
logarrmon[debug_rocksdb]="4/5"
logarrmon[debug_striper]="0/1"
logarrmon[debug_throttle]="1/1"
logarrmon[debug_timer]="0/1"
logarrmon[debug_tp]="0/5"
logarrmon[debug_xio]="1/5"


for osdk in "${!logarrosd[@]}"
do
  ceph tell osd.* injectargs --$osdk=0/0
done

for monk in "${!logarrmon[@]}"
do
  ceph tell mon.* injectargs --$monk=0/0
done



-Original Message-
From: Zhenshi Zhou [mailto:deader...@gmail.com] 
Sent: 30 December 2019 05:41
To: ceph-users
Subject: [ceph-users] ceph log level

Hi all,

OSD servers generate huge number of log. I configure 'debug_osd' to 1/5 
or 1/20, but it seems not working. Is there any other option which 
overrides this configuration?

Ceph version mimic(13.2.5)

Thanks


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] How can I stop this logging?

2019-12-23 Thread Marc Roos




 384 active+clean; 19 TiB data, 45 TiB used, 76 TiB / 122 TiB avail; 3.4 
KiB/s rd, 573 KiB/s wr, 20 op/s
Dec 23 11:58:25 c02 ceph-mgr: 2019-12-23 11:58:25.194 7f7d3a2f8700  0 
log_channel(cluster) log [DBG] : pgmap v411196: 384 pgs: 384 
active+clean; 19 TiB data, 45 TiB used, 76 TiB / 122 TiB avail; 3.3 
KiB/s rd, 521 KiB/s wr, 20 op/s
Dec 23 11:58:27 c02 ceph-mgr: 2019-12-23 11:58:27.196 7f7d3a2f8700  0 
log_channel(cluster) log [DBG] : pgmap v411197: 384 pgs: 384 
active+clean; 19 TiB data, 45 TiB used, 76 TiB / 122 TiB avail; 3.4 
KiB/s rd, 237 KiB/s wr, 19 op/s
Dec 23 11:58:29 c02 ceph-mgr: 2019-12-23 11:58:29.197 7f7d3a2f8700  0 
log_channel(cluster) log [DBG] : pgmap v411198: 384 pgs: 384 
active+clean; 19 TiB data, 45 TiB used, 76 TiB / 122 TiB avail; 3.2 
KiB/s rd, 254 KiB/s wr, 17 op/s
Dec 23 11:58:31 c02 ceph-mgr: 2019-12-23 11:58:31.199 7f7d3a2f8700  0 
log_channel(cluster) log [DBG] : pgmap v411199: 384 pgs: 384 
active+clean; 19 TiB data, 45 TiB used, 76 TiB / 122 TiB avail; 2.9 
KiB/s rd, 258 KiB/s wr, 17 op/s
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Separate disk sets for high IO?

2019-12-16 Thread Marc Roos




You can classify osd's, eg as ssd. And you can assign this class to a 
pool you create. This way you have have rbd's running on only ssd's. I 
think you have also a class for nvme and you can create custom classes.


 

-Original Message-
From: Philip Brown [mailto:pbr...@medata.com] 
Sent: 16 December 2019 22:55
To: ceph-users
Subject: [ceph-users] Separate disk sets for high IO?

Still relatively new to ceph, but have been tinkering for a few weeks 
now.

If I'm reading the various docs correctly, then any RBD in a particular 
ceph cluster, will be distributed across ALL OSDs, ALL the time.
There is no way to designate a particular set of disks, AKA OSDs, to be 
a high performance group, and allocate certain RBDs to only use that set 
of disks.
Pools, only control things like the replication count, and number of 
placement groups.

I'd have to set up a whole new ceph cluster for the type of behavior I 
want.

Am I correct?



--
Philip Brown| Sr. Linux System Administrator | Medata, Inc. 
5 Peters Canyon Rd Suite 250
Irvine CA 92606
Office 714.918.1310| Fax 714.918.1325
pbr...@medata.com| www.medata.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] deleted snap dirs are back as _origdir_1099536400705

2019-12-16 Thread Marc Roos



Yes Thanks!!! you are right I deleted the higher created snapshots, and 
they are now gone.
 

-Original Message-
Cc: ceph-users
Subject: Re: [ceph-users] deleted snap dirs are back as 
_origdir_1099536400705

With just the one ls listing and my memory it's not totally clear, but I 
believe this is the output you get when delete a snapshot folder but 
it's still referenced by a different snapshot farther up the hierarchy.
-Greg

On Mon, Dec 16, 2019 at 8:51 AM Marc Roos  
wrote:
>
>
> Am I the only lucky one having this problem? Should I use the 
> bugtracker system for this?
>
> -Original Message-
> From: Marc Roos
> Sent: 14 December 2019 10:05
> Cc: ceph-users
> Subject: Re: [ceph-users] deleted snap dirs are back as
> _origdir_1099536400705
>
>
>
> ceph tell mds.a scrub start / recursive repair Did not fix this.
>
>
>
> -Original Message-
> Cc: ceph-users
> Subject: [ceph-users] deleted snap dirs are back as
> _origdir_1099536400705
>
>
> I thought I deleted snapshot dirs, but I still have them but with a 
> different name. How to get rid of these?
>
> [@ .snap]# ls -1
> _snap-1_1099536400705
> _snap-2_1099536400705
> _snap-3_1099536400705
> _snap-4_1099536400705
> _snap-5_1099536400705
> _snap-6_1099536400705
> _snap-7_1099536400705
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] deleted snap dirs are back as _origdir_1099536400705

2019-12-16 Thread Marc Roos

Am I the only lucky one having this problem? Should I use the bugtracker 
system for this?

-Original Message-
From: Marc Roos 
Sent: 14 December 2019 10:05
Cc: ceph-users
Subject: Re: [ceph-users] deleted snap dirs are back as 
_origdir_1099536400705

ceph tell mds.a scrub start / recursive repair Did not fix this.

-Original Message-
Cc: ceph-users
Subject: [ceph-users] deleted snap dirs are back as
_origdir_1099536400705

I thought I deleted snapshot dirs, but I still have them but with a 
different name. How to get rid of these?

[@ .snap]# ls -1
_snap-1_1099536400705
_snap-2_1099536400705
_snap-3_1099536400705
_snap-4_1099536400705
_snap-5_1099536400705
_snap-6_1099536400705
_snap-7_1099536400705

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] deleted snap dirs are back as _origdir_1099536400705

2019-12-14 Thread Marc Roos

 

ceph tell mds.a scrub start / recursive repair
Did not fix this.



-Original Message-
Cc: ceph-users
Subject: [ceph-users] deleted snap dirs are back as 
_origdir_1099536400705

 
I thought I deleted snapshot dirs, but I still have them but with a 
different name. How to get rid of these?

[@ .snap]# ls -1
_snap-1_1099536400705
_snap-2_1099536400705
_snap-3_1099536400705
_snap-4_1099536400705
_snap-5_1099536400705
_snap-6_1099536400705
_snap-7_1099536400705
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph tell mds.a scrub status "problem getting command descriptions"

2019-12-13 Thread Marc Roos

 
client.admin, did not have correct rights

 ceph auth caps client.admin mds "allow *" mgr "allow *" mon "allow *" 
osd "allow *"


-Original Message-
To: ceph-users
Subject: [ceph-users] ceph tell mds.a scrub status "problem getting 
command descriptions"


ceph tell mds.a scrub status

Generates

2019-12-14 00:46:38.782 7fef4affd700  0 client.3744774 ms_handle_reset 
on v2:192.168.10.111:6800/3517983549 Error EPERM: problem getting 
command descriptions from mds.a 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] ceph tell mds.a scrub status "problem getting command descriptions"

2019-12-13 Thread Marc Roos



ceph tell mds.a scrub status

Generates

2019-12-14 00:46:38.782 7fef4affd700  0 client.3744774 ms_handle_reset 
on v2:192.168.10.111:6800/3517983549
Error EPERM: problem getting command descriptions from mds.a
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] deleted snap dirs are back as _origdir_1099536400705

2019-12-13 Thread Marc Roos

 
I thought I deleted snapshot dirs, but I still have them but with a 
different name. How to get rid of these?

[@ .snap]# ls -1
_snap-1_1099536400705
_snap-2_1099536400705
_snap-3_1099536400705
_snap-4_1099536400705
_snap-5_1099536400705
_snap-6_1099536400705
_snap-7_1099536400705
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] sharing single SSD across multiple HD based OSDs

2019-12-10 Thread Marc Roos

Just also a bit curious. So it just creates a pv on sda and no 
partitioning done on sda? 

-Original Message-
From: Daniel Sung [mailto:daniel.s...@quadraturecapital.com] 
Sent: dinsdag 10 december 2019 14:40
To: Philip Brown
Cc: ceph-users
Subject: Re: [ceph-users] sharing single SSD across multiple HD based 
OSDs

It just uses LVM to create a bunch of LVs. It doesn't actually create 
separate partitions on the block devices. You can run the command and it 
will give you a preview of what it will do and ask for confirmation. 

On Tue, 10 Dec 2019 at 13:36, Philip Brown  wrote:

Interesting. What did the partitioning look like?

- Original Message -
From: "Daniel Sung" 
To: "Nathan Fish" 
Cc: "Philip Brown" , "ceph-users" 

Sent: Tuesday, December 10, 2019 1:21:36 AM
Subject: Re: [ceph-users] sharing single SSD across multiple HD 
based OSDs

The way I did this was to use:

ceph-volume lvm batch /dev/sda /dev/sdb /dev/sdc /dev/sdd /dev/sde
/dev/sdf etc

Where you just list all of the block devices you want to use in a 
group. It
will automatically determine which devices are SSD and then 
automatically
partition it for you and share it amongst the HDD OSDs.

-- 

Quadrature Capital
daniel.s...@quadraturecapital.com
http://www.quadraturecapital.com
Dir: +44-20-3743-0428 Main: +44-20-3743-0400 Fax: +44-20-3743-0401 The 
Leadenhall Building, 122 Leadenhall Street, London, EC3V 4AB

---
Quadrature Capital Limited, a limited company, registered in England and 
Wales with registered number 09516131, is authorised and regulated by 
the Financial Conduct Authority.

Any e-mail sent from Quadrature Capital may contain information which is 
confidential. Unless you are the intended recipient, you may not 
disclose, copy or use it; please notify the sender immediately and 
delete it and any copies from your systems. You should protect your 
system from viruses etc.; we accept no responsibility for damage that 
may be caused by them.

We may monitor email content for the purposes of ensuring compliance 
with law and our policies, as well as details of correspondents to 
supplement our relationships database.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph-mgr :: Grafana + Telegraf / InfluxDB metrics format

2019-12-10 Thread Marc Roos



 >I am having hard times with creating graphs I want to see. Metrics are 
exported in way that every single one is stored in separated series in 
Influx like:
 >
 >> ceph_pool_stats,cluster=ceph1,metric=read value=1234 
15506589110
 >> ceph_pool_stats,cluster=ceph1,metric=write value=1234 
15506589110
 >> ceph_pool_stats,cluster=ceph1,metric=total value=1234 
15506589110
 >
 >instead of single series like:
 >
 >> ceph_pool_stats,cluster=ceph1 read=1234,write=1234,total=1234 
15506589110
 >

That is how timeseries databases work, one value per timestamp

 >This means when I want to create graph of something like % usage ratio 
(= bytes_used / bytes_total) or number of faulty OSDs (= num_osd_up - 
num_osd_in) I am unable to do it with single query like

I use a static variable in influx to fix this. I have static value for 
osds and totalbytes (although newer ceph versions log this i think). You 
can also solve this with continuous queries in influx. 

 >> SELECT mean("num_osd_up") - mean("num_osd_in") FROM 
"ceph_cluster_stats" WHERE "cluster" =~ /^ceph1$/ AND time >= now() - 6h 
GROUP BY time(5m) fill(null)
 
SELECT $totalosds - last("value")  FROM "ceph_value" WHERE "type" = 
'ceph_bytes' AND "type_instance" = 'Cluster.numOsdIn' AND $timeFilter 
GROUP BY time($__interval) fill(null)
 
 >but instead it requires two queries followed by math operation, which 
I was unable to get it working in my Grafana nor InfluxDB (I believe 
it's not supported, Influx removed JOIN queries some time ago).
 > 



-Original Message-
From: Miroslav Kalina [mailto:miroslav.kal...@livesport.eu] 
Sent: dinsdag 10 december 2019 13:31
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph-mgr :: Grafana + Telegraf / InfluxDB metrics 
format

Hello guys,

is there anyone using Telegraf / InfluxDB metrics exporter with Grafana 
dashboards? I am asking like that because I was unable to find any 
existing Grafana dashboards based on InfluxDB.

I am having hard times with creating graphs I want to see. Metrics are 
exported in way that every single one is stored in separated series in 
Influx like:

> ceph_pool_stats,cluster=ceph1,metric=read value=1234 
15506589110
> ceph_pool_stats,cluster=ceph1,metric=write value=1234 
15506589110
> ceph_pool_stats,cluster=ceph1,metric=total value=1234 
15506589110

instead of single series like:

> ceph_pool_stats,cluster=ceph1 read=1234,write=1234,total=1234 
15506589110

This means when I want to create graph of something like % usage ratio 
(= bytes_used / bytes_total) or number of faulty OSDs (= num_osd_up - 
num_osd_in) I am unable to do it with single query like

> SELECT mean("num_osd_up") - mean("num_osd_in") FROM 
"ceph_cluster_stats" WHERE "cluster" =~ /^ceph1$/ AND time >= now() - 6h 
GROUP BY time(5m) fill(null)

but instead it requires two queries followed by math operation, which I 
was unable to get it working in my Grafana nor InfluxDB (I believe it's 
not supported, Influx removed JOIN queries some time ago).

I didn't see any possibility how to modify metrics format exported to 
Telegraf. I feel like I am missing something pretty obvious here.

I am currently unable to switch to prometheus exporter (which don't have 
this kind of issue) because of my current infrastructure setup.

Currently I am using following versions:
 * Ceph 14.2.4
 * InfluxDB 1.6.4
 * Grafana 6.4.2

So ... do you have it working anyone? Please could you share your 
dashboards?

Best regards

-- 
Miroslav Kalina
Systems developement specialist

Livesport s.r.o.
Aspira Business Centre
Bucharova 2928/14a, 158 00 Praha 5
www.livesport.eu


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Qemu RBD image usage

2019-12-09 Thread Marc Roos

 
This should get you started with using rbd.


  
  

  
  



  
  
  WDC
  WD40EFRX-68WT0N0
  
  
  





cat > secret.xml <


client.rbd.vps secret


EOF

virsh secret-define --file secret.xml

virsh secret-set-value --secret  --base64 `ceph auth get-key 
client.rbd.vps 2>/dev/null`



-Original Message-
To: ceph-users@lists.ceph.com
Cc: d...@ceph.io
Subject: [ceph-users] Qemu RBD image usage

Hi all,
   I want to attach another RBD image into the Qemu VM to be used as 
disk.
   However, it always failed.  The VM definiation xml is attached.
   Could anyone tell me where I did wrong?
   || nstcc3@nstcloudcc3:~$ sudo virsh start ubuntu_18_04_mysql 
--console
   || error: Failed to start domain ubuntu_18_04_mysql
   || error: internal error: process exited while connecting to monitor:
   || 2019-12-09T16:24:30.284454Z qemu-system-x86_64: -drive
   || 
file=rbd:rwl_mysql/mysql_image:auth_supported=none:mon_host=nstcloudcc4\
:6789,format=raw,if=none,id=drive-virtio-disk1:
   || error connecting: Operation not supported


   The cluster info is below:
   || ceph@nstcloudcc3:~$ ceph --version
   || ceph version 14.0.0-16935-g9b6ef711f3 
(9b6ef711f3a40898756457cb287bf291f45943f0) octopus (dev)
   || ceph@nstcloudcc3:~$ ceph -s
   ||   cluster:
   || id: e31502ff-1fb4-40b7-89a8-2b85a77a3b09
   || health: HEALTH_OK
   ||  
   ||   services:
   || mon: 1 daemons, quorum nstcloudcc4 (age 2h)
   || mgr: nstcloudcc4(active, since 2h)
   || osd: 4 osds: 4 up (since 2h), 4 in (since 2h)
   ||  
   ||   data:
   || pools:   1 pools, 128 pgs
   || objects: 6 objects, 6.3 KiB
   || usage:   4.0 GiB used, 7.3 TiB / 7.3 TiB avail
   || pgs: 128 active+clean
   ||  
   || ceph@nstcloudcc3:~$
   || ceph@nstcloudcc3:~$ rbd info rwl_mysql/mysql_image
   || rbd image 'mysql_image':
   || size 100 GiB in 25600 objects
   || order 22 (4 MiB objects)
   || snapshot_count: 0
   || id: 110feda39b1c
   || block_name_prefix: rbd_data.110feda39b1c
   || format: 2
   || features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten
   || op_features: 
   || flags: 
   || create_timestamp: Mon Dec  9 23:48:17 2019
   || access_timestamp: Mon Dec  9 23:48:17 2019
   || modify_timestamp: Mon Dec  9 23:48:17 2019

B.R.
Changcheng


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 2 different ceph-users lists?

2019-12-05 Thread Marc Roos

 

ceph-users@lists.ceph.com is old one, why this is, I also do not know

https://www.mail-archive.com/search?l=all=ceph


-Original Message-
From: Rodrigo Severo - Fábrica [mailto:rodr...@fabricadeideias.com] 
Sent: donderdag 5 december 2019 20:37
To: ceph-users@lists.ceph.com; ceph-us...@ceph.io
Subject: [ceph-users] 2 different ceph-users lists?

Hi,


Are there 2 different ceph-users list?

ceph-users@lists.ceph.com

and

ceph-us...@ceph.io

Why? What's the difference?


Regards,

Rodrigo Severo
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs behind Hardware Raid

2019-12-04 Thread Marc Roos

But I guess that in 'ceph osd tree' the ssd's were then also displayed 
as hdd?

-Original Message-
From: Stolte, Felix [mailto:f.sto...@fz-juelich.de] 
Sent: woensdag 4 december 2019 9:12
To: ceph-users
Subject: [ceph-users] SSDs behind Hardware Raid

Hi guys,

maybe this is common knowledge for the most of you, for me it was not:

if you are using SSDs behind a raid controller in raid mode  (not JBOD) 
make sure your operating system treats them correctly as SSDs. I am an 
Ubuntu user but I think the following applies to all linux operating 
systems:

/sys/block//queue/rotational determines if an device is 
treated as rotational or not. 0 stands for SSD, 1 for Rotational.

In my case Ubuntu treated my SSDs (Raid 0, 1 Disk) as rotational. 
Changing the parameter above for my SSDs to 0 and restarting the 
corresponding osd daemons increased 4K write IOPS drastically:

rados -p ssd bench 60 write -b 4K (6 Nodes, 3 SSDs each)

Before: ~5200 IOPS

After: ~11500 IOPS

@Developers: I am aware that this is not directly a ceph issue, but 
maybe you could consider to add this hint in your documentation. I could 
be wrong, but I think I am not the only one using a hw raid controller 
for osds (not willingly by the way).

On a sidenote: ceph-volume lvm batch uses the rotational parameter as 
well for identifying SSDs (please correct me if I am wrong).

Best regards

Felix

-

-

Forschungszentrum Juelich GmbH

52425 Juelich

Sitz der Gesellschaft: Juelich

Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498

Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher

Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),

Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,

Prof. Dr. Sebastian M. Schmidt

-

-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] rbd image size

2019-11-25 Thread Marc Roos

Is there a point to sending such signature (twice) to a public mailing 
list, having its emails stored on serveral mailing list websites?









===
PS don't you think this (^) is a nicer line than





/

-Original Message-
To: ceph-users
Subject: [ceph-users] rbd image size

Hello ,  I  use ceph as block storage in kubernetes. I want to get the 
rbd usage by command "rbd diff image_id | awk '{ SUM += $2 } END { print 
SUM/1024/1024 " MB" }’”, but I found it is a lot bigger than the value 
which I got by command “df -h” in the pod. I do not know the reason 
and need your help.

Thanks.




// 
声明：此邮件可能包含依图公司保密或特权信息，并且仅应发送至有权接收该邮件
的收件人。如果您无权收取该邮件，您应当立即删除该邮件并通知发件人，您并被
禁止传播、分发或复制此邮件以及附件。对于此邮件可能携带的病毒引起的任何损
害，本公司不承担任何责任。此外，本公司不保证已正确和完整地传输此信息，也
不接受任何延迟收件的赔偿责任。 




// Notice: 
This email may contain confidential or privileged information of Yitu 
and was sent solely to the intended recipients. If you are unauthorized 
to receive this email, you should delete the email and contact the 
sender immediately. Any unauthorized disclosing, distribution, or 
copying of this email and attachment thereto is prohibited. Yitu does 
not accept any liability for any loss caused by possibly viruses in this 
email. E-mail transmission cannot be guaranteed to be secure or 
error-free and Yitu is not responsible for any delayed transmission.




// 
声明：此邮件可能包含依图公司保密或特权信息，并且仅应发送至有权接收该邮件
的收件人。如果您无权收取该邮件，您应当立即删除该邮件并通知发件人，您并被
禁止传播、分发或复制此邮件以及附件。对于此邮件可能携带的病毒引起的任何损
害，本公司不承担任何责任。此外，本公司不保证已正确和完整地传输此信息，也
不接受任何延迟收件的赔偿责任。 




// Notice: 
This email may contain confidential or privileged information of Yitu 
and was sent solely to the intended recipients. If you are unauthorized 
to receive this email, you should delete the email and contact the 
sender immediately. Any unauthorized disclosing, distribution, or 
copying of this email and attachment thereto is prohibited. Yitu does 
not accept any liability for any loss caused by possibly viruses in this 
email. E-mail transmission cannot be guaranteed to be secure or 
error-free and Yitu is not responsible for any delayed transmission.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Cloudstack and CEPH Day London

2019-10-24 Thread Marc Roos

I was thinking of going to the Polish one, but I will be tempted to go 
to the London one, if you also be wearing this Kilt. ;D

-Original Message-
From: John Hearns [mailto:hear...@googlemail.com] 
Sent: donderdag 24 oktober 2019 8:14
To: ceph-users
Subject: [ceph-users] Cloudstack and CEPH Day London

I will be attending the Cloudstack and CEPH Day in London today.
Please say hello - rotund Scottish guy, not much hair. Glaswegian 
accent!

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] hanging slow requests: failed to authpin, subtree is being exported

2019-10-21 Thread Marc Roos

 
I think I am having this issue also (at least I had with luminous) I had 
to remove the hidden temp files rsync had left, when the cephfs mount 
'stalled', otherwise I would never be able to complete the rsync.


-Original Message-
Cc: ceph-users
Subject: Re: [ceph-users] hanging slow requests: failed to authpin, 
subtree is being exported


I've made a ticket for this issue: https://tracker.ceph.com/issues/42338

Thanks again!

K

On 15/10/2019 18:00, Kenneth Waegeman wrote:
> Hi Robert, all,
>
>
> On 23/09/2019 17:37, Robert LeBlanc wrote:
>> On Mon, Sep 23, 2019 at 4:14 AM Kenneth Waegeman 
>>  wrote:
>>> Hi all,
>>>
>>> When syncing data with rsync, I'm often getting blocked slow 
>>> requests, which also block access to this path.
>>>
 2019-09-23 11:25:49.477 7f4f401e8700 0 log_channel(cluster) log 
 [WRN]
 : slow request 31.895478 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:26:19.477 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 61.896079 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:27:19.478 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 121.897268 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:29:19.488 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 241.899467 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:33:19.680 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 482.087927 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:36:09.881 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 32.677511 seconds old, received at 2019-09-23
 11:35:37.217113: client_request(client.38347357:111963 lookup 
 #0x20005b0130c/testing 2019-09-23 11:35:37.217015 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:36:39.881 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 62.678132 seconds old, received at 2019-09-23
 11:35:37.217113: client_request(client.38347357:111963 lookup 
 #0x20005b0130c/testing 2019-09-23 11:35:37.217015 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:37:39.891 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 122.679273 seconds old, received at 2019-09-23
 11:35:37.217113: client_request(client.38347357:111963 lookup 
 #0x20005b0130c/testing 2019-09-23 11:35:37.217015 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:39:39.892 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 242.684667 seconds old, received at 2019-09-23
 11:35:37.217113: client_request(client.38347357:111963 lookup 
 #0x20005b0130c/testing 2019-09-23 11:35:37.217015 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:41:19.893 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 962.305681 seconds old, received at 2019-09-23
 11:25:17.598152: client_request(client.38352684:92684 lookup
 #0x100152383ce/vsc42531 2019-09-23 11:25:17.598077 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:43:39.923 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 482.712888 seconds old, received at 2019-09-23
 11:35:37.217113: client_request(client.38347357:111963 lookup 
 #0x20005b0130c/testing 2019-09-23 11:35:37.217015 caller_uid=0,
 caller_gid=0{0,}) currently failed to authpin, subtree is being 
 exported
 2019-09-23 11:51:40.236 7f4f401e8700  0 log_channel(cluster) log 
 [WRN]
 : slow request 963.037049 seconds old, received at 2019-09-23
 11:35:37.217113:

Re: [ceph-users] collectd Ceph metric

2019-10-21 Thread Marc Roos

 
The 'xx-.conf' are mine, custom. So I would not have to merge 
changes with newer /etc/collectd.conf rpm updates. 

I would suggest get a small configuration that is working, set debug 
logging[0], and increase the configuration until it fails with little 
steps. Load plugin ceph empty, configure then one osd, fails? try mgr?, 
try mon? etc


[0] Try something like this?
LoadPlugin logfile


#not compiled with debug?
#also not writing to the logfile
LogLevel debug
 File "/tmp/collectd.log"
#File STDOUT
Timestamp true
#   PrintSeverity false



-Original Message-
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] collectd Ceph metric

Is there any instruction to install the plugin configuration?

Attach my RHEL/collectd configuration file under /etc/ directory.
On RHEL:
[rdma@rdmarhel0 collectd.d]$ pwd
/etc/collectd.d
[rdma@rdmarhel0 collectd.d]$ tree .
.

0 directories, 0 files
[rdma@rdmarhel0 collectd.d]$

I've also checked the collectd configuration file under Ubuntu, there's 
no 11-network.conf or 12-memory.conf .etc. However, it could still 
collect the cpu and memory information.
On Ubuntu:
   nstcc1@nstcloudcc1:collectd$ pwd
   /etc/collectd
   nstcc1@nstcloudcc1:collectd$ tree .
   .
   ├── collectd.conf
   ├── collectd.conf.d
   │   ├── filters.conf
   │   └── thresholds.conf
   └── collection.conf

   1 directory, 4 files
   nstcc1@nstcloudcc1:collectd$

B.R.
Changcheng

On 10:56 Mon 21 Oct, Marc Roos wrote:
> 
> Your collectd starts without the ceph plugin ok? 
> 
> I have also your error " didn't register a configuration callback", 
> because I configured debug logging, but did not enable it by loading 
> the plugin 'logfile'. Maybe it is the order in which your 
> configuration files a read (I think this used to be important with 
> collectd)
> 
> I have only in my collectd.conf these two lines:
> Include "/etc/collectd.d"
> LoadPlugin syslog
> 
> And in /etc/collectd.d/ these files:
> 10-custom.conf(with network section for influx)
> 11-network.conf   (ethx)
> 12-memory.conf
> 50-ceph.conf
> 51-ipmi.conf
> 52-ipmi-power.conf
> 53-disk.conf
> 
> 
> 
> 
> [0] journalctl -u collectd.service
> Oct 21 10:29:39 c02 collectd[1017750]: Exiting normally.
> Oct 21 10:29:39 c02 systemd[1]: Stopping Collectd statistics daemon...
> Oct 21 10:29:39 c02 collectd[1017750]: collectd: Stopping 5 read 
> threads.
> Oct 21 10:29:39 c02 collectd[1017750]: collectd: Stopping 5 write 
> threads.
> Oct 21 10:29:40 c02 systemd[1]: Stopped Collectd statistics daemon.
> Oct 21 10:29:53 c02 systemd[1]: Starting Collectd statistics daemon...
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "syslog" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "network" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: Found a configuration for the 
> `logfile' plugin, but the plugin isn't loaded or didn't register a 
> configuration callback.
> Oct 21 10:29:53 c02 collectd[1031939]: Found a configuration for the 
> `logfile' plugin, but the plugin isn't loaded or didn't register a 
> configuration callback.
> Oct 21 10:29:53 c02 collectd[1031939]: Found a configuration for the 
> `logfile' plugin, but the plugin isn't loaded or didn't register a 
> configuration callback.
> Oct 21 10:29:53 c02 collectd[1031939]: network plugin: The 
> `MaxPacketSize' must be between 1024 and 65535.
> Oct 21 10:29:53 c02 collectd[1031939]: network plugin: Option 
> `CacheFlush' is not allowed here.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "interface" 

> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "memory" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "ceph" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "ipmi" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: ipmi plugin: Legacy 
> configuration found! Please update your config file.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "exec" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: plugin_load: plugin "disk" 
> successfully loaded.
> Oct 21 10:29:53 c02 collectd[1031939]: Systemd detected, trying to 
> signal readyness.
> Oct 21 10:29:53 c02 systemd[1]: Started Collectd statistics daemon.
> Oct 21 10:29:53 c02 collectd[1031939]: Initialization complete, 
> entering read-loop.
> Oct 21 10:30:04 c02 collectd[1031939]: ipmi plugin: sensor_list_add: 
> Ignore sensor `PS2 Status power_supply (10.2)` of `main`, because it 
> is discrete (0x8)! Its t

Re: [ceph-users] collectd Ceph metric

2019-10-21 Thread Marc Roos

: 
sensor `P2-DIMMH3 TEMP memory_device (32.94)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P2-DIMMH2 TEMP memory_device (32.93)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P2-DIMMG3 TEMP memory_device (32.90)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P2-DIMMG2 TEMP memory_device (32.89)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P2-DIMMF3 TEMP memory_device (32.86)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P2-DIMME3 TEMP memory_device (32.82)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P1-DIMMD3 TEMP memory_device (32.78)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P1-DIMMD2 TEMP memory_device (32.77)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P1-DIMMC3 TEMP memory_device (32.74)` of `main` not present.
Oct 21 10:30:55 c02 collectd[1031939]: ipmi plugin: sensor_read_handler: 
sensor `P1-DIMMC2 TEMP memory_device (32.73)` of `main` not present.

-Original Message-
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] collectd Ceph metric

On 10:16 Mon 21 Oct, Marc Roos wrote:
> I have the same. I do not think ConvertSpecialMetricTypes is 
necessary. 
> 
> 
>   Globals true
> 
> 
> 
>   LongRunAvgLatency false
>   ConvertSpecialMetricTypes true
>   
> SocketPath "/var/run/ceph/ceph-osd.1.asok"
>   
> 
Same configuration, but there's below error after "systemctl restart 
collectd"
Have you ever hit this error before?

=Log start===
Oct 21 16:22:52 rdmarhel0 collectd[768000]: Found a configuration for 
the `ceph' plugin, but the plugin isn't loaded or didn't register a 
configuration callback.
Oct 21 16:22:52 rdmarhel0 systemd[1]: Unit collectd.service entered 
failed state.
Oct 21 16:22:52 rdmarhel0 collectd[768000]: Found a configuration for 
the `ceph' plugin, but the plugin isn't loaded or didn't register a 
configuration callback.
Oct 21 16:22:52 rdmarhel0 systemd[1]: collectd.service failed.
Oct 21 16:22:52 rdmarhel0 collectd[768000]: There is a `Daemon' block 
within the configuration for the ceph plugin. The plugin either only 
expects "simple" configuration statements or wasn Oct 21 16:22:52 
rdmarhel0 systemd[1]: collectd.service holdoff time over, scheduling 
restart.
Oct 21 16:22:52 rdmarhel0 systemd[1]: Stopped Collectd statistics 
daemon.
-- Subject: Unit collectd.service has finished shutting down 
=Log end===

B.R.
Changcheng
> 
> 
> -Original Message-
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] collectd Ceph metric
> 
> On 09:50 Mon 21 Oct, Marc Roos wrote:
> > 
> > I am, collectd with luminous, and upgraded to nautilus and collectd
> > 5.8.1-1.el7 this weekend. Maybe increase logging or so. 
> > I had to wait a long time before collectd was supporting the 
> > luminous release, maybe it is the same with octopus (=15?)
> > 
> @Roos: Do you mean that you could run collectd(5.8.1) with 
> Ceph-Nautilus? Below is my collectd configuration with Ceph-Octopus:
> 
> 
> 
>   
> SocketPath "/var/run/ceph/ceph-osd.0.asok"
>   
> 
> 
> Is there anything wrong?
> 
> > 
> >  
> > 
> > -Original Message-
> > From: Liu, Changcheng [mailto:changcheng@intel.com]
> > Sent: maandag 21 oktober 2019 9:41
> > To: ceph-users@lists.ceph.com
> > Subject: [ceph-users] collectd Ceph metric
> > 
> > Hi all,
> >Does anyone succeed to use collectd/ceph plugin to collect ceph
> >cluster data?
> >I'm using collectd(5.8.1) and Ceph-15.0.0. collectd failed to get
> >cluster data with below error:
> >"collectd.service holdoff time over, scheduling restart"
> > 
> > Regards,
> > Changcheng
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
> > 
> 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] collectd Ceph metric

2019-10-21 Thread Marc Roos

I have the same. I do not think ConvertSpecialMetricTypes is necessary. 


  Globals true



  LongRunAvgLatency false
  ConvertSpecialMetricTypes true
  
SocketPath "/var/run/ceph/ceph-osd.1.asok"
  



-Original Message-
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] collectd Ceph metric

On 09:50 Mon 21 Oct, Marc Roos wrote:
> 
> I am, collectd with luminous, and upgraded to nautilus and collectd
> 5.8.1-1.el7 this weekend. Maybe increase logging or so. 
> I had to wait a long time before collectd was supporting the luminous 
> release, maybe it is the same with octopus (=15?)
> 
@Roos: Do you mean that you could run collectd(5.8.1) with 
Ceph-Nautilus? Below is my collectd configuration with Ceph-Octopus:



  
SocketPath "/var/run/ceph/ceph-osd.0.asok"
  


Is there anything wrong?

> 
>  
> 
> -Original Message-
> From: Liu, Changcheng [mailto:changcheng@intel.com]
> Sent: maandag 21 oktober 2019 9:41
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] collectd Ceph metric
> 
> Hi all,
>Does anyone succeed to use collectd/ceph plugin to collect ceph
>cluster data?
>I'm using collectd(5.8.1) and Ceph-15.0.0. collectd failed to get
>cluster data with below error:
>"collectd.service holdoff time over, scheduling restart"
> 
> Regards,
> Changcheng
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] collectd Ceph metric

2019-10-21 Thread Marc Roos



I am, collectd with luminous, and upgraded to nautilus and collectd 
5.8.1-1.el7 this weekend. Maybe increase logging or so. 
I had to wait a long time before collectd was supporting the luminous 
release, maybe it is the same with octopus (=15?)


 

-Original Message-
From: Liu, Changcheng [mailto:changcheng@intel.com] 
Sent: maandag 21 oktober 2019 9:41
To: ceph-users@lists.ceph.com
Subject: [ceph-users] collectd Ceph metric

Hi all,
   Does anyone succeed to use collectd/ceph plugin to collect ceph
   cluster data?
   I'm using collectd(5.8.1) and Ceph-15.0.0. collectd failed to get
   cluster data with below error:
   "collectd.service holdoff time over, scheduling restart"

Regards,
Changcheng
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-09 Thread Marc Roos

 
Brad, many thanks!!! My cluster has finally HEALTH_OK af 1,5 year or so! 
:)


-Original Message-
Subject: Re: Ceph pg repair clone_missing?

On Fri, Oct 4, 2019 at 6:09 PM Marc Roos  
wrote:
>
>  >
>  >Try something like the following on each OSD that holds a copy of
>  >rbd_data.1f114174b0dc51.0974 and see what output you 
get.
>  >Note that you can drop the bluestore flag if they are not bluestore  

> >osds and you will need the osd stopped at the time (set noout). Also  

> >note, snapids are displayed in hexadecimal in the output (but then 
'4'
>  >is '4' so not a big issues here).
>  >
>  >$ ceph-objectstore-tool --type bluestore --data-path  
> >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
>  >rbd_data.1f114174b0dc51.0974
>
> I got these results
>
> osd.7
> Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#,
> (61) No data available
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

Ah, so of course the problem is the snapshot is missing. You may need to 
try something like the following on each of those osds.

$ ceph-objectstore-tool --type bluestore --data-path 
/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 
'{"oid":"rbd_data.1f114174b0dc51.0974","key":"","snapid":-2,
"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}'
remove-clone-metadata 4

>
> osd.12
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
> osd.29
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
> ["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","s
> na 
> pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
>
>
>  >
>  >The likely issue here is the primary believes snapshot 4 is gone but 
 
> >there is still data and/or metadata on one of the replicas which is  
> >confusing the issue. If that is the case you can use the the  
> >ceph-objectstore-tool to delete the relevant snapshot(s)  >



--
Cheers,
Brad



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-04 Thread Marc Roos

 >
 >Try something like the following on each OSD that holds a copy of
 >rbd_data.1f114174b0dc51.0974 and see what output you get.
 >Note that you can drop the bluestore flag if they are not bluestore
 >osds and you will need the osd stopped at the time (set noout). Also
 >note, snapids are displayed in hexadecimal in the output (but then '4'
 >is '4' so not a big issues here).
 >
 >$ ceph-objectstore-tool --type bluestore --data-path
 >/var/lib/ceph/osd/ceph-XX/ --pgid 17.36 --op list
 >rbd_data.1f114174b0dc51.0974

I got these results

osd.7
Error getting attr on : 17.36_head,#-19:6c00:::scrub_17.36:head#, 
(61) No data available
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

osd.12
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]

osd.29
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":63,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]
["17.36",{"oid":"rbd_data.1f114174b0dc51.0974","key":"","sna
pid":-2,"hash":1357874486,"max":0,"pool":17,"namespace":"","max":0}]


 >
 >The likely issue here is the primary believes snapshot 4 is gone but
 >there is still data and/or metadata on one of the replicas which is
 >confusing the issue. If that is the case you can use the the
 >ceph-objectstore-tool to delete the relevant snapshot(s)
 >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] NFS

2019-10-03 Thread Marc Roos



Thanks Matt! Really useful configs. I am still on luminous, so I can 
forget about this now :( I will try when I am nautilus, I have already 
updated my configuration. However it is interesting that in the 
configuration nowhere the tenant is specified, so I guess that is being 
extracted from the access key/irrelevant.



-Original Message-
Subject: Re: [ceph-users] NFS

Hi Mark,

Here's an example that should work--userx and usery are RGW users 
created in different tenants, like so:

radosgw-admin --tenant tnt1 --uid userx --display-name "tnt1-userx" \
 --access_key "userxacc" --secret "test123" user create

 radosgw-admin --tenant tnt2 --uid usery --display-name "tnt2-usery" \
 --access_key "useryacc" --secret "test456" user create

Remember that to make use of this feature, you need recent librgw and 
matching nfs-ganesha.  In particular, Ceph should have, among other
changes:

commit 65d0ae733defe277f31825364ee52d5102c06ab9
Author: Matt Benjamin 
Date:   Wed Jun 5 07:25:35 2019 -0400

rgw_file: include tenant in hashes of object

Because bucket names are taken as object names in the top
of an export.  Make hashing by tenant general to avoid disjoint
hashing of bucket.

Fixes: http://tracker.ceph.com/issues/40118

Signed-off-by: Matt Benjamin 
(cherry picked from commit 8e0fd5fbfa7c770f6b668e79b772179946027bce)

commit 459b6b2b224953655fd0360e8098ae598e41d3b2
Author: Matt Benjamin 
Date:   Wed May 15 15:53:32 2019 -0400

rgw_file: include tenant when hashing bucket names

Prevent identical paths from distinct tenants from colliding in
RGW NFS handle cache.

Fixes: http://tracker.ceph.com/issues/40118

Signed-off-by: Matt Benjamin 
(cherry picked from commit b800a9de83dff23a150ed7d236cb61c8b7d971ae)
Signed-off-by: Matt Benjamin 


ganesha.conf.deuxtenant:


EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 77;

# Exported path (mandatory)
Path = "/";

# Pseudo Path (required for NFS v4)
Pseudo = "/userx";

# Required for access (default is None)
# Could use CLIENT blocks instead
Access_Type = RW;

SecType = "sys";

Protocols = 3,4;
Transports = UDP,TCP;

#Delegations = Readwrite;

Squash = No_Root_Squash;

# Exporting FSAL
FSAL {
Name = RGW;
User_Id = "userx";
Access_Key_Id = "userxacc";
Secret_Access_Key = "test123";
}
}

EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 78;

# Exported path (mandatory)
Path = "/";

# Pseudo Path (required for NFS v4)
Pseudo = "/usery";

# Required for access (default is None)
# Could use CLIENT blocks instead
Access_Type = RW;

SecType = "sys";

Protocols = 3,4;
Transports = UDP,TCP;

#Delegations = Readwrite;

Squash = No_Root_Squash;

# Exporting FSAL
FSAL {
Name = RGW;
User_Id = "usery";
Access_Key_Id = "useryacc";
Secret_Access_Key = "test456";
}
}

#mount at bucket case
EXPORT
{
# Export Id (mandatory, each EXPORT must have a unique Export_Id)
Export_Id = 79;

# Exported path (mandatory)
Path = "/buck5";

# Pseudo Path (required for NFS v4)
Pseudo = "/usery_buck5";

# Required for access (default is None)
# Could use CLIENT blocks instead
Access_Type = RW;

SecType = "sys";

Protocols = 3,4;
Transports = UDP,TCP;

#Delegations = Readwrite;

Squash = No_Root_Squash;

# Exporting FSAL
FSAL {
Name = RGW;
User_Id = "usery";
Access_Key_Id = "useryacc";
Secret_Access_Key = "test456";
}
}



RGW {
ceph_conf = "/home/mbenjamin/ceph-noob/build/ceph.conf";
#init_args = "-d --debug-rgw=16";
init_args = "";
}

NFS_Core_Param {
Nb_Worker = 17;
mount_path_pseudo = true;
}

CacheInode {
Chunks_HWMark = 7;
Entries_Hwmark = 200;
}

NFSV4 {
Graceless = true;
Allow_Numeric_Owners = true;
Only_Numeric_Owners = true;
}

LOG {
Components {
#NFS_READDIR = FULL_DEBUG;
#NFS4 = FULL_DEBUG;
#CACHE_INODE = FULL_DEBUG;
#FSAL = FULL_DEBUG;
}
Facility {
name = FILE;
destination = "/tmp/ganesha-rgw.log";
enable = active;
}
}

On Thu, Oct 3, 2019 at 10:34 AM Marc Roos  
wrote:
>
>
> How should a multi tenant RGW config look like, I am not able get this
> working:
>
> EXPORT {
>Export_ID=301;
>Path = "test:test3";
>#Path = "/";
>Pseudo = "/rgwtester";
>
>

Re: [ceph-users] NFS

2019-10-03 Thread Marc Roos



How should a multi tenant RGW config look like, I am not able get this 
working:

EXPORT {
   Export_ID=301;
   Path = "test:test3";
   #Path = "/";
   Pseudo = "/rgwtester";

   Protocols = 4;
   FSAL {
   Name = RGW;
   User_Id = "test$tester1";
   Access_Key_Id = "TESTER";
   Secret_Access_Key = "xxx";
   }
   Disable_ACL = TRUE;
   CLIENT { Clients = 192.168.10.0/24; access_type = "RO"; }
}


03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
create_export :FSAL :CRIT :RGW module: librgw init failed (-5)
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
mdcache_fsal_create_export :FSAL :MAJ :Failed to call create_export on 
underlying FSAL RGW
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
fsal_put :FSAL :INFO :FSAL RGW now unused
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
fsal_cfg_commit :CONFIG :CRIT :Could not create export for (/rgwtester) 
to (test:test3)
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
fsal_cfg_commit :FSAL :F_DBG :FSAL RGW refcount 0
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
config_errs_to_log :CONFIG :CRIT :Config File 
(/etc/ganesha/ganesha.conf:216): 1 validation errors in block FSAL
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
config_errs_to_log :CONFIG :CRIT :Config File 
(/etc/ganesha/ganesha.conf:216): Errors processing block (FSAL)
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
config_errs_to_log :CONFIG :CRIT :Config File 
(/etc/ganesha/ganesha.conf:209): 1 validation errors in block EXPORT
03/10/2019 16:15:37 : epoch 5d8d274c : c01 : ganesha.nfsd-4722[sigmgr] 
config_errs_to_log :CONFIG :CRIT :Config File 
(/etc/ganesha/ganesha.conf:209): Errors processing block (EXPORT)

-Original Message-
Subject: Re: [ceph-users] NFS

RGW NFS can support any NFS style of authentication, but users will have 
the RGW access of their nfs-ganesha export.  You can create exports with 
disjoint privileges, and since recent L, N, RGW tenants.

Matt

On Tue, Oct 1, 2019 at 8:31 AM Marc Roos  
wrote:
>
>  I think you can run into problems
> with a multi user environment of RGW and nfs-ganesha.
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph pg repair clone_missing?

2019-10-03 Thread Marc Roos

 >
 >>
 >> I was following the thread where you adviced on this pg repair
 >>
 >> I ran these rados 'list-inconsistent-obj'/'rados 
 >> list-inconsistent-snapset' and have output on the snapset. I tried 
to 
 >> extrapolate your comment on the data/omap_digest_mismatch_info onto 
my 
 >> situation. But I don't know how to proceed. I got on this mailing 
list 
 >> the advice to delete snapshot 4, but if I see this output, that 
might 
 >> not have been the smartest thing to do.
 >
 >That remains to be seen. Can you post the actual scrub error you are 
getting?

2019-10-03 09:27:07.831046 7fc448bf6700 -1 log_channel(cluster) log 
[ERR] : deep-scrub 17.36 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:head : expected 
clone 17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4 1 missing

 >>
 >>
 >>
 >>
 >> [0]
 >> http://tracker.ceph.com/issues/24994
 >
 >At first glance this appears to be a different issue to yours.
 >
 >>
 >> [1]
 >> {
 >>   "epoch": 66082,
 >>   "inconsistents": [
 >> {
 >>   "name": "rbd_data.1f114174b0dc51.0974",
 >
 >rbd_data.1f114174b0dc51 is the block_name_prefix for this image. You 
 >can run 'rbd info' on the images in this pool to see which image is
 >actually affected and how important the data is.

Yes I know what image it is. Deleting data is easy, I like to know/learn 

how to fix this.

 >
 >>   "nspace": "",
 >>   "locator": "",
 >>   "snap": "head",
 >>   "snapset": {
 >> "snap_context": {
 >>   "seq": 63,
 >>   "snaps": [
 >> 63,
 >> 35,
 >> 13,
 >> 4
 >>   ]
 >> },
 >> "head_exists": 1,
 >> "clones": [
 >>   {
 >> "snap": 4,
 >> "size": 4194304,
 >> "overlap": "[]",
 >> "snaps": [
 >>   4
 >> ]
 >>   },
 >>   {
 >> "snap": 63,
 >> "size": 4194304,
 >> "overlap": "[0~4194304]",
 >> "snaps": [
 >>   63,
 >>   35,
 >>   13
 >> ]
 >>   }
 >> ]
 >>   },
 >>   "errors": [
 >> "clone_missing"
 >>   ],
 >>   "missing": [
 >> 4
 >>   ]
 >> }
 >>   ]
 >> }
 >
 >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph pg repair clone_missing?

2019-10-02 Thread Marc Roos



 
Hi Brad, 

I was following the thread where you adviced on this pg repair

I ran these rados 'list-inconsistent-obj'/'rados 
list-inconsistent-snapset' and have output on the snapset. I tried to 
extrapolate your comment on the data/omap_digest_mismatch_info onto my 
situation. But I don't know how to proceed. I got on this mailing list 
the advice to delete snapshot 4, but if I see this output, that might 
not have been the smartest thing to do.




[0]
http://tracker.ceph.com/issues/24994

[1]
{
  "epoch": 66082,
  "inconsistents": [
{
  "name": "rbd_data.1f114174b0dc51.0974",
  "nspace": "",
  "locator": "",
  "snap": "head",
  "snapset": {
"snap_context": {
  "seq": 63,
  "snaps": [
63,
35,
13,
4
  ]
},
"head_exists": 1,
"clones": [
  {
"snap": 4,
"size": 4194304,
"overlap": "[]",
"snaps": [
  4
]
  },
  {
"snap": 63,
"size": 4194304,
"overlap": "[0~4194304]",
"snaps": [
  63,
  35,
  13
]
  }
]
  },
  "errors": [
"clone_missing"
  ],
  "missing": [
4
  ]
}
  ]
}
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] NFS

2019-10-01 Thread Marc Roos

Yes indeed cephfs and rgw backends. I think you can run into problems 
with a multi user environment of RGW and nfs-ganesha. I am not getting 
this working on Luminous. Your rgw config seems ok. 

Add file logging to debug rgw etc something like this.


LOG {
## Default log level for all components
# NULL, FATAL, MAJ, CRIT, WARN, EVENT, INFO, DEBUG, MID_DEBUG, 
M_DBG, FULL_DEBUG, F_DBG], default EVENT
default_log_level = INFO;
#default_log_level = DEBUG;

## Configure per-component log levels.
# ALL, 
LOG,LOG_EMERG,MEMLEAKS,FSAL,NFSPROTO(NFS3),NFS_V4(NFSV4),EXPORT,FILEHAND
LE,DISPATCH,CACHE_INODE,
# CACHE_INODE_LRU,HASHTABLE,HASHTABLE_CACHE,DUPREQ,INIT,MAIN, 
IDMAPPER,NFS_READDIR,NFS_V4_LOCK,CONFIG,CLIENTID,
# 
SESSIONS,PNFS,RW_LOCK,NLM,RPC,NFS_CB,THREAD,NFS_V4_ACL,STATE,9P,9P_DISPA
TCH,FSAL_UP,DBUS

Components {
ALL = WARN;
#ALL = DEBUG;
#FSAL = F_DBG;
#NFS4 = F_DBG;
#EXPORT = F_DBG;
#CONFIG = F_DBG;
}

## Where to log
#   Facility {
#   name = FILE;
#   destination = "/var/log/ganesha.log";
#   enable = default;
#   }
}

-Original Message-
Subject: Re: [ceph-users] NFS

Ganesha can export CephFS or RGW.  It cannot export anything else (like 
iscsi or RBD).  Config for RGW looks like this:

EXPORT
{
 Export_ID=1;
 Path = "/";
 Pseudo = "/rgw";
 Access_Type = RW;
 Protocols = 4;
 Transports = TCP;
 FSAL {
 Name = RGW;
 User_Id = "testuser";
 Access_Key_Id ="";
 Secret_Access_Key = "";
 }
}

RGW {
 ceph_conf = "//ceph.conf";
 # for vstart cluster, name = "client.admin"
 name = "client.rgw.foohost";
 cluster = "ceph";
#   init_args = "-d --debug-rgw=16";
}


Daniel

On 9/30/19 3:01 PM, Marc Roos wrote:
>   
> Just install these
> 
> http://download.ceph.com/nfs-ganesha/
> nfs-ganesha-rgw-2.7.1-0.1.el7.x86_64
> nfs-ganesha-vfs-2.7.1-0.1.el7.x86_64
> libnfsidmap-0.25-19.el7.x86_64
> nfs-ganesha-mem-2.7.1-0.1.el7.x86_64
> nfs-ganesha-xfs-2.7.1-0.1.el7.x86_64
> nfs-ganesha-2.7.1-0.1.el7.x86_64
> nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64
> 
> 
> And export your cephfs like this:
> EXPORT {
>  Export_Id = 10;
>  Path = /nfs/cblr-repos;
>  Pseudo = /cblr-repos;
>  FSAL { Name = CEPH; User_Id = "cephfs.nfs.cblr"; 
> Secret_Access_Key = "xxx"; }
>  Disable_ACL = FALSE;
>  CLIENT { Clients = 192.168.10.2; access_type = "RW"; }
>  CLIENT { Clients = 192.168.10.253; } }
> 
> 
> -Original Message-
> From: Brent Kennedy [mailto:bkenn...@cfl.rr.com]
> Sent: maandag 30 september 2019 20:56
> To: 'ceph-users'
> Subject: [ceph-users] NFS
> 
> Wondering if there are any documents for standing up NFS with an 
> existing ceph cluster.  We don’t use ceph-ansible or any other tools 
> besides ceph-deploy.  The iscsi directions were pretty good once I got 

> past the dependencies.
> 
>   
> 
> I saw the one based on Rook, but it doesn’t seem to apply to our 
setup 
> of ceph vms with physical hosts doing OSDs.  The official ceph 
> documents talk about using ganesha but doesn’t seem to dive into the 
> details of what the process is for getting it online.  We don’t use 
> cephfs, so that’s not setup either.  The basic docs seem to note this 
is required.
>   Seems my google-fu is failing me when I try to find a more 
> definitive guide.
> 
>   
> 
> The servers are all centos 7 with the latest updates.
> 
>   
> 
> Any guidance would be greatly appreciated!
> 
>   
> 
> Regards,
> 
> -Brent
> 
>   
> 
> Existing Clusters:
> 
> Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 
> iscsi gateways ( all virtual on nvme )
> 
> US Production(HDD): Nautilus 14.2.2 with 13 osd servers, 3 mons, 4 
> gateways, 2 iscsi gateways
> 
> UK Production(HDD): Nautilus 14.2.2 with 25 osd servers, 3 mons/man, 3 

> gateways behind
> 
> US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons/man, 3 
> gateways, 2 iscsi gateways
> 
>   
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] NFS

2019-09-30 Thread Marc Roos

 
Just install these

http://download.ceph.com/nfs-ganesha/
nfs-ganesha-rgw-2.7.1-0.1.el7.x86_64
nfs-ganesha-vfs-2.7.1-0.1.el7.x86_64
libnfsidmap-0.25-19.el7.x86_64
nfs-ganesha-mem-2.7.1-0.1.el7.x86_64
nfs-ganesha-xfs-2.7.1-0.1.el7.x86_64
nfs-ganesha-2.7.1-0.1.el7.x86_64
nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64


And export your cephfs like this:
EXPORT {
Export_Id = 10;
Path = /nfs/cblr-repos;
Pseudo = /cblr-repos;
FSAL { Name = CEPH; User_Id = "cephfs.nfs.cblr"; 
Secret_Access_Key = "xxx"; }
Disable_ACL = FALSE;
CLIENT { Clients = 192.168.10.2; access_type = "RW"; }
CLIENT { Clients = 192.168.10.253; }
}


-Original Message-
From: Brent Kennedy [mailto:bkenn...@cfl.rr.com] 
Sent: maandag 30 september 2019 20:56
To: 'ceph-users'
Subject: [ceph-users] NFS

Wondering if there are any documents for standing up NFS with an 
existing ceph cluster.  We don’t use ceph-ansible or any other tools 
besides ceph-deploy.  The iscsi directions were pretty good once I got 
past the dependencies.  

 

I saw the one based on Rook, but it doesn’t seem to apply to our setup 
of ceph vms with physical hosts doing OSDs.  The official ceph documents 
talk about using ganesha but doesn’t seem to dive into the details of 
what the process is for getting it online.  We don’t use cephfs, so 
that’s not setup either.  The basic docs seem to note this is required. 
 Seems my google-fu is failing me when I try to find a more definitive 
guide.

 

The servers are all centos 7 with the latest updates.

 

Any guidance would be greatly appreciated!

 

Regards,

-Brent

 

Existing Clusters:

Test: Nautilus 14.2.2 with 3 osd servers, 1 mon/man, 1 gateway, 2 iscsi 
gateways ( all virtual on nvme )

US Production(HDD): Nautilus 14.2.2 with 13 osd servers, 3 mons, 4 
gateways, 2 iscsi gateways

UK Production(HDD): Nautilus 14.2.2 with 25 osd servers, 3 mons/man, 3 
gateways behind

US Production(SSD): Nautilus 14.2.2 with 6 osd servers, 3 mons/man, 3 
gateways, 2 iscsi gateways

 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Commit and Apply latency on nautilus

2019-09-30 Thread Marc Roos



What parameters are you exactly using? I want to do a similar test on 
luminous, before I upgrade to Nautilus. I have quite a lot (74+)

type_instance=Osd.opBeforeDequeueOpLat
type_instance=Osd.opBeforeQueueOpLat
type_instance=Osd.opLatency
type_instance=Osd.opPrepareLatency
type_instance=Osd.opProcessLatency
type_instance=Osd.opRLatency
type_instance=Osd.opRPrepareLatency
type_instance=Osd.opRProcessLatency
type_instance=Osd.opRwLatency
type_instance=Osd.opRwPrepareLatency
type_instance=Osd.opRwProcessLatency
type_instance=Osd.opWLatency
type_instance=Osd.opWPrepareLatency
type_instance=Osd.opWProcessLatency
type_instance=Osd.subopLatency
type_instance=Osd.subopWLatency
...
...





-Original Message-
From: Alex Litvak [mailto:alexander.v.lit...@gmail.com] 
Sent: zondag 29 september 2019 13:06
To: ceph-users@lists.ceph.com
Cc: ceph-de...@vger.kernel.org
Subject: [ceph-users] Commit and Apply latency on nautilus

Hello everyone,

I am running a number of parallel benchmark tests against the cluster 
that should be ready to go to production.
I enabled prometheus to monitor various information and while cluster 
stays healthy through the tests with no errors or slow requests,
I noticed an apply / commit latency jumping between 40 - 600 ms on 
multiple SSDs.  At the same time op_read and op_write are on average 
below 0.25 ms in the worth case scenario.

I am running nautilus 14.2.2, all bluestore, no separate NVME devices 
for WAL/DB, 6 SSDs per node(Dell PowerEdge R440) with all drives Seagate 
Nytro 1551, osd spread across 6 nodes, running in 
containers.  Each node has plenty of RAM with utilization ~ 25 GB during 
the benchmark runs.

Here are benchmarks being run from 6 client systems in parallel, 
repeating the test for each block size in <4k,16k,128k,4M>.

On rbd mapped partition local to each client:

fio --name=randrw --ioengine=libaio --iodepth=4 --rw=randrw 
--bs=<4k,16k,128k,4M> --direct=1 --size=2G --numjobs=8 --runtime=300 
--group_reporting --time_based --rwmixread=70

On mounted cephfs volume with each client storing test file(s) in own 
sub-directory:

fio --name=randrw --ioengine=libaio --iodepth=4 --rw=randrw 
--bs=<4k,16k,128k,4M> --direct=1 --size=2G --numjobs=8 --runtime=300 
--group_reporting --time_based --rwmixread=70

dbench -t 30 30

Could you please let me know if huge jump in applied and committed 
latency is justified in my case and whether I can do anything to improve 
/ fix it.  Below is some additional cluster info.

Thank you,

root@storage2n2-la:~# podman exec -it ceph-mon-storage2n2-la ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZERAW USE DATAOMAPMETA AVAIL 
  %USE VAR  PGS STATUS
  6   ssd 1.74609  1.0 1.7 TiB  93 GiB  92 GiB 240 MiB  784 MiB 1.7 
TiB 5.21 0.90  44 up
12   ssd 1.74609  1.0 1.7 TiB  98 GiB  97 GiB 118 MiB  906 MiB 1.7 
TiB 5.47 0.95  40 up
18   ssd 1.74609  1.0 1.7 TiB 102 GiB 101 GiB 123 MiB  901 MiB 1.6 
TiB 5.73 0.99  47 up
24   ssd 3.49219  1.0 3.5 TiB 222 GiB 221 GiB 134 MiB  890 MiB 3.3 
TiB 6.20 1.07  96 up
30   ssd 3.49219  1.0 3.5 TiB 213 GiB 212 GiB 151 MiB  873 MiB 3.3 
TiB 5.95 1.03  93 up
35   ssd 3.49219  1.0 3.5 TiB 203 GiB 202 GiB 301 MiB  723 MiB 3.3 
TiB 5.67 0.98 100 up
  5   ssd 1.74609  1.0 1.7 TiB 103 GiB 102 GiB 123 MiB  901 MiB 1.6 
TiB 5.78 1.00  49 up
11   ssd 1.74609  1.0 1.7 TiB 109 GiB 108 GiB  63 MiB  961 MiB 1.6 
TiB 6.09 1.05  46 up
17   ssd 1.74609  1.0 1.7 TiB 104 GiB 103 GiB 205 MiB  819 MiB 1.6 
TiB 5.81 1.01  50 up
23   ssd 3.49219  1.0 3.5 TiB 210 GiB 209 GiB 168 MiB  856 MiB 3.3 
TiB 5.86 1.01  86 up
29   ssd 3.49219  1.0 3.5 TiB 204 GiB 203 GiB 272 MiB  752 MiB 3.3 
TiB 5.69 0.98  92 up
34   ssd 3.49219  1.0 3.5 TiB 198 GiB 197 GiB 295 MiB  729 MiB 3.3 
TiB 5.54 0.96  85 up
  4   ssd 1.74609  1.0 1.7 TiB 119 GiB 118 GiB  16 KiB 1024 MiB 1.6 
TiB 6.67 1.15  50 up
10   ssd 1.74609  1.0 1.7 TiB  95 GiB  94 GiB 183 MiB  841 MiB 1.7 
TiB 5.31 0.92  46 up
16   ssd 1.74609  1.0 1.7 TiB 102 GiB 101 GiB 122 MiB  902 MiB 1.6 
TiB 5.72 0.99  50 up
22   ssd 3.49219  1.0 3.5 TiB 218 GiB 217 GiB 109 MiB  915 MiB 3.3 
TiB 6.11 1.06  91 up
28   ssd 3.49219  1.0 3.5 TiB 198 GiB 197 GiB 343 MiB  681 MiB 3.3 
TiB 5.54 0.96  95 up
33   ssd 3.49219  1.0 3.5 TiB 198 GiB 196 GiB 297 MiB 1019 MiB 3.3 
TiB 5.53 0.96  85 up
  1   ssd 1.74609  1.0 1.7 TiB 101 GiB 100 GiB 222 MiB  802 MiB 1.6 
TiB 5.63 0.97  49 up
  7   ssd 1.74609  1.0 1.7 TiB 102 GiB 101 GiB 153 MiB  871 MiB 1.6 
TiB 5.69 0.99  46 up
13   ssd 1.74609  1.0 1.7 TiB 106 GiB 105 GiB  67 MiB  957 MiB 1.6 
TiB 5.96 1.03  42 up
19   ssd 3.49219  1.0 3.5 TiB 206 GiB 205 GiB 179 MiB  845 MiB 3.3 
TiB 5.77 1.00  83 up
25   ssd 3.49219  1.0 3.5 TiB 195 GiB 194 GiB 352 MiB  672 MiB 3.3 
TiB 5.45 0.94  97 up
31   ssd 3.49219  1.0 3.5 TiB 201 GiB 200 GiB 305 MiB  719 MiB 3.3 
TiB

Re: [ceph-users] Need advice with setup planning

2019-09-20 Thread Marc Roos

 >
 >> > -   Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in 
case
 >>
 >> I can use it later for storage)
 >>
 >> OS not? get enterprise ssd as os (I think some recommend it when
 >> colocating monitors, can generate a lot of disk io)
 >
 >Yes, OS. I have no option to get a SSD.
 
one samsung ssd sm863 of 240GB on ebay is 180us$. How much are your 2x 
hdd

 >
 >>
 >> > -   Install CentOS 7.7
 >>
 >> Good choice
 >>
 >> > -   Use 2 vLANs, one for ceph internal usage and another for 
external
 >>
 >> access. Since they've 4 network adapters, I'll try to bond them in 
pairs
 >> to speed up network (1Gb).
 >>
 >> Bad, get 10Gbit, yes really
 >
 >Again, that's not an option. We'll have to use the hardware we got.

Maybe you can try and convince ceph development to optimize for bonding 
on 1gbit
beware of this:  
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg35474.html
Make sure you test your requirements, because ceph is quite some 
overhead.

 >
 >>
 >> > -   I'll try to use ceph-ansible for installation. I failed to use 
it on
 >>
 >> lab, but it seems more recommended.
 >>
 >> Where did you get it from that ansible is recommended? Ansible is a 
tool
 >> to help you automate deployments, but I have the impression it is 
mostly
 >> used as a 'I do not know how to install something' so lets use 
ansible
 >> tool.
 >
 >From reaing various sites/guides for lab.
 >
 >>
 >> > -   Install Ceph Nautilus
 >>
 >> >
 >>
 >> > -   Each server will host OSD, MON, MGR and MDS.
 >>
 >> > -   One VM for ceph-admin: This wil be used to run ceph-ansible 
and
 >>
 >> maybe to host some ceph services later
 >>
 >> Don't waste a vm on this?
 >
 >You think it is a waste to have a VM for this? Won't I need another 
machine to host other ceph services?

I am not having a vm for ceph admin. Depends on what you are going to 
do, and eg. how much
 memory you have / are using. The thing to beware of is that you could 
get kernel deadlocks 
when running tasks on osd nodes. This is being prevented by using a vm. 
However this all depends on the availability of memory. I didn't 
encounter this 
and others are running also successfully afaik.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Need advice with setup planning

2019-09-20 Thread Marc Roos

 


 >- Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in case 
I can use it later for storage)

OS not? get enterprise ssd as os (I think some recommend it when 
colocating monitors, can generate a lot of disk io)

 >- Install CentOS 7.7

Good choice

 >- Use 2 vLANs, one for ceph internal usage and another for external 
access. Since they've 4 network adapters, I'll try to bond them in pairs 
to speed up network (1Gb).

Bad, get 10Gbit, yes really 

 >- I'll try to use ceph-ansible for installation. I failed to use it on 
lab, but it seems more recommended.

Where did you get it from that ansible is recommended? Ansible is a tool 
to help you automate deployments, but I have the impression it is mostly 
used as a 'I do not know how to install something' so lets use ansible 
tool.

 >- Install Ceph Nautilus
 >
 >- Each server will host OSD, MON, MGR and MDS.
 >- One VM for ceph-admin: This wil be used to run ceph-ansible and 
maybe to host some ceph services later

Don't waste a vm on this?

 >- I'll have to serve samba, iscsi and probably NFS too. Not sure how 
or on which servers.
 >

If you want to create a fancy solution, you can use something like mesos 
that manages your nfs,smb,iscsi or rgw daemons, so if you bring down a 
host, applications automatically move to a different host ;)

 





-Original Message-
From: Salsa [mailto:sa...@protonmail.com] 
Sent: vrijdag 20 september 2019 18:14
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Need advice with setup planning

I have tested Ceph using VMs but never got to put it to use and had a 
lot of trouble to get it to install.


Now I've been asked to do a production setup using 3 servers (Dell R740) 
with 12 4TB each.


My plan is this:

- Use 2 HDDs for SO using RAID 1 (I've left 3.5TB unallocated in case I 
can use it later for storage)

- Install CentOS 7.7

- Use 2 vLANs, one for ceph internal usage and another for external 
access. Since they've 4 network adapters, I'll try to bond them in pairs 
to speed up network (1Gb).

- I'll try to use ceph-ansible for installation. I failed to use it on 
lab, but it seems more recommended.

- Install Ceph Nautilus

- Each server will host OSD, MON, MGR and MDS.
- One VM for ceph-admin: This wil be used to run ceph-ansible and maybe 
to host some ceph services later
- I'll have to serve samba, iscsi and probably NFS too. Not sure how or 
on which servers.


Am I missing anything? Am I doing anything "wrong"?


I searched for some actual guidance on setup but I couldn't find 
anything complete, like a good tutorial or reference based on possible 
use-cases.


So, is there any suggestions you could share or links and references I 
should take a look?

Thanks;


--

Salsa


Sent with ProtonMail   Secure Email.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-09-20 Thread Marc Schöchlin

Hello Mike and Jason,

as described in my last mail i converted the filesystem to ext4, set "sysctl 
vm.dirty_background_ratio=0" and I put the regular workload on the filesystem 
(used as a NFS mount).
That seems so to prevent crashes for a entire week now (before this, the nbd 
device crashed after hours/~one day).

XFS on top of nbd devices really seems to add additional instability situations.

The current workaround causes very high cpu load (40-50 on a 4 cpu virtual 
system) and up to ~95% iowait if a single client puts a 20GB File on that 
volume.

What is your current state in correcting this problem?
Can we support you in testing the by running tests with custom kernel- or 
rbd-nbd builds?

Regards
Marc

Am 13.09.19 um 14:15 schrieb Marc Schöchlin:
>>> Nevertheless i will try EXT4 on another system.
> I converted the filesystem to a ext4 filesystem.
>
> I completely deleted the entire rbd ec image and its snapshots (3) and 
> recreated it.
> After mapping and mounting i executed the following command:
>
> sysctl vm.dirty_background_ratio=0
>
> Lets see, what we get now
>
> Regards
> Marc
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-09-13 Thread Marc Schöchlin

Hello Jason,

Am 12.09.19 um 16:56 schrieb Jason Dillaman:
> On Thu, Sep 12, 2019 at 3:31 AM Marc Schöchlin  wrote:
>
> Whats that, have we seen that before? ("Numerical argument out of domain")
> It's the error that rbd-nbd prints when the kernel prematurely closes
> the socket ... and as we have already discussed, it's closing the
> socket due to the IO timeout being hit ... and it's hitting the IO
> timeout due to a deadlock due to memory pressure from rbd-nbd causing
> IO to pushed from the XFS cache back down into rbd-nbd.
Okay.
>
>> I can try that, but i am skeptical, i am note sure that we are searching on 
>> the right place...
>>
>> Why?
>> - we run hundreds of heavy use rbd-nbd instances in our xen dom-0 systems 
>> for 1.5 years now
>> - we never experienced problems like that in xen dom0 systems
>> - as described these instances run 12.2.5 ceph components with kernel 
>> 4.4.0+10
>> - the domU (virtual machines) are interacting heavily with that dom0 are 
>> using various filesystems
>>-> probably the architecture of the blktap components leads to different 
>> io scenario : https://wiki.xenproject.org/wiki/Blktap
> Are you running a XFS (or any) file system on top of the NBD block
> device in dom0? I suspect you are just passing raw block devices to
> the VMs and therefore they cannot see the same IO back pressure
> feedback loop.

No, we do not make directly use of a filesystem in dom0 on thet nbd device.

Our scenrio is:
The xen dom0 maps the NBD devices and connects them via tapdisk to the 
blktap/blkback infrastructure.
(https://wiki.xenproject.org/wiki/File:Blktap$blktap_diagram_differentSymbols.png,
 you can ignore the right upper quadrant of the diagram - tapdisk just maps the 
nbd device)
The blktap/blkback in xen dom0 infrastructure is using the device channel 
(shared memory ring) to communicate with the vm (domU) using the blkfrnt 
infrastructure an vice versa.
The device is exposed as a /dev/xvd device. These devices are used by our 
virtualized systems as raw devices for disks (using partitions) of for lvm.

I do not know the xen internals, but I suppose that this usage scenario leads 
to homogenous sizes of io requests because it seems to be difficult to 
implement a ringlist using shared memory
Probably a situation which reduces the probability of rbd-nbd crashes 
dramatically.

>> Nevertheless i will try EXT4 on another system.

I converted the filesystem to a ext4 filesystem.

I completely deleted the entire rbd ec image and its snapshots (3) and 
recreated it.
After mapping and mounting i executed the following command:

sysctl vm.dirty_background_ratio=0

Lets see, what we get now

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph dovecot again

2019-09-13 Thread Marc Roos

 

How do I actually configure dovecot to use ceph for a mailbox? I have 
build the plugins as mentioned here[0] 

- but where do I copy/load what module?
- can I configure a specific mailbox only, via eg userdb:
  test3:x:8267:231:Account with special settings for 
dovecot:/home/popusers/test3:/bin/false:userdb_mail=mdbox:~/mdbox:INBOX=
/var/spool/mail/%u:INDEX=/var/dovecot/%u/index




[@test2 src]$ ls -lart librmb/.libs/

drwxrwxr-x 5 test test   4096 Sep 12 23:24 ..
-rw-rw-r-- 1 test test920 Sep 12 23:24 librmb.lai
lrwxrwxrwx 1 test test 12 Sep 12 23:24 librmb.la -> ../librmb.la
drwxrwxr-x 2 test test   4096 Sep 12 23:24 .

[@test2 src]$ ls -lart storage-rbox/.libs/
total 2076

-rwxrwxr-x 1 test test 387840 Sep 12 23:24 libstorage_rbox_plugin.so
drwxrwxr-x 4 test test   4096 Sep 12 23:24 ..
-rw-rw-r-- 1 test test   1071 Sep 12 23:24 libstorage_rbox_plugin.lai
lrwxrwxrwx 1 test test 28 Sep 12 23:24 libstorage_rbox_plugin.la -> 
../libstorage_rbox_plugin.la
drwxrwxr-x 2 test test   4096 Sep 12 23:24 .

[0]
https://github.com/ceph-dovecot/dovecot-ceph-plugin


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-09-12 Thread Marc Schöchlin

Hello Jason,

yesterday i started rbd-nbd in forground mode to see if there are any 
additional informations.

root@int-nfs-001:/etc/ceph# rbd-nbd map rbd_hdd/int-nfs-001_srv-ceph -d --id nfs
2019-09-11 13:07:41.444534 77fe1040  0 ceph version 12.2.12 
(1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable), process rbd-nbd, 
pid 14735
2019-09-11 13:07:41.444555 77fe1040  0 pidfile_write: ignore empty 
--pid-file
/dev/nbd0
-


2019-09-11 21:31:03.126223 7fffc3fff700 -1 rbd-nbd: failed to read nbd request 
header: (33) Numerical argument out of domain

Whats that, have we seen that before? ("Numerical argument out of domain")

Am 10.09.19 um 16:10 schrieb Jason Dillaman:
> [Tue Sep 10 14:46:51 2019]  ? __schedule+0x2c5/0x850
> [Tue Sep 10 14:46:51 2019]  kthread+0x121/0x140
> [Tue Sep 10 14:46:51 2019]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> [Tue Sep 10 14:46:51 2019]  ? kthread+0x121/0x140
> [Tue Sep 10 14:46:51 2019]  ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
> [Tue Sep 10 14:46:51 2019]  ? kthread_park+0x90/0x90
> [Tue Sep 10 14:46:51 2019]  ret_from_fork+0x35/0x40
> Perhaps try it w/ ext4 instead of XFS?

I can try that, but i am skeptical, i am note sure that we are searching on the 
right place...

Why?
- we run hundreds of heavy use rbd-nbd instances in our xen dom-0 systems for 
1.5 years now
- we never experienced problems like that in xen dom0 systems
- as described these instances run 12.2.5 ceph components with kernel 4.4.0+10
- the domU (virtual machines) are interacting heavily with that dom0 are using 
various filesystems
   -> probably the architecture of the blktap components leads to different io 
scenario : https://wiki.xenproject.org/wiki/Blktap

Nevertheless i will try EXT4 on another system.

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-09-10 Thread Marc Schöchlin

Hello Mike,

as described i set all the settings.

Unfortunately it crashed also with these settings :-(

Regards
Marc

[Tue Sep 10 12:25:56 2019] Btrfs loaded, crc32c=crc32c-intel
[Tue Sep 10 12:25:57 2019] EXT4-fs (dm-0): mounted filesystem with ordered data 
mode. Opts: (null)
[Tue Sep 10 12:25:59 2019] systemd[1]: systemd 237 running in system mode. 
(+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP 
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 
default-hierarchy=hybrid)
[Tue Sep 10 12:25:59 2019] systemd[1]: Detected virtualization xen.
[Tue Sep 10 12:25:59 2019] systemd[1]: Detected architecture x86-64.
[Tue Sep 10 12:25:59 2019] systemd[1]: Set hostname to .
[Tue Sep 10 12:26:01 2019] systemd[1]: Started ntp-systemd-netif.path.
[Tue Sep 10 12:26:01 2019] systemd[1]: Created slice System Slice.
[Tue Sep 10 12:26:01 2019] systemd[1]: Listening on udev Kernel Socket.
[Tue Sep 10 12:26:01 2019] systemd[1]: Created slice 
system-serial\x2dgetty.slice.
[Tue Sep 10 12:26:01 2019] systemd[1]: Listening on Journal Socket.
[Tue Sep 10 12:26:01 2019] systemd[1]: Mounting POSIX Message Queue File 
System...
[Tue Sep 10 12:26:01 2019] RPC: Registered named UNIX socket transport module.
[Tue Sep 10 12:26:01 2019] RPC: Registered udp transport module.
[Tue Sep 10 12:26:01 2019] RPC: Registered tcp transport module.
[Tue Sep 10 12:26:01 2019] RPC: Registered tcp NFSv4.1 backchannel transport 
module.
[Tue Sep 10 12:26:01 2019] EXT4-fs (dm-0): re-mounted. Opts: errors=remount-ro
[Tue Sep 10 12:26:01 2019] Loading iSCSI transport class v2.0-870.
[Tue Sep 10 12:26:01 2019] iscsi: registered transport (tcp)
[Tue Sep 10 12:26:01 2019] systemd-journald[497]: Received request to flush 
runtime journal from PID 1
[Tue Sep 10 12:26:01 2019] Installing knfsd (copyright (C) 1996 
o...@monad.swb.de).
[Tue Sep 10 12:26:01 2019] iscsi: registered transport (iser)
[Tue Sep 10 12:26:01 2019] systemd-journald[497]: File 
/var/log/journal/cef15a6d1b80c9fbcb31a3a65aec21ad/system.journal corrupted or 
uncleanly shut down, renaming and replacing.
[Tue Sep 10 12:26:04 2019] EXT4-fs (dm-1): mounted filesystem with ordered data 
mode. Opts: (null)
[Tue Sep 10 12:26:05 2019] EXT4-fs (xvda1): mounted filesystem with ordered 
data mode. Opts: (null)
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.659:2): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="/usr/bin/lxc-start" pid=902 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.675:3): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="/usr/bin/man" pid=904 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.675:4): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="man_filter" pid=904 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.675:5): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="man_groff" pid=904 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.687:6): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="lxc-container-default" pid=900 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.687:7): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="lxc-container-default-cgns" pid=900 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.687:8): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="lxc-container-default-with-mounting" pid=900 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.687:9): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="lxc-container-default-with-nesting" pid=900 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.723:10): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="/usr/lib/snapd/snap-confine" pid=905 comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] audit: type=1400 audit(156866.723:11): 
apparmor="STATUS" operation="profile_load" profile="unconfined" 
name="/usr/lib/snapd/snap-confine//mount-namespace-capture-helper" pid=905 
comm="apparmor_parser"
[Tue Sep 10 12:26:06 2019] new mount options do not match the existing 
superblock, will be ignored
[Tue Sep 10 12:26:09 2019] SGI XFS with ACLs, security attributes, realtime, no 
debug

Re: [ceph-users] reproducible rbd-nbd crashes

2019-09-10 Thread Marc Schöchlin

Hello Mike,

Am 03.09.19 um 04:41 schrieb Mike Christie:
> On 09/02/2019 06:20 AM, Marc Schöchlin wrote:
>> Hello Mike,
>>
>> i am having a quick look  to this on vacation because my coworker
>> reports daily and continuous crashes ;-)
>> Any updates here (i am aware that this is not very easy to fix)?
> I am still working on it. It basically requires rbd-nbd to be written so
> it preallocates its memory used for IO, and when it can't like when
> doing network IO it requires adding a interface to tell the kernel to
> not use allocation flags that can cause disk IO back on to the device.
>
> There are some workraounds like adding more memory and setting the vm
> values. For the latter, if it seems if you set:
>
> vm.dirty_background_ratio = 0 then it looks like it avoids the problem
> because the kernel will immediately start to write dirty pages from the
> background worker threads, so we do not end up later needing to write
> out pages from the rbd-nbd thread to free up memory.

Sigh, I set this yesterday on my system ("sysctl vm.dirty_background_ratio=0") 
and got an additional crash this night :-(

I now restarted the system and invoked all of the following commands mentioned 
by your last mail:

sysctl vm.dirty_background_ratio=0
sysctl vm.dirty_ratio=0
sysctl vm.vfs_cache_pressure=0

Let's see if that helps

Regards

Marc


Am 03.09.19 um 04:41 schrieb Mike Christie:
> On 09/02/2019 06:20 AM, Marc Schöchlin wrote:
>> Hello Mike,
>>
>> i am having a quick look  to this on vacation because my coworker
>> reports daily and continuous crashes ;-)
>> Any updates here (i am aware that this is not very easy to fix)?
> I am still working on it. It basically requires rbd-nbd to be written so
> it preallocates its memory used for IO, and when it can't like when
> doing network IO it requires adding a interface to tell the kernel to
> not use allocation flags that can cause disk IO back on to the device.
>
> There are some workraounds like adding more memory and setting the vm
> values. For the latter, if it seems if you set:
>
> vm.dirty_background_ratio = 0 then it looks like it avoids the problem
> because the kernel will immediately start to write dirty pages from the
> background worker threads, so we do not end up later needing to write
> out pages from the rbd-nbd thread to free up memory.
>
> or
>
> vm.dirty_ratio = 0 then it looks like it avoids the problem because the
> kernel will just write out the data right away similar to above, but
> from its normally going to be written out from the thread that you are
> running your test from.
>
> and this seems optional and can result in other problems:
>
> vm.vfs_cache_pressure = 0 then for at least XFS it looks like we avoid
> one of the immediate problems where allocations would always cause the
> inode caches to be reclaimed and that memory to be written out to the
> device. For EXT4, I did not see a similar issue.
>
>> I think the severity of this problem
>> <https://tracker.ceph.com/issues/40822> (currently "minor") is not
>> suitable to the consequences of this problem.
>>
>> This reproducible problem can cause:
>>
>>   * random service outage
>>   * data corruption
>>   * long recovery procedures on huge filesystems
>>
>> Is it adequate to increase the severity to major or critical?
>>
>> What might the reason for a very reliable rbd-nbd running on my xen
>> servers as storage repository?
>> (see https://github.com/vico-research-and-consulting/RBDSR/tree/v2.0 -
>> hundreds of devices, high workload)
>>
>> Regards
>> Marc
>>
>> Am 15.08.19 um 20:07 schrieb Marc Schöchlin:
>>> Hello Mike,
>>>
>>> Am 15.08.19 um 19:57 schrieb Mike Christie:
>>>>> Don't waste your time. I found a way to replicate it now.
>>>>>
>>>> Just a quick update.
>>>>
>>>> Looks like we are trying to allocate memory in the IO path in a way that
>>>> can swing back on us, so we can end up locking up. You are probably not
>>>> hitting this with krbd in your setup because normally it's preallocating
>>>> structs, using flags like GFP_NOIO, etc. For rbd-nbd, we cannot
>>>> preallocate some structs and cannot control the allocation flags for
>>>> some operations initiated from userspace, so its possible to hit this
>>>> every IO. I can replicate this now in a second just doing a cp -r.
>>>>
>>>> It's not going to be a simple fix. We have had a similar issue for
>>>> storage daemons like iscsid and multipathd since they were created. It's
>>>> le

[ceph-users] Ceph performance paper

2019-08-20 Thread Marc Roos

 
Hi Vitaliy, just saw you recommend someone to use ssd, and wanted to use 
the oppurtunaty to thank you for composing this text[0], enoyed reading 
it. 

- What do you mean with: bad-SSD-only?
- Is this patch[1] in a Nautilus release?


[0]
https://yourcmc.ru/wiki/Ceph_performance

[1]
https://github.com/ceph/ceph/pull/26909
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-08-15 Thread Marc Schöchlin

Hello Mike,

Am 15.08.19 um 19:57 schrieb Mike Christie:
>
>> Don't waste your time. I found a way to replicate it now.
>>
>
> Just a quick update.
>
> Looks like we are trying to allocate memory in the IO path in a way that
> can swing back on us, so we can end up locking up. You are probably not
> hitting this with krbd in your setup because normally it's preallocating
> structs, using flags like GFP_NOIO, etc. For rbd-nbd, we cannot
> preallocate some structs and cannot control the allocation flags for
> some operations initiated from userspace, so its possible to hit this
> every IO. I can replicate this now in a second just doing a cp -r.
>
> It's not going to be a simple fix. We have had a similar issue for
> storage daemons like iscsid and multipathd since they were created. It's
> less likey to hit with them because you only hit the paths they cannot
> control memory allocation behavior during recovery.
>
> I am looking into some things now.

Great to hear, that the problem is now identified.

As described I'm on vacation -  if you need anything after the 8.9. we can 
probably invest some time to test upcoming fixes.

Regards
Marc


-- 
GPG encryption available: 0x670DCBEC/pool.sks-keyservers.net

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] reproducible rbd-nbd crashes

2019-08-14 Thread Marc Schöchlin

44
fs.file-max = 100
kernel.pid_max = 4194303
vm.zone_reclaim_mode = 0
kernel.randomize_va_space = 0
kernel.panic = 0
kernel.panic_on_oops = 0


>> I'm not sure which test it covers above, but for
>> test-with-timeout/ceph-client.archiv.log and dmesg-crash it looks like
>> the command that probably triggered the timeout got stuck in safe_write
>> or write_fd, because we see:
>>
>> // Command completed and right after this log message we try to write
>> the reply and data to the nbd.ko module.
>>
>> 2019-07-29 21:55:21.148118 7fffbf7fe700 20 rbd-nbd: writer_entry: got:
>> [4500 READ 24043755000~2 0]
>>
>> // We got stuck and 2 minutes go by and so the timeout fires. That kills
>> the socket, so we get an error here and after that rbd-nbd is going to exit.
>>
>> 2019-07-29 21:57:21.785111 7fffbf7fe700 -1 rbd-nbd: [4500
>> READ 24043755000~2 0]: failed to write replay data: (32) Broken pipe
>>
>> We could hit this in a couple ways:
>>
>> 1. The block layer sends a command that is larger than the socket's send
>> buffer limits. These are those values you sometimes set in sysctl.conf like:
>>
>> net.core.rmem_max
>> net.core.wmem_max
>> net.core.rmem_default
>> net.core.wmem_default
>> net.core.optmem_max
see attached file.
>> There does not seem to be any checks/code to make sure there is some
>> alignment with limits. I will send a patch but that will not help you
>> right now. The max io size for nbd is 128k so make sure your net values
>> are large enough. Increase the values in sysctl.conf and retry if they
>> were too small.
> Not sure what I was thinking. Just checked the logs and we have done IO
> of the same size that got stuck and it was fine, so the socket sizes
> should be ok.
>
> We still need to add code to make sure IO sizes and the af_unix sockets
> size limits match up.
>
>
>> 2. If memory is low on the system, we could be stuck trying to allocate
>> memory in the kernel in that code path too.
memory was definitely not low, we only had 10% memory usage at the time of the 
crash.
>> rbd-nbd just uses more memory per device, so it could be why we do not
>> see a problem with krbd.
>>
>> 3. I wonder if we are hitting a bug with PF_MEMALLOC Ilya hit with krbd.
>> He removed that code from the krbd. I will ping him on that.

Interesting. I activated Coredumps for that processes - probably we can find 
something interesting here...

Regards
Marc




sysctl_settings.txt.gz
Description: application/gzip
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-08-13 Thread Marc Schöchlin

Hello Jason,

thanks for your response.
See my inline comments.

Am 31.07.19 um 14:43 schrieb Jason Dillaman:
> On Wed, Jul 31, 2019 at 6:20 AM Marc Schöchlin  wrote:
>
>
> The problem not seems to be related to kernel releases, filesystem types or 
> the ceph and network setup.
> Release 12.2.5 seems to work properly, and at least releases >= 12.2.10 seems 
> to have the described problem.
>  ...
>
> It's basically just a log message tweak and some changes to how the
> process is daemonized. If you could re-test w/ each release after
> 12.2.5 and pin-point where the issue starts occurring, we would have
> something more to investigate.

Are there changes related to https://tracker.ceph.com/issues/23891?


You showed me the very low amount of changes in rbd-nbd.
What about librbd, librados, ...?

What else can we do to find a detailed reason for the crash?
Do you think it is useful to activate coredump-creation for that process?

>> Whats next? Is i a good idea to do a binary search between 12.2.12 and 
>> 12.2.5?
>>
Due to the absence of a coworker i almost had no capacity to execute deeper 
tests with this problem.
But i can say that in reproduced the problem also with release 12.2.12.

The new (updated) list:

- SUCCESSFUL: kernel 4.15, ceph 12.2.5, 1TB ec-volume, ext4 file system, 120s 
device timeout
  -> 18 hour testrun was successful, no dmesg output
- FAILED: kernel 4.4, ceph 12.2.11, 2TB ec-volume, xfs file system, 120s device 
timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created without reboot
  -> parallel krbd device usage with 99% io usage worked without a problem 
while running the test
- FAILED: kernel 4.15, ceph 12.2.11, 2TB ec-volume, xfs file system, 120s 
device timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created
  -> parallel krbd device usage with 99% io usage worked without a problem 
while running the test
- FAILED: kernel 4.4, ceph 12.2.11, 2TB ec-volume, xfs file system, no timeout
  -> failed after < 10 minutes
  -> system runs in a high system load, system is almost unusable, unable to 
shutdown the system, hard reset of vm necessary, manual exclusive lock removal 
is necessary before remapping the device
- FAILED: kernel 4.4, ceph 12.2.11, 2TB 3-replica-volume, xfs file system, 120s 
device timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created
*- FAILED: kernel 5.0, ceph 12.2.12, 2TB ec-volume, ext4 file system, 120s 
device timeout-> failed after < 1 hour, rbd-nbd map/device is gone, mount 
throws io errors, map/mount can be re-created*

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducible rbd-nbd crashes

2019-07-31 Thread Marc Schöchlin

Hello Jason,

it seems that there is something wrong in the rbd-nbd implementation.
(added this information also at  https://tracker.ceph.com/issues/40822)

The problem not seems to be related to kernel releases, filesystem types or the 
ceph and network setup.
Release 12.2.5 seems to work properly, and at least releases >= 12.2.10 seems 
to have the described problem.

This night a 18 hour testrun with the following procedure was successful:
-
#!/bin/bash
set -x
while true; do
   date
   find /srv_ec -type f -name "*.MYD" -print0 |head -n 50|xargs -0 -P 10 -n 2 
gzip -v
   date
   find /srv_ec -type f -name "*.MYD.gz" -print0 |head -n 50|xargs -0 -P 10 -n 
2 gunzip -v
done
-
Previous tests crashed in a reproducible manner with "-P 1" (single io 
gzip/gunzip) after a few minutes up to 45 minutes.

Overview of my tests:

- SUCCESSFUL: kernel 4.15, ceph 12.2.5, 1TB ec-volume, ext4 file system, 120s 
device timeout
  -> 18 hour testrun was successful, no dmesg output
- FAILED: kernel 4.4, ceph 12.2.11, 2TB ec-volume, xfs file system, 120s device 
timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created without reboot
  -> parallel krbd device usage with 99% io usage worked without a problem 
while running the test
- FAILED: kernel 4.15, ceph 12.2.11, 2TB ec-volume, xfs file system, 120s 
device timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created
  -> parallel krbd device usage with 99% io usage worked without a problem 
while running the test
- FAILED: kernel 4.4, ceph 12.2.11, 2TB ec-volume, xfs file system, no timeout
  -> failed after < 10 minutes
  -> system runs in a high system load, system is almost unusable, unable to 
shutdown the system, hard reset of vm necessary, manual exclusive lock removal 
is necessary before remapping the device
- FAILED: kernel 4.4, ceph 12.2.11, 2TB 3-replica-volume, xfs file system, 120s 
device timeout
  -> failed after < 1 hour, rbd-nbd map/device is gone, mount throws io errors, 
map/mount can be re-created

All device timeouts were set separately set by the nbd_set_ioctl tool because 
luminous rbd-nbd does not provide the possibility to define timeouts.

Whats next? Is i a good idea to do a binary search between 12.2.12 and 12.2.5?

From my point of view (without in depth-knowledge of rbd-nbd/librbd) my 
assumption is that this problem might be caused by rbd-nbd code and not by 
librbd.
The probability that a bug like this survives uncovered in librbd for such a 
long time seems to be low for me :-)

Regards
Marc

Am 29.07.19 um 22:25 schrieb Marc Schöchlin:
> Hello Jason,
>
> i updated the ticket https://tracker.ceph.com/issues/40822
>
> Am 24.07.19 um 19:20 schrieb Jason Dillaman:
>> On Wed, Jul 24, 2019 at 12:47 PM Marc Schöchlin  wrote:
>>> Testing with a 10.2.5 librbd/rbd-nbd ist currently not that easy for me, 
>>> because the ceph apt source does not contain that version.
>>> Do you know a package source?
>> All the upstream packages should be available here [1], including 12.2.5.
> Ah okay, i will test this tommorow.
>> Did you pull the OSD blocked ops stats to figure out what is going on
>> with the OSDs?
> Yes, see referenced data in the ticket 
> https://tracker.ceph.com/issues/40822#note-15
>
> Regards
> Marc
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-29 Thread Marc Schöchlin

Hello Jason,

i updated the ticket https://tracker.ceph.com/issues/40822

Am 24.07.19 um 19:20 schrieb Jason Dillaman:
> On Wed, Jul 24, 2019 at 12:47 PM Marc Schöchlin  wrote:
>>
>> Testing with a 10.2.5 librbd/rbd-nbd ist currently not that easy for me, 
>> because the ceph apt source does not contain that version.
>> Do you know a package source?
> All the upstream packages should be available here [1], including 12.2.5.
Ah okay, i will test this tommorow.
> Did you pull the OSD blocked ops stats to figure out what is going on
> with the OSDs?
Yes, see referenced data in the ticket 
https://tracker.ceph.com/issues/40822#note-15

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-24 Thread Marc Schöchlin

Hi Jason,

i installed kernel 4.4.0-154.181 (from ubuntu package sources) and performed 
the crash reproduction.
The problem also re-appeared with that kernel release.

A gunzip with 10 gunzip processes throwed 1600 write and 330 read IOPS against 
the cluster/the rbd_ec volume with a transfer rate of 290MB/sec for 10 Minutes.
After that the same problem re-appeared.

What should we do now?

Testing with a 10.2.5 librbd/rbd-nbd ist currently not that easy for me, 
because the ceph apt source does not contain that version.
Do you know a package source?

How can i support you?

Regards
Marc

Am 24.07.19 um 07:55 schrieb Marc Schöchlin:
> Hi Jason,
>
> Am 24.07.19 um 00:40 schrieb Jason Dillaman:
>>> Sure, which kernel do you prefer?
>> You said you have never had an issue w/ rbd-nbd 12.2.5 in your Xen 
>> environment. Can you use a matching kernel version? 
>
> Thats true, our virtual machines of our xen environments completly run on 
> rbd-nbd devices.
> Every host runs dozends of rbd-nbd maps which are visible as xen disks in the 
> virtual systems.
> (https://github.com/vico-research-and-consulting/RBDSR)
>
> It seems that xenserver has a special behavior with device timings because 
> 1.5 years ago we had a outage of 1.5 hours of our ceph cluster which blocked 
> all write requests
> (overfull disks because of huge usage growth). In this situation all 
> virtualmachines continue their work without problems after the cluster was 
> back.
> We haven't set any timeouts using nbd_set_timeout.c on these systems.
>
> We never experienced problems with these rbd-nbd instances.
>
> [root@xen-s31 ~]# rbd nbd ls
> pid   pool   image
>     snap device
> 10405 RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-72f4e61d-acb9-4679-9b1d-fe0324cb5436 -    /dev/nbd3 
> 12731 RBD_XenStorage-PROD-SSD-2-edcf45e6-ca5b-43f9-bafe-c553b1e5dd84 
> RBD-88f8889a-05dc-49ab-a7de-8b5f3961f9c9 -    /dev/nbd4 
> 13123 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-37243066-54b0-453a-8bf3-b958153a680d -    /dev/nbd5 
> 15342 RBD_XenStorage-PROD-SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5 
> RBD-2bee9bf7-4fed-4735-a749-2d4874181686 -    /dev/nbd6 
> 15702 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-5b93eb93-ebe7-4711-a16a-7893d24c1bbf -    /dev/nbd7 
> 27568 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-616a74b5-3f57-4123-9505-dbd4c9aa9be3 -    /dev/nbd8 
> 21112 RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-5c673a73-7827-44cc-802c-8d626da2f401 -    /dev/nbd9 
> 15726 RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-1069a275-d97f-48fd-9c52-aed1d8ac9eab -    /dev/nbd10
> 4368  RBD_XenStorage-PROD-SSD-2-edcf45e6-ca5b-43f9-bafe-c553b1e5dd84 
> RBD-23b72184-0914-4924-8f7f-10868af7c0ab -    /dev/nbd11
> 4642  RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-bf13cf77-6115-466e-85c5-aa1d69a570a0 -    /dev/nbd12
> 9438  RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-a2071aa0-5f63-4425-9f67-1713851fc1ca -    /dev/nbd13
> 29191 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-fd9a299f-dad9-4ab9-b6c9-2e9650cda581 -    /dev/nbd14
> 4493  RBD_XenStorage-PROD-SSD-2-edcf45e6-ca5b-43f9-bafe-c553b1e5dd84 
> RBD-1bbb4135-e9ed-4720-a41a-a49b998faf42 -    /dev/nbd15
> 4683  RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-374cadac-d969-49eb-8269-aa125cba82d8 -    /dev/nbd16
> 1736  RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-478a20cc-58dd-4cd9-b8b1-6198014e21b1 -    /dev/nbd17
> 3648  RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-6e28ec15-747a-43c9-998d-e9f2a600f266 -    /dev/nbd18
> 9993  RBD_XenStorage-PROD-SSD-2-edcf45e6-ca5b-43f9-bafe-c553b1e5dd84 
> RBD-61ae5ef3-9efb-4fe6-8882-45d54558313e -    /dev/nbd19
> 10324 RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-f7d27673-c268-47b9-bd58-46dcd4626bbb -    /dev/nbd20
> 19330 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-0d4e5568-ac93-4f27-b24f-6624f2fa4a2b -    /dev/nbd21
> 14942 RBD_XenStorage-PROD-SSD-1-cb933ab7-a006-4046-a012-5cbe0c5fbfb5 
> RBD-69832522-fd68-49f9-810f-485947ff5e44 -    /dev/nbd22
> 20859 RBD_XenStorage-PROD-HDD-2-08fdb4aa-81e3-433a-87d7-d5b37012a282 
> RBD-5025b066-723e-48f5-bc4e-9b8bdc1e9326 -    /dev/nbd23
> 19247 RBD_XenStorage-PROD-HDD-1-2d80bec4-0f74-4553-9d87-5ccf650c87a0 
> RBD-095292a0-6cc2-4112-95bf-15cb3dd33e9a -    /dev/nbd24
> 22356 RBD_XenStorage-PROD-SSD-2-edcf45e6-ca5b-43f9-bafe-c553b1e5dd84 
> RBD-f8229ea0-ad7b-4034-9cbe-7353792a2b7c -    /dev/

Re: [ceph-users] Questions regarding backing up Ceph

2019-07-24 Thread Marc Roos

 

>
> complete DR with Ceph to restore it back to how it was at a given 
point in time is a challenge.

>
> Trying to backup a Ceph cluster sounds very 'enterprise' and is 
difficult to scale as well.

Hmmm, I was actually also curious how backups were done, especially on 
these clusters that have 300, 600 or even more osd's. 

Is it common to do backups 'within' the ceph cluster, eg with snapshots? 
Different pool? If this is common with most ceph clusters, would that 
not also increase security requirements for the development? 

Recently I read a cloud hosting provider Insynq lost all their data[0] 
One remote exploit in ceph, and TB/PB of data could be at risk. 



[0]
https://www.insynq.com/support/#status


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-23 Thread Marc Schöchlin

.2.5-0.el7.x86_64
rbd-nbd-12.2.5-0.el7.x86_64

Therefore i will try to use a 4.4 release - but i suppose that there some 
patch-differences between my ubuntu 4.4 and xenserver 4.4 kernel
I will test with "4.4.0-154".

Regards
Marc


>
>> I can test with following releases:
>>
>> # apt-cache search linux-image-4.*.*.*-*-generic 2>&1|sed 
>> '~s,\.[0-9]*-[0-9]*-*-generic - .*,,;~s,linux-image-,,'|sort -u
>> 4.10
>> 4.11
>> 4.13
>> 4.15
>> 4.4
>> 4.8
>>
>> We can also perform tests by using another filesystem (i.e. ext4).
>>
>> From my point of view i suppose that there is something wrong nbd.ko or with 
>> rbd-nbd (excluding rbd-cache functionality) - therefore i do not think that 
>> this very promising
> Agreed. I would also attempt to see if you have blocked ops on the OSD during 
> these events (see Mykola’s ticket comment).
>
>> Regards
>> Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-23 Thread Marc Schöchlin

Hi Jason,

Am 23.07.19 um 14:41 schrieb Jason Dillaman
> Can you please test a consistent Ceph release w/ a known working
> kernel release? It sounds like you have changed two variables, so it's
> hard to know which one is broken. We need *you* to isolate what
> specific Ceph or kernel release causes the break.
Sure, lets find the origin of this problem. :-)
>
> We really haven't made many changes to rbd-nbd, but the kernel has had
> major changes to the nbd driver. As Mike pointed out on the tracker
> ticket, one of those major changes effectively capped the number of
> devices at 256. Can you repeat this with a single device? 


Definitely, the problematic rbd-nbd runs on a virtual system which only 
utilizes one single nbd and one single krbd device.

To be clear:

# lsb_release -d
Description:    Ubuntu 16.04.5 LTS

# uname -a
Linux archiv-001 4.15.0-45-generic #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 
UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# rbd nbd ls
pid    pool    image snap device   
626931 rbd_hdd archiv-001-srv_ec -    /dev/nbd0 

# rbd showmapped
id pool    image  snap device   
0  rbd_hdd archiv-001_srv -    /dev/rbd0 

# df -h|grep -P "File|nbd|rbd"
Filesystem    Size  Used Avail Use% Mounted on
/dev/rbd0  32T   31T  1.8T  95% /srv
/dev/nbd0 3.0T  1.3T  1.8T  42% /srv_ec

#  mount|grep -P "nbd|rbd"
/dev/rbd0 on /srv type xfs 
(rw,relatime,attr2,largeio,inode64,allocsize=4096k,logbufs=8,logbsize=256k,sunit=8192,swidth=8192,noquota,_netdev)
/dev/nbd0 on /srv_ec type xfs 
(rw,relatime,attr2,discard,largeio,inode64,allocsize=4096k,logbufs=8,logbsize=256k,noquota,_netdev)

# dpkg -l|grep -P "rbd|ceph"
ii  ceph-common   12.2.11-1xenial   
 amd64    common utilities to mount and interact with a ceph storage 
cluster
ii  libcephfs2    12.2.11-1xenial   
 amd64    Ceph distributed file system client library
ii  librbd1   12.2.11-1xenial   
 amd64    RADOS block device client library
ii  python-cephfs 12.2.11-1xenial   
 amd64    Python 2 libraries for the Ceph libcephfs library
ii  python-rbd    12.2.11-1xenial   
 amd64    Python 2 libraries for the Ceph librbd library
ii  rbd-nbd   12.2.12-1xenial   
 amd64    NBD-based rbd client for the Ceph distributed file system

More details regarding the problem environment can be gathered in my initial 
mail below the description"Environment". 
> Can you
> repeat this on Ceph rbd-nbd 12.2.11 with an older kernel?

Sure, which kernel do you prefer?

I can test with following releases:

# apt-cache search linux-image-4.*.*.*-*-generic 2>&1|sed 
'~s,\.[0-9]*-[0-9]*-*-generic - .*,,;~s,linux-image-,,'|sort -u
4.10
4.11
4.13
4.15
4.4
4.8

We can also perform tests by using another filesystem (i.e. ext4).

From my point of view i suppose that there is something wrong nbd.ko or with 
rbd-nbd (excluding rbd-cache functionality) - therefore i do not think that 
this very promising

Regards
Marc
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-23 Thread Marc Schöchlin



Am 23.07.19 um 07:28 schrieb Marc Schöchlin:
>
> Okay, i already experimented with high timeouts (i.e 600 seconds). As i can 
> remember this leaded to pretty unusable system if i put high amounts of io on 
> the ec volume.
> This system also runs als krbd volume which saturates the system with ~30-60% 
> iowait - this volume never had a problem.
>
> A comment writer in https://tracker.ceph.com/issues/40822#change-141205 
> suggests me to reduce the rbd cache.
> What do you think about that?

Test with reduce rbd cache still fail, therefore i made tests with disabled rbd 
cache:

- i disabled rbd cache with "rbd cache = false"
- unmounted and unmapped the image
- mapped and mounted the image
- re-executed my test'
   find /srv_ec type f -name "*.sql" -exec gzip -v {} \;


It took several hours, but at the end i have the same error situation.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-22 Thread Marc Schöchlin

Hi Mike,

Am 22.07.19 um 16:48 schrieb Mike Christie:
> On 07/22/2019 06:00 AM, Marc Schöchlin wrote:
>>> With older kernels no timeout would be set for each command by default,
>>> so if you were not running that tool then you would not see the nbd
>>> disconnect+io_errors+xfs issue. You would just see slow IOs.
>>>
>>> With newer kernels, like 4.15, nbd.ko always sets a per command timeout
>>> even if you do not set it via a nbd ioctl/netlink command. By default
>>> the timeout is 30 seconds. After the timeout period then the kernel does
>>> that disconnect+IO_errors error handling which causes xfs to get errors.
>>>
>> Did i get you correctly: Setting a unlimited timeout should prevent crashes 
>> on kernel 4.15?
> It looks like with newer kernels there is no way to turn it off.
>
> You can set it really high. There is no max check and so it depends on
> various calculations and what some C types can hold and how your kernel
> is compiled. You should be able to set the timer to an hour.

Okay, i already experimented with high timeouts (i.e 600 seconds). As i can 
remember this leaded to pretty unusable system if i put high amounts of io on 
the ec volume.
This system also runs als krbd volume which saturates the system with ~30-60% 
iowait - this volume never had a problem.

A comment writer in https://tracker.ceph.com/issues/40822#change-141205 
suggests me to reduce the rbd cache.
What do you think about that?

>
>> For testing purposes i set the timeout to unlimited ("nbd_set_ioctl 
>> /dev/nbd0 0", on already mounted device).
>> I re-executed the problem procedure and discovered that the 
>> compression-procedure crashes not at the same file, but crashes 30 seconds 
>> later with the same crash behavior.
>>
> 0 will cause the default timeout of 30 secs to be used.

Okay, then the usage description of 
https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c not seems 
to be correct :-)

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-22 Thread Marc Schöchlin

Hi Mike,

Am 22.07.19 um 17:01 schrieb Mike Christie:
> On 07/19/2019 02:42 AM, Marc Schöchlin wrote:
>> We have ~500 heavy load rbd-nbd devices in our xen cluster (rbd-nbd 12.2.5, 
>> kernel 4.4.0+10, centos clone) and ~20 high load krbd devices (kernel 
>> 4.15.0-45, ubuntu 16.04) - we never experienced problems like this.
> For this setup, do you have 257 or more rbd-nbd devices running on a
> single system?
No, these rbd-nbds are distributed over more than a dozen of xen dom-0 systems 
on our xenservers.
> If so then you are hitting another bug where newer kernels only support
> 256 devices. It looks like a regression was added when mq and netlink
> support was added upstream. You can create more then 256 devices, but
> some devices will not be able to execute any IO. Commands sent to the
> rbd-nbd device are going to always timeout and you will see the errors
> in your log.
>
> I am testing some patches for that right now.

From my point of view there is no limitation besides io from ceph cluster 
perspective.

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-22 Thread Marc Schöchlin

Hello Mike,

i attached inline comments.

Am 19.07.19 um 22:20 schrieb Mike Christie:
>
>> We have ~500 heavy load rbd-nbd devices in our xen cluster (rbd-nbd 12.2.5, 
>> kernel 4.4.0+10, centos clone) and ~20 high load krbd devices (kernel 
>> 4.15.0-45, ubuntu 16.04) - we never experienced problems like this.
>> We only experience problems like this with rbd-nbd > 12.2.5 on ubuntu 16.04 
>> (kernel 4.15) or ubuntu 18.04 (kernel 4.15) with erasure encoding or without.
>>
> Are you only using the nbd_set_timeout tool for this newer kernel combo
> to try and workaround the disconnect+io_errors problem in newer kernels,
> or did you use that tool to set a timeout with older kernels? I am just
> trying to clarify the problem, because the kernel changed behavior and I
> am not sure if your issue is the very slow IO or that the kernel now
> escalates its error handler by default.
I only use nbd_set_timeout with the 4.15 kernels on obuntu 16.04 and 18.04 
because we experienced problems some weeks ago on "fstrim" activities a few 
weeks ago.
Adding timeouts of 60 seconds seemed to help, but did not solve the problem 
completely.

The problem situation described in my request is a different distuation but 
seems to be sourced in the same rootcause.

Not using the nbd_set_timeout tool, results in the same but more prominent 
problem situations :-)
(test with unloading the nbd module and re-executing the test)
>
> With older kernels no timeout would be set for each command by default,
> so if you were not running that tool then you would not see the nbd
> disconnect+io_errors+xfs issue. You would just see slow IOs.
>
> With newer kernels, like 4.15, nbd.ko always sets a per command timeout
> even if you do not set it via a nbd ioctl/netlink command. By default
> the timeout is 30 seconds. After the timeout period then the kernel does
> that disconnect+IO_errors error handling which causes xfs to get errors.
>
Did i get you correctly: Setting a unlimited timeout should prevent crashes on 
kernel 4.15?

For testing purposes i set the timeout to unlimited ("nbd_set_ioctl /dev/nbd0 
0", on already mounted device).
I re-executed the problem procedure and discovered that the 
compression-procedure crashes not at the same file, but crashes 30 seconds 
later with the same crash behavior.

Regards
Marc


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Future of Filestore?

2019-07-22 Thread Marc Roos



 >> Reverting back to filestore is quite a lot of work and time again. 
 >> Maybe see first if with some tuning of the vms you can get better 
results?
 >
 >None of the VMs are particularly disk-intensive.  There's two users 
accessing the system over a WiFi network for email, and some HTTP/SMTP 
traffic coming in via an ADSL2 Internet connection.
 >
 >If Bluestore can't manage this, then I'd consider it totally worthless 
in any enterprise installation -- so clearly something is wrong.


I have a cluster mainly intended for backups to cephfs, 4 nodes, sata 
disks and mostly 5400rpm. Because the cluster is doing nothing. I 
decided to put vm's on them. I am running 15 vm's without problems on 
the hdd pool. Going to move more to them. One of them is an macos 
machine, I did once a fio test in it and gave me 917 iops at 4k random 
reads. (technically not possible I would say, I have mostly default 
configurations in libvirt)


 >
 >> What you also can try is for io intensive vm's add an ssd pool?
 >
 >How well does that work in a cluster with 0 SSD-based OSDs?
 >
 >For 3 of the nodes, the cases I'm using for the servers can fit two 
2.5"
 >drives.  I have one 120GB SSD for the OS, that leaves one space spare 
for the OSD.  


I think this could be your bottle neck, I have 31 drives, so the load is 
spread across 31 (hopefully). If you have only 3 drives you have 
3x60iops to share amongst your vms. 
I am getting the impression that ceph development is not really 
interested in setups quite different from the advised standards. I once 
made an attempt to get things better working for 1Gb adapters[0].

 >
 >I since added two new nodes, which are Intel NUCs with m.2 SATA SSDs 
for the OS and like the other nodes have a single 2.5" drive bay.
 >
 >This is being done as a hobby and a learning exercise I might add -- 
so while I have spent a lot of money on this, the funds I have to throw 
at this are not infinite.


Same here ;) 


 >
 >> I moved
 >> some exchange servers on them. Tuned down the logging, because that 
is 
 >> writing constantly to disk.
 >> With such setup you are at least secured for the future.
 >
 >The VMs I have are mostly Linux (Gentoo, some Debian/Ubuntu), with a 
few OpenBSD VMs for things like routers between virtual networks.
 >

[0] https://www.mail-archive.com/ceph-users@lists.ceph.com/msg35474.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Future of Filestore?

2019-07-20 Thread Marc Roos

 

Reverting back to filestore is quite a lot of work and time again. Maybe 
see first if with some tuning of the vms you can get better results? 
What you also can try is for io intensive vm's add an ssd pool? I moved 
some exchange servers on them. Tuned down the logging, because that is 
writing constantly to disk. 
With such setup you are at least secured for the future.






-Original Message-
From: Stuart Longland [mailto:stua...@longlandclan.id.au] 
Subject: Re: [ceph-users] Future of Filestore?


>  
> Maybe a bit of topic, just curious what speeds did you get previously? 

> Depending on how you test your native drive of 5400rpm, the 
> performance could be similar. 4k random read of my 7200rpm/5400 rpm 
> results in ~60iops at 260kB/s.

Well, to be honest I never formally tested the performance prior to the 
move to Bluestore.  It was working "acceptably" for my needs, thus I 
never had a reason to test it.

It was never a speed demon, but it did well enough for my needs.  Had 
Filestore on BTRFS remained an option in Ceph v12, I'd have stayed that 
way.

> I also wonder why filestore could be that much faster, is this not 
> something else? Maybe some dangerous caching method was on?

My understanding is that Bluestore does not benefit from the Linux 
kernel filesystem cache.  On paper, Bluestore *should* be faster, but 
it's hard to know for sure.

Maybe I should try migrating back to Filestore and see if that improves 
things?
--
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Future of Filestore?

2019-07-19 Thread Marc Roos

 
Maybe a bit of topic, just curious what speeds did you get previously? 
Depending on how you test your native drive of 5400rpm, the performance 
could be similar. 4k random read of my 7200rpm/5400 rpm results in 
~60iops at 260kB/s.
I also wonder why filestore could be that much faster, is this not 
something else? Maybe some dangerous caching method was on?



-Original Message-
From: Stuart Longland [mailto:stua...@longlandclan.id.au] 
Sent: vrijdag 19 juli 2019 12:22
To: ceph-users
Subject: [ceph-users] Future of Filestore?

Hi all,

Earlier this year, I did a migration from Ceph 10 to 12.  Previously, I 
was happily running Ceph v10 on Filestore with BTRFS, and getting 
reasonable performance.

Moving to Ceph v12 necessitated a migration away from this set-up, and 
reading the documentation, Bluestore seemed to be "the way", so a hasty 
migration was performed and now my then 3-node cluster moved to 
Bluestore.  I've since added two new nodes to that cluster and replaced 
the disks in all systems, so I have 5 WD20SPZX-00Us storing my data.

I'm now getting about 5MB/sec I/O speeds in my VMs.

I'm contemplating whether I migrate back to using Filestore (on XFS this 
time, since BTRFS appears to be a rude word despite Ceph v10 docs 
suggesting it as a good option), but I'm not sure what the road map is 
for supporting Filestore long-term.

Is Filestore likely to have long term support for the next few years or 
should I persevere with tuning Bluestore to get something that won't be 
outperformed by an early 90s PIO mode 0 IDE HDD?
--
Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind...
  ...it's backed up on a tape somewhere.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] reproducable rbd-nbd crashes

2019-07-19 Thread Marc Schöchlin

Hello Jason,

Am 18.07.19 um 20:10 schrieb Jason Dillaman:
> On Thu, Jul 18, 2019 at 1:47 PM Marc Schöchlin  wrote:
>> Hello cephers,
>>
>> rbd-nbd crashes in a reproducible way here.
> I don't see a crash report in the log below. Is it really crashing or
> is it shutting down? If it is crashing and it's reproducable, can you
> install the debuginfo packages, attach gdb, and get a full backtrace
> of the crash?

I do not get a crash report of rbd-nbd.
I seems that "rbd-nbd" just terminates, and crashes the xfs filesystem because 
the nbd device is not available anymore.
("rbd nbd ls" shows no mapped device anymore)

>
> It seems like your cluster cannot keep up w/ the load and the nbd
> kernel driver is timing out the IO and shutting down. There is a
> "--timeout" option on "rbd-nbd" that you can use to increase the
> kernel IO timeout for nbd.
>
I have also a 36TB XFS (non_ec) volume on this virtual system mapped by krbd 
which is under really heavy read/write usage.
I never experienced problems like this on this system with similar usage 
patterns.

The volume which is involved in the problem only handles a really low load and 
i was capable to create the error situation by using the simple "find . -type f 
-name "*.sql" -exec ionice -c3 nice -n 20 gzip -v {} \;" command.
I copied and read ~1.5 TB of data to this volume without a problem - it seems 
that the gzip command provokes a io pattern which leads to the error situation.

As described i use a luminous "12.2.11" client which does not support that 
"--timeout" option (btw. a backport would be nice).
Our ceph system runs with a heavy write load, therefore we already set a 60 
seconds timeout using the following code:
(https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c)

We have ~500 heavy load rbd-nbd devices in our xen cluster (rbd-nbd 12.2.5, 
kernel 4.4.0+10, centos clone) and ~20 high load krbd devices (kernel 
4.15.0-45, ubuntu 16.04) - we never experienced problems like this.
We only experience problems like this with rbd-nbd > 12.2.5 on ubuntu 16.04 
(kernel 4.15) or ubuntu 18.04 (kernel 4.15) with erasure encoding or without.

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] reproducable rbd-nbd crashes

2019-07-18 Thread Marc Schöchlin

ass=hdd \
 crush-failure-domain=host
 
    ceph osd pool create rbd_hdd_ec 64 64 erasure archive_profile
    ceph osd pool set rbd_hdd_ec  allow_ec_overwrites true
    ceph osd pool application enable rbd_hdd_ec rbd

What can i do?
I never experienced something like this krbd or rbd-nbd (12.2.5 in my 
xen-hypervisor, https://github.com/vico-research-and-consulting/RBDSR)

Regards
Marc


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs snapshot scripting questions

2019-07-17 Thread Marc Roos


H, ok ok, test it first, can't remember if it is finished. Checks 
also if it is usefull to create a snapshot, by checking the size of the 
directory.

[@ cron.daily]# cat backup-archive-mail.sh
#!/bin/bash


cd /home/

for account in `ls -c1 /home/mail-archive/ | sort`
do
  /usr/local/sbin/backup-snap.sh mail-archive/$account 7
  /usr/local/sbin/backup-snap.sh archiveindex/$account 7
done 



[@ cron.daily]# cat /usr/local/sbin/backup-snap.sh
#!/bin/bash
#
# usage: ./backup-snap.sh dirname ret
#
# dont forget to change
# START_HOURS_RANGE=0-22
#
# command to backup:


# static variables
# BACKUPDIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )"
BACKUPDIR="$PWD"
BACKUPPREFIX="snap"
CEPHSDIR=".snap"
ARCHIVEDAYS=30
DOW=`date +%u`
DOM=`date +%-d`

if [ $# -lt 2 ]
then
 echo ""
 echo "usage: $0 dirname ret-days "
 echo ""
 echo ""
exit;
fi

# used for hardlinking with rsync to previous version
PREVDAY=
# current version backup location
VB=
# argument variables
ARGS=$*
SNAPBAK=$1
BACKUPDAYS=$2
ARCHIVE=0

function setpreviousday {

PREVDAY=`expr $DOM - 1`

if [ $PREVDAY -eq 0 ]
then
  PREVDAY=$BACKUPDAYS
fi
}

function getdirsize {
local path="$1"
local size=0
local rval=0

size=$(getfattr --only-values --absolute-names -d -m ceph.dir.rbytes 
"$path" 2> /dev/null)
if [ $? -eq 0 ]
then
  rval=$size
fi

echo -n $rval
}

function resetdom {
# reset the dom counter so XX days are backuped

if [ $DOM -gt $BACKUPDAYS ]
then
  DOM=`expr $DOM - $BACKUPDAYS`
  resetdom
fi

}

function removewhitespace {
NEW=`echo -n "$1" | sed -e 's/\s//g' `
echo -n "$NEW"
}


function createsnapshot () {
  SNAPBAK="$1"
  SNNAME="$2"

  mkdir "$BACKUPDIR/$SNAPBAK/$CEPHSDIR/$SNNAME"
}

function snapshotremove () {
  SNAPBAK="$1"
  SNNAME="$2"

  #echo "removing snapshot $SNAPBAK"
  rmdir "$BACKUPDIR/$SNAPBAK/$CEPHSDIR/$SNNAME"
  if [ $? -ne 0 ]
  then
echo "error removing old snapshot $SNAPBAK"
exit;
  fi
  sleep 2
}

function snapshotexists () {
# returns 0 if exists
  rval=1
  SNAPBAK="$1"
  SNNAME="$2"

  FOUND="$BACKUPDIR/$SNAPBAK/$CEPHSDIR/$SNNAME"
  if [ -d $FOUND ]
  then
rval=0
  fi

  echo -n $rval
}


# script arguments
#setargs

umask 0027


# reset day of month in scale of 1 to BACKUPDAYS
resetdom
VB=$DOM

# calculate previous day number
setpreviousday

# do server backups
SNNAME=$BACKUPPREFIX"-"$VB


# do only snapshot if there is data
if [ $(getdirsize "$SNAPBAK") -ne 0 ]
then
  if [ $(snapshotexists $SNAPBAK $SNNAME) -eq 0 ]
  then
snapshotremove $SNAPBAK $SNNAME
  fi

  createsnapshot $SNAPBAK $SNNAME
fi





-Original Message-
From: Robert Ruge [mailto:robert.r...@deakin.edu.au] 
Sent: woensdag 17 juli 2019 2:44
To: ceph-users@lists.ceph.com
Subject: [ceph-users] cephfs snapshot scripting questions

Greetings.

 

Before I reinvent the wheel has anyone written a script to maintain X 
number of snapshots on a cephfs file system that can be run through 
cron?

I am aware of the cephfs-snap code but just wondering if there are any 
other options out there.

 

On a related note which of these options would be better?

1.   Maintain one .snap directory at the root of the cephfs tree - 
/ceph/.snap

2.   Have a .snap directory for every second level directory 
/ceph/user/.snap

 

I am thinking the later might make it more obvious for the users to do 
their own restores but wondering what the resource implications of 
either approach might be.

 

The documentation indicates that I should use kernel >= 4.17 for cephfs. 
 I’m currently using Mimic 13.2.6 on Ubuntu 18.04 with kernel version 
4.15.0. What issues might I see with this combination? I’m hesitant to 
upgrade to an unsupported kernel on Ubuntu but wondering if I’m going 
to be playing Russian Roulette with this combo. 

 

Are there any gotcha’s I should be aware of before plunging into full 
blown cephfs snapshotting?

 

Regards and thanks.

Robert Ruge

 


Important Notice: The contents of this email are intended solely for the 
named addressee and are confidential; any unauthorised use, reproduction 
or storage of the contents is expressly prohibited. If you have received 
this email in error, please delete it and any attachments immediately 
and advise the sender by return email or telephone.

Deakin University does not warrant that this email and any attachments 
are error or virus free. 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New best practices for osds???

2019-07-16 Thread Marc Roos

 

With the sas expander you are putting more drives on 'one port'. Just 
make sure you do not create a bottle neck there, adding to many drives. 
I guess this depends on the speed of the drives. Then you should be fine 
not?




-Original Message-
From: Stolte, Felix [mailto:f.sto...@fz-juelich.de] 
Sent: dinsdag 16 juli 2019 17:42
To: ceph-users
Subject: [ceph-users] New best practices for osds???

Hi guys,

our ceph cluster is performing way less than it could, based on the 
disks we are using. We could narrow it down to the storage controller 
(LSI SAS3008 HBA) in combination with an SAS expander. Yesterday we had 
a meeting with our hardware reseller and sale representatives of the 
hardware manufacturer to resolve the issue.

They told us, that "best practices" for ceph would be to deploy disks as 
Raid 0 consisting of one disk using a raid controller with a big 
writeback cache. 

Since this "best practice" is new to me, I would like to hear your 
opinion on this topic.

Regards Felix


-

-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), 
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. 
Dr. Sebastian M. Schmidt

-

-
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-16 Thread Marc Roos

 
Can you give me a little pointer on how and where to do the verbose 
logging? I am really interested in discovery why I am getting the large 
osdmap message. Maybe it is because of the snapshot being created?



-Original Message-
From: Ilya Dryomov [mailto:idryo...@gmail.com] 
Cc: paul.emmerich; ceph-users
Subject: Re: [ceph-users] "session established", "io error", "session 
lost, hunting for new mon" solution/fix

On Fri, Jul 12, 2019 at 5:38 PM Marc Roos  
wrote:
>
>
> Thanks Ilya for explaining. Am I correct to understand from the 
> link[0] mentioned in the issue, that because eg. I have an unhealthy 
> state for some time (1 pg on a insignificant pool) I have larger 
> osdmaps, triggering this issue? Or is just random bad luck? (Just a 
> bit curious why I have this issue)
>
> [0]
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg51522.html

I'm not sure.  I wouldn't expect one unhealthy PG to trigger a large 
osdmap message.  Only verbose logs can tell.

Thanks,

Ilya


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Returning to the performance in a small cluster topic

2019-07-15 Thread Marc Roos

 

Isn't that why you suppose to test up front? So you do not have shocking 
surprises? You can find in the mailing list archives some performance 
references also. 
I think it would be good to publish some performance results on the 
ceph.com website. Can’t be to difficult to put some default scenarios, 
used hardware and performance there in some nice graphs. I take it some 
here would be willing to contribute test results of their 
test/production clusters. This way new ceph’ers know what to expect 
from similar setups.



-Original Message-
From: Jordan Share [mailto:readm...@krotus.com] 
Sent: maandag 15 juli 2019 20:16
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Returning to the performance in a small 
cluster topic

We found shockingly bad committed IOPS/latencies on ceph.

We could get roughly 20-30 IOPS when running this fio invocation from 
within a vm:
fio --name=seqwrite --rw=write --direct=1 --ioengine=libaio --bs=32k
--numjobs=1 --size=2G --runtime=60 --group_reporting --fsync=1

For non-committed IO, we get about 2800 iops, with this invocation:
fio --name=seqwrite --rw=write --direct=1 --ioengine=libaio --bs=32k
--numjobs=1 --size=2G --runtime=150 --group_reporting

So, maybe, if PostreSQL has a lot of committed IO needs, you might not 
have the performance you're expecting.

You could try running your fio tests with "--fsync=1" and see if those 
numbers (which I'd expect to be very low) would be in line with your 
PostgreSQL performance.

Jordan


On 7/15/2019 7:08 AM, Drobyshevskiy, Vladimir wrote:
> Dear colleagues,
> 
>    I would like to ask you for help with a performance problem on a 
> site backed with ceph storage backend. Cluster details below.
> 
>    I've got a big problem with PostgreSQL performance. It runs inside 
> a VM with virtio-scsi ceph rbd image. And I see constant ~100% disk 
> load with up to hundreds milliseconds latencies (via atop) even when 
> pg_top shows 10-20 tps. All other resources are almost untouched - 
> there is a lot of memory and free CPU cores, DB fits memory but still 
> has performance issues.
> 
>    The cluster itself:
>    nautilus
>    6 nodes, 7 SSD with 2 OSDs per SSD (14 OSDs in overall).
>    Each node: 2x Intel Xeon E5-2665 v1 (governor = performance, 
> powersaving disabled), 64GB RAM, Samsung SM863 1.92TB SSD, QDR 
Infiniband.
> 
>    I've made fio benchmarking with three type of measures:
>    a VM with virtio-scsi driver,
>    baremetal host with mounted rbd image
>    and the same baremetal host with mounted lvm partition on SM863 SSD 

> drive.
> 
>    I've set bs=8k (as Postgres writes 8k blocks) and tried 1 and 8 
jobs.
> 
>    Here are some results: https://pastebin.com/TFUg5fqA
>    Drives load on the OSD hosts are very low, just a few percent.
> 
>    Here is my ceph config: https://pastebin.com/X5ZwaUrF
> 
>    Numbers don't look very good from my point of view but they are 
> also not really bad (are they?). But I don't really know the next 
> direction I can go to solve the problem with PostgreSQL.
> 
>    I've tried to make an RAID0 with mdraid and 2 virtual drives but 
> haven't noticed any difference.
> 
>    Could you please tell me:
>    Are these performance numbers good or bad according to the 
hardware?
>    Is it possible to tune anything more? May be you can point me to 
> docs or other papers?
>    Does any special VM tuning for the PostgreSQL\ceph cooperation 
exist?
>    Thank you in advance!
> 
> --
> Best regards,
> Vladimir
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-12 Thread Marc Roos

 
Thanks Ilya for explaining. Am I correct to understand from the link[0] 
mentioned in the issue, that because eg. I have an unhealthy state for 
some time (1 pg on a insignificant pool) I have larger osdmaps, 
triggering this issue? Or is just random bad luck? (Just a bit curious 
why I have this issue)

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg51522.html

-Original Message-
Subject: Re: [ceph-users] "session established", "io error", "session 
lost, hunting for new mon" solution/fix

On Fri, Jul 12, 2019 at 12:33 PM Paul Emmerich  
wrote:
>
>
>
> On Thu, Jul 11, 2019 at 11:36 PM Marc Roos  
wrote:
>> Anyone know why I would get these? Is it not strange to get them in a 

>> 'standard' setup?
>
> you are probably running on an ancient kernel. this bug has been fixed 
a long time ago.

This is not a kernel bug:

http://tracker.ceph.com/issues/38040

It is possible to hit with few OSDs too.  The actual problem is the size 
of the osdmap message which can contain multiple full osdmaps, not the 
number of OSDs.  The size of a full osdmap is proportional to the number 
of OSDs but it's not the only way to get a big osdmap message.

As you have experienced, these settings used to be expressed in the 
number of osdmaps and our defaults were too high for a stream of full 
osdmaps (as opposed to incrementals).  It is now expressed in bytes, the 
patch should be in 12.2.13.

>
>> -Original Message-
>> Subject: [ceph-users] "session established", "io error", "session 
>> lost, hunting for new mon" solution/fix
>>
>>
>> I have on a cephfs client again (luminous cluster, centos7, only 32 
>> osds!). Wanted to share the 'fix'
>>
>> [Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
>> established [Thu Jul 11 12:16:09 2019] libceph: mon0 
>> 192.168.10.111:6789 io error [Thu Jul 11 12:16:09 2019] libceph: mon0 

>> 192.168.10.111:6789 session lost, hunting for new mon [Thu Jul 11 
>> 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session established 
>> [Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io error 

>> [Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session 
>> lost, hunting for new mon [Thu Jul 11 12:16:09 2019] libceph: mon0 
>> 192.168.10.111:6789 session established [Thu Jul 11 12:16:09 2019] 
>> libceph: mon0 192.168.10.111:6789 io error [Thu Jul 11 12:16:09 2019] 

>> libceph: mon0 192.168.10.111:6789 session lost, hunting for new mon 
>> [Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 session 
>> established [Thu Jul 11 12:16:09 2019] libceph: mon1 
>> 192.168.10.112:6789 io error [Thu Jul 11 12:16:09 2019] libceph: mon1 

>> 192.168.10.112:6789 session lost, hunting for new mon
>>
>> 1) I blocked client access to the monitors with iptables -I INPUT -p 
>> tcp -s 192.168.10.43 --dport 6789 -j REJECT Resulting in
>>
>> [Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 
>> closed (con state CONNECTING) [Thu Jul 11 12:34:18 2019] libceph: 
>> mon1 192.168.10.112:6789 socket closed (con state CONNECTING) [Thu 
>> Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket closed 

>> (con state CONNECTING) [Thu Jul 11 12:34:26 2019] libceph: mon2 
>> 192.168.10.113:6789 socket closed (con state CONNECTING) [Thu Jul 11 
>> 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket closed (con 
>> state CONNECTING) [Thu Jul 11 12:34:28 2019] libceph: mon2 
>> 192.168.10.113:6789 socket closed (con state CONNECTING) [Thu Jul 11 
>> 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket closed (con 
>> state CONNECTING) [Thu Jul 11 12:34:30 2019] libceph: mon2 
>> 192.168.10.113:6789 socket closed (con state CONNECTING) [Thu Jul 11 
>> 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket closed (con 
>> state CONNECTING) [Thu Jul 11 12:34:42 2019] libceph: mon2 
>> 192.168.10.113:6789 socket closed (con state CONNECTING) [Thu Jul 11 
>> 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket closed (con 
>> state CONNECTING) [Thu Jul 11 12:34:45 2019] libceph: mon0 
>> 192.168.10.111:6789 socket closed (con state CONNECTING) [Thu Jul 11 
>> 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket closed (con 
>> state CONNECTING)
>>
>> 2) I applied the suggested changes to the osd map message max, 
>> mentioned
>>
>> in early threads[0]
>> ceph tell osd.* injectargs '--osd_map_message_max=10'
>> ceph tell mon.* injectargs '--osd_map_message_max=10'
>> [@c01 ~]# ceph daemon osd.0 config show|grep message_max
>> "osd_map_message_max": "10",
>

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-12 Thread Marc Roos

 
Paul, this should have been/is back ported to this kernel not?


-Original Message-
From: Paul Emmerich [mailto:paul.emmer...@croit.io] 
Cc: ceph-users
Subject: Re: [ceph-users] "session established", "io error", "session 
lost, hunting for new mon" solution/fix

 

Anyone know why I would get these? Is it not strange to get them in 
a 
'standard' setup?



you are probably running on an ancient kernel. this bug has been fixed a 
long time ago.


Paul

 






-Original Message-
Subject: [ceph-users] "session established", "io error", "session 
lost, 
hunting for new mon" solution/fix


I have on a cephfs client again (luminous cluster, centos7, only 32 

osds!). Wanted to share the 'fix'

[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
lost, hunting for new mon

1) I blocked client access to the monitors with
iptables -I INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
Resulting in 

[Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:18 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:26 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:28 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:42 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:45 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)

2) I applied the suggested changes to the osd map message max, 
mentioned 

in early threads[0]
ceph tell osd.* injectargs '--osd_map_message_max=10'
ceph tell mon.* injectargs '--osd_map_message_max=10'
[@c01 ~]# ceph daemon osd.0 config show|grep message_max
"osd_map_message_max": "10",
[@c01 ~]# ceph daemon mon.a config show|grep message_max
"osd_map_message_max": "10",

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg54419.htm
l
http://tracker.ceph.com/issues/38040

3) Allow access to a monitor with
iptables -D INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT

Getting 
[Thu Jul 11 12:39:26 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:39:26 2019] libceph: osd0 down
[Thu Jul 11 12:39:26 2019] libceph: osd0 up

Problems solved, in D state hung unmount was released. 

I am not sure if the prolonged disconnection to the monitors was 
the 
solution or the osd_map_message_max=10, or both. 





___
ceph-users mailing list

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-12 Thread Marc Roos

 

 
Hi Paul, 

Thanks for your reply, I am running 3.10.0-957.12.2.el7.x86_64, it is 
from may 2019.



-Original Message-
From: Paul Emmerich [mailto:paul.emmer...@croit.io] 
Sent: vrijdag 12 juli 2019 12:34
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] "session established", "io error", "session 
lost, hunting for new mon" solution/fix



On Thu, Jul 11, 2019 at 11:36 PM Marc Roos  
wrote:


 

Anyone know why I would get these? Is it not strange to get them in 
a 
'standard' setup?



you are probably running on an ancient kernel. this bug has been fixed a 
long time ago.


Paul

 






-Original Message-
Subject: [ceph-users] "session established", "io error", "session 
lost, 
hunting for new mon" solution/fix


I have on a cephfs client again (luminous cluster, centos7, only 32 

osds!). Wanted to share the 'fix'

[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 
session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 io 
error
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 
session 
lost, hunting for new mon

1) I blocked client access to the monitors with
iptables -I INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
Resulting in 

[Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:18 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:26 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:28 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:42 2019] libceph: mon2 192.168.10.113:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:45 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)
[Thu Jul 11 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket 

closed (con state CONNECTING)

2) I applied the suggested changes to the osd map message max, 
mentioned 

in early threads[0]
ceph tell osd.* injectargs '--osd_map_message_max=10'
ceph tell mon.* injectargs '--osd_map_message_max=10'
[@c01 ~]# ceph daemon osd.0 config show|grep message_max
"osd_map_message_max": "10",
[@c01 ~]# ceph daemon mon.a config show|grep message_max
"osd_map_message_max": "10",

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg54419.htm
l
http://tracker.ceph.com/issues/38040

3) Allow access to a monitor with
iptables -D INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT

Getting 
[Thu Jul 11 12:39:26 2019] libceph: mon0 192.168.10.111:6789 
session 
established
[Thu Jul 11 12:39:26 2019] libceph: osd0 down
[Thu Jul 11 12:39:26 2019] libceph: osd0 up

Problems solved, in D state hung unmount was released. 

I am not sure if the prolo

Re: [ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-11 Thread Marc Roos

 

Anyone know why I would get these? Is it not strange to get them in a 
'standard' setup?





-Original Message-
Subject: [ceph-users] "session established", "io error", "session lost, 
hunting for new mon" solution/fix


I have on a cephfs client again (luminous cluster, centos7, only 32 
osds!). Wanted to share the 'fix'

[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon

1) I blocked client access to the monitors with
iptables -I INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
Resulting in 

[Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:18 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:26 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:28 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:42 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:45 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)

2) I applied the suggested changes to the osd map message max, mentioned 

in early threads[0]
ceph tell osd.* injectargs '--osd_map_message_max=10'
ceph tell mon.* injectargs '--osd_map_message_max=10'
[@c01 ~]# ceph daemon osd.0 config show|grep message_max
"osd_map_message_max": "10",
[@c01 ~]# ceph daemon mon.a config show|grep message_max
"osd_map_message_max": "10",

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg54419.html
http://tracker.ceph.com/issues/38040

3) Allow access to a monitor with
iptables -D INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT

Getting 
[Thu Jul 11 12:39:26 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:39:26 2019] libceph: osd0 down
[Thu Jul 11 12:39:26 2019] libceph: osd0 up

Problems solved, in D state hung unmount was released. 

I am not sure if the prolonged disconnection to the monitors was the 
solution or the osd_map_message_max=10, or both. 





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Libceph clock drift or I guess kernel clock drift issue

2019-07-11 Thread Marc Roos



I noticed that the dmesg -T gives incorrect time, the messages have a 
time in the future compared to the system time. Not sure if this is 
libceph issue or a kernel issue.

[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd22 192.168.10.111:6811 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established


[@ ]# uptime
 10:39:17 up 50 days, 13:31,  2 users,  load average: 3.60, 3.02, 2.57


[@~]# uname -a
Linux c01 3.10.0-957.12.2.el7.x86_64 #1 SMP Tue May 14 21:24:32 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] "session established", "io error", "session lost, hunting for new mon" solution/fix

2019-07-11 Thread Marc Roos



I have on a cephfs client again (luminous cluster, centos7, only 32 
osds!). Wanted to share the 'fix'

[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 12:16:09 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon

1) I blocked client access to the monitors with
iptables -I INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT
Resulting in 

[Thu Jul 11 12:34:16 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:18 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:22 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:26 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:27 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:28 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon1 192.168.10.112:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:30 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:34 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:42 2019] libceph: mon2 192.168.10.113:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:44 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:45 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)
[Thu Jul 11 12:34:46 2019] libceph: mon0 192.168.10.111:6789 socket 
closed (con state CONNECTING)

2) I applied the suggested changes to the osd map message max, mentioned 
in early threads[0]
ceph tell osd.* injectargs '--osd_map_message_max=10'
ceph tell mon.* injectargs '--osd_map_message_max=10'
[@c01 ~]# ceph daemon osd.0 config show|grep message_max
"osd_map_message_max": "10",
[@c01 ~]# ceph daemon mon.a config show|grep message_max
"osd_map_message_max": "10",

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg54419.html
http://tracker.ceph.com/issues/38040

3) Allow access to a monitor with
iptables -D INPUT -p tcp -s 192.168.10.43 --dport 6789 -j REJECT

Getting 
[Thu Jul 11 12:39:26 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 12:39:26 2019] libceph: osd0 down
[Thu Jul 11 12:39:26 2019] libceph: osd0 up

Problems solved, in D state hung unmount was released. 

I am not sure if the prolonged disconnection to the monitors was the 
solution or the osd_map_message_max=10, or both. 





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] shutdown down all monitors

2019-07-11 Thread Marc Roos




Can I temporary shutdown all my monitors? This only affects new 
connections not? Existing will still keep running?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos

 


I decided to restart osd.0, then the load of the cephfs and on all osd 
nodes dropped. After this I still have on the first server


[@~]# cat 
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client357431
0/osdc
REQUESTS 0 homeless 0
LINGER REQUESTS
BACKOFFS


[@~]# cat 
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client358422
4/osdc
REQUESTS 2 homeless 0
317841  osd020.d6ec44c1 20.1[0,28,5]/0  [0,28,5]/0  
e65040  10001b44a70.0x40001c102023  read
317853  osd020.5956d31b 20.1b   [0,5,10]/0  [0,5,10]/0  
e65040  10001ad8962.0x40001c40731   read
LINGER REQUESTS
BACKOFFS

And dmesg -T keeps giving me these (again with wrong timestamps)

[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 11:23:21 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 11:23:21 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 11:23:21 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 11:23:21 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 11:23:21 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon

What to do now? Restarting the monitor did not help.


-Original Message-
Subject: Re: [ceph-users] Luminous cephfs maybe not to stable as 
expected?

 

Forgot to add these

[@ ~]# cat
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client357431
0/osdc
REQUESTS 0 homeless 0
LINGER REQUESTS
BACKOFFS

[@~]# cat
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client358422
4/osdc
REQUESTS 38 homeless 0
317841  osd020.d6ec44c1 20.1[0,28,5]/0  [0,28,5]/0  
e65040  10001b44a70.0x40001c101139  read
317853  osd020.5956d31b 20.1b   [0,5,10]/0  [0,5,10]/0  
e65040  10001ad8962.0x40001c39847   read
317835  osd320.ede889de 20.1e   [3,12,27]/3 [3,12,27]/3 
e65040  10001ad80f6.0x40001c87758   read
317838  osd320.7b730a4e 20.e[3,31,9]/3  [3,31,9]/3  
e65040  10001ad89d8.0x40001c83444   read
317844  osd320.feead84c 20.c[3,13,18]/3 [3,13,18]/3 
e65040  10001ad8733.0x40001c77267   read
317852  osd320.bd2658e  20.e[3,31,9]/3  [3,31,9]/3  
e65040  10001ad7e00.0x40001c39331   read
317830  osd420.922e6d04 20.4[4,16,27]/4 [4,16,27]/4 
e65040  10001ad80f2.0x40001c86326   read
317837  osd420.fe93d4ab 20.2b   [4,14,25]/4 [4,14,25]/4 
e65040  10001ad80fb.0x40001c78951   read
317839  osd420.d7af926b 20.2b   [4,14,25]/4 [4,14,25]/4 
e65040  10001ad80ee.0x40001c77556   read
317849  osd520.5fcb95c5 20.5[5,18,29]/5 [5,18,29]/5 
e65040  10001ad7f75.0x40001c61147   read
317857  osd520.28764e9a 20.1a   [5,7,28]/5  [5,7,28]/5  
e65040  10001ad8a10.0x40001c30369   read
317859  osd520.7bb79985 20.5[5,18,29]/5 [5,18,29]/5 
e65040  10001ad7fe8.0x40001c27942   read
317836  osd820.e7bf5cf4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad7d79.0x40001c133699  read
317842  osd820.abbb9df4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001d5903f.0x40001c125308  read
317850  osd820.ecd0034  20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad89b2.0x40001c68348   read
317854  osd820.cef50134 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad8728.0x40001c57431   read
317861  osd820.3e859bb4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad8108.0x40001c50642   read
317847  osd920.fc9e9f43 20.3[9,29,17]/9 [9,29,17]/9 
e65040  10001ad8101.0x40001c88464   read
317848  osd920.d32b6ac3 20.3[9,29,17]/9 [9,29,17]/9 
e65040  10001ad8100.0x40001c85929   read
317862  osd11

Re: [ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos

 

Forgot to add these

[@ ~]# cat 
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client357431
0/osdc
REQUESTS 0 homeless 0
LINGER REQUESTS
BACKOFFS

[@~]# cat 
/sys/kernel/debug/ceph/0f1701f5-453a-4a3b-928d-f652a2bbbcb0.client358422
4/osdc
REQUESTS 38 homeless 0
317841  osd020.d6ec44c1 20.1[0,28,5]/0  [0,28,5]/0  
e65040  10001b44a70.0x40001c101139  read
317853  osd020.5956d31b 20.1b   [0,5,10]/0  [0,5,10]/0  
e65040  10001ad8962.0x40001c39847   read
317835  osd320.ede889de 20.1e   [3,12,27]/3 [3,12,27]/3 
e65040  10001ad80f6.0x40001c87758   read
317838  osd320.7b730a4e 20.e[3,31,9]/3  [3,31,9]/3  
e65040  10001ad89d8.0x40001c83444   read
317844  osd320.feead84c 20.c[3,13,18]/3 [3,13,18]/3 
e65040  10001ad8733.0x40001c77267   read
317852  osd320.bd2658e  20.e[3,31,9]/3  [3,31,9]/3  
e65040  10001ad7e00.0x40001c39331   read
317830  osd420.922e6d04 20.4[4,16,27]/4 [4,16,27]/4 
e65040  10001ad80f2.0x40001c86326   read
317837  osd420.fe93d4ab 20.2b   [4,14,25]/4 [4,14,25]/4 
e65040  10001ad80fb.0x40001c78951   read
317839  osd420.d7af926b 20.2b   [4,14,25]/4 [4,14,25]/4 
e65040  10001ad80ee.0x40001c77556   read
317849  osd520.5fcb95c5 20.5[5,18,29]/5 [5,18,29]/5 
e65040  10001ad7f75.0x40001c61147   read
317857  osd520.28764e9a 20.1a   [5,7,28]/5  [5,7,28]/5  
e65040  10001ad8a10.0x40001c30369   read
317859  osd520.7bb79985 20.5[5,18,29]/5 [5,18,29]/5 
e65040  10001ad7fe8.0x40001c27942   read
317836  osd820.e7bf5cf4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad7d79.0x40001c133699  read
317842  osd820.abbb9df4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001d5903f.0x40001c125308  read
317850  osd820.ecd0034  20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad89b2.0x40001c68348   read
317854  osd820.cef50134 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad8728.0x40001c57431   read
317861  osd820.3e859bb4 20.34   [8,5,10]/8  [8,5,10]/8  
e65040  10001ad8108.0x40001c50642   read
317847  osd920.fc9e9f43 20.3[9,29,17]/9 [9,29,17]/9 
e65040  10001ad8101.0x40001c88464   read
317848  osd920.d32b6ac3 20.3[9,29,17]/9 [9,29,17]/9 
e65040  10001ad8100.0x40001c85929   read
317862  osd11   20.ee6cc689 20.9[11,0,12]/11[11,0,12]/11
e65040  10001ad7d64.0x40001c40266   read
317843  osd12   20.a801f0e9 20.29   [12,26,8]/12[12,26,8]/12
e65040  10001ad7f07.0x40001c86610   read
317851  osd12   20.8bb48de9 20.29   [12,26,8]/12[12,26,8]/12
e65040  10001ad7e4f.0x40001c46746   read
317860  osd12   20.47815f36 20.36   [12,0,28]/12[12,0,28]/12
e65040  10001ad8035.0x40001c35249   read
317831  osd15   20.9e3acb53 20.13   [15,0,1]/15 [15,0,1]/15 
e65040  10001ad8978.0x40001c85329   read
317840  osd15   20.2a40efdf 20.1f   [15,4,17]/15[15,4,17]/15
e65040  10001ad7ef8.0x40001c76282   read
317846  osd15   20.8143f15f 20.1f   [15,4,17]/15[15,4,17]/15
e65040  10001ad89d1.0x40001c61297   read
317864  osd15   20.c889a49c 20.1c   [15,0,31]/15[15,0,31]/15
e65040  10001ad89fb.0x40001c24385   read
317832  osd18   20.f76227a  20.3a   [18,6,15]/18[18,6,15]/18
e65040  10001ad8020.0x40001c82852   read
317833  osd18   20.d8edab31 20.31   [18,29,14]/18   [18,29,14]/18   
e65040  10001ad8952.0x40001c82852   read
317858  osd18   20.8f69d231 20.31   [18,29,14]/18   [18,29,14]/18   
e65040  10001ad8176.0x40001c32400   read
317855  osd22   20.b3342c0f 20.f[22,18,31]/22   [22,18,31]/22   
e65040  10001ad8146.0x40001c51024   read
317863  osd23   20.cde0ce7b 20.3b   [23,1,6]/23 [23,1,6]/23 
e65040  10001ad856c.0x40001c34521   read
317865  osd23   20.702d2dfe 20.3e   [23,9,22]/23[23,9,22]/23
e65040  10001ad8a5e.0x40001c30664   read
317866  osd23   20.cb4a32fe 20.3e   [23,9,22]/23[23,9,22]/23
e65040  10001ad8575.0x40001c29683   read
317867  osd23   20.9a008910 20.10   [23,12,6]/23[23,12,6]/23
e65040  10001ad7d24.0x40001c29683   read
317834  osd25   20.6efd4911

[ceph-users] Luminous cephfs maybe not to stable as expected?

2019-07-11 Thread Marc Roos


Maybe this requires some attention. I have a default centos7 (maybe not 
the most recent kernel though), ceph luminous setup eg. no different 
kernels. 

This is 2nd or 3rd time that a vm is going into a high load (151) and 
stopping its services. I have two vm's both mounting the same 2 cephfs 
'shares'. After the last incident I dismounted the shares on the 2nd 
server. (Migrating to a new environment this 2nd server is not doing 
anything). Last time I thought maybe this could be related to my work on 
the switch from the stupid allocator to the bitmap.

Anyway yesterday I thought lets mount again the 2 shares on the 2nd 
server, see what happens. And this morning the high load was back. Afaik 
the 2nd server is only doing a cron job on the cephfs mounts, creating 
snapshots.

1) I have now still increased load on the osd nodes, from cephfs. How 
can I see what client is doing this? I don’t seem to get this from 
'ceph daemon mds.c session ls' however 'ceph osd pool stats | grep 
client -B 1' indicates it is cephfs.

2) ceph osd blacklist ls
No blacklist entries

3) the first server keeps generating such messages, while there is no 
issue with connectivity.

[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd25 192.168.10.114:6804 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd18 192.168.10.112:6802 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: osd22 192.168.10.111:6811 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 io error
[Thu Jul 11 10:41:22 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Thu Jul 11 10:41:22 2019] libceph: mon0 192.168.10.111:6789 session 
established

PS dmesg -T gives me strange times, as you can see these are in the 
future, os time is 2 min behind (which is the correct one, ntpd sync).
[@ ]# uptime
 10:39:17 up 50 days, 13:31,  2 users,  load average: 3.60, 3.02, 2.57

4) unmount the filesystem on the first server fails.

5) evicting the cephfs sessions of the first server, does not change the 
load of the cephfs on the osd nodes.

6) unmounting all cephfs clients, still leaves me with cephfs activity 
on the data pool and on the osd nodes.

[@c03 ~]# ceph daemon mds.c session ls
[] 

7) On the first server 
[@~]# ps -auxf| grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root  6716  3.0  0.0  0 0 ?D10:18   0:59  \_ 
[kworker/0:2]
root 20039  0.0  0.0 123520  1212 pts/0D+   10:28   0:00  |  
 \_ umount /home/mail-archive/

[@ ~]# cat /proc/6716/stack
[] __wait_on_freeing_inode+0xb0/0xf0
[] find_inode+0x99/0xc0

Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

2019-07-11 Thread Marc Roos

What about creating snaps on a 'lower level' in the directory structure 
so you do not need to remove files from a snapshot as a work around?

-Original Message-
From: Lars Marowsky-Bree [mailto:l...@suse.com] 
Sent: donderdag 11 juli 2019 10:21
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] writable snapshots in cephfs? GDPR/DSGVO

On 2019-07-10T09:59:08, Lars Täuber   wrote:

> Hi everbody!
> 
> Is it possible to make snapshots in cephfs writable?
> We need to remove files because of this General Data Protection 
Regulation also from snapshots.

Removing data from existing WORM storage is tricky, snapshots being a 
specific form thereof. If you want to avoid copying and altering all 
existing records - which might clash with the requirement from other 
fields that data needs to be immutable, but I guess you could store 
checksums externally somewhere? -, this is difficult.

I think what you'd need is an additional layer - say, one holding the 
decryption keys for the tenant/user (or whatever granularity you want to 
be able to remove data at) - that you can still modify.

Once the keys have been successfully and permanently wiped, the old data 
is effectively permanently deleted (from all media; whether Ceph snaps 
or tape or other immutable storage).

You may have a record that you *had* the data.

Now, of course, you've got to manage keys, but that's significantly less 
data to massage.

Not a lawyer, either.

Good luck.

Regards,
Lars

--
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 
21284 (AG Nürnberg) "Architects should open possibilities and not 
determine everything." (Ueli Zbinden) 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrating a cephfs data pool

2019-06-28 Thread Marc Roos

Afaik is the mv now fast because it is not moving any real data, just 
some meta data. Thus a real mv will be slow (only in the case between 
different pools) because it copies the data to the new pool and when 
successful deletes the old one. This will of course take a lot more 
time, but you at least are able to access the cephfs on both locations 
during this time and can fix things in your client access.

My problem with mv now is that if you accidentally use it between data 
pools, it does not really move data. 

-Original Message-
From: Robert LeBlanc [mailto:rob...@leblancnet.us] 
Sent: vrijdag 28 juni 2019 18:30
To: Marc Roos
Cc: ceph-users; jgarcia
Subject: Re: [ceph-users] Migrating a cephfs data pool

Given that the MDS knows everything, it seems trivial to add a ceph 'mv' 
command to do this. I looked at using tiering to try and do the move, 
but I don't know how to tell cephfs that the data is now on the new pool 
instead of the old pool name. Since we can't take a long enough downtime 
to move hundreds of Terabytes, we need something that can be done 
online, and if it has a minute or two of downtime would be okay.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Fri, Jun 28, 2019 at 9:02 AM Marc Roos  
wrote:

1.
change data pool for a folder on the file system:
setfattr -n ceph.dir.layout.pool -v fs_data.ec21 foldername

2. 
cp /oldlocation /foldername
Remember that you preferably want to use mv, but this leaves (meta) 
data 
on the old pool, that is not what you want when you want to delete 
that 
pool.

3. When everything is copied-removed, you should end up with an 
empty 
datapool with zero objects. 

4. Verify here with others, if you can just remove this one.

I think this is a reliable technique to switch, because you use the 

basic cephfs functionality that supposed to work. I prefer that the 
ceph 
guys implement a mv that does what you expect from it. Now it acts 
more 
or less like a linking.

-Original Message-
From: Jorge Garcia [mailto:jgar...@soe.ucsc.edu] 
Sent: vrijdag 28 juni 2019 17:52
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Migrating a cephfs data pool

Are you talking about adding the new data pool to the current 
filesystem? Like:

   $ ceph fs add_data_pool my_ceph_fs new_ec_pool

I have done that, and now the filesystem shows up as having two 
data 
pools:

   $ ceph fs ls
   name: my_ceph_fs, metadata pool: cephfs_meta, data pools: 
[cephfs_data new_ec_pool ]

but then I run into two issues:

1. How do I actually copy/move/migrate the data from the old pool 
to the 
new pool?
2. When I'm done moving the data, how do I get rid of the old data 
pool? 

I know there's a rm_data_pool option, but I have read on the 
mailing 
list that you can't remove the original data pool from a cephfs 
filesystem.

The other option is to create a whole new cephfs with a new 
metadata 
pool and the new data pool, but creating multiple filesystems is 
still 
experimental and not allowed by default...

On 6/28/19 8:28 AM, Marc Roos wrote:
>   
> What about adding the new data pool, mounting it and then moving 
the 
> files? (read copy because move between data pools does not what 
you 
> expect it do)
>
>
> -Original Message-
> From: Jorge Garcia [mailto:jgar...@soe.ucsc.edu]
> Sent: vrijdag 28 juni 2019 17:26
> To: ceph-users
> Subject: *SPAM* [ceph-users] Migrating a cephfs data pool
>
> This seems to be an issue that gets brought up repeatedly, but I 
> haven't seen a definitive answer yet. So, at the risk of 
repeating a 
> question that has already been asked:
>
> How do you migrate a cephfs data pool to a new data pool? The 
obvious 
> case would be somebody that has set up a replicated pool for 
their 
> cephfs data and then wants to convert it to an erasure code pool. 
Is 
> there a simple way to do this, other than creating a whole new 
ceph 
> cluster and copying the data using rsync?
>
> Thanks for any clues
>
> Jorge
>
> ___
> ceph-users mailing list
> cep

Re: [ceph-users] Migrating a cephfs data pool

2019-06-28 Thread Marc Roos

1.
change data pool for a folder on the file system:
setfattr -n ceph.dir.layout.pool -v fs_data.ec21 foldername

2. 
cp /oldlocation /foldername
Remember that you preferably want to use mv, but this leaves (meta) data 
on the old pool, that is not what you want when you want to delete that 
pool.

3. When everything is copied-removed, you should end up with an empty 
datapool with zero objects. 

4. Verify here with others, if you can just remove this one.

I think this is a reliable technique to switch, because you use the 
basic cephfs functionality that supposed to work. I prefer that the ceph 
guys implement a mv that does what you expect from it. Now it acts more 
or less like a linking.

-Original Message-
From: Jorge Garcia [mailto:jgar...@soe.ucsc.edu] 
Sent: vrijdag 28 juni 2019 17:52
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Migrating a cephfs data pool

Are you talking about adding the new data pool to the current 
filesystem? Like:

   $ ceph fs add_data_pool my_ceph_fs new_ec_pool

I have done that, and now the filesystem shows up as having two data 
pools:

   $ ceph fs ls
   name: my_ceph_fs, metadata pool: cephfs_meta, data pools: 
[cephfs_data new_ec_pool ]

but then I run into two issues:

1. How do I actually copy/move/migrate the data from the old pool to the 
new pool?
2. When I'm done moving the data, how do I get rid of the old data pool? 

I know there's a rm_data_pool option, but I have read on the mailing 
list that you can't remove the original data pool from a cephfs 
filesystem.

The other option is to create a whole new cephfs with a new metadata 
pool and the new data pool, but creating multiple filesystems is still 
experimental and not allowed by default...

On 6/28/19 8:28 AM, Marc Roos wrote:
>   
> What about adding the new data pool, mounting it and then moving the 
> files? (read copy because move between data pools does not what you 
> expect it do)
>
>
> -Original Message-
> From: Jorge Garcia [mailto:jgar...@soe.ucsc.edu]
> Sent: vrijdag 28 juni 2019 17:26
> To: ceph-users
> Subject: *SPAM* [ceph-users] Migrating a cephfs data pool
>
> This seems to be an issue that gets brought up repeatedly, but I 
> haven't seen a definitive answer yet. So, at the risk of repeating a 
> question that has already been asked:
>
> How do you migrate a cephfs data pool to a new data pool? The obvious 
> case would be somebody that has set up a replicated pool for their 
> cephfs data and then wants to convert it to an erasure code pool. Is 
> there a simple way to do this, other than creating a whole new ceph 
> cluster and copying the data using rsync?
>
> Thanks for any clues
>
> Jorge
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrating a cephfs data pool

2019-06-28 Thread Marc Roos

What about adding the new data pool, mounting it and then moving the 
files? (read copy because move between data pools does not what you 
expect it do)

-Original Message-
From: Jorge Garcia [mailto:jgar...@soe.ucsc.edu] 
Sent: vrijdag 28 juni 2019 17:26
To: ceph-users
Subject: *SPAM* [ceph-users] Migrating a cephfs data pool

This seems to be an issue that gets brought up repeatedly, but I haven't 
seen a definitive answer yet. So, at the risk of repeating a question 
that has already been asked:

How do you migrate a cephfs data pool to a new data pool? The obvious 
case would be somebody that has set up a replicated pool for their 
cephfs data and then wants to convert it to an erasure code pool. Is 
there a simple way to do this, other than creating a whole new ceph 
cluster and copying the data using rsync?

Thanks for any clues

Jorge

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Expected IO in luminous Ceph Cluster

2019-06-06 Thread Marc Roos


I am also thinking of moving the wal/db to ssd of the sata hdd's. Did 
you do tests before and after this change, and know what the difference 
is iops? And is the advantage more or less when your sata hdd's are 
slower? 


-Original Message-
From: Stolte, Felix [mailto:f.sto...@fz-juelich.de] 
Sent: donderdag 6 juni 2019 10:47
To: ceph-users
Subject: [ceph-users] Expected IO in luminous Ceph Cluster

Hello folks,

we are running a ceph cluster on Luminous consisting of 21 OSD Nodes 
with 9 8TB SATA drives and 3 Intel 3700 SSDs for Bluestore WAL and DB 
(1:3 Ratio). OSDs have 10Gb for Public and Cluster Network. The cluster 
is running stable for over a year. We didn’t had a closer look on IO 
until one of our customers started to complain about a VM we migrated 
from VMware with Netapp Storage to our Openstack Cloud with ceph 
storage. He sent us a sysbench report from the machine, which I could 
reproduce on other VMs as well as on a mounted RBD on physical hardware:

sysbench --file-fsync-freq=1 --threads=16 fileio --file-total-size=1G 
--file-test-mode=rndrw --file-rw-ratio=2 run sysbench 1.0.11 (using 
system LuaJIT 2.1.0-beta3)

Running the test with following options:
Number of threads: 16
Initializing random number generator from current time

Extra file open flags: 0
128 files, 8MiB each
1GiB total file size
Block size 16KiB
Number of IO requests: 0
Read/Write ratio for combined random IO test: 2.00 Periodic FSYNC 
enabled, calling fsync() each 1 requests.
Calling fsync() at the end of test, Enabled.
Using synchronous I/O mode
Doing random r/w test

File operations:
reads/s:  36.36
writes/s: 18.18
fsyncs/s: 2318.59

Throughput:
read, MiB/s:  0.57
written, MiB/s:   0.28

General statistics:
total time:  10.0071s
total number of events:  23755

Latency (ms):
 min:  0.01
 avg:  6.74
 max:   1112.58
 95th percentile: 26.68
 sum: 160022.67

Threads fairness:
events (avg/stddev):   1484.6875/52.59
execution time (avg/stddev):   10.0014/0.00

Are these numbers reasonable for a cluster of our size?

Best regards
Felix
IT-Services
Telefon 02461 61-9243
E-Mail: f.sto...@fz-juelich.de

-

-
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender), 
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. 
Dr. Sebastian M. Schmidt

-

-
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Single threaded IOPS on SSD pool.

2019-06-05 Thread Marc Roos

 We have this, if it is any help

write-4k-seq: (groupid=0, jobs=1): err= 0: pid=1446964: Fri May 24 
19:41:48 2019
  write: IOPS=760, BW=3042KiB/s (3115kB/s)(535MiB/180001msec)
slat (usec): min=7, max=234, avg=16.59, stdev=13.59
clat (usec): min=786, max=167483, avg=1295.60, stdev=1933.25
 lat (usec): min=810, max=167492, avg=1312.46, stdev=1933.67
clat percentiles (usec):
 |  1.00th=[   914],  5.00th=[   979], 10.00th=[  1020], 20.00th=[  
1074],
 | 30.00th=[  1123], 40.00th=[  1172], 50.00th=[  1205], 60.00th=[  
1254],
 | 70.00th=[  1319], 80.00th=[  1401], 90.00th=[  1516], 95.00th=[  
1631],
 | 99.00th=[  2704], 99.50th=[  3949], 99.90th=[  5145], 99.95th=[  
5538],
 | 99.99th=[139461]
   bw (  KiB/s): min=  625, max= 3759, per=80.13%, avg=2436.63, 
stdev=653.68, samples=359
   iops: min=  156, max=  939, avg=608.76, stdev=163.41, 
samples=359
  lat (usec)   : 1000=7.83%
  lat (msec)   : 2=90.27%, 4=1.42%, 10=0.45%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%, 250=0.02%
  cpu  : usr=0.52%, sys=1.55%, ctx=162087, majf=0, minf=28
  IO depths: 1=117.6%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=0,136889,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1
randwrite-4k-seq: (groupid=1, jobs=1): err= 0: pid=1448032: Fri May 24 
19:41:48 2019
  write: IOPS=655, BW=2620KiB/s (2683kB/s)(461MiB/180001msec)
slat (usec): min=7, max=120, avg=10.79, stdev= 6.22
clat (usec): min=897, max=77251, avg=1512.76, stdev=368.36
 lat (usec): min=906, max=77262, avg=1523.77, stdev=368.54
clat percentiles (usec):
 |  1.00th=[ 1106],  5.00th=[ 1205], 10.00th=[ 1254], 20.00th=[ 
1319],
 | 30.00th=[ 1369], 40.00th=[ 1418], 50.00th=[ 1483], 60.00th=[ 
1532],
 | 70.00th=[ 1598], 80.00th=[ 1663], 90.00th=[ 1778], 95.00th=[ 
1893],
 | 99.00th=[ 2540], 99.50th=[ 2933], 99.90th=[ 3392], 99.95th=[ 
4080],
 | 99.99th=[ 6194]
   bw (  KiB/s): min= 1543, max= 2830, per=79.66%, avg=2087.02, 
stdev=396.14, samples=359
   iops: min=  385, max=  707, avg=521.39, stdev=99.06, 
samples=359
  lat (usec)   : 1000=0.06%
  lat (msec)   : 2=97.19%, 4=2.70%, 10=0.04%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu  : usr=0.39%, sys=1.13%, ctx=118477, majf=0, minf=50
  IO depths: 1=116.6%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=0,117905,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1
read-4k-seq: (groupid=2, jobs=1): err= 0: pid=1449103: Fri May 24 
19:41:48 2019
   read: IOPS=2736, BW=10.7MiB/s (11.2MB/s)(1924MiB/180001msec)
slat (usec): min=6, max=142, avg= 9.26, stdev= 5.02
clat (usec): min=152, max=13253, avg=353.73, stdev=98.92
 lat (usec): min=160, max=13262, avg=363.24, stdev=99.15
clat percentiles (usec):
 |  1.00th=[  182],  5.00th=[  215], 10.00th=[  239], 20.00th=[  
273],
 | 30.00th=[  306], 40.00th=[  330], 50.00th=[  355], 60.00th=[  
375],
 | 70.00th=[  396], 80.00th=[  420], 90.00th=[  461], 95.00th=[  
498],
 | 99.00th=[  586], 99.50th=[  635], 99.90th=[  775], 99.95th=[  
889],
 | 99.99th=[ 1958]
   bw (  KiB/s): min= 5883, max=13817, per=79.66%, avg=8720.01, 
stdev=1895.05, samples=359
   iops: min= 1470, max= 3454, avg=2179.63, stdev=473.78, 
samples=359
  lat (usec)   : 250=13.13%, 500=82.11%, 750=4.64%, 1000=0.09%
  lat (msec)   : 2=0.02%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu  : usr=1.31%, sys=3.69%, ctx=493433, majf=0, minf=32
  IO depths: 1=115.9%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=492640,0,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1
randread-4k-seq: (groupid=3, jobs=1): err= 0: pid=1450173: Fri May 24 
19:41:48 2019
   read: IOPS=1812, BW=7251KiB/s (7425kB/s)(1275MiB/180001msec)
slat (usec): min=6, max=161, avg=10.25, stdev= 6.37
clat (usec): min=182, max=23748, avg=538.35, stdev=136.71
 lat (usec): min=189, max=23758, avg=548.86, stdev=137.19
clat percentiles (usec):
 |  1.00th=[  265],  5.00th=[  310], 10.00th=[  351], 20.00th=[  
445],
 | 30.00th=[  494], 40.00th=[  519], 50.00th=[  537], 60.00th=[  
562],
 | 70.00th=[  594], 80.00th=[  644], 90.00th=[  701], 95.00th=[  
742],
 | 99.00th=[  816], 99.50th=[  840], 99.90th=[  914], 99.95th=[ 
1172],
 | 99.99th=[ 2442]
   bw (  KiB/s): min= 4643,

Re: [ceph-users] How to remove ceph-mgr from a node

2019-06-05 Thread Marc Roos

 
What is wrong with?

service ceph-mgr@c stop
systemctl disable ceph-mgr@c


-Original Message-
From: Vandeir Eduardo [mailto:vandeir.edua...@gmail.com] 
Sent: woensdag 5 juni 2019 16:44
To: ceph-users
Subject: [ceph-users] How to remove ceph-mgr from a node

Hi guys,

sorry, but I'm not finding in documentation how to remove ceph-mgr from 
a node. Is it possible?

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Radosgw in container

2019-06-05 Thread Marc Roos




Has anyone put the radosgw in a container? What files do I need to put 
in the sandbox directory? Are there other things I should consider?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CEPH MDS Damaged Metadata - recovery steps

2019-06-04 Thread Marc Roos

 
How did this get damaged? You had 3x replication on the pool?



-Original Message-
From: Yan, Zheng [mailto:uker...@gmail.com] 
Sent: dinsdag 4 juni 2019 1:14
To: James Wilkins
Cc: ceph-users
Subject: Re: [ceph-users] CEPH MDS Damaged Metadata - recovery steps

On Mon, Jun 3, 2019 at 3:06 PM James Wilkins 
 wrote:
>
> Hi all,
>
> After a bit of advice to ensure we’re approaching this the right way.
>
> (version: 12.2.12, multi-mds, dirfrag is enabled)
>
> We have corrupt meta-data as identified by ceph
>
> health: HEALTH_ERR
> 2 MDSs report damaged metadata
>
> Asking the mds via damage ls
>
> {
> "damage_type": "dir_frag",
> "id": 2265410500,
> "ino": 2199349051809,
> "frag": "*",
> "path": 
"/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html/s
hop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1"
> }
>
>
> We’ve done the steps outlined here -> 
> http://docs.ceph.com/docs/luminous/cephfs/disaster-recovery/ namely
>
> cephfs-journal-tool –fs:all journal reset (both ranks) 
> cephfs-data-scan scan extents / inodes / links has completed
>
> However when attempting to access the named folder we get:
>
> 2019-05-31 03:16:04.792274 7f56f6fb5700 -1 log_channel(cluster) log 
> [ERR] : dir 0x200136b41a1 object missing on disk; some files may be 
> lost 
> (/projects/17343-5bcdaf07f4055-managed-server-0/apache-echfq-data/html
> /shop/app/cache/prod/smarty/cache/iqitreviews/simple/21832/1)
>
> We get this error followed shortly by an MDS failover
>
> Two questions really
>
> What’s not immediately clear from the documentation is should we/do 
we also need to run the below?
>
> # Session table
> cephfs-table-tool 0 reset session
> # SnapServer
> cephfs-table-tool 0 reset snap
> # InoTable
> cephfs-table-tool 0 reset inode
> # Root inodes ("/" and MDS directory)
> cephfs-data-scan init
>

No, don't do this.

> And secondly – our current train of thought is we need to grab the 
inode number of the parent folder and delete this from the metadata pool 
via rados rmomapkey – is this correct?
>

Yes, find inode number of directory 21832. check if omap key '1_head'
exist in object .. If it exists, 
remove it.

> Any input appreciated
>
> Cheers,
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Marc Roos

 

I had this with balancer active and "crush-compat"
MIN/MAX VAR: 0.43/1.59  STDDEV: 10.81

And by increasing the pg of some pools (from 8 to 64) and deleting empty 
pools, I went to this

MIN/MAX VAR: 0.59/1.28  STDDEV: 6.83

(Do not want to go to this upmap yet)




-Original Message-
From: Tarek Zegar [mailto:tze...@us.ibm.com] 
Sent: woensdag 29 mei 2019 17:52
To: ceph-users
Subject: *SPAM* [ceph-users] Balancer: uneven OSDs

Can anyone help with this? Why can't I optimize this cluster, the pg 
counts and data distribution is way off.
__

I enabled the balancer plugin and even tried to manually invoke it but 
it won't allow any changes. Looking at ceph osd df, it's not even at 
all. Thoughts?

root@hostadmin:~# ceph osd df
ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
1 hdd 0.00980 0 0 B 0 B 0 B 0 0 0
3 hdd 0.00980 1.0 10 GiB 8.3 GiB 1.7 GiB 82.83 1.14 156
6 hdd 0.00980 1.0 10 GiB 8.4 GiB 1.6 GiB 83.77 1.15 144 0 hdd 
0.00980 0 0 B 0 B 0 B 0 0 0
5 hdd 0.00980 1.0 10 GiB 9.0 GiB 1021 MiB 90.03 1.23 159
7 hdd 0.00980 1.0 10 GiB 7.7 GiB 2.3 GiB 76.57 1.05 141
2 hdd 0.00980 1.0 10 GiB 5.5 GiB 4.5 GiB 55.42 0.76 90
4 hdd 0.00980 1.0 10 GiB 5.9 GiB 4.1 GiB 58.78 0.81 99
8 hdd 0.00980 1.0 10 GiB 6.3 GiB 3.7 GiB 63.12 0.87 111 TOTAL 90 GiB 
53 GiB 37 GiB 72.93 MIN/MAX VAR: 0.76/1.23 STDDEV: 12.67


root@hostadmin:~# osdmaptool om --upmap out.txt --upmap-pool rbd
osdmaptool: osdmap file 'om'
writing upmap command output to: out.txt checking for upmap cleanups 
upmap, max-count 100, max deviation 0.01 <---really? It's not even close 
to 1% across the drives limiting to pools rbd (1) no upmaps proposed


ceph balancer optimize myplan
Error EALREADY: Unable to find further optimization,or distribution is 
already perfect



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-05-28 Thread Marc Roos


I switched first of may, and did not notice to much difference in memory 
usage. After the restart of the osd's on the node I see the memory 
consumption gradually getting back to as before.
Can't say anything about latency.






-Original Message-
From: Konstantin Shalygin  
Sent: dinsdag 28 mei 2019 11:52
To: Wido den Hollander
Cc: ceph-users@lists.ceph.com
Subject: *SPAM* Re: [ceph-users] BlueStore bitmap allocator 
under Luminous and Mimic



Hi,

With the release of 12.2.12 the bitmap allocator for BlueStore is 
now
available under Mimic and Luminous.

[osd]
bluestore_allocator = bitmap
bluefs_allocator = bitmap

Before setting this in production: What might the implications be 
and
what should be thought of?

>From what I've read the bitmap allocator seems to be better in 
read
performance and uses less memory.

In Nautilus bitmap is the default, but L and M still default to 
stupid.

Since the bitmap allocator was backported there must be a use-case 
to
use the bitmap allocator instead of stupid.

Thanks!

Wido




Wido, do you setted allocator to bitmap on L installations past this 
months? Any improvements?







k





test-memory.png
Description: Binary data
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] performance in a small cluster

2019-05-25 Thread Marc Roos

 
Maybe my data can be useful to compare with? I have the samsung sm863. 

This[0] is what I get from fio directly on the ssd, and from an rbd ssd 
pool with 3x replication[1]. 
I also have included a comparisson with cephfs[3], would be nice if 
there would be some sort of
 manual page describing general to be expected ceph overhead.


[0] direct
randwrite-4k-seq: (groupid=1, jobs=1): err= 0: pid=522903: Thu Sep  6 
21:04:12 2018
  write: IOPS=17.9k, BW=69.8MiB/s (73.2MB/s)(12.3GiB/180001msec)
slat (usec): min=4, max=333, avg= 9.94, stdev= 5.00
clat (nsec): min=1141, max=1131.2k, avg=42560.69, stdev=9074.14
 lat (usec): min=35, max=1137, avg=52.80, stdev= 9.42
clat percentiles (usec):
 |  1.00th=[   33],  5.00th=[   35], 10.00th=[   35], 20.00th=[   
35],
 | 30.00th=[   36], 40.00th=[   36], 50.00th=[   41], 60.00th=[   
43],
 | 70.00th=[   49], 80.00th=[   54], 90.00th=[   57], 95.00th=[   
58],
 | 99.00th=[   60], 99.50th=[   62], 99.90th=[   67], 99.95th=[   
70],
 | 99.99th=[  174]
   bw (  KiB/s): min=34338, max=92268, per=84.26%, avg=60268.13, 
stdev=12283.36, samples=359
   iops: min= 8584, max=23067, avg=15066.67, stdev=3070.87, 
samples=359
  lat (usec)   : 2=0.01%, 10=0.01%, 20=0.01%, 50=71.73%, 100=28.24%
  lat (usec)   : 250=0.01%, 500=0.01%, 750=0.01%
  lat (msec)   : 2=0.01%
  cpu  : usr=12.96%, sys=26.87%, ctx=3218988, majf=0, minf=10962
  IO depths: 1=116.8%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=0,3218724,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1
randread-4k-seq: (groupid=3, jobs=1): err= 0: pid=523297: Thu Sep  6 
21:04:12 2018
   read: IOPS=10.2k, BW=39.7MiB/s (41.6MB/s)(7146MiB/180001msec)
slat (usec): min=4, max=328, avg=15.39, stdev= 8.62
clat (nsec): min=1600, max=948792, avg=78946.53, stdev=36246.91
 lat (usec): min=39, max=969, avg=94.75, stdev=37.43
clat percentiles (usec):
 |  1.00th=[   38],  5.00th=[   40], 10.00th=[   40], 20.00th=[   
41],
 | 30.00th=[   41], 40.00th=[   52], 50.00th=[   70], 60.00th=[  
110],
 | 70.00th=[  112], 80.00th=[  115], 90.00th=[  125], 95.00th=[  
127],
 | 99.00th=[  133], 99.50th=[  135], 99.90th=[  141], 99.95th=[  
147],
 | 99.99th=[  243]
   bw (  KiB/s): min=19918, max=49336, per=84.40%, avg=34308.52, 
stdev=6891.67, samples=359
   iops: min= 4979, max=12334, avg=8576.75, stdev=1722.92, 
samples=359
  lat (usec)   : 2=0.01%, 10=0.01%, 20=0.01%, 50=38.06%, 100=19.88%
  lat (usec)   : 250=42.04%, 500=0.01%, 750=0.01%, 1000=0.01%
  cpu  : usr=8.07%, sys=21.59%, ctx=1829588, majf=0, minf=10954
  IO depths: 1=116.7%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=1829296,0,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1

[1] rbd ssd 3x
randwrite-4k-seq: (groupid=1, jobs=1): err= 0: pid=1448032: Fri May 24 
19:41:48 2019
  write: IOPS=655, BW=2620KiB/s (2683kB/s)(461MiB/180001msec)
slat (usec): min=7, max=120, avg=10.79, stdev= 6.22
clat (usec): min=897, max=77251, avg=1512.76, stdev=368.36
 lat (usec): min=906, max=77262, avg=1523.77, stdev=368.54
clat percentiles (usec):
 |  1.00th=[ 1106],  5.00th=[ 1205], 10.00th=[ 1254], 20.00th=[ 
1319],
 | 30.00th=[ 1369], 40.00th=[ 1418], 50.00th=[ 1483], 60.00th=[ 
1532],
 | 70.00th=[ 1598], 80.00th=[ 1663], 90.00th=[ 1778], 95.00th=[ 
1893],
 | 99.00th=[ 2540], 99.50th=[ 2933], 99.90th=[ 3392], 99.95th=[ 
4080],
 | 99.99th=[ 6194]
   bw (  KiB/s): min= 1543, max= 2830, per=79.66%, avg=2087.02, 
stdev=396.14, samples=359
   iops: min=  385, max=  707, avg=521.39, stdev=99.06, 
samples=359
  lat (usec)   : 1000=0.06%
  lat (msec)   : 2=97.19%, 4=2.70%, 10=0.04%, 20=0.01%, 50=0.01%
  lat (msec)   : 100=0.01%
  cpu  : usr=0.39%, sys=1.13%, ctx=118477, majf=0, minf=50
  IO depths: 1=116.6%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, 
>=64=0.0%
 submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, 
>=64=0.0%
 issued rwt: total=0,117905,0, short=0,0,0, dropped=0,0,0
 latency   : target=0, window=0, percentile=100.00%, depth=1
randread-4k-seq: (groupid=3, jobs=1): err= 0: pid=1450173: Fri May 24 
19:41:48 2019
   read: IOPS=1812, BW=7251KiB/s (7425kB/s)(1275MiB/180001msec)
slat (usec): min=6, max=161, avg=10.25, stdev= 6.37
clat (usec): min=182, max=23748, avg=538.35, stdev=136.71
 lat (usec): min=189, max=23758, avg=548.86, stdev=137.19
clat

Re: [ceph-users] performance in a small cluster

2019-05-25 Thread Marc Schöchlin

Hello Robert,

probably the following tool provides deeper insights whats happening on your 
osds:

https://github.com/scoopex/ceph/blob/master/src/tools/histogram_dump.py
https://github.com/ceph/ceph/pull/28244
https://user-images.githubusercontent.com/288876/58368661-410afa00-7ef0-11e9-9aca-b09d974024a7.png

Monitoring virtual machine/client behavior in a comparable way would also be a 
good thing.

@All: Do you know suitable tools?

  * kernel rbd
  * rbd-nbd
  * linux native (i.e. if your want to analyze from inside a kvm or xen vm)

(the output of "iostat -N -d -x -t -m 10" seems not to be enough for detailed 
analytics)

Regards
Marc

Am 24.05.19 um 13:22 schrieb Robert Sander:
> Hi,
>
> we have a small cluster at a customer's site with three nodes and 4 SSD-OSDs 
> each.
> Connected with 10G the system is supposed to perform well.
>
> rados bench shows ~450MB/s write and ~950MB/s read speeds with 4MB objects 
> but only 20MB/s write and 95MB/s read with 4KB objects.
>
> This is a little bit disappointing as the 4K performance is also seen in KVM 
> VMs using RBD.
>
> Is there anything we can do to improve performance with small objects / block 
> sizes?
>
> Jumbo frames have already been enabled.
>
> 4MB objects write:
>
> Total time run: 30.218930
> Total writes made:  3391
> Write size: 4194304
> Object size:    4194304
> Bandwidth (MB/sec): 448.858
> Stddev Bandwidth:   63.5044
> Max bandwidth (MB/sec): 552
> Min bandwidth (MB/sec): 320
> Average IOPS:   112
> Stddev IOPS:    15
> Max IOPS:   138
> Min IOPS:   80
> Average Latency(s): 0.142475
> Stddev Latency(s):  0.0990132
> Max latency(s): 0.814715
> Min latency(s): 0.0308732
>
> 4MB objects rand read:
>
> Total time run:   30.169312
> Total reads made: 7223
> Read size:    4194304
> Object size:  4194304
> Bandwidth (MB/sec):   957.662
> Average IOPS: 239
> Stddev IOPS:  23
> Max IOPS: 272
> Min IOPS: 175
> Average Latency(s):   0.0653696
> Max latency(s):   0.517275
> Min latency(s):   0.00201978
>
> 4K objects write:
>
> Total time run: 30.002628
> Total writes made:  165404
> Write size: 4096
> Object size:    4096
> Bandwidth (MB/sec): 21.5351
> Stddev Bandwidth:   2.0575
> Max bandwidth (MB/sec): 22.4727
> Min bandwidth (MB/sec): 11.0508
> Average IOPS:   5512
> Stddev IOPS:    526
> Max IOPS:   5753
> Min IOPS:   2829
> Average Latency(s): 0.00290095
> Stddev Latency(s):  0.0015036
> Max latency(s): 0.0778454
> Min latency(s): 0.00174262
>
> 4K objects read:
>
> Total time run:   30.000538
> Total reads made: 1064610
> Read size:    4096
> Object size:  4096
> Bandwidth (MB/sec):   138.619
> Average IOPS: 35486
> Stddev IOPS:  3776
> Max IOPS: 42208
> Min IOPS: 26264
> Average Latency(s):   0.000443905
> Max latency(s):   0.0123462
> Min latency(s):   0.000123081
>
>
> Regards
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] "allow profile rbd" or "profile rbd"

2019-05-24 Thread Marc Roos



I have still some account listing either "allow" or not. What should 
this be? Should this not be kept uniform?



[client.xxx.xx]
 key = xxx
 caps mon = "allow profile rbd"
 caps osd = "profile rbd pool=rbd,profile rbd pool=rbd.ssd"



[client.xxx]
 key = 
 caps mon = "profile rbd"





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Ceph dovecot

2019-05-23 Thread Marc Roos



Sorry for not waiting until it is published on the ceph website but, 
anyone attended this talk? Is it production ready? 

https://cephalocon2019.sched.com/event/M7j8
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Major ceph disaster

2019-05-23 Thread Marc Roos



I have been following this thread for a while, and thought I need to 
have 
 "major ceph disaster" alert on the monitoring ;)
 http://www.f1-outsourcing.eu/files/ceph-disaster.mp4 




-Original Message-
From: Kevin Flöh [mailto:kevin.fl...@kit.edu] 
Sent: donderdag 23 mei 2019 10:51
To: Robert LeBlanc
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Major ceph disaster

Hi,

we have set the PGs to recover and now they are stuck in 
active+recovery_wait+degraded and instructing them to deep-scrub does 
not change anything. Hence, the rados report is empty. Is there a way to 
stop the recovery wait to start the deep-scrub and get the output? I 
guess the recovery_wait might be caused by missing objects. Do we need 
to delete them first to get the recovery going?


Kevin


On 22.05.19 6:03 nachm., Robert LeBlanc wrote:


On Wed, May 22, 2019 at 4:31 AM Kevin Flöh  
wrote:


Hi,

thank you, it worked. The PGs are not incomplete anymore. 
Still we have 
another problem, there are 7 PGs inconsistent and a cpeh pg 
repair is 
not doing anything. I just get "instructing pg 1.5dd on osd.24 
to 
repair" and nothing happens. Does somebody know how we can get 
the PGs 
to repair?

Regards,

Kevin



Kevin,

I just fixed an inconsistent PG yesterday. You will need to figure 
out why they are inconsistent. Do these steps and then we can figure out 
how to proceed.
1. Do a deep-scrub on each PG that is inconsistent. (This may fix 
some of them)
2. Print out the inconsistent report for each inconsistent PG. 
`rados list-inconsistent-obj  --format=json-pretty`
3. You will want to look at the error messages and see if all the 
shards have the same data.

Robert LeBlanc
 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] How to fix this? session lost, hunting for new mon, session established, io error

2019-05-21 Thread Marc Roos



I am still stuck with this situation, and do not want to restart(reset) 
this host. I tried bringing down the eth connected to the client network 
for a while, but after bringing it up, I am getting the same messages 



-Original Message-
From: Marc Roos 
Sent: dinsdag 21 mei 2019 11:42
To: ceph-users
Subject: [ceph-users] How to fix this? session lost, hunting for new 
mon, session established, io error



I have this on a cephfs client, I had ceph common on 12.2.11, and 
upgraded to 12.2.12 while having this error. They are writing here [0] 
you need to upgrade kernel and it is fixed in 12.2.2

[@~]# uname -a
Linux mail03 3.10.0-957.5.1.el7.x86_6

[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 io error
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 io error
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
established



ceph version
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous 

(stable)

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg52177.html
https://tracker.ceph.com/issues/23537
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] How to fix this? session lost, hunting for new mon, session established, io error

2019-05-21 Thread Marc Roos




I have this on a cephfs client, I had ceph common on 12.2.11, and 
upgraded to 12.2.12 while having this error. They are writing here [0] 
you need to upgrade kernel and it is fixed in 12.2.2

[@~]# uname -a
Linux mail03 3.10.0-957.5.1.el7.x86_6

[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 io error
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 io error
[Tue May 21 11:23:26 2019] libceph: mon0 192.168.10.111:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 session 
established
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 
[Tue May 21 11:23:26 2019] libceph: mon1 192.168.10.112:6789 session 
lost, hunting for new mon
[Tue May 21 11:23:26 2019] libceph: mon2 192.168.10.113:6789 session 
established



ceph version
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous 
(stable)

[0]
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg52177.html
https://tracker.ceph.com/issues/23537
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd

2019-05-21 Thread Marc Schöchlin

Hello Jason,

Am 20.05.19 um 23:49 schrieb Jason Dillaman:
> On Mon, May 20, 2019 at 2:17 PM Marc Schöchlin  wrote:
>> Hello cephers,
>>
>> we have a few systems which utilize a rbd-bd map/mount to get access to a 
>> rbd volume.
>> (This problem seems to be related to "[ceph-users] Slow requests from 
>> bluestore osds" (the original thread))
>>
>> Unfortunately the rbd-nbd device of a system crashes three mondays in series 
>> at ~00:00 when the systemd fstrim timer executes "fstrim -av".
>> (which runs in parallel to deep scrub operations)
> That's probably not a good practice if you have lots of VMs doing this
> at the same time *and* you are not using object-map. The reason is
> that "fstrim" could discard huge extents that result around a thousand
> concurrent remove/truncate/zero ops per image being thrown at your
> cluster.
Sure, currently we do not have lots of vms which are capable to run fstim on 
rbd volumes.
But the already involved RBD Images are multiple-tb images with a high 
write/deletetion rate.
Therefore i am already in progress to distribute fstrims by adding random delays
>
>> After that the device constantly reports io errors every time a access to 
>> the filesystem happens.
>> Unmounting, remapping and mounting helped to get the filesystem/device back 
>> into business :-)
> If the cluster was being DDoSed by the fstrims, the VM OSes' might
> have timed out thinking a controller failure.

Yes and no :-) Probably my problem is related to the kernel release, kernel 
setting or the operating system release.
Why?

  * we run ~800 RBD images on that ceph cluster with rbd-nbd 12.2.5 in our xen 
cluster as dom0-storage repository device without any timeout problems
(kernel 4.4.0+10, centos 7)
  * we run some 35TB kRBD images with multiples of the load of the crashed 
rbd-nbd very write/read/deletion load without any timeout problems
  * the timeout problem appears on two vms (ubuntu 18.04, ubuntu 16.04) which 
utilize the described settings

From my point of view, the error behavior is currently reproducible with a good 
probability.
Do you have suggestions how to find the root cause of this problem?

Regards
Marc

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Cephfs client evicted, how to unmount the filesystem on the client?

2019-05-21 Thread Marc Roos






[@ceph]# ps -aux | grep D
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
root 12527  0.0  0.0 123520   932 pts/1D+   09:26   0:00 umount 
/home/mail-archive
root 14549  0.2  0.0  0 0 ?D09:29   0:09 
[kworker/0:0]
root 23350  0.0  0.0 123520   932 pts/0D09:38   0:00 umount 
/home/archiveindex

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Is a not active mds doing something?

2019-05-21 Thread Marc Roos




I have not configured anything for the msd except this


[mds]
# 100k+ files in 2 folders
mds bal fragment size max = 12
# maybe for nfs-ganesha problems?
# http://docs.ceph.com/docs/master/cephfs/eviction/
mds_session_blacklist_on_timeout = false
mds_session_blacklist_on_evict = false
mds_cache_memory_limit = 80
# faster fail over?
#mds beacon grace = 5


-Original Message-
From: Eugen Block [mailto:ebl...@nde.ag] 
Sent: dinsdag 21 mei 2019 10:18
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Is a not active mds doing something?

Hi Marc,

have you configured the other MDS to be standby-replay for the active 
MDS? I have three MDS servers, one is active, the second is 
active-standby and the third just standby. If the active fails, the 
second takes over within seconds. This is what I have in my ceph.conf:

[mds.]
mds_standby_replay = true
mds_standby_for_rank = 0


Regards,
Eugen


Zitat von Marc Roos :

> Should a not active mds be doing something??? When I restarted the not 

> active mds.c, My client io on the fs_data pool disappeared.
>
>
>   services:
> mon: 3 daemons, quorum a,b,c
> mgr: c(active), standbys: a, b
> mds: cephfs-1/1/1 up  {0=a=up:active}, 1 up:standby
> osd: 32 osds: 32 up, 32 in
> rgw: 2 daemons active
>
>
>
> -Original Message-
> From: Marc Roos
> Sent: dinsdag 21 mei 2019 10:01
> To: ceph-users@lists.ceph.com; Marc Roos
> Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 
> 15 min later another cephfs vm
>
>
> I have evicted all client connections and have still high load on 
> osd's
>
> And ceph osd pool stats shows still client activity?
>
> pool fs_data id 20
>   client io 565KiB/s rd, 120op/s rd, 0op/s wr
>
>
>
>
> -Original Message-
> From: Marc Roos
> Sent: dinsdag 21 mei 2019 9:51
> To: ceph-users@lists.ceph.com; Marc Roos
> Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 
> 15 min later another cephfs vm
>
>
> I have got this today again? I cannot unmount the filesystem and 
> looks like some osd's are having 100% cpu utilization?
>
>
> -Original Message-
> From: Marc Roos
> Sent: maandag 20 mei 2019 12:42
> To: ceph-users
> Subject: [ceph-users] cephfs causing high load on vm, taking down 15 
> min later another cephfs vm
>
>
>
> I got my first problem with cephfs in a production environment. Is it 
> possible from these logfiles to deduct what happened?
>
> svr1 is connected to ceph client network via switch
> svr2 vm is collocated on c01 node.
> c01 has osd's and the mon.a colocated.
>
> svr1 was the first to report errors at 03:38:44. I have no error 
> messages reported of a network connection problem by any of the ceph 
> nodes. I have nothing in dmesg on c01.
>
> [@c01 ~]# cat /etc/redhat-release
> CentOS Linux release 7.6.1810 (Core)
> [@c01 ~]# uname -a
> Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 
> 2019
>
> x86_64 x86_64 x86_64 GNU/Linux
> [@c01 ~]# ceph versions
> {
> "mon": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 3
> },
> "mgr": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 3
> },
> "osd": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 32
> },
> "mds": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 2
> },
> "rgw": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 2
> },
> "overall": {
> "ceph version 12.2.12 
> (1436006594665279fe734b4c15d7e08c13ebd777)
>
> luminous (stable)": 42
> }
> }
>
>
>
>
> [0] svr1 messages
> May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
> May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
> May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
> May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
> May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
> May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
> May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
> May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
> May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
> May 20 03:38:01 svr1 systemd: Started Session 308983 of

[ceph-users] Is a not active mds doing something?

2019-05-21 Thread Marc Roos



Should a not active mds be doing something??? When I restarted the not 
active mds.c, My client io on the fs_data pool disappeared. 


  services:
mon: 3 daemons, quorum a,b,c
mgr: c(active), standbys: a, b
mds: cephfs-1/1/1 up  {0=a=up:active}, 1 up:standby
osd: 32 osds: 32 up, 32 in
rgw: 2 daemons active



-Original Message-
From: Marc Roos 
Sent: dinsdag 21 mei 2019 10:01
To: ceph-users@lists.ceph.com; Marc Roos
Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm

 
I have evicted all client connections and have still high load on osd's 

And ceph osd pool stats shows still client activity?

pool fs_data id 20
  client io 565KiB/s rd, 120op/s rd, 0op/s wr




-Original Message-
From: Marc Roos
Sent: dinsdag 21 mei 2019 9:51
To: ceph-users@lists.ceph.com; Marc Roos
Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm


I have got this today again? I cannot unmount the filesystem and 
looks like some osd's are having 100% cpu utilization?


-Original Message-
From: Marc Roos
Sent: maandag 20 mei 2019 12:42
To: ceph-users
Subject: [ceph-users] cephfs causing high load on vm, taking down 15 min 
later another cephfs vm



I got my first problem with cephfs in a production environment. Is it 
possible from these logfiles to deduct what happened?

svr1 is connected to ceph client network via switch
svr2 vm is collocated on c01 node.
c01 has osd's and the mon.a colocated. 

svr1 was the first to report errors at 03:38:44. I have no error 
messages reported of a network connection problem by any of the ceph 
nodes. I have nothing in dmesg on c01.

[@c01 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[@c01 ~]# uname -a
Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 

x86_64 x86_64 x86_64 GNU/Linux
[@c01 ~]# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 32
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 42
}
}




[0] svr1 messages 
May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos

No, but even if, I never had any issues when running multiple scrubs.

-Original Message-
From: EDH - Manuel Rios Fernandez [mailto:mrios...@easydatahost.com] 
Sent: dinsdag 21 mei 2019 10:03
To: Marc Roos; 'ceph-users'
Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm

Hi Marc

Is there any scrub / deepscrub running in the affected OSDs?

Best Regards,
Manuel

-Mensaje original-
De: ceph-users  En nombre de Marc 
Roos Enviado el: martes, 21 de mayo de 2019 10:01
Para: ceph-users ; Marc Roos 

Asunto: Re: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm

I have evicted all client connections and have still high load on osd's 

And ceph osd pool stats shows still client activity?

pool fs_data id 20
  client io 565KiB/s rd, 120op/s rd, 0op/s wr

-Original Message-
From: Marc Roos
Sent: dinsdag 21 mei 2019 9:51
To: ceph-users@lists.ceph.com; Marc Roos
Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm

I have got this today again? I cannot unmount the filesystem and 
looks like some osd's are having 100% cpu utilization?

-Original Message-
From: Marc Roos
Sent: maandag 20 mei 2019 12:42
To: ceph-users
Subject: [ceph-users] cephfs causing high load on vm, taking down 15 min 
later another cephfs vm

I got my first problem with cephfs in a production environment. Is it 
possible from these logfiles to deduct what happened?

svr1 is connected to ceph client network via switch
svr2 vm is collocated on c01 node.
c01 has osd's and the mon.a colocated. 

svr1 was the first to report errors at 03:38:44. I have no error 
messages reported of a network connection problem by any of the ceph 
nodes. I have nothing in dmesg on c01.

[@c01 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[@c01 ~]# uname -a
Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 

x86_64 x86_64 x86_64 GNU/Linux
[@c01 ~]# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 32
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 42
}
}

[0] svr1 messages 
May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos

 
I have evicted all client connections and have still high load on osd's 

And ceph osd pool stats shows still client activity?

pool fs_data id 20
  client io 565KiB/s rd, 120op/s rd, 0op/s wr




-Original Message-
From: Marc Roos 
Sent: dinsdag 21 mei 2019 9:51
To: ceph-users@lists.ceph.com; Marc Roos
Subject: RE: [ceph-users] cephfs causing high load on vm, taking down 15 
min later another cephfs vm


I have got this today again? I cannot unmount the filesystem and 
looks like some osd's are having 100% cpu utilization?


-Original Message-
From: Marc Roos
Sent: maandag 20 mei 2019 12:42
To: ceph-users
Subject: [ceph-users] cephfs causing high load on vm, taking down 15 min 
later another cephfs vm



I got my first problem with cephfs in a production environment. Is it 
possible from these logfiles to deduct what happened?

svr1 is connected to ceph client network via switch
svr2 vm is collocated on c01 node.
c01 has osd's and the mon.a colocated. 

svr1 was the first to report errors at 03:38:44. I have no error 
messages reported of a network connection problem by any of the ceph 
nodes. I have nothing in dmesg on c01.

[@c01 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[@c01 ~]# uname -a
Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 

x86_64 x86_64 x86_64 GNU/Linux
[@c01 ~]# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 32
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 42
}
}




[0] svr1 messages 
May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph:

Re: [ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-21 Thread Marc Roos

I have got this today again? I cannot unmount the filesystem and 
looks like some osd's are having 100% cpu utilization?

-Original Message-
From: Marc Roos 
Sent: maandag 20 mei 2019 12:42
To: ceph-users
Subject: [ceph-users] cephfs causing high load on vm, taking down 15 min 
later another cephfs vm

I got my first problem with cephfs in a production environment. Is it 
possible from these logfiles to deduct what happened?

svr1 is connected to ceph client network via switch
svr2 vm is collocated on c01 node.
c01 has osd's and the mon.a colocated. 

svr1 was the first to report errors at 03:38:44. I have no error 
messages reported of a network connection problem by any of the ceph 
nodes. I have nothing in dmesg on c01.

[@c01 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[@c01 ~]# uname -a
Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 

x86_64 x86_64 x86_64 GNU/Linux
[@c01 ~]# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 3
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 32
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 2
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 

luminous (stable)": 42
}
}

[0] svr1 messages 
May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.16

Re: [ceph-users] Slow requests from bluestore osds / crashing rbd-nbd

2019-05-20 Thread Marc Schöchlin

Hello cephers,

we have a few systems which utilize a rbd-bd map/mount to get access to a rbd 
volume.
(This problem seems to be related to "[ceph-users] Slow requests from bluestore 
osds" (the original thread))

Unfortunately the rbd-nbd device of a system crashes three mondays in series at 
~00:00 when the systemd fstrim timer executes "fstrim -av".
(which runs in parallel to deep scrub operations)

After that the device constantly reports io errors every time a access to the 
filesystem happens.
Unmounting, remapping and mounting helped to get the filesystem/device back 
into business :-)

Manual 30 minute stresstests using the following fio command, did not produce 
any problems on client side
(Ceph storage reported some slow requests while testing).

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test 
--filename=test --bs=4k --iodepth=64 --size=4G --readwrite=randrw 
--rwmixread=50 --numjobs=50 --loops=10

It seems that others also experienced this problem: 
https://ceph-users.ceph.narkive.com/2FIfyx1U/rbd-nbd-timeout-and-crash
The change for setting device timeouts by not seems to be merged to luminous.
Experiments setting the timeout manually after mapping using 
https://github.com/OnApp/nbd-kernel_mod/blob/master/nbd_set_timeout.c haven't 
change the situation.

Do you have suggestions how to analyze/solve the situation?

Regards
Marc
--



The client kernel throws messages like this:

May 19 23:59:01 int-nfs-001 CRON[836295]: (root) CMD (command -v debian-sa1 > 
/dev/null && debian-sa1 60 2)
May 20 00:00:30 int-nfs-001 systemd[1]: Starting Discard unused blocks...
May 20 00:01:02 int-nfs-001 kernel: [1077851.623582] block nbd0: Connection 
timed out
May 20 00:01:02 int-nfs-001 kernel: [1077851.623613] block nbd0: shutting down 
sockets
May 20 00:01:02 int-nfs-001 kernel: [1077851.623617] print_req_error: I/O 
error, dev nbd0, sector 84082280
May 20 00:01:02 int-nfs-001 kernel: [1077851.623632] block nbd0: Connection 
timed out
May 20 00:01:02 int-nfs-001 kernel: [1077851.623636] print_req_error: I/O 
error, dev nbd0, sector 92470887
May 20 00:01:02 int-nfs-001 kernel: [1077851.623642] block nbd0: Connection 
timed out

Ceph throws messages like this:

2019-05-20 00:00:00.000124 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173572 : 
cluster [INF] overall HEALTH_OK
2019-05-20 00:00:54.249998 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173586 : 
cluster [WRN] Health check failed: 644 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:00.330566 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173587 : 
cluster [WRN] Health check update: 594 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:09.768476 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173591 : 
cluster [WRN] Health check update: 505 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:14.768769 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173592 : 
cluster [WRN] Health check update: 497 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:20.610398 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173593 : 
cluster [WRN] Health check update: 509 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:28.721891 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173594 : 
cluster [WRN] Health check update: 501 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:34.909842 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173596 : 
cluster [WRN] Health check update: 494 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:44.770330 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173597 : 
cluster [WRN] Health check update: 500 slow requests are blocked > 32 sec. 
Implicated osds 51 (REQUEST_SLOW)
2019-05-20 00:01:49.770625 mon.ceph-mon-s43 mon.0 10.23.27.153:6789/0 173599 : 
cluster [WRN] Heal

[ceph-users] cephfs causing high load on vm, taking down 15 min later another cephfs vm

2019-05-20 Thread Marc Roos




I got my first problem with cephfs in a production environment. Is it 
possible from these logfiles to deduct what happened?

svr1 is connected to ceph client network via switch
svr2 vm is collocated on c01 node.
c01 has osd's and the mon.a colocated. 

svr1 was the first to report errors at 03:38:44. I have no error 
messages reported of a network connection problem by any of the ceph 
nodes. I have nothing in dmesg on c01.

[@c01 ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
[@c01 ~]# uname -a
Linux c01 3.10.0-957.10.1.el7.x86_64 #1 SMP Mon Mar 18 15:06:45 UTC 2019 
x86_64 x86_64 x86_64 GNU/Linux
[@c01 ~]# ceph versions
{
"mon": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 3
},
"osd": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 32
},
"mds": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 2
},
"rgw": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 2
},
"overall": {
"ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) 
luminous (stable)": 42
}
}




[0] svr1 messages 
May 20 03:36:01 svr1 systemd: Started Session 308978 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308979 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:36:01 svr1 systemd: Started Session 308980 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308981 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308982 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:01 svr1 systemd: Started Session 308983 of user root.
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:44 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: last message repeated 5 times
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon1 192.168.x.112:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: osd0 192.168.x.111:6814 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon2 192.168.x.113:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 io error
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
lost, hunting for new mon
May 20 03:38:45 svr1 kernel: libceph: mon0 192.168.x.111:6789 session 
established


[1] svr2 messages 
May 20 03:40:01 svr2 systemd: Stopping User Slice of root.
May 20 03:40:01 svr2 systemd: Removed slice User Slice of root.
May 20 03:40:01

Re: [ceph-users] Default min_size value for EC pools

2019-05-19 Thread Marc Roos

https://ceph.com/community/new-luminous-erasure-coding-rbd-cephfs/

-Original Message-
From: Florent B [mailto:flor...@coppint.com] 
Sent: zondag 19 mei 2019 12:06
To: Paul Emmerich
Cc: Ceph Users
Subject: Re: [ceph-users] Default min_size value for EC pools

Thank you Paul for your answer. On a recent setup on Luminous I had by 
default min_size=k+m with k=2 and m=1.

When you say unsafe : what I would like to do with k=2 and m=1 is 
equivalent of replicated size=2 pool. In this context, is it really 
equivalent or EC pool is really unsafe ?

Thank you.

On 19/05/2019 11:41, Paul Emmerich wrote:

Default is k+1 or k if m == 1

min_size = k is unsafe and should never be set.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at 
https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sun, May 19, 2019 at 11:31 AM Florent B  
wrote:

Hi,

I would like to know why default min_size value for EC pools 
is k+m ?

In this context, when a single OSD is down, the pool is 
unavailable (pgs
are "incomplete" and stuck queries start to grow).

Setting min_size=k seems the right setting, isn't it ?

Thank you.

Florent

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

1 2 3 4 5 6 >

1 - 100 of 578 matches

Mail list logo