[ceph-users] kernel cephfs - too many caps used by client

2019-10-17 Thread Lei Liu
Hi cephers,

We have some ceph clusters use cephfs in production(mount with kernel
cephfs), but several of clients often keep a lot of caps(millions)
unreleased.
I know this is due to the client's inability to complete the cache release,
errors might have been encountered, but no logs.

client kernel version is 3.10.0-957.21.3.el7.x86_64
ceph version is mostly v12.2.8

ceph status shows:

x clients failing to respond to cache pressure

client kernel debug shows:

# cat
/sys/kernel/debug/ceph/a00cc99c-f9f9-4dd9-9281-43cd12310e41.client11291811/caps
total 23801585
avail 1074
used 23800511
reserved 0
min 1024

mds config:
[mds]
mds_max_caps_per_client = 10485760
# 50G
mds_cache_memory_limit = 53687091200

I want to know if some ceph configurations can solve this problem ?

Any suggestions?

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph iscsi question

2019-10-17 Thread 展荣臻(信泰)


> Have you updated your "/etc/multipath.conf" as documented here [1]?
> You should have ALUA configured but it doesn't appear that's the case
> w/ your provided output.
>
> [1] https://docs.ceph.com/ceph-prs/30912/rbd/iscsi-initiator-linux/

 Thank you jason.Updated the /etc/multipath.conf as you said ,it work 
successfully.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crashed MDS (segfault)

2019-10-17 Thread Gustavo Tonini
Hi Zheng,
the cluster is running ceph mimic. This warning about network only appears
when using nautilus' cephfs-journal-tool.

"cephfs-data-scan scan_links" does not report any issue.

How could variable "newparent" be NULL at
https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is
there a way to fix this?

On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng  wrote:

> On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini 
> wrote:
> >
> > No. The cluster was just rebalancing.
> >
> > The journal seems damaged:
> >
> > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal inspect
> > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket
> couldn't create socket (97) Address family not supported by protocol
>
> corrupted journal shouldn't cause error like this. This is more like
> network issue. please double check network config of your cluster.
>
> > Overall journal integrity: DAMAGED
> > Corrupt regions:
> > 0x1c5e4d904ab-1c5e4d9ddbc
> > ceph@deployer:~$
> >
> > Could a journal reset help with this?
> >
> > I could snapshot all FS pools and export the journal before to guarantee
> a rollback to this state if something goes wrong with jounal reset.
> >
> > On Thu, Oct 17, 2019, 09:07 Yan, Zheng  wrote:
> >>
> >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini <
> gustavoton...@gmail.com> wrote:
> >> >
> >> > Dear ceph users,
> >> > we're experiencing a segfault during MDS startup (replay process)
> which is making our FS inaccessible.
> >> >
> >> > MDS log messages:
> >> >
> >> > Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c08f49700  1 -- 192.168.8.195:6800/3181891717 <== osd.26
> 192.168.8.209:6821/2419345 3  osd_op_reply(21 1. [getxattr]
> v0'0 uv0 ondisk = -61 ((61) No data available)) v8  154+0+0 (3715233608
> 0 0) 0x2776340 con 0x18bd500
> >> > Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
> >> > Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544
> >> > Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 mds.0.cache.ino(0x100)  magic is 'ceph fs volume v011'
> (expecting 'ceph fs volume v011')
> >> > Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10  mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents
> [1,head]
> >> > Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head]
> ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17
> 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800]
> >> > Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
> >> > Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482
> >> > Oct 15 03:41:39.894891 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201
> 7f3c00589700 10 mds.0.cache.ino(0x1)  magic is 'ceph fs volume v011'
> (expecting 'ceph fs volume v011')
> >> > Oct 15 03:41:39.894958 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.205
> 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread
> 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6
> (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1:
> (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm
> const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308)
> [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53)
> [0x6e1f63]#012 5:
> (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012
> 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&,
> Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119)
> [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e)
> [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: (clone()+0x6d)
> [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or `objdump -rdS
> ` is needed to interpret this.
> >> > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels ---
> >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none
> >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep
> >> >
> >>
> >> looks like snap info for root inode is corrupted. did you do any
> >> unusually operation before this happened?
> >>
> >>
> >> >
> >> > Cluster status information:
> >> >
> >> >   cluster:
> >> > id: b8205875-e56f-4280-9e52-6aab9c758586
> >> > health: HEALTH_WARN
> >> > 1 filesystem is degraded
> >> > 1 nearfull osd(s)
> >> > 11 pool(s) nearfull
> >> >
> >> >   services:
> >> > mon: 3 daemons, quorum mon1,mon2,mon3
> >> > mgr: mon1(active), standbys: mon2, mon3
> >> > mds: 

Re: [ceph-users] Crashed MDS (segfault)

2019-10-17 Thread Yan, Zheng
On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini  wrote:
>
> No. The cluster was just rebalancing.
>
> The journal seems damaged:
>
> ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal inspect
> 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket couldn't 
> create socket (97) Address family not supported by protocol

corrupted journal shouldn't cause error like this. This is more like
network issue. please double check network config of your cluster.

> Overall journal integrity: DAMAGED
> Corrupt regions:
> 0x1c5e4d904ab-1c5e4d9ddbc
> ceph@deployer:~$
>
> Could a journal reset help with this?
>
> I could snapshot all FS pools and export the journal before to guarantee a 
> rollback to this state if something goes wrong with jounal reset.
>
> On Thu, Oct 17, 2019, 09:07 Yan, Zheng  wrote:
>>
>> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini  
>> wrote:
>> >
>> > Dear ceph users,
>> > we're experiencing a segfault during MDS startup (replay process) which is 
>> > making our FS inaccessible.
>> >
>> > MDS log messages:
>> >
>> > Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c08f49700  1 -- 192.168.8.195:6800/3181891717 <== osd.26 
>> > 192.168.8.209:6821/2419345 3  osd_op_reply(21 1. [getxattr] 
>> > v0'0 uv0 ondisk = -61 ((61) No data available)) v8  154+0+0 
>> > (3715233608 0 0) 0x2776340 con 0x18bd500
>> > Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
>> > Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544
>> > Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 mds.0.cache.ino(0x100)  magic is 'ceph fs volume v011' 
>> > (expecting 'ceph fs volume v011')
>> > Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10  mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents 
>> > [1,head]
>> > Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] 
>> > ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 
>> > 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 
>> > 0x18bf800]
>> > Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
>> > Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482
>> > Oct 15 03:41:39.894891 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
>> > 7f3c00589700 10 mds.0.cache.ino(0x1)  magic is 'ceph fs volume v011' 
>> > (expecting 'ceph fs volume v011')
>> > Oct 15 03:41:39.894958 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.205 
>> > 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread 
>> > 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 
>> > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: 
>> > (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm 
>> > const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) 
>> > [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) 
>> > [0x6e1f63]#012 5: 
>> > (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 
>> > 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, 
>> > Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) 
>> > [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e) 
>> > [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: 
>> > (clone()+0x6d) [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or 
>> > `objdump -rdS ` is needed to interpret this.
>> > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels ---
>> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none
>> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep
>> >
>>
>> looks like snap info for root inode is corrupted. did you do any
>> unusually operation before this happened?
>>
>>
>> >
>> > Cluster status information:
>> >
>> >   cluster:
>> > id: b8205875-e56f-4280-9e52-6aab9c758586
>> > health: HEALTH_WARN
>> > 1 filesystem is degraded
>> > 1 nearfull osd(s)
>> > 11 pool(s) nearfull
>> >
>> >   services:
>> > mon: 3 daemons, quorum mon1,mon2,mon3
>> > mgr: mon1(active), standbys: mon2, mon3
>> > mds: fs_padrao-1/1/1 up  {0=mds1=up:replay(laggy or crashed)}
>> > osd: 90 osds: 90 up, 90 in
>> >
>> >   data:
>> > pools:   11 pools, 1984 pgs
>> > objects: 75.99 M objects, 285 TiB
>> > usage:   457 TiB used, 181 TiB / 639 TiB avail
>> > pgs: 1896 active+clean
>> >  87   active+clean+scrubbing+deep+repair
>> >  1active+clean+scrubbing
>> >
>> >   io:

Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread Robert LeBlanc
On Thu, Oct 17, 2019 at 12:35 PM huxia...@horebdata.cn
 wrote:
>
> hello, Robert
>
> thanks for the quick reply. I did test with  osd op queue = wpq , and osd 
> op queue cut off = high
> and
> osd_recovery_op_priority = 1
> osd recovery delay start = 20
> osd recovery max active = 1
> osd recovery max chunk = 1048576
> osd recovery sleep = 1
> osd recovery sleep hdd = 1
> osd recovery sleep ssd = 1
> osd recovery sleep hybrid = 1
> osd recovery priority = 1
> osd max backfills = 1
> osd backfill scan max = 16
> osd backfill scan min = 4
> osd_op_thread_suicide_timeout = 300
>
> But still the ceph cluster showed extremely hug recovery activities during 
> the beginning of the recovery, and after ca. 5-10 minutes, the recovery 
> gradually get under the control. I guess this is quite similar to what you 
> encountered in Nov. 2015.
>
> It is really annoying, and what else can i do to mitigate this weird 
> inital-recovery issue? any suggestions are much appreciated.

Hmm, on our Luminous cluster, we have the defaults other than the op
queue and cut off and bringing in a node is nearly zero impact for
client traffic. Those would need to be set on all OSDs to be
completely effective. Maybe go back to the defaults?


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread huxia...@horebdata.cn
hello, Robert 

thanks for the quick reply. I did test with  osd op queue = wpq , and osd 
op queue cut off = high
and 
osd_recovery_op_priority = 1  
osd recovery delay start = 20  
osd recovery max active = 1  
osd recovery max chunk = 1048576  
osd recovery sleep = 1   
osd recovery sleep hdd = 1
osd recovery sleep ssd = 1
osd recovery sleep hybrid = 1 
osd recovery priority = 1
osd max backfills = 1 
osd backfill scan max = 16  
osd backfill scan min = 4   
osd_op_thread_suicide_timeout = 300   

But still the ceph cluster showed extremely hug recovery activities during the 
beginning of the recovery, and after ca. 5-10 minutes, the recovery gradually 
get under the control. I guess this is quite similar to what you encountered in 
Nov. 2015.

It is really annoying, and what else can i do to mitigate this weird 
inital-recovery issue? any suggestions are much appreciated.

thanks again,

samuel



huxia...@horebdata.cn
 
From: Robert LeBlanc
Date: 2019-10-17 21:23
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph 
recovery
On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn
 wrote:
>
> I happened to find a note that you wrote in Nov 2015: 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html
> and I believe this is what i just hit exactly the same behavior : a host down 
> will badly take the client performance down 1/10 (with 200MB/s recovery 
> workload) and then took ten minutes  to get good control of OSD recovery.
>
> Could you please share how did you eventally solve that issue? by seting a 
> fair large OSD recovery delay start or any other parameter?
 
Wow! Dusting off the cobwebs here. I think this is what lead me to dig
into the code and write the WPQ scheduler. I can't remember doing
anything specific. I'm sorry I'm not much help in this regard.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph iscsi question

2019-10-17 Thread Mike Christie
On 10/17/2019 10:52 AM, Mike Christie wrote:
> On 10/16/2019 01:35 AM, 展荣臻(信泰) wrote:
>> hi,all
>>   we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi runs in 
>> docker.
>>   I create iscsi target according to 
>> https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/.
>>   I discovered and logined iscsi target on another host,as show below:
>>
>> [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p 192.168.42.110
>> 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
>> 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
>> [root@node1 tmp]# iscsiadm -m node -T 
>> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l
>> Logging in to [iface: default, target: 
>> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] 
>> (multiple)
>> Login to [iface: default, target: 
>> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] 
>> successful.
>>
>>  /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur,
>>
>> [root@node1 tmp]# mkfs.xfs -f /dev/sde
>> meta-data=/dev/sde   isize=512agcount=4, agsize=1966080 blks
>>  =   sectsz=512   attr=2, projid32bit=1
>>  =   crc=1finobt=0, sparse=0
>> data =   bsize=4096   blocks=7864320, imaxpct=25
>>  =   sunit=0  swidth=0 blks
>> naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
>> log  =internal log   bsize=4096   blocks=3840, version=2
>>  =   sectsz=512   sunit=0 blks, lazy-count=1
>> realtime =none   extsz=4096   blocks=0, rtextents=0
>> existing superblock read failed: Input/output error
>> mkfs.xfs: pwrite64 failed: Input/output error
>>
>> message in /var/log/messages:
>> Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0
>> Oct 16 14:01:44 localhost kernel: sde: unable to read partition table
>> Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0
>> Oct 16 14:02:17 localhost kernel: sde: unable to read partition table
>>
> 
> Is there any errors before ofter this? Something about a transport or
> hardware error or something about SCSI sense?
> 
> What does "iscsiadm -m session -P 3"
> 
> report? Does it report failed for the login or device states?
> 
> If logged in ok, can you just do a
> 
> sg_tur /dev/sde

Instead of sg_tur can you give me the output of

sg_rtpg -v /dev/sde

?

Can you also tell me the tcmu-runner and ceph-iscsi or
ceph-iscsi-cli/ceph-iscsi-config versions?


> 
> ?
> 
> On the target side are there errors in /var/log/messages or
> /var/log/tcmu-runner.log?
> 
> 
> 
> 
>> we use Luminous ceph.
>> what cause this error? how debug it.any suggestion is appreciative. 
>>
>>   
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread Robert LeBlanc
On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn
 wrote:
>
> I happened to find a note that you wrote in Nov 2015: 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html
> and I believe this is what i just hit exactly the same behavior : a host down 
> will badly take the client performance down 1/10 (with 200MB/s recovery 
> workload) and then took ten minutes  to get good control of OSD recovery.
>
> Could you please share how did you eventally solve that issue? by seting a 
> fair large OSD recovery delay start or any other parameter?

Wow! Dusting off the cobwebs here. I think this is what lead me to dig
into the code and write the WPQ scheduler. I can't remember doing
anything specific. I'm sorry I'm not much help in this regard.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread huxia...@horebdata.cn
Hello, Robert,

I happened to find a note that you wrote in Nov 2015: 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html
and I believe this is what i just hit exactly the same behavior : a host down 
will badly take the client performance down 1/10 (with 200MB/s recovery 
workload) and then took ten minutes  to get good control of OSD recovery.

Could you please share how did you eventally solve that issue? by seting a fair 
large OSD recovery delay start or any other parameter?

best regards,

samuel
 



huxia...@horebdata.cn
 
From: Robert LeBlanc
Date: 2019-10-16 21:46
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph 
recovery
On Wed, Oct 16, 2019 at 11:53 AM huxia...@horebdata.cn
 wrote:
>
> My Ceph version is Luminuous 12.2.12. Do you think should i upgrade to 
> Nautilus, or will Nautilus have a better control of recovery/backfilling?
 
We have a Jewel cluster and Luminuous cluster that we have changed
these settings on and it really helped both of them.
 

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph iscsi question

2019-10-17 Thread Mike Christie
On 10/16/2019 01:35 AM, 展荣臻(信泰) wrote:
> hi,all
>   we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi runs in 
> docker.
>   I create iscsi target according to 
> https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/.
>   I discovered and logined iscsi target on another host,as show below:
> 
> [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p 192.168.42.110
> 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
> 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
> [root@node1 tmp]# iscsiadm -m node -T 
> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l
> Logging in to [iface: default, target: 
> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] 
> (multiple)
> Login to [iface: default, target: iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, 
> portal: 192.168.42.110,3260] successful.
> 
>  /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur,
> 
> [root@node1 tmp]# mkfs.xfs -f /dev/sde
> meta-data=/dev/sde   isize=512agcount=4, agsize=1966080 blks
>  =   sectsz=512   attr=2, projid32bit=1
>  =   crc=1finobt=0, sparse=0
> data =   bsize=4096   blocks=7864320, imaxpct=25
>  =   sunit=0  swidth=0 blks
> naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
> log  =internal log   bsize=4096   blocks=3840, version=2
>  =   sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none   extsz=4096   blocks=0, rtextents=0
> existing superblock read failed: Input/output error
> mkfs.xfs: pwrite64 failed: Input/output error
> 
> message in /var/log/messages:
> Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0
> Oct 16 14:01:44 localhost kernel: sde: unable to read partition table
> Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0
> Oct 16 14:02:17 localhost kernel: sde: unable to read partition table
> 

Is there any errors before ofter this? Something about a transport or
hardware error or something about SCSI sense?

What does "iscsiadm -m session -P 3"

report? Does it report failed for the login or device states?

If logged in ok, can you just do a

sg_tur /dev/sde

?

On the target side are there errors in /var/log/messages or
/var/log/tcmu-runner.log?




> we use Luminous ceph.
> what cause this error? how debug it.any suggestion is appreciative. 
> 
>   
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NFS

2019-10-17 Thread Brent Kennedy
Awesome, thank you giving an overview of these features, sounds like the 
correct direction then!

-Brent

-Original Message-
From: Daniel Gryniewicz  
Sent: Thursday, October 3, 2019 8:20 AM
To: Brent Kennedy 
Cc: Marc Roos ; ceph-users 
Subject: Re: [ceph-users] NFS

So, Ganesha is an NFS gateway, living in userspace.  It provides access via NFS 
(for any NFS client) to a number of clustered storage systems, or to local 
filesystems on it's host.  It can run on any system that has access to the 
cluster (ceph in this case).  One Ganesha instance can serve quite a few 
clients (the limit typically being either memory on the Ganesha node or network 
bandwidth).
Ganesha's configuration lives in /etc/ganesha/ganesha.conf.  There should be 
man pages related to Ganesha and it's configuration installed when Ganesha is 
installed.

Ganesha has a number of FSALs (File System Abstraction Layers) that work with a 
number of different clustered storage systems.  For Ceph, Ganesha has 2 FSALs: 
FSAL_CEPH works on top of CephFS, and FSAL_RGW works on top of RadosGW.  
FSAL_CEPH provides full NFS semantics, sinces CephFS is a full POSIX 
filesystem; FSAL_RGW provides slightly limited semantics, since RGW itself it 
not POSIX and doesn't provide everything.  For example, you cannot write into 
an arbitrary location within a file, you can only overwrite the entire file.

Anything you can store in the underlying storage (CephFS or RadosGW) can be 
stored/accessed by Ganesha.  So, 20+GB files should work fine on either one.

Daniel

On Tue, Oct 1, 2019 at 10:45 PM Brent Kennedy  wrote:
>
> We might have to backup a step here so I can understand.  Are you 
> saying stand up a new VM with just those packages installed, then 
> configure the export file  ( the file location isn’t mentioned in the 
> ceph docs ) and supposedly a client can connect to them?  ( only linux 
> clients or any NFS client? )
>
> I don’t use cephFS, so being that it will be an object storage backend, will 
> that be ok with multiple hosts accessing files through the NFS one gateway or 
> should I configure multiple gateways ( one for each share )?
>
> I was hoping to save large files( 20+ GB ), should I stand up cephFS instead 
> for this?
>
> I am used to using a NAS storage appliance server(or freeNAS ), so 
> using ceph as a NAS backend is new to me ( thus I might be over 
> thinking this )  :)
>
> -Brent
>
> -Original Message-
> From: Daniel Gryniewicz 
> Sent: Tuesday, October 1, 2019 8:20 AM
> To: Marc Roos ; bkennedy 
> ; ceph-users 
> Subject: Re: [ceph-users] NFS
>
> Ganesha can export CephFS or RGW.  It cannot export anything else (like iscsi 
> or RBD).  Config for RGW looks like this:
>
> EXPORT
> {
>  Export_ID=1;
>  Path = "/";
>  Pseudo = "/rgw";
>  Access_Type = RW;
>  Protocols = 4;
>  Transports = TCP;
>  FSAL {
>  Name = RGW;
>  User_Id = "testuser";
>  Access_Key_Id ="";
>  Secret_Access_Key = "";
>  }
> }
>
> RGW {
>  ceph_conf = "//ceph.conf";
>  # for vstart cluster, name = "client.admin"
>  name = "client.rgw.foohost";
>  cluster = "ceph";
> #   init_args = "-d --debug-rgw=16";
> }
>
>
> Daniel
>
> On 9/30/19 3:01 PM, Marc Roos wrote:
> >
> > Just install these
> >
> > http://download.ceph.com/nfs-ganesha/
> > nfs-ganesha-rgw-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-vfs-2.7.1-0.1.el7.x86_64
> > libnfsidmap-0.25-19.el7.x86_64
> > nfs-ganesha-mem-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-xfs-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-2.7.1-0.1.el7.x86_64
> > nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64
> >
> >
> > And export your cephfs like this:
> > EXPORT {
> >  Export_Id = 10;
> >  Path = /nfs/cblr-repos;
> >  Pseudo = /cblr-repos;
> >  FSAL { Name = CEPH; User_Id = "cephfs.nfs.cblr"; 
> > Secret_Access_Key = "xxx"; }
> >  Disable_ACL = FALSE;
> >  CLIENT { Clients = 192.168.10.2; access_type = "RW"; }
> >  CLIENT { Clients = 192.168.10.253; } }
> >
> >
> > -Original Message-
> > From: Brent Kennedy [mailto:bkenn...@cfl.rr.com]
> > Sent: maandag 30 september 2019 20:56
> > To: 'ceph-users'
> > Subject: [ceph-users] NFS
> >
> > Wondering if there are any documents for standing up NFS with an 
> > existing ceph cluster.  We don’t use ceph-ansible or any other tools 
> > besides ceph-deploy.  The iscsi directions were pretty good once I 
> > got past the dependencies.
> >
> >
> >
> > I saw the one based on Rook, but it doesn’t seem to apply to our 
> > setup of ceph vms with physical hosts doing OSDs.  The official ceph 
> > documents talk about using ganesha but doesn’t seem to dive into the 
> > details of what the process is for getting it online.  We don’t use 
> > cephfs, so that’s not setup either.  The basic docs seem to note this is 
> > required.
> >   Seems my google-fu is failing me when I try to 

Re: [ceph-users] krbd / kcephfs - jewel client features question

2019-10-17 Thread Lei Liu
Well, I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version?  3.10-862 or above version ?Thanks 发自我的iPhone-- Original --From: 刘磊 Date: Thu,Oct 17,2019 9:38 PMTo: ceph-users Subject: Re: krbd / kcephfs - jewel client features questionHi Cephers,We have some ceph clusters in 12.2.x version, now we want to use upmap balancer,but when i set set-require-min-compat-client to luminous, it's failed# ceph osd set-require-min-compat-client luminousError EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0xa20); 1 connected client(s) look like jewel (missing 0x800); 1 connected client(s) look like jewel (missing 0x820); add --yes-i-really-mean-it to do it anywayceph features"client": {        "group": {            "features": "0x40106b84a842a52",            "release": "jewel",            "num": 6        },        "group": {            "features": "0x7010fb86aa42ada",            "release": "jewel",            "num": 1        },        "group": {            "features": "0x7fddff8ee84bffb",            "release": "jewel",            "num": 1        },        "group": {            "features": "0x3ffddff8eea4fffb",            "release": "luminous",            "num": 7        }    }and sessions"MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features 0x7fddff8ee84bffb (jewel))","MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features 0x7010fb86aa42ada (jewel))"can i use --yes-i-really-mean-it to force enable it ? I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] krbd / kcephfs - jewel client features question

2019-10-17 Thread Ilya Dryomov
On Thu, Oct 17, 2019 at 3:38 PM Lei Liu  wrote:
>
> Hi Cephers,
>
> We have some ceph clusters in 12.2.x version, now we want to use upmap 
> balancer,but when i set set-require-min-compat-client to luminous, it's failed
>
> # ceph osd set-require-min-compat-client luminous
> Error EPERM: cannot set require_min_compat_client to luminous: 6 connected 
> client(s) look like jewel (missing 0xa20); 1 connected client(s) 
> look like jewel (missing 0x800); 1 connected client(s) look like 
> jewel (missing 0x820); add --yes-i-really-mean-it to do it anyway
>
> ceph features
>
> "client": {
> "group": {
> "features": "0x40106b84a842a52",
> "release": "jewel",
> "num": 6
> },
> "group": {
> "features": "0x7010fb86aa42ada",
> "release": "jewel",
> "num": 1
> },
> "group": {
> "features": "0x7fddff8ee84bffb",
> "release": "jewel",
> "num": 1
> },
> "group": {
> "features": "0x3ffddff8eea4fffb",
> "release": "luminous",
> "num": 7
> }
> }
>
> and sessions
>
> "MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features 
> 0x40106b84a842a52 (jewel))",
> "MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features 
> 0x40106b84a842a52 (jewel))",
> "MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features 
> 0x7fddff8ee84bffb (jewel))",
> "MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features 
> 0x7010fb86aa42ada (jewel))"
>
> can i use --yes-i-really-mean-it to force enable it ?

No.  0x40106b84a842a52 and 0x7fddff8ee84bffb are too old.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] krbd / kcephfs - jewel client features question

2019-10-17 Thread Lei Liu
Hi Cephers,

We have some ceph clusters in 12.2.x version, now we want to use upmap
balancer,but when i set set-require-min-compat-client to luminous, it's
failed

# ceph osd set-require-min-compat-client luminous
Error EPERM: cannot set require_min_compat_client to luminous: 6 connected
client(s) look like jewel (missing 0xa20); 1 connected
client(s) look like jewel (missing 0x800); 1 connected
client(s) look like jewel (missing 0x820); add
--yes-i-really-mean-it to do it anyway

ceph features

"client": {
"group": {
"features": "0x40106b84a842a52",
"release": "jewel",
"num": 6
},
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x7fddff8ee84bffb",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x3ffddff8eea4fffb",
"release": "luminous",
"num": 7
}
}

and sessions

"MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features
0x40106b84a842a52 (jewel))",
"MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features
0x40106b84a842a52 (jewel))",
"MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features
0x7fddff8ee84bffb (jewel))",
"MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features
0x7010fb86aa42ada (jewel))"

can i use --yes-i-really-mean-it to force enable it ?

I know kernel 4.13 is full support with ceph luminous and I know that
Redhat backports a lot of things to the 3.10 kernel, So which is the
matching version?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crashed MDS (segfault)

2019-10-17 Thread Yan, Zheng
On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini  wrote:
>
> Dear ceph users,
> we're experiencing a segfault during MDS startup (replay process) which is 
> making our FS inaccessible.
>
> MDS log messages:
>
> Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c08f49700  1 -- 192.168.8.195:6800/3181891717 <== osd.26 
> 192.168.8.209:6821/2419345 3  osd_op_reply(21 1. [getxattr] v0'0 
> uv0 ondisk = -61 ((61) No data available)) v8  154+0+0 (3715233608 0 0) 
> 0x2776340 con 0x18bd500
> Oct 15 03:41:39.894584 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
> Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544
> Oct 15 03:41:39.894658 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 mds.0.cache.ino(0x100)  magic is 'ceph fs volume v011' 
> (expecting 'ceph fs volume v011')
> Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10  mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents 
> [1,head]
> Oct 15 03:41:39.894735 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] 
> ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 
> 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800]
> Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched
> Oct 15 03:41:39.894821 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482
> Oct 15 03:41:39.894891 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.201 
> 7f3c00589700 10 mds.0.cache.ino(0x1)  magic is 'ceph fs volume v011' 
> (expecting 'ceph fs volume v011')
> Oct 15 03:41:39.894958 mds1 ceph-mds:   -472> 2019-10-15 00:40:30.205 
> 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread 
> 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 
> (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: (()+0x11390) 
> [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm const&)+0x42) 
> [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) [0x72f488]#012 4: 
> (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) [0x6e1f63]#012 5: 
> (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 6: 
> (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, Context*)+0x1b2) 
> [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) [0x74fcc9]#012 8: 
> (Finisher::finisher_thread_entry()+0x12e) [0x7f3c0ebffece]#012 9: (()+0x76ba) 
> [0x7f3c0e4806ba]#012 10: (clone()+0x6d) [0x7f3c0dca941d]#012 NOTE: a copy of 
> the executable, or `objdump -rdS ` is needed to interpret this.
> Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels ---
> Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none
> Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep
>

looks like snap info for root inode is corrupted. did you do any
unusually operation before this happened?


>
> Cluster status information:
>
>   cluster:
> id: b8205875-e56f-4280-9e52-6aab9c758586
> health: HEALTH_WARN
> 1 filesystem is degraded
> 1 nearfull osd(s)
> 11 pool(s) nearfull
>
>   services:
> mon: 3 daemons, quorum mon1,mon2,mon3
> mgr: mon1(active), standbys: mon2, mon3
> mds: fs_padrao-1/1/1 up  {0=mds1=up:replay(laggy or crashed)}
> osd: 90 osds: 90 up, 90 in
>
>   data:
> pools:   11 pools, 1984 pgs
> objects: 75.99 M objects, 285 TiB
> usage:   457 TiB used, 181 TiB / 639 TiB avail
> pgs: 1896 active+clean
>  87   active+clean+scrubbing+deep+repair
>  1active+clean+scrubbing
>
>   io:
> client:   89 KiB/s wr, 0 op/s rd, 3 op/s wr
>
> Has anyone seen anything like this?
>
> Regards,
> Gustavo.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph iscsi question

2019-10-17 Thread Jason Dillaman
Have you updated your "/etc/multipath.conf" as documented here [1]?
You should have ALUA configured but it doesn't appear that's the case
w/ your provided output.

On Wed, Oct 16, 2019 at 11:36 PM 展荣臻(信泰)  wrote:
>
>
>
>
> > -原始邮件-
> > 发件人: "Jason Dillaman" 
> > 发送时间: 2019-10-17 09:54:30 (星期四)
> > 收件人: "展荣臻(信泰)" 
> > 抄送: dillaman , ceph-users 
> > 主题: Re: [ceph-users] ceph iscsi question
> >
> > On Wed, Oct 16, 2019 at 9:52 PM 展荣臻(信泰)  wrote:
> > >
> > >
> > >
> > >
> > > > -原始邮件-
> > > > 发件人: "Jason Dillaman" 
> > > > 发送时间: 2019-10-16 20:33:47 (星期三)
> > > > 收件人: "展荣臻(信泰)" 
> > > > 抄送: ceph-users 
> > > > 主题: Re: [ceph-users] ceph iscsi question
> > > >
> > > > On Wed, Oct 16, 2019 at 2:35 AM 展荣臻(信泰)  
> > > > wrote:
> > > > >
> > > > > hi,all
> > > > >   we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi 
> > > > > runs in docker.
> > > > >   I create iscsi target according to 
> > > > > https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/.
> > > > >   I discovered and logined iscsi target on another host,as show below:
> > > > >
> > > > > [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p 
> > > > > 192.168.42.110
> > > > > 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
> > > > > 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw
> > > > > [root@node1 tmp]# iscsiadm -m node -T 
> > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l
> > > > > Logging in to [iface: default, target: 
> > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 
> > > > > 192.168.42.110,3260] (multiple)
> > > > > Login to [iface: default, target: 
> > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 
> > > > > 192.168.42.110,3260] successful.
> > > > >
> > > > >  /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur,
> > > > >
> > > > > [root@node1 tmp]# mkfs.xfs -f /dev/sde
> > > > > meta-data=/dev/sde   isize=512agcount=4, 
> > > > > agsize=1966080 blks
> > > > >  =   sectsz=512   attr=2, projid32bit=1
> > > > >  =   crc=1finobt=0, sparse=0
> > > > > data =   bsize=4096   blocks=7864320, 
> > > > > imaxpct=25
> > > > >  =   sunit=0  swidth=0 blks
> > > > > naming   =version 2  bsize=4096   ascii-ci=0 ftype=1
> > > > > log  =internal log   bsize=4096   blocks=3840, version=2
> > > > >  =   sectsz=512   sunit=0 blks, 
> > > > > lazy-count=1
> > > > > realtime =none   extsz=4096   blocks=0, rtextents=0
> > > > > existing superblock read failed: Input/output error
> > > > > mkfs.xfs: pwrite64 failed: Input/output error
> > > > >
> > > > > message in /var/log/messages:
> > > > > Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0
> > > > > Oct 16 14:01:44 localhost kernel: sde: unable to read partition table
> > > > > Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0
> > > > > Oct 16 14:02:17 localhost kernel: sde: unable to read partition table
> > > > >
> > > > > we use Luminous ceph.
> > > > > what cause this error? how debug it.any suggestion is appreciative.
> > > >
> > > > Please use the associated multipath device, not the raw block device.
> > > >
> > > hi,Jason
> > >   Thanks for your reply
> > >   The multipath device is the same error as raw block device.
> > >
> >
> > What does "multipath -ll" show?
> >
> [root@node1 ~]# multipath -ll
> mpathf (36001405366100aeda2044f286329b57a) dm-2 LIO-ORG ,TCMU device
> size=30G features='0' hwhandler='0' wp=rw
> |-+- policy='service-time 0' prio=0 status=enabled
> | `- 13:0:0:0 sde 8:64 failed faulty running
> `-+- policy='service-time 0' prio=0 status=enabled
>   `- 14:0:0:0 sdf 8:80 failed faulty running
> [root@node1 ~]#
>
> I don't know if it is related to that our all daemons run in docker while 
> docker runs on kvm.
>
>
>
>
>
>
>

[1] https://docs.ceph.com/ceph-prs/30912/rbd/iscsi-initiator-linux/

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD PGs are not being removed - Full OSD issues

2019-10-17 Thread Philippe D'Anjou
This is related to https://tracker.ceph.com/issues/42341 and to 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-October/037017.html

After closing inspection yesterday we found that PGs are not being removed from 
OSDs which then leads to near full errors, explains why reweights don't work. 
This is a BIG issue because I have to constantly manually intervene to not have 
the cluster die.14.2.4. Fresh Setup, all defaultPG Balancer is turned off now, 
I begin to wonder if its at fault.

My crush map: https://termbin.com/3t8lWhat was mentioned that the bucket 
weights are WEIRD. I never touched this.The crush weights that are unsual are 
for nearfull osd53 and some are set to 10 from a previous manual intervention.
Now that the PGs are not being purged is one issue, the original issue is why 
the f ceph fills ONLY my nearfull OSDs in the first place. It seems to always 
select the fullest OSD to write more data onto it. If I reweight it it starts 
giving alerts for another almost full OSD because it intends to write 
everything there, despite everything else being only at about 60%.
I dont know how to debug this, it's a MAJOR PITA


Hope someone has an idea because I can't fight this 24/7, I'm getting pretty 
tired of this
Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com