[ceph-users] kernel cephfs - too many caps used by client
Hi cephers, We have some ceph clusters use cephfs in production(mount with kernel cephfs), but several of clients often keep a lot of caps(millions) unreleased. I know this is due to the client's inability to complete the cache release, errors might have been encountered, but no logs. client kernel version is 3.10.0-957.21.3.el7.x86_64 ceph version is mostly v12.2.8 ceph status shows: x clients failing to respond to cache pressure client kernel debug shows: # cat /sys/kernel/debug/ceph/a00cc99c-f9f9-4dd9-9281-43cd12310e41.client11291811/caps total 23801585 avail 1074 used 23800511 reserved 0 min 1024 mds config: [mds] mds_max_caps_per_client = 10485760 # 50G mds_cache_memory_limit = 53687091200 I want to know if some ceph configurations can solve this problem ? Any suggestions? Thanks. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph iscsi question
> Have you updated your "/etc/multipath.conf" as documented here [1]? > You should have ALUA configured but it doesn't appear that's the case > w/ your provided output. > > [1] https://docs.ceph.com/ceph-prs/30912/rbd/iscsi-initiator-linux/ Thank you jason.Updated the /etc/multipath.conf as you said ,it work successfully. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Crashed MDS (segfault)
Hi Zheng, the cluster is running ceph mimic. This warning about network only appears when using nautilus' cephfs-journal-tool. "cephfs-data-scan scan_links" does not report any issue. How could variable "newparent" be NULL at https://github.com/ceph/ceph/blob/master/src/mds/SnapRealm.cc#L599 ? Is there a way to fix this? On Thu, Oct 17, 2019 at 9:58 PM Yan, Zheng wrote: > On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini > wrote: > > > > No. The cluster was just rebalancing. > > > > The journal seems damaged: > > > > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal inspect > > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket > couldn't create socket (97) Address family not supported by protocol > > corrupted journal shouldn't cause error like this. This is more like > network issue. please double check network config of your cluster. > > > Overall journal integrity: DAMAGED > > Corrupt regions: > > 0x1c5e4d904ab-1c5e4d9ddbc > > ceph@deployer:~$ > > > > Could a journal reset help with this? > > > > I could snapshot all FS pools and export the journal before to guarantee > a rollback to this state if something goes wrong with jounal reset. > > > > On Thu, Oct 17, 2019, 09:07 Yan, Zheng wrote: > >> > >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini < > gustavoton...@gmail.com> wrote: > >> > > >> > Dear ceph users, > >> > we're experiencing a segfault during MDS startup (replay process) > which is making our FS inaccessible. > >> > > >> > MDS log messages: > >> > > >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] > v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 > 0 0) 0x2776340 con 0x18bd500 > >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 > >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents > [1,head] > >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] > ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 > 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800] > >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 > >> > Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > >> > Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 00:40:30.205 > 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread > 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: > (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm > const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) > [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) > [0x6e1f63]#012 5: > (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 > 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, > Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) > [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e) > [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: (clone()+0x6d) > [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or `objdump -rdS > ` is needed to interpret this. > >> > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels --- > >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none > >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep > >> > > >> > >> looks like snap info for root inode is corrupted. did you do any > >> unusually operation before this happened? > >> > >> > >> > > >> > Cluster status information: > >> > > >> > cluster: > >> > id: b8205875-e56f-4280-9e52-6aab9c758586 > >> > health: HEALTH_WARN > >> > 1 filesystem is degraded > >> > 1 nearfull osd(s) > >> > 11 pool(s) nearfull > >> > > >> > services: > >> > mon: 3 daemons, quorum mon1,mon2,mon3 > >> > mgr: mon1(active), standbys: mon2, mon3 > >> > mds:
Re: [ceph-users] Crashed MDS (segfault)
On Thu, Oct 17, 2019 at 10:19 PM Gustavo Tonini wrote: > > No. The cluster was just rebalancing. > > The journal seems damaged: > > ceph@deployer:~$ cephfs-journal-tool --rank=fs_padrao:0 journal inspect > 2019-10-16 17:46:29.596 7fcd34cbf700 -1 NetHandler create_socket couldn't > create socket (97) Address family not supported by protocol corrupted journal shouldn't cause error like this. This is more like network issue. please double check network config of your cluster. > Overall journal integrity: DAMAGED > Corrupt regions: > 0x1c5e4d904ab-1c5e4d9ddbc > ceph@deployer:~$ > > Could a journal reset help with this? > > I could snapshot all FS pools and export the journal before to guarantee a > rollback to this state if something goes wrong with jounal reset. > > On Thu, Oct 17, 2019, 09:07 Yan, Zheng wrote: >> >> On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini >> wrote: >> > >> > Dear ceph users, >> > we're experiencing a segfault during MDS startup (replay process) which is >> > making our FS inaccessible. >> > >> > MDS log messages: >> > >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 >> > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] >> > v0'0 uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 >> > (3715233608 0 0) 0x2776340 con 0x18bd500 >> > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 >> > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs volume v011' >> > (expecting 'ceph fs volume v011') >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents >> > [1,head] >> > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] >> > ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 >> > 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) >> > 0x18bf800] >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched >> > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 >> > Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 >> > 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume v011' >> > (expecting 'ceph fs volume v011') >> > Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 00:40:30.205 >> > 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread >> > 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 >> > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: >> > (()+0x11390) [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm >> > const&)+0x42) [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) >> > [0x72f488]#012 4: (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) >> > [0x6e1f63]#012 5: >> > (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 >> > 6: (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, >> > Context*)+0x1b2) [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) >> > [0x74fcc9]#012 8: (Finisher::finisher_thread_entry()+0x12e) >> > [0x7f3c0ebffece]#012 9: (()+0x76ba) [0x7f3c0e4806ba]#012 10: >> > (clone()+0x6d) [0x7f3c0dca941d]#012 NOTE: a copy of the executable, or >> > `objdump -rdS ` is needed to interpret this. >> > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels --- >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none >> > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep >> > >> >> looks like snap info for root inode is corrupted. did you do any >> unusually operation before this happened? >> >> >> > >> > Cluster status information: >> > >> > cluster: >> > id: b8205875-e56f-4280-9e52-6aab9c758586 >> > health: HEALTH_WARN >> > 1 filesystem is degraded >> > 1 nearfull osd(s) >> > 11 pool(s) nearfull >> > >> > services: >> > mon: 3 daemons, quorum mon1,mon2,mon3 >> > mgr: mon1(active), standbys: mon2, mon3 >> > mds: fs_padrao-1/1/1 up {0=mds1=up:replay(laggy or crashed)} >> > osd: 90 osds: 90 up, 90 in >> > >> > data: >> > pools: 11 pools, 1984 pgs >> > objects: 75.99 M objects, 285 TiB >> > usage: 457 TiB used, 181 TiB / 639 TiB avail >> > pgs: 1896 active+clean >> > 87 active+clean+scrubbing+deep+repair >> > 1active+clean+scrubbing >> > >> > io:
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Thu, Oct 17, 2019 at 12:35 PM huxia...@horebdata.cn wrote: > > hello, Robert > > thanks for the quick reply. I did test with osd op queue = wpq , and osd > op queue cut off = high > and > osd_recovery_op_priority = 1 > osd recovery delay start = 20 > osd recovery max active = 1 > osd recovery max chunk = 1048576 > osd recovery sleep = 1 > osd recovery sleep hdd = 1 > osd recovery sleep ssd = 1 > osd recovery sleep hybrid = 1 > osd recovery priority = 1 > osd max backfills = 1 > osd backfill scan max = 16 > osd backfill scan min = 4 > osd_op_thread_suicide_timeout = 300 > > But still the ceph cluster showed extremely hug recovery activities during > the beginning of the recovery, and after ca. 5-10 minutes, the recovery > gradually get under the control. I guess this is quite similar to what you > encountered in Nov. 2015. > > It is really annoying, and what else can i do to mitigate this weird > inital-recovery issue? any suggestions are much appreciated. Hmm, on our Luminous cluster, we have the defaults other than the op queue and cut off and bringing in a node is nearly zero impact for client traffic. Those would need to be set on all OSDs to be completely effective. Maybe go back to the defaults? Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
hello, Robert thanks for the quick reply. I did test with osd op queue = wpq , and osd op queue cut off = high and osd_recovery_op_priority = 1 osd recovery delay start = 20 osd recovery max active = 1 osd recovery max chunk = 1048576 osd recovery sleep = 1 osd recovery sleep hdd = 1 osd recovery sleep ssd = 1 osd recovery sleep hybrid = 1 osd recovery priority = 1 osd max backfills = 1 osd backfill scan max = 16 osd backfill scan min = 4 osd_op_thread_suicide_timeout = 300 But still the ceph cluster showed extremely hug recovery activities during the beginning of the recovery, and after ca. 5-10 minutes, the recovery gradually get under the control. I guess this is quite similar to what you encountered in Nov. 2015. It is really annoying, and what else can i do to mitigate this weird inital-recovery issue? any suggestions are much appreciated. thanks again, samuel huxia...@horebdata.cn From: Robert LeBlanc Date: 2019-10-17 21:23 To: huxia...@horebdata.cn CC: ceph-users Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn wrote: > > I happened to find a note that you wrote in Nov 2015: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html > and I believe this is what i just hit exactly the same behavior : a host down > will badly take the client performance down 1/10 (with 200MB/s recovery > workload) and then took ten minutes to get good control of OSD recovery. > > Could you please share how did you eventally solve that issue? by seting a > fair large OSD recovery delay start or any other parameter? Wow! Dusting off the cobwebs here. I think this is what lead me to dig into the code and write the WPQ scheduler. I can't remember doing anything specific. I'm sorry I'm not much help in this regard. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph iscsi question
On 10/17/2019 10:52 AM, Mike Christie wrote: > On 10/16/2019 01:35 AM, 展荣臻(信泰) wrote: >> hi,all >> we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi runs in >> docker. >> I create iscsi target according to >> https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/. >> I discovered and logined iscsi target on another host,as show below: >> >> [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p 192.168.42.110 >> 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw >> 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw >> [root@node1 tmp]# iscsiadm -m node -T >> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l >> Logging in to [iface: default, target: >> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] >> (multiple) >> Login to [iface: default, target: >> iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] >> successful. >> >> /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur, >> >> [root@node1 tmp]# mkfs.xfs -f /dev/sde >> meta-data=/dev/sde isize=512agcount=4, agsize=1966080 blks >> = sectsz=512 attr=2, projid32bit=1 >> = crc=1finobt=0, sparse=0 >> data = bsize=4096 blocks=7864320, imaxpct=25 >> = sunit=0 swidth=0 blks >> naming =version 2 bsize=4096 ascii-ci=0 ftype=1 >> log =internal log bsize=4096 blocks=3840, version=2 >> = sectsz=512 sunit=0 blks, lazy-count=1 >> realtime =none extsz=4096 blocks=0, rtextents=0 >> existing superblock read failed: Input/output error >> mkfs.xfs: pwrite64 failed: Input/output error >> >> message in /var/log/messages: >> Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0 >> Oct 16 14:01:44 localhost kernel: sde: unable to read partition table >> Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0 >> Oct 16 14:02:17 localhost kernel: sde: unable to read partition table >> > > Is there any errors before ofter this? Something about a transport or > hardware error or something about SCSI sense? > > What does "iscsiadm -m session -P 3" > > report? Does it report failed for the login or device states? > > If logged in ok, can you just do a > > sg_tur /dev/sde Instead of sg_tur can you give me the output of sg_rtpg -v /dev/sde ? Can you also tell me the tcmu-runner and ceph-iscsi or ceph-iscsi-cli/ceph-iscsi-config versions? > > ? > > On the target side are there errors in /var/log/messages or > /var/log/tcmu-runner.log? > > > > >> we use Luminous ceph. >> what cause this error? how debug it.any suggestion is appreciative. >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
On Thu, Oct 17, 2019 at 12:08 PM huxia...@horebdata.cn wrote: > > I happened to find a note that you wrote in Nov 2015: > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html > and I believe this is what i just hit exactly the same behavior : a host down > will badly take the client performance down 1/10 (with 200MB/s recovery > workload) and then took ten minutes to get good control of OSD recovery. > > Could you please share how did you eventally solve that issue? by seting a > fair large OSD recovery delay start or any other parameter? Wow! Dusting off the cobwebs here. I think this is what lead me to dig into the code and write the WPQ scheduler. I can't remember doing anything specific. I'm sorry I'm not much help in this regard. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery
Hello, Robert, I happened to find a note that you wrote in Nov 2015: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-November/006173.html and I believe this is what i just hit exactly the same behavior : a host down will badly take the client performance down 1/10 (with 200MB/s recovery workload) and then took ten minutes to get good control of OSD recovery. Could you please share how did you eventally solve that issue? by seting a fair large OSD recovery delay start or any other parameter? best regards, samuel huxia...@horebdata.cn From: Robert LeBlanc Date: 2019-10-16 21:46 To: huxia...@horebdata.cn CC: ceph-users Subject: Re: Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery On Wed, Oct 16, 2019 at 11:53 AM huxia...@horebdata.cn wrote: > > My Ceph version is Luminuous 12.2.12. Do you think should i upgrade to > Nautilus, or will Nautilus have a better control of recovery/backfilling? We have a Jewel cluster and Luminuous cluster that we have changed these settings on and it really helped both of them. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph iscsi question
On 10/16/2019 01:35 AM, 展荣臻(信泰) wrote: > hi,all > we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi runs in > docker. > I create iscsi target according to > https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/. > I discovered and logined iscsi target on another host,as show below: > > [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p 192.168.42.110 > 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw > 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw > [root@node1 tmp]# iscsiadm -m node -T > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l > Logging in to [iface: default, target: > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: 192.168.42.110,3260] > (multiple) > Login to [iface: default, target: iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, > portal: 192.168.42.110,3260] successful. > > /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur, > > [root@node1 tmp]# mkfs.xfs -f /dev/sde > meta-data=/dev/sde isize=512agcount=4, agsize=1966080 blks > = sectsz=512 attr=2, projid32bit=1 > = crc=1finobt=0, sparse=0 > data = bsize=4096 blocks=7864320, imaxpct=25 > = sunit=0 swidth=0 blks > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > log =internal log bsize=4096 blocks=3840, version=2 > = sectsz=512 sunit=0 blks, lazy-count=1 > realtime =none extsz=4096 blocks=0, rtextents=0 > existing superblock read failed: Input/output error > mkfs.xfs: pwrite64 failed: Input/output error > > message in /var/log/messages: > Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0 > Oct 16 14:01:44 localhost kernel: sde: unable to read partition table > Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0 > Oct 16 14:02:17 localhost kernel: sde: unable to read partition table > Is there any errors before ofter this? Something about a transport or hardware error or something about SCSI sense? What does "iscsiadm -m session -P 3" report? Does it report failed for the login or device states? If logged in ok, can you just do a sg_tur /dev/sde ? On the target side are there errors in /var/log/messages or /var/log/tcmu-runner.log? > we use Luminous ceph. > what cause this error? how debug it.any suggestion is appreciative. > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] NFS
Awesome, thank you giving an overview of these features, sounds like the correct direction then! -Brent -Original Message- From: Daniel Gryniewicz Sent: Thursday, October 3, 2019 8:20 AM To: Brent Kennedy Cc: Marc Roos ; ceph-users Subject: Re: [ceph-users] NFS So, Ganesha is an NFS gateway, living in userspace. It provides access via NFS (for any NFS client) to a number of clustered storage systems, or to local filesystems on it's host. It can run on any system that has access to the cluster (ceph in this case). One Ganesha instance can serve quite a few clients (the limit typically being either memory on the Ganesha node or network bandwidth). Ganesha's configuration lives in /etc/ganesha/ganesha.conf. There should be man pages related to Ganesha and it's configuration installed when Ganesha is installed. Ganesha has a number of FSALs (File System Abstraction Layers) that work with a number of different clustered storage systems. For Ceph, Ganesha has 2 FSALs: FSAL_CEPH works on top of CephFS, and FSAL_RGW works on top of RadosGW. FSAL_CEPH provides full NFS semantics, sinces CephFS is a full POSIX filesystem; FSAL_RGW provides slightly limited semantics, since RGW itself it not POSIX and doesn't provide everything. For example, you cannot write into an arbitrary location within a file, you can only overwrite the entire file. Anything you can store in the underlying storage (CephFS or RadosGW) can be stored/accessed by Ganesha. So, 20+GB files should work fine on either one. Daniel On Tue, Oct 1, 2019 at 10:45 PM Brent Kennedy wrote: > > We might have to backup a step here so I can understand. Are you > saying stand up a new VM with just those packages installed, then > configure the export file ( the file location isn’t mentioned in the > ceph docs ) and supposedly a client can connect to them? ( only linux > clients or any NFS client? ) > > I don’t use cephFS, so being that it will be an object storage backend, will > that be ok with multiple hosts accessing files through the NFS one gateway or > should I configure multiple gateways ( one for each share )? > > I was hoping to save large files( 20+ GB ), should I stand up cephFS instead > for this? > > I am used to using a NAS storage appliance server(or freeNAS ), so > using ceph as a NAS backend is new to me ( thus I might be over > thinking this ) :) > > -Brent > > -Original Message- > From: Daniel Gryniewicz > Sent: Tuesday, October 1, 2019 8:20 AM > To: Marc Roos ; bkennedy > ; ceph-users > Subject: Re: [ceph-users] NFS > > Ganesha can export CephFS or RGW. It cannot export anything else (like iscsi > or RBD). Config for RGW looks like this: > > EXPORT > { > Export_ID=1; > Path = "/"; > Pseudo = "/rgw"; > Access_Type = RW; > Protocols = 4; > Transports = TCP; > FSAL { > Name = RGW; > User_Id = "testuser"; > Access_Key_Id =""; > Secret_Access_Key = ""; > } > } > > RGW { > ceph_conf = "//ceph.conf"; > # for vstart cluster, name = "client.admin" > name = "client.rgw.foohost"; > cluster = "ceph"; > # init_args = "-d --debug-rgw=16"; > } > > > Daniel > > On 9/30/19 3:01 PM, Marc Roos wrote: > > > > Just install these > > > > http://download.ceph.com/nfs-ganesha/ > > nfs-ganesha-rgw-2.7.1-0.1.el7.x86_64 > > nfs-ganesha-vfs-2.7.1-0.1.el7.x86_64 > > libnfsidmap-0.25-19.el7.x86_64 > > nfs-ganesha-mem-2.7.1-0.1.el7.x86_64 > > nfs-ganesha-xfs-2.7.1-0.1.el7.x86_64 > > nfs-ganesha-2.7.1-0.1.el7.x86_64 > > nfs-ganesha-ceph-2.7.1-0.1.el7.x86_64 > > > > > > And export your cephfs like this: > > EXPORT { > > Export_Id = 10; > > Path = /nfs/cblr-repos; > > Pseudo = /cblr-repos; > > FSAL { Name = CEPH; User_Id = "cephfs.nfs.cblr"; > > Secret_Access_Key = "xxx"; } > > Disable_ACL = FALSE; > > CLIENT { Clients = 192.168.10.2; access_type = "RW"; } > > CLIENT { Clients = 192.168.10.253; } } > > > > > > -Original Message- > > From: Brent Kennedy [mailto:bkenn...@cfl.rr.com] > > Sent: maandag 30 september 2019 20:56 > > To: 'ceph-users' > > Subject: [ceph-users] NFS > > > > Wondering if there are any documents for standing up NFS with an > > existing ceph cluster. We don’t use ceph-ansible or any other tools > > besides ceph-deploy. The iscsi directions were pretty good once I > > got past the dependencies. > > > > > > > > I saw the one based on Rook, but it doesn’t seem to apply to our > > setup of ceph vms with physical hosts doing OSDs. The official ceph > > documents talk about using ganesha but doesn’t seem to dive into the > > details of what the process is for getting it online. We don’t use > > cephfs, so that’s not setup either. The basic docs seem to note this is > > required. > > Seems my google-fu is failing me when I try to
Re: [ceph-users] krbd / kcephfs - jewel client features question
Well, I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version? 3.10-862 or above version ?Thanks 发自我的iPhone-- Original --From: 刘磊 Date: Thu,Oct 17,2019 9:38 PMTo: ceph-users Subject: Re: krbd / kcephfs - jewel client features questionHi Cephers,We have some ceph clusters in 12.2.x version, now we want to use upmap balancer,but when i set set-require-min-compat-client to luminous, it's failed# ceph osd set-require-min-compat-client luminousError EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0xa20); 1 connected client(s) look like jewel (missing 0x800); 1 connected client(s) look like jewel (missing 0x820); add --yes-i-really-mean-it to do it anywayceph features"client": { "group": { "features": "0x40106b84a842a52", "release": "jewel", "num": 6 }, "group": { "features": "0x7010fb86aa42ada", "release": "jewel", "num": 1 }, "group": { "features": "0x7fddff8ee84bffb", "release": "jewel", "num": 1 }, "group": { "features": "0x3ffddff8eea4fffb", "release": "luminous", "num": 7 } }and sessions"MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features 0x40106b84a842a52 (jewel))","MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features 0x7fddff8ee84bffb (jewel))","MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features 0x7010fb86aa42ada (jewel))"can i use --yes-i-really-mean-it to force enable it ? I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] krbd / kcephfs - jewel client features question
On Thu, Oct 17, 2019 at 3:38 PM Lei Liu wrote: > > Hi Cephers, > > We have some ceph clusters in 12.2.x version, now we want to use upmap > balancer,but when i set set-require-min-compat-client to luminous, it's failed > > # ceph osd set-require-min-compat-client luminous > Error EPERM: cannot set require_min_compat_client to luminous: 6 connected > client(s) look like jewel (missing 0xa20); 1 connected client(s) > look like jewel (missing 0x800); 1 connected client(s) look like > jewel (missing 0x820); add --yes-i-really-mean-it to do it anyway > > ceph features > > "client": { > "group": { > "features": "0x40106b84a842a52", > "release": "jewel", > "num": 6 > }, > "group": { > "features": "0x7010fb86aa42ada", > "release": "jewel", > "num": 1 > }, > "group": { > "features": "0x7fddff8ee84bffb", > "release": "jewel", > "num": 1 > }, > "group": { > "features": "0x3ffddff8eea4fffb", > "release": "luminous", > "num": 7 > } > } > > and sessions > > "MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features > 0x40106b84a842a52 (jewel))", > "MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features > 0x40106b84a842a52 (jewel))", > "MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features > 0x7fddff8ee84bffb (jewel))", > "MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features > 0x7010fb86aa42ada (jewel))" > > can i use --yes-i-really-mean-it to force enable it ? No. 0x40106b84a842a52 and 0x7fddff8ee84bffb are too old. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] krbd / kcephfs - jewel client features question
Hi Cephers, We have some ceph clusters in 12.2.x version, now we want to use upmap balancer,but when i set set-require-min-compat-client to luminous, it's failed # ceph osd set-require-min-compat-client luminous Error EPERM: cannot set require_min_compat_client to luminous: 6 connected client(s) look like jewel (missing 0xa20); 1 connected client(s) look like jewel (missing 0x800); 1 connected client(s) look like jewel (missing 0x820); add --yes-i-really-mean-it to do it anyway ceph features "client": { "group": { "features": "0x40106b84a842a52", "release": "jewel", "num": 6 }, "group": { "features": "0x7010fb86aa42ada", "release": "jewel", "num": 1 }, "group": { "features": "0x7fddff8ee84bffb", "release": "jewel", "num": 1 }, "group": { "features": "0x3ffddff8eea4fffb", "release": "luminous", "num": 7 } } and sessions "MonSession(unknown.0 10.10.100.6:0/1603916368 is open allow *, features 0x40106b84a842a52 (jewel))", "MonSession(unknown.0 10.10.100.2:0/2484488531 is open allow *, features 0x40106b84a842a52 (jewel))", "MonSession(client.? 10.10.100.6:0/657483412 is open allow *, features 0x7fddff8ee84bffb (jewel))", "MonSession(unknown.0 10.10.14.67:0/500706582 is open allow *, features 0x7010fb86aa42ada (jewel))" can i use --yes-i-really-mean-it to force enable it ? I know kernel 4.13 is full support with ceph luminous and I know that Redhat backports a lot of things to the 3.10 kernel, So which is the matching version? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Crashed MDS (segfault)
On Tue, Oct 15, 2019 at 12:03 PM Gustavo Tonini wrote: > > Dear ceph users, > we're experiencing a segfault during MDS startup (replay process) which is > making our FS inaccessible. > > MDS log messages: > > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26 > 192.168.8.209:6821/2419345 3 osd_op_reply(21 1. [getxattr] v0'0 > uv0 ondisk = -61 ((61) No data available)) v8 154+0+0 (3715233608 0 0) > 0x2776340 con 0x18bd500 > Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched got 0 and 544 > Oct 15 03:41:39.894658 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.snaprealm(0x100 seq 1 0x1799c00) open_parents > [1,head] > Oct 15 03:41:39.894735 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x100) _fetched [inode 0x100 [...2,head] > ~mds0/ auth v275131 snaprealm=0x1799c00 f(v0 1=1+0) n(v76166 rc2020-07-17 > 15:29:27.00 b41838692297 -3184=-3168+-16)/n() (iversion lock) 0x18bf800] > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 MDSIOContextBase::complete: 18C_IO_Inode_Fetched > Oct 15 03:41:39.894821 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) _fetched got 0 and 482 > Oct 15 03:41:39.894891 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201 > 7f3c00589700 10 mds.0.cache.ino(0x1) magic is 'ceph fs volume v011' > (expecting 'ceph fs volume v011') > Oct 15 03:41:39.894958 mds1 ceph-mds: -472> 2019-10-15 00:40:30.205 > 7f3c00589700 -1 *** Caught signal (Segmentation fault) **#012 in thread > 7f3c00589700 thread_name:fn_anonymous#012#012 ceph version 13.2.6 > (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)#012 1: (()+0x11390) > [0x7f3c0e48a390]#012 2: (operator<<(std::ostream&, SnapRealm const&)+0x42) > [0x72cb92]#012 3: (SnapRealm::merge_to(SnapRealm*)+0x308) [0x72f488]#012 4: > (CInode::decode_snap_blob(ceph::buffer::list&)+0x53) [0x6e1f63]#012 5: > (CInode::decode_store(ceph::buffer::list::iterator&)+0x76) [0x702b86]#012 6: > (CInode::_fetched(ceph::buffer::list&, ceph::buffer::list&, Context*)+0x1b2) > [0x702da2]#012 7: (MDSIOContextBase::complete(int)+0x119) [0x74fcc9]#012 8: > (Finisher::finisher_thread_entry()+0x12e) [0x7f3c0ebffece]#012 9: (()+0x76ba) > [0x7f3c0e4806ba]#012 10: (clone()+0x6d) [0x7f3c0dca941d]#012 NOTE: a copy of > the executable, or `objdump -rdS ` is needed to interpret this. > Oct 15 03:41:39.895400 mds1 ceph-mds: --- logging levels --- > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 5 none > Oct 15 03:41:39.895473 mds1 ceph-mds:0/ 1 lockdep > looks like snap info for root inode is corrupted. did you do any unusually operation before this happened? > > Cluster status information: > > cluster: > id: b8205875-e56f-4280-9e52-6aab9c758586 > health: HEALTH_WARN > 1 filesystem is degraded > 1 nearfull osd(s) > 11 pool(s) nearfull > > services: > mon: 3 daemons, quorum mon1,mon2,mon3 > mgr: mon1(active), standbys: mon2, mon3 > mds: fs_padrao-1/1/1 up {0=mds1=up:replay(laggy or crashed)} > osd: 90 osds: 90 up, 90 in > > data: > pools: 11 pools, 1984 pgs > objects: 75.99 M objects, 285 TiB > usage: 457 TiB used, 181 TiB / 639 TiB avail > pgs: 1896 active+clean > 87 active+clean+scrubbing+deep+repair > 1active+clean+scrubbing > > io: > client: 89 KiB/s wr, 0 op/s rd, 3 op/s wr > > Has anyone seen anything like this? > > Regards, > Gustavo. > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] ceph iscsi question
Have you updated your "/etc/multipath.conf" as documented here [1]? You should have ALUA configured but it doesn't appear that's the case w/ your provided output. On Wed, Oct 16, 2019 at 11:36 PM 展荣臻(信泰) wrote: > > > > > > -原始邮件- > > 发件人: "Jason Dillaman" > > 发送时间: 2019-10-17 09:54:30 (星期四) > > 收件人: "展荣臻(信泰)" > > 抄送: dillaman , ceph-users > > 主题: Re: [ceph-users] ceph iscsi question > > > > On Wed, Oct 16, 2019 at 9:52 PM 展荣臻(信泰) wrote: > > > > > > > > > > > > > > > > -原始邮件- > > > > 发件人: "Jason Dillaman" > > > > 发送时间: 2019-10-16 20:33:47 (星期三) > > > > 收件人: "展荣臻(信泰)" > > > > 抄送: ceph-users > > > > 主题: Re: [ceph-users] ceph iscsi question > > > > > > > > On Wed, Oct 16, 2019 at 2:35 AM 展荣臻(信泰) > > > > wrote: > > > > > > > > > > hi,all > > > > > we deploy ceph with ceph-ansible.osds,mons and daemons of iscsi > > > > > runs in docker. > > > > > I create iscsi target according to > > > > > https://docs.ceph.com/docs/luminous/rbd/iscsi-target-cli/. > > > > > I discovered and logined iscsi target on another host,as show below: > > > > > > > > > > [root@node1 tmp]# iscsiadm -m discovery -t sendtargets -p > > > > > 192.168.42.110 > > > > > 192.168.42.110:3260,1 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw > > > > > 192.168.42.111:3260,2 iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw > > > > > [root@node1 tmp]# iscsiadm -m node -T > > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw -p 192.168.42.110 -l > > > > > Logging in to [iface: default, target: > > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: > > > > > 192.168.42.110,3260] (multiple) > > > > > Login to [iface: default, target: > > > > > iqn.2003-01.com.teamsun.iscsi-gw:iscsi-igw, portal: > > > > > 192.168.42.110,3260] successful. > > > > > > > > > > /dev/sde is mapped,when i mkfs.xfs -f /dev/sde, an Error occur, > > > > > > > > > > [root@node1 tmp]# mkfs.xfs -f /dev/sde > > > > > meta-data=/dev/sde isize=512agcount=4, > > > > > agsize=1966080 blks > > > > > = sectsz=512 attr=2, projid32bit=1 > > > > > = crc=1finobt=0, sparse=0 > > > > > data = bsize=4096 blocks=7864320, > > > > > imaxpct=25 > > > > > = sunit=0 swidth=0 blks > > > > > naming =version 2 bsize=4096 ascii-ci=0 ftype=1 > > > > > log =internal log bsize=4096 blocks=3840, version=2 > > > > > = sectsz=512 sunit=0 blks, > > > > > lazy-count=1 > > > > > realtime =none extsz=4096 blocks=0, rtextents=0 > > > > > existing superblock read failed: Input/output error > > > > > mkfs.xfs: pwrite64 failed: Input/output error > > > > > > > > > > message in /var/log/messages: > > > > > Oct 16 14:01:44 localhost kernel: Dev sde: unable to read RDB block 0 > > > > > Oct 16 14:01:44 localhost kernel: sde: unable to read partition table > > > > > Oct 16 14:02:17 localhost kernel: Dev sde: unable to read RDB block 0 > > > > > Oct 16 14:02:17 localhost kernel: sde: unable to read partition table > > > > > > > > > > we use Luminous ceph. > > > > > what cause this error? how debug it.any suggestion is appreciative. > > > > > > > > Please use the associated multipath device, not the raw block device. > > > > > > > hi,Jason > > > Thanks for your reply > > > The multipath device is the same error as raw block device. > > > > > > > What does "multipath -ll" show? > > > [root@node1 ~]# multipath -ll > mpathf (36001405366100aeda2044f286329b57a) dm-2 LIO-ORG ,TCMU device > size=30G features='0' hwhandler='0' wp=rw > |-+- policy='service-time 0' prio=0 status=enabled > | `- 13:0:0:0 sde 8:64 failed faulty running > `-+- policy='service-time 0' prio=0 status=enabled > `- 14:0:0:0 sdf 8:80 failed faulty running > [root@node1 ~]# > > I don't know if it is related to that our all daemons run in docker while > docker runs on kvm. > > > > > > > [1] https://docs.ceph.com/ceph-prs/30912/rbd/iscsi-initiator-linux/ -- Jason ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD PGs are not being removed - Full OSD issues
This is related to https://tracker.ceph.com/issues/42341 and to http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-October/037017.html After closing inspection yesterday we found that PGs are not being removed from OSDs which then leads to near full errors, explains why reweights don't work. This is a BIG issue because I have to constantly manually intervene to not have the cluster die.14.2.4. Fresh Setup, all defaultPG Balancer is turned off now, I begin to wonder if its at fault. My crush map: https://termbin.com/3t8lWhat was mentioned that the bucket weights are WEIRD. I never touched this.The crush weights that are unsual are for nearfull osd53 and some are set to 10 from a previous manual intervention. Now that the PGs are not being purged is one issue, the original issue is why the f ceph fills ONLY my nearfull OSDs in the first place. It seems to always select the fullest OSD to write more data onto it. If I reweight it it starts giving alerts for another almost full OSD because it intends to write everything there, despite everything else being only at about 60%. I dont know how to debug this, it's a MAJOR PITA Hope someone has an idea because I can't fight this 24/7, I'm getting pretty tired of this Thanks ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com