Hi John,
Seems like you're right... strange that it seemed to work with only one mds
before I shut the cluster down. Here is the `ceph fs get` output for the two
file systems:
[root@carf-ceph-osd15 ~]# ceph fs get carf_ceph_kube01
Filesystem 'carf_ceph_kube01' (2)
fs_name carf_ceph_kube01
epoch 22
flags 8
created 2017-08-21 12:10:57.948579
modified 2017-08-21 12:10:57.948579
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 1218
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses
versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
max_mds 1
in 0
up {}
failed 0
damaged
stopped
data_pools [23]
metadata_pool 24
inline_data disabled
balancer
standby_count_wanted 0
[root@carf-ceph-osd15 ~]#
[root@carf-ceph-osd15 ~]# ceph fs get carf_ceph02
Filesystem 'carf_ceph02' (1)
fs_name carf_ceph02
epoch 26
flags 8
created 2017-08-18 14:20:50.152054
modified 2017-08-18 14:20:50.152054
tableserver 0
root 0
session_timeout 60
session_autoclose 300
max_file_size 1099511627776
last_failure 0
last_failure_osd_epoch 1198
compat compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds uses
versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
max_mds 1
in 0
up {0=474299}
failed
damaged
stopped
data_pools [21]
metadata_pool 22
inline_data disabled
balancer
standby_count_wanted 0
474299: 7.128.13.69:6800/304042158 'carf-ceph-osd15' mds.0.23 up:active seq 5
I also looked into trying to specify the mds_namespace option to the mount
operation (http://docs.ceph.com/docs/master/cephfs/kernel/) but that doesn’t
seem to be valid:
[ceph-admin@carf-ceph-osd04 ~]$ sudo mount -t ceph carf-ceph-osd15:6789:/
/mnt/carf_ceph02/ -o
mds_namespace=carf_ceph02,name=cephfs.k8test,secretfile=k8test.secret
mount error 22 = Invalid argument
Thanks,
-Bryan
-----Original Message-----
From: John Spray [mailto:[email protected]]
Sent: Tuesday, August 22, 2017 11:18 AM
To: Bryan Banister <[email protected]>
Cc: [email protected]
Subject: Re: [ceph-users] Help with file system with failed mds daemon
Note: External Email
-------------------------------------------------
On Tue, Aug 22, 2017 at 4:58 PM, Bryan Banister
<[email protected]<mailto:[email protected]>> wrote:
> Hi all,
>
>
>
> I’m still new to ceph and cephfs. Trying out the multi-fs configuration on
> at Luminous test cluster. I shutdown the cluster to do an upgrade and when
> I brought the cluster back up I now have a warnings that one of the file
> systems has a failed mds daemon:
>
>
>
> 2017-08-21 17:00:00.000081 mon.carf-ceph-osd15 [WRN] overall HEALTH_WARN 1
> filesystem is degraded; 1 filesystem is have a failed mds daemon; 1 pools
> have many more objects per pg than average; application not enabled on 9
> pool(s)
>
>
>
> I tried restarting the mds service on the system and it doesn’t seem to
> indicate any problems:
>
> 2017-08-21 16:13:40.979449 7fffed8b0700 1 mds.0.20 shutdown: shutting down
> rank 0
>
> 2017-08-21 16:13:41.012167 7ffff7fde1c0 0 set uid:gid to 167:167
> (ceph:ceph)
>
> 2017-08-21 16:13:41.012180 7ffff7fde1c0 0 ceph version 12.1.4
> (a5f84b37668fc8e03165aaf5cbb380c78e4deba4) luminous (rc), process (unknown),
> pid 16656
>
> 2017-08-21 16:13:41.014105 7ffff7fde1c0 0 pidfile_write: ignore empty
> --pid-file
>
> 2017-08-21 16:13:45.541442 7ffff10b7700 1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:45.541449 7ffff10b7700 1 mds.0.23 handle_mds_map state
> change up:boot --> up:replay
>
> 2017-08-21 16:13:45.541459 7ffff10b7700 1 mds.0.23 replay_start
>
> 2017-08-21 16:13:45.541466 7ffff10b7700 1 mds.0.23 recovery set is
>
> 2017-08-21 16:13:45.541475 7ffff10b7700 1 mds.0.23 waiting for osdmap 1198
> (which blacklists prior instance)
>
> 2017-08-21 16:13:45.565779 7fffea8aa700 0 mds.0.cache creating system inode
> with ino:0x100
>
> 2017-08-21 16:13:45.565920 7fffea8aa700 0 mds.0.cache creating system inode
> with ino:0x1
>
> 2017-08-21 16:13:45.571747 7fffe98a8700 1 mds.0.23 replay_done
>
> 2017-08-21 16:13:45.571751 7fffe98a8700 1 mds.0.23 making mds journal
> writeable
>
> 2017-08-21 16:13:46.542148 7ffff10b7700 1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:46.542149 7ffff10b7700 1 mds.0.23 handle_mds_map state
> change up:replay --> up:reconnect
>
> 2017-08-21 16:13:46.542158 7ffff10b7700 1 mds.0.23 reconnect_start
>
> 2017-08-21 16:13:46.542161 7ffff10b7700 1 mds.0.23 reopen_log
>
> 2017-08-21 16:13:46.542171 7ffff10b7700 1 mds.0.23 reconnect_done
>
> 2017-08-21 16:13:47.543612 7ffff10b7700 1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:47.543616 7ffff10b7700 1 mds.0.23 handle_mds_map state
> change up:reconnect --> up:rejoin
>
> 2017-08-21 16:13:47.543623 7ffff10b7700 1 mds.0.23 rejoin_start
>
> 2017-08-21 16:13:47.543638 7ffff10b7700 1 mds.0.23 rejoin_joint_start
>
> 2017-08-21 16:13:47.543666 7ffff10b7700 1 mds.0.23 rejoin_done
>
> 2017-08-21 16:13:48.544768 7ffff10b7700 1 mds.0.23 handle_mds_map i am now
> mds.0.23
>
> 2017-08-21 16:13:48.544771 7ffff10b7700 1 mds.0.23 handle_mds_map state
> change up:rejoin --> up:active
>
> 2017-08-21 16:13:48.544779 7ffff10b7700 1 mds.0.23 recovery_done --
> successful recovery!
>
> 2017-08-21 16:13:48.544924 7ffff10b7700 1 mds.0.23 active_start
>
> 2017-08-21 16:13:48.544954 7ffff10b7700 1 mds.0.23 cluster recovered.
>
>
>
> This seems like an easy problem to fix. Any help is greatly appreciated!
I wonder if you have two filesystems but only one MDS? Ceph will then
think that the second filesystem "has a failed MDS" because there
isn't an MDS online to service it.
John
>
> -Bryan
>
>
> ________________________________
>
> Note: This email is for the confidential use of the named addressee(s) only
> and may contain proprietary, confidential or privileged information. If you
> are not the intended recipient, you are hereby notified that any review,
> dissemination or copying of this email is strictly prohibited, and to please
> notify the sender immediately and destroy this email and any attachments.
> Email transmission cannot be guaranteed to be secure or error-free. The
> Company, therefore, does not make any guarantees as to the completeness or
> accuracy of this email or any attachments. This email is for informational
> purposes only and does not constitute a recommendation, offer, request or
> solicitation of any kind to buy, sell, subscribe, redeem or perform any type
> of transaction of a financial product.
>
> _______________________________________________
> ceph-users mailing list
> [email protected]<mailto:[email protected]>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
________________________________
Note: This email is for the confidential use of the named addressee(s) only and
may contain proprietary, confidential or privileged information. If you are not
the intended recipient, you are hereby notified that any review, dissemination
or copying of this email is strictly prohibited, and to please notify the
sender immediately and destroy this email and any attachments. Email
transmission cannot be guaranteed to be secure or error-free. The Company,
therefore, does not make any guarantees as to the completeness or accuracy of
this email or any attachments. This email is for informational purposes only
and does not constitute a recommendation, offer, request or solicitation of any
kind to buy, sell, subscribe, redeem or perform any type of transaction of a
financial product.
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com