Oh my god. The network configuration was the problem. Fixing network configuration, I successfully created CephFS. Thank you very much.
-Kisik 2018년 10월 16일 (화) 오후 9:58, John Spray <jsp...@redhat.com>님이 작성: > On Mon, Oct 15, 2018 at 7:15 PM Kisik Jeong <kisik.je...@csl.skku.edu> > wrote: > > > > I attached osd & fs dumps. There are two pools (cephfs_data, > cephfs_metadata) for CephFS clearly. And this system's network is 40Gbps > ethernet for public & cluster. So I don't think the network speed would be > problem. Thank you. > > Ah, your pools do exist, I had just been looking at the start of the > MDS log where it hadn't seen the osdmap yet. > > Looking again at your original log together with your osdmap, I notice > that your stuck operations are targeting OSDs 10,11,13,14,15, and all > these OSDs have public addresses in the 192.168.10.x range rather than > the 192.168.40.x range like the others. > > So my guess would be that you are intending your OSDs to be in the > 192.168.40.x range, but are missing some config settings for certain > daemons. > > John > > > > 2018년 10월 16일 (화) 오전 1:18, John Spray <jsp...@redhat.com>님이 작성: > >> > >> On Mon, Oct 15, 2018 at 4:24 PM Kisik Jeong <kisik.je...@csl.skku.edu> > wrote: > >> > > >> > Thank you for your reply, John. > >> > > >> > I restarted my Ceph cluster and captured the mds logs. > >> > > >> > I found that mds shows slow request because some OSDs are laggy. > >> > > >> > I followed the ceph mds troubleshooting with 'mds slow request', but > there is no operation in flight: > >> > > >> > root@hpc1:~/iodc# ceph daemon mds.hpc1 dump_ops_in_flight > >> > { > >> > "ops": [], > >> > "num_ops": 0 > >> > } > >> > > >> > Is there any other reason that mds shows slow request? Thank you. > >> > >> Those stuck requests seem to be stuck because they're targeting pools > >> that don't exist. Has something strange happened in the history of > >> this cluster that might have left a filesystem referencing pools that > >> no longer exist? Ceph is not supposed to permit removal of pools in > >> use by CephFS, but perhaps something went wrong. > >> > >> Check out the "ceph osd dump --format=json-pretty" and "ceph fs dump > >> --format=json-pretty" outputs and how the pool ID's relate. According > >> to those logs, data pool with ID 1 and metadata pool with ID 2 do not > >> exist. > >> > >> John > >> > >> > -Kisik > >> > > >> > 2018년 10월 15일 (월) 오후 11:43, John Spray <jsp...@redhat.com>님이 작성: > >> >> > >> >> On Mon, Oct 15, 2018 at 3:34 PM Kisik Jeong < > kisik.je...@csl.skku.edu> wrote: > >> >> > > >> >> > Hello, > >> >> > > >> >> > I successfully deployed Ceph cluster with 16 OSDs and created > CephFS before. > >> >> > But after rebooting due to mds slow request problem, when creating > CephFS, Ceph mds goes creating status and never changes. > >> >> > Seeing Ceph status, there is no other problem I think. Here is > 'ceph -s' result: > >> >> > >> >> That's pretty strange. Usually if an MDS is stuck in "creating", > it's > >> >> because an OSD operation is stuck, but in your case all your PGs are > >> >> healthy. > >> >> > >> >> I would suggest setting "debug mds=20" and "debug objecter=10" on > your > >> >> MDS, restarting it and capturing those logs so that we can see where > >> >> it got stuck. > >> >> > >> >> John > >> >> > >> >> > csl@hpc1:~$ ceph -s > >> >> > cluster: > >> >> > id: 1a32c483-cb2e-4ab3-ac60-02966a8fd327 > >> >> > health: HEALTH_OK > >> >> > > >> >> > services: > >> >> > mon: 1 daemons, quorum hpc1 > >> >> > mgr: hpc1(active) > >> >> > mds: cephfs-1/1/1 up {0=hpc1=up:creating} > >> >> > osd: 16 osds: 16 up, 16 in > >> >> > > >> >> > data: > >> >> > pools: 2 pools, 640 pgs > >> >> > objects: 7 objects, 124B > >> >> > usage: 34.3GiB used, 116TiB / 116TiB avail > >> >> > pgs: 640 active+clean > >> >> > > >> >> > However, CephFS still works in case of 8 OSDs. > >> >> > > >> >> > If there is any doubt of this phenomenon, please let me know. > Thank you. > >> >> > > >> >> > PS. I attached my ceph.conf contents: > >> >> > > >> >> > [global] > >> >> > fsid = 1a32c483-cb2e-4ab3-ac60-02966a8fd327 > >> >> > mon_initial_members = hpc1 > >> >> > mon_host = 192.168.40.10 > >> >> > auth_cluster_required = cephx > >> >> > auth_service_required = cephx > >> >> > auth_client_required = cephx > >> >> > > >> >> > public_network = 192.168.40.0/24 > >> >> > cluster_network = 192.168.40.0/24 > >> >> > > >> >> > [osd] > >> >> > osd journal size = 1024 > >> >> > osd max object name len = 256 > >> >> > osd max object namespace len = 64 > >> >> > osd mount options f2fs = active_logs=2 > >> >> > > >> >> > [osd.0] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.40.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.1] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.40.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.2] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.40.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.3] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.40.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.4] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.40.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.5] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.40.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.6] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.40.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.7] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.40.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.8] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.40.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.9] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.40.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.10] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.10.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.11] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.10.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.12] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.10.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.13] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.10.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > [osd.14] > >> >> > host = hpc9 > >> >> > public_addr = 192.168.10.18 > >> >> > cluster_addr = 192.168.40.18 > >> >> > > >> >> > [osd.15] > >> >> > host = hpc10 > >> >> > public_addr = 192.168.10.19 > >> >> > cluster_addr = 192.168.40.19 > >> >> > > >> >> > -- > >> >> > Kisik Jeong > >> >> > Ph.D. Student > >> >> > Computer Systems Laboratory > >> >> > Sungkyunkwan University > >> >> > _______________________________________________ > >> >> > ceph-users mailing list > >> >> > ceph-users@lists.ceph.com > >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > >> > > >> > > >> > > >> > -- > >> > Kisik Jeong > >> > Ph.D. Student > >> > Computer Systems Laboratory > >> > Sungkyunkwan University > > > > > > > > -- > > Kisik Jeong > > Ph.D. Student > > Computer Systems Laboratory > > Sungkyunkwan University > -- Kisik Jeong Ph.D. Student Computer Systems Laboratory Sungkyunkwan University
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com