Thanks for the advice. I dumped the filesystem contents, then deleted the cephfs, deleted the pools, and recreated from scratch.
I did not track the specific issue in fuse, sorry. It gave an endpoint disconnected message. I will next time for sure. After the dump and recreate, all was good. Until... I now have a file with a slightly different symptom. I can stat it, but not read it: don@nubo-2:~$ cat .profile cat: .profile: Input/output error don@nubo-2:~$ stat .profile File: ‘.profile’ Size: 675 Blocks: 2 IO Block: 4194304 regular file Device: 0h/0d Inode: 1099511687525 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/ don) Gid: ( 1000/ don) Access: 2015-12-04 05:08:35.247603061 +0000 Modify: 2015-12-04 05:08:35.247603061 +0000 Change: 2015-12-04 05:13:29.395252968 +0000 Birth: - don@nubo-2:~$ sum .profile sum: .profile: Input/output error don@nubo-2:~$ ls -il .profile 1099511687525 -rw-r--r-- 1 don don 675 Dec 4 05:08 .profile Would this be a similar problem? Should I give up on cephfs? its been working fine for me for sometime, but now 2 errors in 4 days makes me very nervous. On 4 December 2015 at 08:16, Yan, Zheng <uker...@gmail.com> wrote: > On Fri, Dec 4, 2015 at 10:39 AM, Don Waterloo <don.water...@gmail.com> > wrote: > > i have a file which is untouchable: ls -i gives an error, stat gives an > > error. it shows ??? for all fields except name. > > > > How do i clean this up? > > > > The safest way to clean this up is create a new directory, move rest > files into the new directory, move the old directory into somewhere > you don't touch, replace the old directory with the new directory. > > > If you still are uncomfortable with it. you can use 'rados -p metadata > rmomapkey ...' to forcely remove the corrupted file. > > first flush journal > #ceph daemon mds.nubo-2 flush journal > > find inode number of the directory which contains the corrupted file > > #rados -p metadata listomapkeys <dir inode number in hex>.00000000 > > the output should include the name (with subfix _head) of corrupted file > > #rados -p metadata rmomapkey <dir inode number in hex>.00000000 > <omapkey for the corrupted file> > > now the file is deleted, but the directory become un-deletable. you > can fix the directory by: > > make sure 'mds verify scatter' config is disable > #ceph daemon mds.nubo-2 config set mds_verify_scatter 0 > > fragment the directory > #ceph mds tell 0 fragment_dir <path of the un-deletable directory in > the FS> '0/0' 1 > > create a file in the directory > #touch <path of the un-deletable directory>/foo > > above two steps will fix directory's stat, now you can delete the directory > #rm -rf <path of the un-deletable directory> > > > > I'm on ubuntu 15.10, running 0.94.5 > > # ceph -v > > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) > > > > the node that accessed the file then caused a problem with mds: > > > > root@nubo-1:/home/git/go/src/github.com/gogits/gogs# ceph status > > cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded > > health HEALTH_WARN > > mds0: Client nubo-1 failing to respond to capability release > > monmap e1: 3 mons at > > {nubo-1= > 10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0} > > election epoch 906, quorum 0,1,2 nubo-1,nubo-2,nubo-3 > > mdsmap e418: 1/1/1 up {0=nubo-2=up:active}, 2 up:standby > > osdmap e2081: 6 osds: 6 up, 6 in > > pgmap v95696: 560 pgs, 6 pools, 131 GB data, 97784 objects > > 265 GB used, 5357 GB / 5622 GB avail > > 560 active+clean > > > > Trying a different node, i see the same problem. > > > > I'm getting this error dumped to dmesg: > > > > [670243.421212] Workqueue: ceph-msgr con_work [libceph] > > [670243.421213] 0000000000000000 00000000e800e516 ffff8810cd68f9d8 > > ffffffff817e8c09 > > [670243.421215] 0000000000000000 0000000000000000 ffff8810cd68fa18 > > ffffffff8107b3c6 > > [670243.421217] ffff8810cd68fa28 00000000ffffffea 0000000000000000 > > 0000000000000000 > > [670243.421218] Call Trace: > > [670243.421221] [<ffffffff817e8c09>] dump_stack+0x45/0x57 > > [670243.421223] [<ffffffff8107b3c6>] warn_slowpath_common+0x86/0xc0 > > [670243.421225] [<ffffffff8107b4fa>] warn_slowpath_null+0x1a/0x20 > > [670243.421229] [<ffffffffc06ebb1c>] fill_inode.isra.18+0xc5c/0xc90 > [ceph] > > [670243.421233] [<ffffffff81217427>] ? inode_init_always+0x107/0x1b0 > > [670243.421236] [<ffffffffc06e95e0>] ? ceph_mount+0x7e0/0x7e0 [ceph] > > [670243.421241] [<ffffffffc06ebe82>] ceph_fill_trace+0x332/0x910 [ceph] > > [670243.421248] [<ffffffffc0709db5>] handle_reply+0x525/0xb70 [ceph] > > [670243.421255] [<ffffffffc070cac8>] dispatch+0x3c8/0xbb0 [ceph] > > [670243.421260] [<ffffffffc069daeb>] con_work+0x57b/0x1770 [libceph] > > [670243.421262] [<ffffffff810b2d7b>] ? dequeue_task_fair+0x36b/0x700 > > [670243.421263] [<ffffffff810b2141>] ? put_prev_entity+0x31/0x420 > > [670243.421265] [<ffffffff81013689>] ? __switch_to+0x1f9/0x5c0 > > [670243.421267] [<ffffffff8109412a>] process_one_work+0x1aa/0x440 > > [670243.421269] [<ffffffff8109440b>] worker_thread+0x4b/0x4c0 > > [670243.421271] [<ffffffff810943c0>] ? process_one_work+0x440/0x440 > > [670243.421273] [<ffffffff810943c0>] ? process_one_work+0x440/0x440 > > [670243.421274] [<ffffffff8109a7c8>] kthread+0xd8/0xf0 > > [670243.421276] [<ffffffff8109a6f0>] ? > kthread_create_on_node+0x1f0/0x1f0 > > [670243.421277] [<ffffffff817efe1f>] ret_from_fork+0x3f/0x70 > > [670243.421279] [<ffffffff8109a6f0>] ? > kthread_create_on_node+0x1f0/0x1f0 > > [670243.421280] ---[ end trace 5cded7a882dfd5d1 ]--- > > [670243.421282] ceph: fill_inode badness ffff88179e2d9f28 > > 10000004e91.fffffffffffffffe > > > > this problem persisted through a reboot, and there is no fsck to help me. > > > > I also tried with ceph-fuse, but it crashes when I access the file. > > how did ceph-fuse crashed, please send backtrace to us. > > Regards > Yan, Zheng > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com