Thanks for the advice.

I dumped the filesystem contents, then deleted the cephfs, deleted the
pools, and recreated from scratch.

I did not track the specific issue in fuse, sorry. It gave an endpoint
disconnected message. I will next time for sure.

After the dump and recreate, all was good. Until... I now have a file with
a slightly different symptom. I can stat it, but not read it:

don@nubo-2:~$ cat .profile
cat: .profile: Input/output error
don@nubo-2:~$ stat .profile
  File: ‘.profile’
  Size: 675             Blocks: 2          IO Block: 4194304 regular file
Device: 0h/0d   Inode: 1099511687525  Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/     don)   Gid: ( 1000/     don)
Access: 2015-12-04 05:08:35.247603061 +0000
Modify: 2015-12-04 05:08:35.247603061 +0000
Change: 2015-12-04 05:13:29.395252968 +0000
 Birth: -
don@nubo-2:~$ sum .profile
sum: .profile: Input/output error
don@nubo-2:~$ ls -il .profile
1099511687525 -rw-r--r-- 1 don don 675 Dec  4 05:08 .profile

Would this be a similar problem? Should I give up on cephfs? its been
working fine for me for sometime, but now 2 errors in 4 days makes me very
nervous.


On 4 December 2015 at 08:16, Yan, Zheng <uker...@gmail.com> wrote:

> On Fri, Dec 4, 2015 at 10:39 AM, Don Waterloo <don.water...@gmail.com>
> wrote:
> > i have a file which is untouchable: ls -i gives an error, stat gives an
> > error. it shows ??? for all fields except name.
> >
> > How do i clean this up?
> >
>
> The safest way to clean this up is create a new directory, move rest
> files into the new directory, move the old directory into somewhere
> you don't touch, replace the old directory with the new directory.
>
>
> If you still are uncomfortable with it. you can use 'rados -p metadata
> rmomapkey ...'  to forcely remove the corrupted file.
>
> first flush journal
> #ceph daemon mds.nubo-2 flush journal
>
> find inode number of the directory which contains the corrupted file
>
> #rados -p metadata listomapkeys <dir inode number in hex>.00000000
>
> the output should include the name (with subfix _head) of corrupted file
>
> #rados -p metadata rmomapkey <dir inode number in hex>.00000000
> <omapkey for the corrupted file>
>
> now the file is deleted, but the directory become un-deletable. you
> can fix the directory by:
>
> make sure 'mds verify scatter' config is disable
> #ceph daemon mds.nubo-2 config set mds_verify_scatter 0
>
> fragment the directory
> #ceph mds tell 0 fragment_dir <path of the un-deletable directory in
> the FS>  '0/0' 1
>
> create a file in the directory
> #touch <path of the un-deletable directory>/foo
>
> above two steps will fix directory's stat, now you can delete the directory
> #rm -rf <path of the un-deletable directory>
>
>
> > I'm on ubuntu 15.10, running 0.94.5
> > # ceph -v
> > ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
> >
> > the node that accessed the file then caused a problem with mds:
> >
> > root@nubo-1:/home/git/go/src/github.com/gogits/gogs# ceph status
> >     cluster b23abffc-71c4-4464-9449-3f2c9fbe1ded
> >      health HEALTH_WARN
> >             mds0: Client nubo-1 failing to respond to capability release
> >      monmap e1: 3 mons at
> > {nubo-1=
> 10.100.10.60:6789/0,nubo-2=10.100.10.61:6789/0,nubo-3=10.100.10.62:6789/0}
> >             election epoch 906, quorum 0,1,2 nubo-1,nubo-2,nubo-3
> >      mdsmap e418: 1/1/1 up {0=nubo-2=up:active}, 2 up:standby
> >      osdmap e2081: 6 osds: 6 up, 6 in
> >       pgmap v95696: 560 pgs, 6 pools, 131 GB data, 97784 objects
> >             265 GB used, 5357 GB / 5622 GB avail
> >                  560 active+clean
> >
> > Trying a different node, i see the same problem.
> >
> > I'm getting this error dumped to dmesg:
> >
> > [670243.421212] Workqueue: ceph-msgr con_work [libceph]
> > [670243.421213]  0000000000000000 00000000e800e516 ffff8810cd68f9d8
> > ffffffff817e8c09
> > [670243.421215]  0000000000000000 0000000000000000 ffff8810cd68fa18
> > ffffffff8107b3c6
> > [670243.421217]  ffff8810cd68fa28 00000000ffffffea 0000000000000000
> > 0000000000000000
> > [670243.421218] Call Trace:
> > [670243.421221]  [<ffffffff817e8c09>] dump_stack+0x45/0x57
> > [670243.421223]  [<ffffffff8107b3c6>] warn_slowpath_common+0x86/0xc0
> > [670243.421225]  [<ffffffff8107b4fa>] warn_slowpath_null+0x1a/0x20
> > [670243.421229]  [<ffffffffc06ebb1c>] fill_inode.isra.18+0xc5c/0xc90
> [ceph]
> > [670243.421233]  [<ffffffff81217427>] ? inode_init_always+0x107/0x1b0
> > [670243.421236]  [<ffffffffc06e95e0>] ? ceph_mount+0x7e0/0x7e0 [ceph]
> > [670243.421241]  [<ffffffffc06ebe82>] ceph_fill_trace+0x332/0x910 [ceph]
> > [670243.421248]  [<ffffffffc0709db5>] handle_reply+0x525/0xb70 [ceph]
> > [670243.421255]  [<ffffffffc070cac8>] dispatch+0x3c8/0xbb0 [ceph]
> > [670243.421260]  [<ffffffffc069daeb>] con_work+0x57b/0x1770 [libceph]
> > [670243.421262]  [<ffffffff810b2d7b>] ? dequeue_task_fair+0x36b/0x700
> > [670243.421263]  [<ffffffff810b2141>] ? put_prev_entity+0x31/0x420
> > [670243.421265]  [<ffffffff81013689>] ? __switch_to+0x1f9/0x5c0
> > [670243.421267]  [<ffffffff8109412a>] process_one_work+0x1aa/0x440
> > [670243.421269]  [<ffffffff8109440b>] worker_thread+0x4b/0x4c0
> > [670243.421271]  [<ffffffff810943c0>] ? process_one_work+0x440/0x440
> > [670243.421273]  [<ffffffff810943c0>] ? process_one_work+0x440/0x440
> > [670243.421274]  [<ffffffff8109a7c8>] kthread+0xd8/0xf0
> > [670243.421276]  [<ffffffff8109a6f0>] ?
> kthread_create_on_node+0x1f0/0x1f0
> > [670243.421277]  [<ffffffff817efe1f>] ret_from_fork+0x3f/0x70
> > [670243.421279]  [<ffffffff8109a6f0>] ?
> kthread_create_on_node+0x1f0/0x1f0
> > [670243.421280] ---[ end trace 5cded7a882dfd5d1 ]---
> > [670243.421282] ceph: fill_inode badness ffff88179e2d9f28
> > 10000004e91.fffffffffffffffe
> >
> > this problem persisted through a reboot, and there is no fsck to help me.
> >
> > I also tried with ceph-fuse, but it crashes when I access the file.
>
> how did ceph-fuse crashed, please send backtrace to us.
>
> Regards
> Yan, Zheng
>
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to