Hi,
I had a chance to catch John Spray at the Ceph Day, and he suggested that I try
to reproduce this bug in luminos.
To fix my immediate problem we discussed 2 ideas:
1. Manually edit the Meta-data, unfortunately I was not able to find any
Information on how the meta-data is structured :-(
2. Edit the code to set the link count to 0 if it is negative:
diff --git a/src/mds/StrayManager.cc b/src/mds/StrayManager.cc
index 9e53907..2ca1449 100644
--- a/src/mds/StrayManager.cc
+++ b/src/mds/StrayManager.cc
@@ -553,6 +553,10 @@ bool StrayManager::__eval_stray(CDentry *dn, bool delay)
logger->set(l_mdc_num_strays_delayed, num_strays_delayed);
}
+ if (in->inode.nlink < 0) {
+ in->inode.nlink=0;
+ }
+
// purge?
if (in->inode.nlink == 0) {
// past snaprealm parents imply snapped dentry remote links.
diff --git a/src/xxHash b/src/xxHash
--- a/src/xxHash
+++ b/src/xxHash
@@ -1 +1 @@
Im not sure if this works, the patched mds no longer crashes, however I
expected that this value:
root@mds02:~ # ceph daemonperf mds.1
-----mds------ --mds_server-- ---objecter--- -----mds_cache----- ---mds_log----
rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs evts subm|
0 100k 0 | 0 0 0 | 0 0 0 | 0 0 625k 0 | 30 25k 0
^^^^
Should go down, but it stays at 625k, unfortunately I don't have another System
to compare.
After I started the patched mds once, I reverted back to an unpatched mds, and it also
stopped crashing, so I guess it did "fix" something.
A question just out of curiosity, I tried to log these events with something
like:
dout(10) << "Fixed negative inode count";
or
derr << "Fixed negative inode count";
But my compiler yelled at me for trying this.
Micha Krause
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com