If the filesystem returns an error we need to cleanup and avoid a deadlock.
This can happen if there is a disk corruption, or one has a stale ino (they can
get repurposed on restart)
Its easy to reproduce this bug by placing a 'poison' in diskfs_user_read_node
for a particular ino
and then just try to access the corresponding file from live system.
Before this fix:
- deadlock
With this fix:
- 'stat: cannot statx '<filename>': Input/output error'
---
libdiskfs/node-cache.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/libdiskfs/node-cache.c b/libdiskfs/node-cache.c
index 1ff19ade..d11e5866 100644
--- a/libdiskfs/node-cache.c
+++ b/libdiskfs/node-cache.c
@@ -112,7 +112,22 @@ diskfs_cached_lookup_context (ino_t inum, struct node
**npp,
/* Get the contents of NP off disk. */
err = diskfs_user_read_node (np, ctx);
if (err)
+ {
+ pthread_rwlock_wrlock (&nodecache_lock);
+ hurd_ihash_remove (&nodecache, (hurd_ihash_key_t) &np->cache_id);
+ pthread_rwlock_unlock (&nodecache_lock);
+
+ /* Don't delete from disk. */
+ np->dn_stat.st_nlink = 1;
+ np->allocsize = 0;
+ np->dn_set_ctime = 0;
+ np->dn_set_atime = 0;
+ np->dn_set_mtime = 0;
+ diskfs_nput (np);
+ *npp = NULL;
+
return err;
+ }
else
{
*npp = np;
--
2.51.0