On 01/18/2012 02:24 AM, Pranith Kumar K wrote:
On 01/17/2012 05:54 PM, Dan Bretherton wrote:
Dear All-
I have been having problems with rebalance ... self-heal again with
Glusterfs version 3.2.5, this time related to "no gfid found"
errors. A fix-layout operation has stalled because errors like the
following are being reported for large number of files.
[2012-01-17 10:48:02.138837] W
[fuse-resolve.c:273:fuse_resolve_deep_cbk] 0-fuse:
/users/mvc/WORK/ORCA1/ORCA1-MV01-DIMGPROC/RUNTMP_Exp61/ORCA1-MV01_2D_y2007m01d05.dimgproc.020:
no gfid found
I thought GFID errors were being fixed in version 3.2.5. How can I
fix these errors to allow rebalance...fix-layout to run normally? I
am also very worried that the lack of GFID entries for files and
directories could stop file replication and other GlusterFS
operations from working properly. All comments and suggestions would
be much appreciated.
Regards
Dan.
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
Dan,
We would like to reproduce this problem in house, could you give
more details on how to get into this situation.
Pranith
Hello Pranith,
The errors were probably the result of a server that became unresponsive
for a few hours and had to be restarted. When the server was not
responding properly it was still showing as Connected in the output of
"gluster peer status", but the load was growing quite large and it was
impossible to log on. I restarted the server and triggered a self-heal
operation on all the volumes in case any files had not been copied
correctly to the unresponsive server. Later on I noticed some layout
related error messages mentioning "anomalies", so I started a fix-layout
operation to correct them. The fix-layout didn't complete the first time
because of "no gfid found" errors, as I reported to the mailing list. A
couple of days later I stopped fix-layout and started it again on
another server, and that time it ran to completion. I then re-ran the
self-heal operation and didn't find any new layout errors. I don't know
if the second fix-layout attempt worked because it was performed on a
different server, or if the "no gfid found" errors had been corrected
automatically by GlusterFS in the days between the two fix-layout
attempts. Either way I am very relieved, and I apologise for the false
alarm. GlusterFS version 3.2.5 does appear to be able to correct GFID
errors automatically, but this process can take a long time it seems.
Regards
-Dan.
_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users