Response inline
On 01/28/2016 07:28 AM, Ronny Adsetts wrote:
Hi all,
Have an issue I'm having trouble explaining or getting to the bottom of. I have
a two node, two brick replicated 75G volume containing ~4661 files:
gotham:~# gluster volume status software
Status of volume: software
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick gotham.stor.graysofwestminster.co.uk:/data/gluste
rfs/software/brick1/brick 49152 Y 27296
Brick metropolis.stor.graysofwestminster.co.uk:/data/gl
usterfs/software/brick1/brick 49152 Y 30335
NFS Server on localhost 2049 Y 27309
Self-heal Daemon on localhost N/A Y 27316
NFS Server on metropolis.stor.graysofwestminster.co.uk 2049 Y 30348
Self-heal Daemon on metropolis.stor.graysofwestminster.
co.uk N/A Y 30355
Task Status of Volume software
------------------------------------------------------------------------------
There are no active volume tasks
gotham:~# gluster volume heal software info
Brick gotham.graysofwestminster.co.uk:/data/glusterfs/software/brick1/brick/
Number of entries: 0
Brick metropolis.graysofwestminster.co.uk:/data/glusterfs/software/brick1/brick/
Number of entries: 0
I have the volume mounted on both nodes at /stor/software. On one of the nodes,
there are lots of files showing with a link count of 0 which is leading to the
Windows software using the data having validation issues:
metropolis:~# ls -ltr /stor/software/win_patches/ | head
total 21585884
-rwxr--r-- 0 ainet Domain Admins 27816 Jan 4 2006 nullpatch.exe
-rwxrwxr-- 0 ronny Domain Admins 135477136 Oct 29 2006 W2KA_SP4_128_x86_ENU.exe
-rwxrwxr-- 0 ronny Domain Admins 432568 Oct 29 2006 MSXML26SP3.exe
-rwxrwxr-- 0 ronny Domain Admins 1070592 Oct 29 2006 msxml3sp7.msi
-rwxrwxr-- 0 ronny Domain Admins 711160 Oct 29 2006 Windows2000-KB842773.EXE
-rwxrwxr-- 0 ronny Domain Admins 760824 Oct 29 2006 Windows2000-KB842933.EXE
-rwxrwxr-- 0 ronny Domain Admins 2585872 Oct 29 2006
WindowsInstaller-KB893803v2.exe
-rwxrwxr-- 0 ronny Domain Admins 106240 Oct 29 2006 Windows-KB870669.exe
-rwxrwxr-- 0 ronny Domain Admins 1260024 Oct 29 2006 Windows2000-KB908506.EXE
From the other node:
gotham:~# ls -ltr /stor/software/win_patches/ | head
total 21585884
-rwxr--r-- 1 1066 1016 27816 Jan 4 2006 nullpatch.exe
-rwxrwxr-- 1 1045 1016 135477136 Oct 29 2006 W2KA_SP4_128_x86_ENU.exe
-rwxrwxr-- 1 1045 1016 432568 Oct 29 2006 MSXML26SP3.exe
-rwxrwxr-- 1 1045 1016 1070592 Oct 29 2006 msxml3sp7.msi
-rwxrwxr-- 1 1045 1016 711160 Oct 29 2006 Windows2000-KB842773.EXE
-rwxrwxr-- 1 1045 1016 760824 Oct 29 2006 Windows2000-KB842933.EXE
-rwxrwxr-- 1 1045 1016 2585872 Oct 29 2006 WindowsInstaller-KB893803v2.exe
-rwxrwxr-- 1 1045 1016 106240 Oct 29 2006 Windows-KB870669.exe
-rwxrwxr-- 1 1045 1016 1260024 Oct 29 2006 Windows2000-KB908506.EXE
I'm seeing the following and lots more similar in the logs:
[2016-01-28 15:13:15.099478] W [client-rpc-fops.c:2772:client3_3_lookup_cbk]
0-software-client-1: remote operation failed: No such file or directory. Path:
/win_patches/SkypeSetup71732106.msi (3489f648-9fde-4bcc-b906-6ef88ffcf90f)
client-1 is brick2, metropolis. That error says the file is missing, but
it also shows the gfid. Perhaps the gfid hardlink is missing (the
.glusterfs tree).
Check the metadata for the file on both bricks:
getfattr -m . -d -e hex
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
Ensure the gfid is the same on both. Check the number of links on both,
there should be 2. With a gifd of 489f648-9fde-4bcc-b906-6ef88ffcf90f,
that file should be hard linked to
.glusterfs/48/9f/489f648-9fde-4bcc-b906-6ef88ffcf90f on both bricks.
File does exist on both bricks:
metropolis:~# ls -l
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
-rwxr--r-- 1 ainet Domain Admins 44380160 Dec 31 11:41
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
gotham:~# ls -l
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
-rwxr--r-- 2 1066 1016 44380160 Dec 31 11:41
/data/glusterfs/software/brick1/brick/win_patches/SkypeSetup71732106.msi
Any pointers to get to the bottom of this would be greatly appreciated.
Look for log entries in the brick logs that corrospond to the timestamps
on the client-side and see if there's some correlation.
Thanks.
Ronny
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users