On 03/18/2013 10:12 PM, Marc Seeger wrote:
Sadly, we keep seeing those. The logs display the same pattern:

[2013-03-18 05:22:49.174382] I 
[afr-self-heal-common.c:1941:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 
0-replicate0: Non blocking entrylks failed.
[2013-03-18 05:22:49.174382] E 
[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-replicate0: 
background  meta-data data entry missing-entry gfid self-heal failed on 
/home/vcltest256/.drush/vcltest256.aliases.drushrc.php.lock
[2013-03-18 05:22:49.174382] W [inode.c:914:inode_lookup] 
(-->/usr/lib/libglusterfs.so.0(default_lookup_cbk+0xc1) [0x7f14a4cd6d01] 
(-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xf198) [0x7f14a27fe198] 
(-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xeffb) [0x7f14a27fdffb]))) 
0-fuse: inode not found



On Mar 7, 2013, at 1:14 AM, Marc Seeger <[email protected] <mailto:[email protected]>> wrote:

Almost forgot: These operations were done on a symlinked directory (/home is linked to /mnt/gfs/home where /mnt/gfs is the gluster mountpoint)

On Mar 7, 2013, at 1:08 AM, Marc Seeger <[email protected] <mailto:[email protected]>> wrote:

Hey, in our testing we seem to sometimes run into a problem with gluster fs 
breaking.
The most recent occurrence was two processes on two machines trying to stat the 
same lock file:
2013-03-06T16:41:27+00:00 daemon.notice : creating directory: 
dir=/home/vcltest464/.drush, user=10036, group=10036, mode=0700
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning:  stat(): stat failed for 
/home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on 
line 695
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning:  stat(): stat failed for 
/home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on 
line 695
2013-03-06T16:41:27+00:00 daemon.notice : PHP Warning:  stat(): stat failed for 
/home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock in [...].php on 
line 695
(a few thousand times. it was a bug in the locking code when the stat failed)
It ends up as something like this in the gluster log files:
[2013-03-06 16:34:36.875559] W [client3_1-fops.c:2457:client3_1_link_cbk] 
0-remote8: remote operation failed: File exists 
(00000000-0000-0000-0000-000000000000 -> /home/vcltest418/prod)
[2013-03-06 16:34:36.875559] W [client3_1-fops.c:2457:client3_1_link_cbk] 
0-remote7: remote operation failed: File exists 
(00000000-0000-0000-0000-000000000000 -> /home/vcltest418/prod)
[2013-03-06 16:36:24.809098] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 
0-remote8: remote operation failed: File exists. Path: /vcltest473/php_sessions 
(00000000-0000-0000-0000-000000000000)
[2013-03-06 16:36:24.809098] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 
0-remote7: remote operation failed: File exists. Path: /vcltest473/php_sessions 
(00000000-0000-0000-0000-000000000000)
[2013-03-06 16:36:24.809098] W [fuse-bridge.c:292:fuse_entry_cbk] 
0-glusterfs-fuse: 9061: MKDIR() /vcltest473/php_sessions => -1 (File exists)
[2013-03-06 16:36:26.179144] I 
[afr-self-heal-common.c:1189:afr_sh_missing_entry_call_impunge_recreate] 
0-replicate0: no missing files - 
/home/vcltest473/.drush/vcltest473.aliases.drushrc.php.lock. proceeding to 
metadata check
[2013-03-06 16:36:34.899435] I [afr-self-heal-entry.c:2333:afr_sh_entry_fix] 
0-replicate0: /vcltest473/livedev: Performing conservative merge
[2013-03-06 16:41:02.118580] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 
0-remote8: remote operation failed: File exists. Path: 
/vcltest723/files-private (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:41:02.118580] W [client3_1-fops.c:327:client3_1_mkdir_cbk] 
0-remote7: remote operation failed: File exists. Path: 
/vcltest723/files-private (00000000-0000-0000-0000-000000000000)
[2013-03-06 16:41:02.118580] W [fuse-bridge.c:292:fuse_entry_cbk] 
0-glusterfs-fuse: 12435: MKDIR() /vcltest723/files-private => -1 (File exists)
[2013-03-06 16:41:27.179425] I 
[afr-self-heal-common.c:1941:afr_sh_post_nb_entrylk_conflicting_sh_cbk] 
0-replicate0: Non blocking entrylks failed.
[2013-03-06 16:41:27.179425] E 
[afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-replicate0: 
background  meta-data data entry missing-entry gfid self-heal failed on 
/home/vcltest464/.drush/vcltest464.aliases.drushrc.php.lock
[2013-03-06 16:41:27.179425] W [inode.c:914:inode_lookup] 
(-->/usr/lib/libglusterfs.so.0(default_lookup_cbk+0xc1) [0x7f93c6ac7d01] 
(-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xf198) [0x7f93c45ef198] 
(-->/usr/lib/glusterfs/3.3.1/xlator/mount/fuse.so(+0xeffb) [0x7f93c45eeffb]))) 
0-fuse: inode not found
After this, the mountpoint was not responding to file_exists() anymore which usually means the client died. Any idea what could have caused such behaviour?

What do you mean by 'client died'. Does it mean the glusterfs mount process crashed? If yes do you have the crash log from the log file of the mount process.

Cheers,
Marc




_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
[email protected]
http://supercolony.gluster.org/mailman/listinfo/gluster-users

Reply via email to