Hello Shylesh,
Thanks for looking into this for me. I think the ext4 features are missing because the filesystems were accidentally formatted as ext3 and then mounted as ext4. I didn't realise that was possible until I started investigating this fix-layout problem. I don't know how I managed to make the same mistake on both replicated bricks but I can't think of any other explanation. I mounted the filesystems as ext3 and tried the rebalance again, but the result was the same. Then I tried converting the filesystems to ext4, as described in various CentOS forums and blogs including this one: http://blog.secaserver.com/2011/08/linux-converting-ext3-ext4-for-centos-5. Unfortunately the "Operation not supported" errors were still there during the fix-layout, so it seems that the damage has already been done by mounting the ext3 filesystems as ext4. Perhaps xattrs on new files would be created correctly in the converted bricks, but I really need to find a way to repair the GlusterFS xattrs on the existing files. Is there a way of doing this?

Regards
Dan.
Hi Dan,

I created two bricks both have ext4 file system.

The issue seems to be in fs features that you have disabled.

 Formatted the *brick1* with ext4:

root@SERVER1 mnt]# dumpe2fs /dev/sda| grep 'Filesystem features'
dumpe2fs 1.41.12 (17-May-2010)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize

Formatted *brick 2* with ext4:
 [root@SERVER2 ~]# dumpe2fs /dev/sda| grep 'Filesystem features'
dumpe2fs 1.41.12 (17-May-2010)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file

As you said i have disabled some of the features from *brick2*.

I created a distribute volume with these two bricks. created some files on the mount point and tried setting xattr for these files.

I got error messages
=======================================================================================
[2011-12-30 01:57:22.551634] I [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote operation failed: Operation not supported [2011-12-30 01:57:22.551658] W [fuse-bridge.c:850:fuse_err_cbk] 0-glusterfs-fuse: 201305: SETXATTR() /92 => -1 (Operation not supported) [2011-12-30 01:57:22.556490] I [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote operation failed: Operation not supported [2011-12-30 01:57:22.556520] W [fuse-bridge.c:850:fuse_err_cbk] 0-glusterfs-fuse: 201311: SETXATTR() /95 => -1 (Operation not supported) [2011-12-30 01:57:22.564089] I [client3_1-fops.c:818:client3_1_setxattr_cbk] 1-test-client-1: remote operation failed: Operation not supported [2011-12-30 01:57:22.564114] W [fuse-bridge.c:850:fuse_err_cbk] 0-glusterfs-fuse: 201321: SETXATTR() /100 => -1 (Operation not supported)
========================================================================================

where as i created another volume with only *brick1* and everything went smoothly. so i suspect problem is not with rebalance but with ext4 features that are disabled on *brick2*.

Please let me know if i am missing anything that can be tried.




Thanks,
Shylesh

------------------------------------------------------------------------
*From:* [email protected] [[email protected]] on behalf of Dan Bretherton [[email protected]]
*Sent:* Thursday, December 29, 2011 6:05 AM
*To:* gluster-users
*Subject:* [Gluster-users] fix-layout stalls with xattr errors

Hello All-
I am having problems with rebalance ... fix-layout in version 3.2.5. I extended a volume with add-brick but the fix-layout stalls after a small number of layout fixes and does not make any more progress. I have tried the operation twice on different servers with the same result. The following errors are found in the fuse mount log file on the server carrying out the operation.

    [2011-12-28 21:38:14.840013] I
    [afr-common.c:1038:afr_launch_self_heal] 0-nemo2-replicate-4:
    background  data self-heal triggered. path:
    /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.nc
    [2011-12-28 21:38:15.93079] E
    [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-8:
    remote operation failed: Operation not supported
    [2011-12-28 21:38:15.93141] E
    [client3_1-fops.c:1498:client3_1_fxattrop_cbk] 0-nemo2-client-9:
    remote operation failed: Operation not supported
    [2011-12-28 21:38:15.93385] I
    [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-8:
    remote operation failed: Operation not supported
    [2011-12-28 21:38:15.93521] I
    [client3_1-fops.c:1187:client3_1_fstat_cbk] 0-nemo2-client-9:
    remote operation failed: Operation not supported


The file in the error message is a link, and it is not broken as seen from the volume mount point or the bricks.

There are some worrying error messages in the brick log files for nemo2-client-8 and nemo2-client-9. Here are some exerpts from the nemo2-client-8 log, which is similar to the 0-nemo2-client-9 log.

    [2011-12-28 21:23:05.827877] W [posix.c:3928:do_xattrop]
    0-nemo2-posix: Extended attributes not supported by filesystem
    [2011-12-28 21:23:05.827932] I
    [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server: 8438:
    FXATTROP 0 (-2111276040) ==> -1 (Operation not support
    ed)
    [2011-12-28 21:23:05.828848] E [posix.c:4200:posix_fstat]
    0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
    supported
    [2011-12-28 21:23:05.828879] I
    [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 8439:
    FSTAT 0 (-2111276040) ==> -1 (Operation not supported)
    [2011-12-28 21:29:29.871213] W
    [socket.c:1494:__socket_proto_state_machine] 0-tcp.nemo2-server:
    reading from socket failed. Error (Transport endpoint i
    s not connected), peer (192.171.166.81:1003)
    [2011-12-28 21:29:29.871305] I
    [server-helpers.c:360:do_lock_table_cleanup] 0-nemo2-server:
    inodelk released on /users/hzu/DATA/ERAINT/ORCA025/2010/sno
    w_ERAINT_2010.nc
    [2011-12-28 21:29:29.871345] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu/DATA/ERAINT/ORCA025/2010/snow_ERAINT_2010.
    nc

    [2011-12-28 21:34:36.190023] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup on /
    [2011-12-28 21:34:36.190055] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users
    [2011-12-28 21:34:36.190086] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu
    [2011-12-28 21:34:36.190102] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu/DATA
    [2011-12-28 21:34:36.190135] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu/DATA/ERAINT
    [2011-12-28 21:34:36.190154] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu/DATA/ERAINT/ORCA025
    [2011-12-28 21:34:36.190171] I
    [server-helpers.c:485:do_fd_cleanup] 0-nemo2-server: fd cleanup
    on /users/hzu/DATA/ERAINT/ORCA025/2009

     [2011-12-28 21:38:15.92433] I
    [server3_1-fops.c:1705:server_fxattrop_cbk] 0-nemo2-server:
    12228: FXATTROP 7 (-2111276040) ==> -1 (Operation not supported)
    [2011-12-28 21:38:15.92743] E [posix.c:4200:posix_fstat]
    0-nemo2-posix: fstat failed on fd=0x2aaaac703804: Operation not
    supported
    [2011-12-28 21:38:15.92775] I
    [server3_1-fops.c:1113:server_fstat_cbk] 0-nemo2-server: 12229:
    FSTAT 7 (-2111276040) ==> -1 (Operation not supported)


The backend filesystems are ext4 and the are mounted with options "acl,user_xattr". I tested extended attribute support (as suggested here: http://gluster.org/pipermail/gluster-users/2010-December/006257.html) and could not find any problems, so I don't understand the "Extended attributes not supported by filesystem" error. The only unusual thing about the filesystems is the reduced number of filesystem features enabled compared to other bricks. These are the ext4 features enabled.

has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file

All the other bricks in the volume have these features plus extent, flex_bg, huge_file, uninit_bg, dir_nlink and extra_isize. I don't know if any of these missing ext4 features are part of the problem. Does anybody know what's going on here?

Regards
Dan.



_______________________________________________
Gluster-users mailing list
[email protected]
http://gluster.org/cgi-bin/mailman/listinfo/gluster-users

Reply via email to