Recently, we lost a brick in a 4-node distribute + replica 2 volume.  The host 
was fine so we simply fixed the hardware failure, recreated the zpool and zfs, 
set the correct trusted.glusterfs.volume-id, restarted the gluster daemons on 
the host and the heal got to work.  The version running is 3.7.4 atop Ubuntu 
Trusty.

However, we’ve noticed that directories are not getting created on the brick 
being healed with the correct ctime and mtime.  Files, however, are being set 
correctly.

$ gluster volume info edc1
 
Volume Name: edc1
Type: Distributed-Replicate
Volume ID: 2f6b5804-e2d8-4400-93e9-b172952b1aae
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: fs4:/fs4/edc1
Brick2: fs5:/fs5/edc1
Brick3: hdfs5:/hdfs5/edc1
Brick4: hdfs6:/hdfs6/edc1
Options Reconfigured:
performance.write-behind-window-size: 1GB
performance.cache-size: 1GB
performance.readdir-ahead: enable
performance.read-ahead: enable

Example:

On the glusterfs mount:

  File: ‘BSA_9781483021973’
  Size: 36              Blocks: 2          IO Block: 131072 directory
Device: 19h/25d Inode: 11345194644681878130  Links: 2
Access: (0777/drwxrwxrwx)  Uid: ( 1007/ UNKNOWN)   Gid: ( 1007/ UNKNOWN)
Access: 2015-11-27 04:01:49.520001319 -0800
Modify: 2014-08-29 09:20:50.006294000 -0700
Change: 2015-02-16 00:04:21.312079523 -0800
 Birth: -

On the unfailed brick:

  File: ‘BSA_9781483021973’
  Size: 10              Blocks: 6          IO Block: 1024   directory
Device: 1ah/26d Inode: 25261       Links: 2
Access: (0777/drwxrwxrwx)  Uid: ( 1007/ UNKNOWN)   Gid: ( 1007/ UNKNOWN)
Access: 2015-11-27 04:01:49.520001319 -0800
Modify: 2014-08-29 09:20:50.006294000 -0700
Change: 2015-02-16 00:04:21.312079523 -0800
 Birth: -

On the failed brick that’s healing:

  File: ‘BSA_9781483021973’
  Size: 10              Blocks: 6          IO Block: 131072 directory
Device: 17h/23d Inode: 252324      Links: 2
Access: (0777/drwxrwxrwx)  Uid: ( 1007/ UNKNOWN)   Gid: ( 1007/ UNKNOWN)
Access: 2015-11-27 10:10:35.441261192 -0800
Modify: 2015-11-25 04:07:36.354860631 -0800
Change: 2015-11-25 04:07:36.354860631 -0800
 Birth: -

Normally, this wouldn’t be an issue, except that the glusterfs is reporting the 
ctime and mtime of the directories that the failed node is now the 
authoritative replica for.  An example:

On a non-failed brick:

  File: ‘BSA_9780792765073’
  Size: 23              Blocks: 6          IO Block: 3072   directory
Device: 1ah/26d Inode: 3734793     Links: 2
Access: (0777/drwxrwxrwx)  Uid: ( 1007/ UNKNOWN)   Gid: ( 1007/ UNKNOWN)
Access: 2015-11-27 10:22:25.374931735 -0800
Modify: 2015-03-24 13:56:53.371733811 -0700
Change: 2015-03-24 13:56:53.371733811 -0700
 Birth: -

On the glusterfs:

  File: ‘BSA_9780792765073’
  Size: 97              Blocks: 2          IO Block: 131072 directory
Device: 19h/25d Inode: 13293019492851992284  Links: 2
Access: (0777/drwxrwxrwx)  Uid: ( 1007/ UNKNOWN)   Gid: ( 1007/ UNKNOWN)
Access: 2015-11-27 10:22:20.922782180 -0800
Modify: 2015-11-25 04:03:21.889978948 -0800
Change: 2015-11-25 04:03:21.889978948 -0800
 Birth: -

Thanks,
-t


_______________________________________________
Gluster-users mailing list
[email protected]
http://www.gluster.org/mailman/listinfo/gluster-users

Reply via email to