Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version

Mauro Tridici Wed, 26 Sep 2018 10:57:27 -0700

Hi Ashish,

sure, no problem! We are a little bit worried, but we can wait  :-)
Thank you very much for your support and your availability.


Regards,
Mauro


> Il giorno 26 set 2018, alle ore 19:33, Ashish Pandey <[email protected]> ha 
> scritto:
> 
> Hi Mauro,
> 
> Yes, I can provide you step by step procedure to correct it. 
> Is it fine If i provide you the steps tomorrow as it is quite late over here 
> and I don't want to miss anything in hurry?
> 
> ---
> Ashish
> 
> From: "Mauro Tridici" <[email protected]>
> To: "Ashish Pandey" <[email protected]>
> Cc: "gluster-users" <[email protected]>
> Sent: Wednesday, September 26, 2018 6:54:19 PM
> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume  
>       based on 3.12.14 version
> 
> 
> Hi Ashish,
> 
> in attachment you can find the rebalance log file and the last updated brick 
> log file (the other files in /var/log/glusterfs/bricks directory seem to be 
> too old).
> I just stopped the running rebalance (as you can see at the bottom of the 
> rebalance log file).
> So, if exists a safe procedure to correct the problem I would like execute it.
> 
> I don’t know if I can ask you it, but, if it is possible, could you please 
> describe me step by step the right procedure to remove the newly added bricks 
> without losing the data that have been already rebalanced?
> 
> The following outputs show the result of “df -h” command executed on one of 
> the first 3 nodes (s01, s02, s03) already existing  and on one of the last 3 
> nodes (s04, s05, s06) added recently.
> 
> [root@s06 bricks]# df -h
> File system                          Dim. Usati Dispon. Uso% Montato su
> /dev/mapper/cl_s06-root              100G  2,1G     98G   3% /
> devtmpfs                              32G     0     32G   0% /dev
> tmpfs                                 32G  4,0K     32G   1% /dev/shm
> tmpfs                                 32G   26M     32G   1% /run
> tmpfs                                 32G     0     32G   0% /sys/fs/cgroup
> /dev/mapper/cl_s06-var               100G  2,0G     99G   2% /var
> /dev/mapper/cl_s06-gluster           100G   33M    100G   1% /gluster
> /dev/sda1                           1014M  152M    863M  15% /boot
> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  807G    8,3T   9% /gluster/mnt3
> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  807G    8,3T   9% /gluster/mnt6
> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  807G    8,3T   9% /gluster/mnt2
> /dev/mapper/gluster_vge-gluster_lve  9,0T  807G    8,3T   9% /gluster/mnt4
> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  887G    8,2T  10% /gluster/mnt9
> /dev/mapper/gluster_vgb-gluster_lvb  9,0T  807G    8,3T   9% /gluster/mnt1
> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  887G    8,2T  10% /gluster/mnt7
> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  807G    8,3T   9% /gluster/mnt5
> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  887G    8,2T  10% /gluster/mnt8
> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  887G    8,2T  10% /gluster/mnt11
> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  887G    8,2T  10% /gluster/mnt10
> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  887G    8,2T  10% /gluster/mnt12
> tmpfs                                6,3G     0    6,3G   0% /run/user/0
> 
> [root@s01 ~]# df -h
> File system                          Dim. Usati Dispon. Uso% Montato su
> /dev/mapper/cl_s01-root              100G  5,3G     95G   6% /
> devtmpfs                              32G     0     32G   0% /dev
> tmpfs                                 32G   39M     32G   1% /dev/shm
> tmpfs                                 32G   26M     32G   1% /run
> tmpfs                                 32G     0     32G   0% /sys/fs/cgroup
> /dev/mapper/cl_s01-var               100G   11G     90G  11% /var
> /dev/md127                          1015M  151M    865M  15% /boot
> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,5T    3,6T  61% /gluster/mnt7
> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,6T  61% /gluster/mnt11
> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63% /gluster/mnt4
> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  64% /gluster/mnt10
> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,5T    3,6T  61% /gluster/mnt8
> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,6T  61% /gluster/mnt12
> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  64% /gluster/mnt9
> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,6T    3,5T  63% /gluster/mnt6
> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,6T    3,5T  63% /gluster/mnt5
> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63% /gluster/mnt3
> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62% /gluster/mnt1
> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62% /gluster/mnt2
> tmpfs                                6,3G     0    6,3G   0% /run/user/0
> s01-stg:tier2                        420T  159T    262T  38% /tier2
> 
> As you can see, used space value of each brick of the last servers is about 
> 800GB.
> 
> Thank you,
> Mauro
> 
> 
> 
> 
> 
> 
> 
> 
> Il giorno 26 set 2018, alle ore 14:51, Ashish Pandey <[email protected] 
> <mailto:[email protected]>> ha scritto:
> 
> Hi Mauro,
> 
> rebalance and brick logs should be the first thing we should go through.
> 
> There is a procedure to correct the configuration/setup but the situation you 
> are in is difficult to follow that procedure.
> You should have added the bricks hosted on s04-stg, s05-stg and s06-stg the 
> same way you had the previous configuration.
> That means 2 bricks on each node for one subvolume.
> The procedure will require a lot of replace bricks which will again need 
> healing and all. In addition to that we have to wait for re-balance to 
> complete.
> 
> I would suggest that if whole data has not been rebalanced and if you can 
> stop the rebalance and remove these newly added bricks properly then you 
> should remove these newly added bricks.
> After that, add these bricks so that you have 2 bricks of each volume on 3 
> newly added nodes.
> 
> Yes, it is like undoing whole effort but it is better to do it now then 
> facing issues in future when it will be almost impossible to correct these 
> things if you have lots of data.
> 
> ---
> Ashish
> 
> 
> 
> From: "Mauro Tridici" <[email protected] <mailto:[email protected]>>
> To: "Ashish Pandey" <[email protected] <mailto:[email protected]>>
> Cc: "gluster-users" <[email protected] 
> <mailto:[email protected]>>
> Sent: Wednesday, September 26, 2018 5:55:02 PM
> Subject: Re: [Gluster-users] Rebalance failed on Distributed Disperse volume  
>       based on 3.12.14 version
> 
> 
> Dear Ashish,
> 
> thank you for you answer.
> I could provide you the entire log file related to glusterd, glusterfsd and 
> rebalance.
> Please, could you indicate which one you need first?
> 
> Yes, we added the last 36 bricks after creating vol. Is there a procedure to 
> correct this error? Is it still possible to do it?
> 
> Many thanks,
> Mauro
> 
> Il giorno 26 set 2018, alle ore 14:13, Ashish Pandey <[email protected] 
> <mailto:[email protected]>> ha scritto:
> 
> 
> I think we don't have enough logs to debug this so I would suggest you to 
> provide more logs/info.
> I have also observed that the configuration and setup of your volume is not 
> very efficient.
> 
> For example: 
> Brick37: s04-stg:/gluster/mnt1/brick
> Brick38: s04-stg:/gluster/mnt2/brick
> Brick39: s04-stg:/gluster/mnt3/brick
> Brick40: s04-stg:/gluster/mnt4/brick
> Brick41: s04-stg:/gluster/mnt5/brick
> Brick42: s04-stg:/gluster/mnt6/brick
> Brick43: s04-stg:/gluster/mnt7/brick
> Brick44: s04-stg:/gluster/mnt8/brick
> Brick45: s04-stg:/gluster/mnt9/brick
> Brick46: s04-stg:/gluster/mnt10/brick
> Brick47: s04-stg:/gluster/mnt11/brick
> Brick48: s04-stg:/gluster/mnt12/brick
> 
> These 12 bricks are on same node and the sub volume made up of these bricks 
> will be of same subvolume, which is not good. Same is true for the bricks 
> hosted on s05-stg and s06-stg
> I think you have added these bricks after creating vol. The probability of 
> disruption in connection of these bricks will be higher in this case.
> 
> ---
> Ashish
> 
> From: "Mauro Tridici" <[email protected] <mailto:[email protected]>>
> To: "gluster-users" <[email protected] 
> <mailto:[email protected]>>
> Sent: Wednesday, September 26, 2018 3:38:35 PM
> Subject: [Gluster-users] Rebalance failed on Distributed Disperse volume      
>   based on 3.12.14 version
> 
> Dear All, Dear Nithya,
> 
> after upgrading from 3.10.5 version to 3.12.14, I tried to start a rebalance 
> process to distribute data across the bricks, but something goes wrong.
> Rebalance failed on different nodes and the time value needed to complete the 
> procedure seems to be very high.
> 
> [root@s01 ~]# gluster volume rebalance tier2 status
>                                     Node Rebalanced-files          size       
> scanned      failures       skipped               status  run time in h:m:s
>                                ---------      -----------   -----------   
> -----------   -----------   -----------         ------------     
> --------------
>                                localhost               19       161.6GB       
>     537             2             2          in progress        0:32:23
>                                  s02-stg               25       212.7GB       
>     526             5             2          in progress        0:32:25
>                                  s03-stg                4        69.1GB       
>     511             0             0          in progress        0:32:25
>                                  s04-stg                4      484Bytes       
>   12283             0             3          in progress        0:32:25
>                                  s05-stg               23      484Bytes       
>   11049             0            10          in progress        0:32:25
>                                  s06-stg                3         1.2GB       
>    8032            11             3               failed        0:17:57
> Estimated time left for rebalance to complete :     3601:05:41
> volume rebalance: tier2: success
> 
> When rebalance processes fail, I can see the following kind of errors in 
> /var/log/glusterfs/tier2-rebalance.log
> 
> Error type 1)
> 
> [2018-09-26 08:50:19.872575] W [MSGID: 122053] 
> [ec-common.c:269:ec_check_status] 0-tier2-disperse-10: Operation failed on 2 
> of 6 subvolumes.(up=111111, mask=100111, remaining=
> 000000, good=100111, bad=011000)
> [2018-09-26 08:50:19.901792] W [MSGID: 122053] 
> [ec-common.c:269:ec_check_status] 0-tier2-disperse-11: Operation failed on 1 
> of 6 subvolumes.(up=111111, mask=111101, remaining=
> 000000, good=111101, bad=000010)
> 
> Error type 2)
> 
> [2018-09-26 08:53:31.566836] W [socket.c:600:__socket_rwv] 0-tier2-client-53: 
> readv on 192.168.0.55:49153 failed (Connection reset by peer)
> 
> Error type 3)
> 
> [2018-09-26 08:57:37.852590] W [MSGID: 122035] 
> [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation 
> with some subvolumes unavailable (10)
> [2018-09-26 08:57:39.282306] W [MSGID: 122035] 
> [ec-common.c:571:ec_child_select] 0-tier2-disperse-9: Executing operation 
> with some subvolumes unavailable (10)
> [2018-09-26 09:02:04.928408] W [MSGID: 109023] 
> [dht-rebalance.c:1013:__dht_check_free_space] 0-tier2-dht: data movement of 
> file {blocks:0 name:(/OPA/archive/historical/dts/MRE
> A/Observations/Observations/MREA14/Cs-1/CMCC/raw/CS013.ext)} would result in 
> dst node (tier2-disperse-5:2440190848) having lower disk space than the 
> source node (tier2-dispers
> e-11:71373083776).Skipping file.
> 
> Error type 4)
> 
> W [rpc-clnt-ping.c:223:rpc_clnt_ping_cbk] 0-tier2-client-7: socket 
> disconnected
> 
> Error type 5)
> 
> [2018-09-26 09:07:42.333720] W [glusterfsd.c:1375:cleanup_and_exit] 
> (-->/lib64/libpthread.so.0(+0x7e25) [0x7f0417e0ee25] 
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5) [0x55
> 90086004b5] -->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x55900860032b] ) 
> 0-: received signum (15), shutting down
> 
> Error type 6)
> 
> [2018-09-25 08:09:18.340658] C 
> [rpc-clnt-ping.c:166:rpc_clnt_ping_timer_expired] 0-tier2-client-4: server 
> 192.168.0.52:49153 has not responded in the last 42 seconds, disconnecting.
> 
> It seems that there are some network or timeout problems, but the network 
> usage/traffic values are not so high.
> Do you think that, in my volume configuration, I have to modify some volume 
> options related to thread and/or network parameters?
> Could you, please, help me to understand the cause of the problems above?
> 
> You can find below our volume info:
> (volume is implemented on 6 servers; each server configuration:  2 cpu 
> 10-cores, 64GB RAM, 1 SSD dedicated to the OS, 12 x 10TB HD)
> 
> [root@s04 ~]# gluster vol info
>  
> Volume Name: tier2
> Type: Distributed-Disperse
> Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 12 x (4 + 2) = 72
> Transport-type: tcp
> Bricks:
> Brick1: s01-stg:/gluster/mnt1/brick
> Brick2: s02-stg:/gluster/mnt1/brick
> Brick3: s03-stg:/gluster/mnt1/brick
> Brick4: s01-stg:/gluster/mnt2/brick
> Brick5: s02-stg:/gluster/mnt2/brick
> Brick6: s03-stg:/gluster/mnt2/brick
> Brick7: s01-stg:/gluster/mnt3/brick
> Brick8: s02-stg:/gluster/mnt3/brick
> Brick9: s03-stg:/gluster/mnt3/brick
> Brick10: s01-stg:/gluster/mnt4/brick
> Brick11: s02-stg:/gluster/mnt4/brick
> Brick12: s03-stg:/gluster/mnt4/brick
> Brick13: s01-stg:/gluster/mnt5/brick
> Brick14: s02-stg:/gluster/mnt5/brick
> Brick15: s03-stg:/gluster/mnt5/brick
> Brick16: s01-stg:/gluster/mnt6/brick
> Brick17: s02-stg:/gluster/mnt6/brick
> Brick18: s03-stg:/gluster/mnt6/brick
> Brick19: s01-stg:/gluster/mnt7/brick
> Brick20: s02-stg:/gluster/mnt7/brick
> Brick21: s03-stg:/gluster/mnt7/brick
> Brick22: s01-stg:/gluster/mnt8/brick
> Brick23: s02-stg:/gluster/mnt8/brick
> Brick24: s03-stg:/gluster/mnt8/brick
> Brick25: s01-stg:/gluster/mnt9/brick
> Brick26: s02-stg:/gluster/mnt9/brick
> Brick27: s03-stg:/gluster/mnt9/brick
> Brick28: s01-stg:/gluster/mnt10/brick
> Brick29: s02-stg:/gluster/mnt10/brick
> Brick30: s03-stg:/gluster/mnt10/brick
> Brick31: s01-stg:/gluster/mnt11/brick
> Brick32: s02-stg:/gluster/mnt11/brick
> Brick33: s03-stg:/gluster/mnt11/brick
> Brick34: s01-stg:/gluster/mnt12/brick
> Brick35: s02-stg:/gluster/mnt12/brick
> Brick36: s03-stg:/gluster/mnt12/brick
> Brick37: s04-stg:/gluster/mnt1/brick
> Brick38: s04-stg:/gluster/mnt2/brick
> Brick39: s04-stg:/gluster/mnt3/brick
> Brick40: s04-stg:/gluster/mnt4/brick
> Brick41: s04-stg:/gluster/mnt5/brick
> Brick42: s04-stg:/gluster/mnt6/brick
> Brick43: s04-stg:/gluster/mnt7/brick
> Brick44: s04-stg:/gluster/mnt8/brick
> Brick45: s04-stg:/gluster/mnt9/brick
> Brick46: s04-stg:/gluster/mnt10/brick
> Brick47: s04-stg:/gluster/mnt11/brick
> Brick48: s04-stg:/gluster/mnt12/brick
> Brick49: s05-stg:/gluster/mnt1/brick
> Brick50: s05-stg:/gluster/mnt2/brick
> Brick51: s05-stg:/gluster/mnt3/brick
> Brick52: s05-stg:/gluster/mnt4/brick
> Brick53: s05-stg:/gluster/mnt5/brick
> Brick54: s05-stg:/gluster/mnt6/brick
> Brick55: s05-stg:/gluster/mnt7/brick
> Brick56: s05-stg:/gluster/mnt8/brick
> Brick57: s05-stg:/gluster/mnt9/brick
> Brick58: s05-stg:/gluster/mnt10/brick
> Brick59: s05-stg:/gluster/mnt11/brick
> Brick60: s05-stg:/gluster/mnt12/brick
> Brick61: s06-stg:/gluster/mnt1/brick
> Brick62: s06-stg:/gluster/mnt2/brick
> Brick63: s06-stg:/gluster/mnt3/brick
> Brick64: s06-stg:/gluster/mnt4/brick
> Brick65: s06-stg:/gluster/mnt5/brick
> Brick66: s06-stg:/gluster/mnt6/brick
> Brick67: s06-stg:/gluster/mnt7/brick
> Brick68: s06-stg:/gluster/mnt8/brick
> Brick69: s06-stg:/gluster/mnt9/brick
> Brick70: s06-stg:/gluster/mnt10/brick
> Brick71: s06-stg:/gluster/mnt11/brick
> Brick72: s06-stg:/gluster/mnt12/brick
> Options Reconfigured:
> network.ping-timeout: 60
> diagnostics.count-fop-hits: on
> diagnostics.latency-measurement: on
> cluster.server-quorum-type: server
> features.default-soft-limit: 90
> features.quota-deem-statfs: on
> performance.io <http://performance.io/>-thread-count: 16
> disperse.cpu-extensions: auto
> performance.io <http://performance.io/>-cache: off
> network.inode-lru-limit: 50000
> performance.md-cache-timeout: 600
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> cluster.readdir-optimize: on
> performance.parallel-readdir: off
> performance.readdir-ahead: on
> cluster.lookup-optimize: on
> client.event-threads: 4
> server.event-threads: 4
> nfs.disable: on
> transport.address-family: inet
> cluster.quorum-type: auto
> cluster.min-free-disk: 10
> performance.client-io-threads: on
> features.quota: on
> features.inode-quota: on
> features.bitrot: on
> features.scrub: Active
> cluster.brick-multiplex: on
> cluster.server-quorum-ratio: 51%
> 
> If it can help, I paste here the output of “free -m” command executed on all 
> the cluster nodes:
> 
> The result is almost the same on every nodes. In your opinion, the available 
> RAM is enough to support data movement?
> 
> [root@s06 ~]# free -m
>               total        used        free      shared  buff/cache   
> available
> Mem:          64309       10409         464          15       53434       
> 52998
> Swap:         65535         103       65432
> 
> Thank you in advance.
> Sorry for my long message, but I’m trying to notify you all available 
> information.
> 
> Regards,
> Mauro
> 
> 
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> [email protected] <mailto:[email protected]>
> https://lists.gluster.org/mailman/listinfo/gluster-users 
> <https://lists.gluster.org/mailman/listinfo/gluster-users>
> 
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> [email protected] <mailto:[email protected]>
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 
> 
> 
> _______________________________________________
> Gluster-users mailing list
> [email protected]
> https://lists.gluster.org/mailman/listinfo/gluster-users
> 


-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Università del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: [email protected] <mailto:[email protected]>
https://it.linkedin.com/in/mauro-tridici-5977238b

_______________________________________________
Gluster-users mailing list
[email protected]
https://lists.gluster.org/mailman/listinfo/gluster-users

Re: [Gluster-users] Rebalance failed on Distributed Disperse volume based on 3.12.14 version

Reply via email to