Hi Mauro, Yes, a rebalance consists of 2 operations for every directory:
1. Fix the layout for the new volume config (newly added or removed bricks) 2. Migrate files to their new hashed subvols based on the new layout Are you running a rebalance because you added new bricks to the volume ? As per an earlier email you have already run a fix-layout. On s04, please check the rebalance log file to see why the rebalance failed. Regards, Nithya On 8 October 2018 at 13:22, Mauro Tridici <[email protected]> wrote: > Hi All, > > for your information, this is the current rebalance status: > > [root@s01 ~]# gluster volume rebalance tier2 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 551922 20.3TB > 2349397 0 61849 in progress 55:25:38 > s02-stg 287631 13.2TB > 959954 0 30262 in progress 55:25:39 > s03-stg 288523 12.7TB > 973111 0 30220 in progress 55:25:39 > s04-stg 0 0Bytes > 0 0 0 failed 0:00:37 > s05-stg 0 0Bytes > 0 0 0 completed 48:33:03 > s06-stg 0 0Bytes > 0 0 0 completed 48:33:02 > Estimated time left for rebalance to complete : 1023:49:56 > volume rebalance: tier2: success > > Rebalance is migrating files on s05, s06 servers and on s04 too (although > it is marked as failed). > s05 and s06 tasks are completed. > > Questions: > > 1) it seems that rebalance is moving files, but it is fixing the layout > also, is it normal? > 2) when rebalance will be completed, what we need to do before return the > gluster storage to the users? We have to launch rebalance again in order to > involve s04 server too or a fix-layout to eventually fix some error on s04? > > Thank you very much, > Mauro > > > Il giorno 07 ott 2018, alle ore 10:29, Mauro Tridici < > [email protected]> ha scritto: > > > Hi All, > > some important updates about the issue mentioned below. > After rebalance failed on all the servers, I decided to: > > - stop gluster volume > - reboot the servers > - start gluster volume > - change some gluster volume options > - start the rebalance again > > The options that I changed are listed below after reading some threads on > gluster users mailing list: > > BEFORE CHANGE: > gluster volume set tier2 network.ping-timeout 02 > gluster volume set all cluster.brick-multiplex on > gluster volume set tier2 cluster.server-quorum-ratio 51% > gluster volume set tier2 cluster.server-quorum-type server > gluster volume set tier2 cluster.quorum-type auto > > AFTER CHANGE: > > gluster volume set tier2 network.ping-timeout 42 > gluster volume set all cluster.brick-multiplex off > gluster volume set tier2 cluster.server-quorum-ratio none > gluster volume set tier2 cluster.server-quorum-type none > gluster volume set tier2 cluster.quorum-type none > > The result was that rebalance starts moving data from s01, s02 ed s03 > servers to s05 and s06 servers (the new added ones), but it failed on s04 > server after 37 seconds. > The rebalance is still running and moving data as you can see from the > output: > > [root@s01 ~]# gluster volume rebalance tier2 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 286680 12.6TB > 1217960 0 43343 in progress 32:10:24 > s02-stg 126291 12.4TB > 413077 0 21932 in progress 32:10:25 > s03-stg 126516 11.9TB > 433014 0 21870 in progress 32:10:25 > s04-stg 0 0Bytes > 0 0 0 failed 0:00:37 > s05-stg 0 0Bytes > 0 0 0 in progress 32:10:25 > s06-stg 0 0Bytes > 0 0 0 in progress 32:10:25 > Estimated time left for rebalance to complete : 624:47:48 > volume rebalance: tier2: success > > When rebalance will be completed, we are planning to re-launch it to try > to involve s04 server also. > Do you have some idea about what happened in my previous message and why, > now, rebalance it’s running although it’s not involve s04 server? > > In attachment the complete tier2-rebalance.log file related to s04 server. > > Thank you very much for your help, > Mauro > > > <tier2-rebalance.log.gz> > > Il giorno 06 ott 2018, alle ore 02:01, Mauro Tridici < > [email protected]> ha scritto: > > > Hi All, > > since we need to restore gluster storage as soon as possible, we decided > to ignore the few files that could be lost and to go ahead. > So we cleaned all bricks content of servers s04, s05 and s06. > > As planned some days ago, we executed the following commands: > > *gluster peer detach s04* > *gluster peer detach s05* > *gluster peer detach s06* > > *gluster peer probe s04* > *gluster peer probe s05* > *gluster peer probe s06* > > *gluster volume add-brick tier2 s04-stg:/gluster/mnt1/brick > s05-stg:/gluster/mnt1/brick s06-stg:/gluster/mnt1/brick > s04-stg:/gluster/mnt2/brick s05-stg:/gluster/mnt2/brick > s06-stg:/gluster/mnt2/brick s04-stg:/gluster/mnt3/brick > s05-stg:/gluster/mnt3/brick s06-stg:/gluster/mnt3/brick > s04-stg:/gluster/mnt4/brick s05-stg:/gluster/mnt4/brick > s06-stg:/gluster/mnt4/brick s04-stg:/gluster/mnt5/brick > s05-stg:/gluster/mnt5/brick s06-stg:/gluster/mnt5/brick > s04-stg:/gluster/mnt6/brick s05-stg:/gluster/mnt6/brick > s06-stg:/gluster/mnt6/brick s04-stg:/gluster/mnt7/brick > s05-stg:/gluster/mnt7/brick s06-stg:/gluster/mnt7/brick > s04-stg:/gluster/mnt8/brick s05-stg:/gluster/mnt8/brick > s06-stg:/gluster/mnt8/brick s04-stg:/gluster/mnt9/brick > s05-stg:/gluster/mnt9/brick s06-stg:/gluster/mnt9/brick > s04-stg:/gluster/mnt10/brick s05-stg:/gluster/mnt10/brick > s06-stg:/gluster/mnt10/brick s04-stg:/gluster/mnt11/brick > s05-stg:/gluster/mnt11/brick s06-stg:/gluster/mnt11/brick > s04-stg:/gluster/mnt12/brick s05-stg:/gluster/mnt12/brick > s06-stg:/gluster/mnt12/brick force* > > *gluster volume rebalance tier2 fix-layout start* > > Everything seem to be fine and fix-layout ended. > > [root@s01 ~]# gluster volume rebalance tier2 status > Node > status run time in h:m:s > --------- > ----------- ------------ > localhost > fix-layout completed 12:11:6 > s02-stg > fix-layout completed 12:11:18 > s03-stg > fix-layout completed 12:11:12 > s04-stg > fix-layout completed 12:11:20 > s05-stg > fix-layout completed 12:11:14 > s06-stg > fix-layout completed 12:10:47 > volume rebalance: tier2: success > > [root@s01 ~]# gluster volume info > > Volume Name: tier2 > Type: Distributed-Disperse > Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c > Status: Started > Snapshot Count: 0 > Number of Bricks: 12 x (4 + 2) = 72 > Transport-type: tcp > Bricks: > Brick1: s01-stg:/gluster/mnt1/brick > Brick2: s02-stg:/gluster/mnt1/brick > Brick3: s03-stg:/gluster/mnt1/brick > Brick4: s01-stg:/gluster/mnt2/brick > Brick5: s02-stg:/gluster/mnt2/brick > Brick6: s03-stg:/gluster/mnt2/brick > Brick7: s01-stg:/gluster/mnt3/brick > Brick8: s02-stg:/gluster/mnt3/brick > Brick9: s03-stg:/gluster/mnt3/brick > Brick10: s01-stg:/gluster/mnt4/brick > Brick11: s02-stg:/gluster/mnt4/brick > Brick12: s03-stg:/gluster/mnt4/brick > Brick13: s01-stg:/gluster/mnt5/brick > Brick14: s02-stg:/gluster/mnt5/brick > Brick15: s03-stg:/gluster/mnt5/brick > Brick16: s01-stg:/gluster/mnt6/brick > Brick17: s02-stg:/gluster/mnt6/brick > Brick18: s03-stg:/gluster/mnt6/brick > Brick19: s01-stg:/gluster/mnt7/brick > Brick20: s02-stg:/gluster/mnt7/brick > Brick21: s03-stg:/gluster/mnt7/brick > Brick22: s01-stg:/gluster/mnt8/brick > Brick23: s02-stg:/gluster/mnt8/brick > Brick24: s03-stg:/gluster/mnt8/brick > Brick25: s01-stg:/gluster/mnt9/brick > Brick26: s02-stg:/gluster/mnt9/brick > Brick27: s03-stg:/gluster/mnt9/brick > Brick28: s01-stg:/gluster/mnt10/brick > Brick29: s02-stg:/gluster/mnt10/brick > Brick30: s03-stg:/gluster/mnt10/brick > Brick31: s01-stg:/gluster/mnt11/brick > Brick32: s02-stg:/gluster/mnt11/brick > Brick33: s03-stg:/gluster/mnt11/brick > Brick34: s01-stg:/gluster/mnt12/brick > Brick35: s02-stg:/gluster/mnt12/brick > Brick36: s03-stg:/gluster/mnt12/brick > Brick37: s04-stg:/gluster/mnt1/brick > Brick38: s05-stg:/gluster/mnt1/brick > Brick39: s06-stg:/gluster/mnt1/brick > Brick40: s04-stg:/gluster/mnt2/brick > Brick41: s05-stg:/gluster/mnt2/brick > Brick42: s06-stg:/gluster/mnt2/brick > Brick43: s04-stg:/gluster/mnt3/brick > Brick44: s05-stg:/gluster/mnt3/brick > Brick45: s06-stg:/gluster/mnt3/brick > Brick46: s04-stg:/gluster/mnt4/brick > Brick47: s05-stg:/gluster/mnt4/brick > Brick48: s06-stg:/gluster/mnt4/brick > Brick49: s04-stg:/gluster/mnt5/brick > Brick50: s05-stg:/gluster/mnt5/brick > Brick51: s06-stg:/gluster/mnt5/brick > Brick52: s04-stg:/gluster/mnt6/brick > Brick53: s05-stg:/gluster/mnt6/brick > Brick54: s06-stg:/gluster/mnt6/brick > Brick55: s04-stg:/gluster/mnt7/brick > Brick56: s05-stg:/gluster/mnt7/brick > Brick57: s06-stg:/gluster/mnt7/brick > Brick58: s04-stg:/gluster/mnt8/brick > Brick59: s05-stg:/gluster/mnt8/brick > Brick60: s06-stg:/gluster/mnt8/brick > Brick61: s04-stg:/gluster/mnt9/brick > Brick62: s05-stg:/gluster/mnt9/brick > Brick63: s06-stg:/gluster/mnt9/brick > Brick64: s04-stg:/gluster/mnt10/brick > Brick65: s05-stg:/gluster/mnt10/brick > Brick66: s06-stg:/gluster/mnt10/brick > Brick67: s04-stg:/gluster/mnt11/brick > Brick68: s05-stg:/gluster/mnt11/brick > Brick69: s06-stg:/gluster/mnt11/brick > Brick70: s04-stg:/gluster/mnt12/brick > Brick71: s05-stg:/gluster/mnt12/brick > Brick72: s06-stg:/gluster/mnt12/brick > Options Reconfigured: > network.ping-timeout: 42 > features.scrub: Active > features.bitrot: on > features.inode-quota: on > features.quota: on > performance.client-io-threads: on > cluster.min-free-disk: 10 > cluster.quorum-type: none > transport.address-family: inet > nfs.disable: on > server.event-threads: 4 > client.event-threads: 4 > cluster.lookup-optimize: on > performance.readdir-ahead: on > performance.parallel-readdir: off > cluster.readdir-optimize: on > features.cache-invalidation: on > features.cache-invalidation-timeout: 600 > performance.stat-prefetch: on > performance.cache-invalidation: on > performance.md-cache-timeout: 600 > network.inode-lru-limit: 50000 > performance.io-cache: off > disperse.cpu-extensions: auto > performance.io-thread-count: 16 > features.quota-deem-statfs: on > features.default-soft-limit: 90 > cluster.server-quorum-type: none > diagnostics.latency-measurement: on > diagnostics.count-fop-hits: on > cluster.brick-multiplex: off > cluster.server-quorum-ratio: 51% > > The last step should be the data rebalance between the servers, but > rebalance failed soon with a lot of errors like the following ones: > > [2018-10-05 23:48:38.644978] I [MSGID: 114035] > [client-handshake.c:202:client_set_lk_version_cbk] > 0-tier2-client-70: Server lk version = 1 > [2018-10-05 23:48:44.735323] I [dht-rebalance.c:4512:gf_defrag_start_crawl] > 0-tier2-dht: gf_defrag_start_crawl using commit hash 3720331860 > [2018-10-05 23:48:44.736205] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-7: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.736266] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-7: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.736282] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-7: Failed to update version and size [Input/output error] > [2018-10-05 23:48:44.736377] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-8: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.736436] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-8: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.736459] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-8: Failed to update version and size [Input/output error] > [2018-10-05 23:48:44.736460] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-10: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.736537] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-9: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.736571] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-10: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.736574] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-9: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.736604] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-9: Failed to update version and size [Input/output error] > [2018-10-05 23:48:44.736604] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-10: Failed to update version and size [Input/output error] > [2018-10-05 23:48:44.736827] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-11: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.736887] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-11: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.736904] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-11: Failed to update version and size [Input/output error] > [2018-10-05 23:48:44.740337] W [MSGID: 122040] > [ec-common.c:1097:ec_prepare_update_cbk] 0-tier2-disperse-6: Failed to > get size and version [Input/output error] > [2018-10-05 23:48:44.740381] E [MSGID: 122034] > [ec-common.c:613:ec_child_select] > 0-tier2-disperse-6: Insufficient available children for this request (have > 0, need 4) > [2018-10-05 23:48:44.740394] E [MSGID: 122037] > [ec-common.c:2040:ec_update_size_version_done] > 0-tier2-disperse-6: Failed to update version and size [Input/output error] > [2018-10-05 23:48:50.066103] I [MSGID: 109081] > [dht-common.c:4379:dht_setxattr] > 0-tier2-dht: fixing the layout of / > > In attachment you can find the first logs captured during the rebalance > execution. > In your opinion, is there a way to restore the gluster storage or all the > data have been lost? > > Thank you in advance, > Mauro > > <rebalance_log.txt> > > > > Il giorno 04 ott 2018, alle ore 15:31, Mauro Tridici < > [email protected]> ha scritto: > > > Hi Nithya, > > thank you very much. > This is the current “gluster volume info” output after removing bricks > (and after peer detach command). > > [root@s01 ~]# gluster volume info > > Volume Name: tier2 > Type: Distributed-Disperse > Volume ID: a28d88c5-3295-4e35-98d4-210b3af9358c > Status: Started > Snapshot Count: 0 > Number of Bricks: 6 x (4 + 2) = 36 > Transport-type: tcp > Bricks: > Brick1: s01-stg:/gluster/mnt1/brick > Brick2: s02-stg:/gluster/mnt1/brick > Brick3: s03-stg:/gluster/mnt1/brick > Brick4: s01-stg:/gluster/mnt2/brick > Brick5: s02-stg:/gluster/mnt2/brick > Brick6: s03-stg:/gluster/mnt2/brick > Brick7: s01-stg:/gluster/mnt3/brick > Brick8: s02-stg:/gluster/mnt3/brick > Brick9: s03-stg:/gluster/mnt3/brick > Brick10: s01-stg:/gluster/mnt4/brick > Brick11: s02-stg:/gluster/mnt4/brick > Brick12: s03-stg:/gluster/mnt4/brick > Brick13: s01-stg:/gluster/mnt5/brick > Brick14: s02-stg:/gluster/mnt5/brick > Brick15: s03-stg:/gluster/mnt5/brick > Brick16: s01-stg:/gluster/mnt6/brick > Brick17: s02-stg:/gluster/mnt6/brick > Brick18: s03-stg:/gluster/mnt6/brick > Brick19: s01-stg:/gluster/mnt7/brick > Brick20: s02-stg:/gluster/mnt7/brick > Brick21: s03-stg:/gluster/mnt7/brick > Brick22: s01-stg:/gluster/mnt8/brick > Brick23: s02-stg:/gluster/mnt8/brick > Brick24: s03-stg:/gluster/mnt8/brick > Brick25: s01-stg:/gluster/mnt9/brick > Brick26: s02-stg:/gluster/mnt9/brick > Brick27: s03-stg:/gluster/mnt9/brick > Brick28: s01-stg:/gluster/mnt10/brick > Brick29: s02-stg:/gluster/mnt10/brick > Brick30: s03-stg:/gluster/mnt10/brick > Brick31: s01-stg:/gluster/mnt11/brick > Brick32: s02-stg:/gluster/mnt11/brick > Brick33: s03-stg:/gluster/mnt11/brick > Brick34: s01-stg:/gluster/mnt12/brick > Brick35: s02-stg:/gluster/mnt12/brick > Brick36: s03-stg:/gluster/mnt12/brick > Options Reconfigured: > network.ping-timeout: 0 > features.scrub: Active > features.bitrot: on > features.inode-quota: on > features.quota: on > performance.client-io-threads: on > cluster.min-free-disk: 10 > cluster.quorum-type: auto > transport.address-family: inet > nfs.disable: on > server.event-threads: 4 > client.event-threads: 4 > cluster.lookup-optimize: on > performance.readdir-ahead: on > performance.parallel-readdir: off > cluster.readdir-optimize: on > features.cache-invalidation: on > features.cache-invalidation-timeout: 600 > performance.stat-prefetch: on > performance.cache-invalidation: on > performance.md-cache-timeout: 600 > network.inode-lru-limit: 50000 > performance.io-cache: off > disperse.cpu-extensions: auto > performance.io-thread-count: 16 > features.quota-deem-statfs: on > features.default-soft-limit: 90 > cluster.server-quorum-type: server > diagnostics.latency-measurement: on > diagnostics.count-fop-hits: on > cluster.brick-multiplex: on > cluster.server-quorum-ratio: 51% > > Regards, > Mauro > > Il giorno 04 ott 2018, alle ore 15:22, Nithya Balachandran < > [email protected]> ha scritto: > > Hi Mauro, > > > The files on s04 and s05 can be deleted safely as long as those bricks > have been removed from the volume and their brick processes are not running. > > > .glusterfs/indices/xattrop/xattrop-* are links to files that need to be > healed. > .glusterfs/quarantine/stub-00000000-0000-0000-0000-000000000008 links to > files that bitrot (if enabled)says are corrupted. (none in this case) > > > > I will get back to you on s06. Can you please provide the output of gluster > volume info again? > > > Regards, > Nithya > > > > On 4 October 2018 at 13:47, Mauro Tridici <[email protected]> wrote: > >> >> Dear Ashish, Dear Nithya, >> >> I’m writing this message only to summarize and simplify the information >> about the "not migrated” files left on removed bricks on server s04, s05 >> and s06. >> In attachment, you can find 3 files (1 file for each server) containing >> the “not migrated” files lists and related brick number. >> >> In particular: >> >> - s04 and s05 bricks contain only not migrated files in hidden >> directories “/gluster/mnt#/brick/.glusterfs" (I could delete them, >> doesn’t it?) >> - s06 bricks contain >> - not migrated files in hidden directories “/gluster/mnt#/bri >> ck/.glusterfs”; >> - not migrated files with size equal to 0; >> - not migrated files with size greater than 0. >> >> >> I think it was necessary to collect and summarize information to simplify >> your analysis. >> Thank you very much, >> Mauro >> >> >> > > >
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
