On Tue, 18 Dec 2018 at 21:11, Stephen Remde <[email protected]> wrote:
> Nithya, > > You are correct, but as you stated earlier, it also has to migrate data > from other bricks on the same host, so another 74TB on dc4-03 /dev/md0 > needs to be migrated? > Ah, ok. Let me clarify - it will migrate files to and from dc4-03 /dev/md0 to use the new directory layouts (which is correct behaviour for a rebalance but unnecessary in this case as it can potentially cause the remove-brick operation to take longer). It should not migrate all the data from this brick so you should still see lots of files on it. Only the data on the removed-brick will be moved off completely. I recommend that you let the remove-brick proceed and keep an eye on the disk usage of dc4-03 /dev/md0 as well. Let me know if you see if falling drastically. If you can afford some downtime, we could probably try alternate methods but those would mean that users would not have access to some files on the volume. Regards, Nithya > > This is the current behaviour of rebalance and nothing to be concerned > about - it will migrate data on all bricks on the nodes which host the > bricks being removed > > > Steve > > On Tue, 18 Dec 2018 at 15:37, Nithya Balachandran <[email protected]> > wrote: > >> >> >> On Tue, 18 Dec 2018 at 14:56, Stephen Remde <[email protected]> >> wrote: >> >>> Nithya, >>> >>> I've realised, I will not have enough space on the other bricks in my >>> cluster to migrate data off the server so I can remove the single brick - >>> is there a work around? >>> >>> As you can see below, the new brick was created with the wrong raid >>> configuration, so I want to remove it recreate the raid, and re add it. >>> >>> xxxxxx Filesystem Size Used Avail Use% Mounted on >>> dc4-01 /dev/md0 95T 87T 8.0T 92% /export/md0 >>> dc4-01 /dev/md1 95T 87T 8.4T 92% /export/md1 >>> dc4-01 /dev/md2 95T 86T 9.3T 91% /export/md2 >>> dc4-01 /dev/md3 95T 86T 8.9T 91% /export/md3 >>> dc4-02 /dev/md0 95T 89T 6.5T 94% /export/md0 >>> dc4-02 /dev/md1 95T 87T 8.4T 92% /export/md1 >>> dc4-02 /dev/md2 95T 87T 8.6T 91% /export/md2 >>> dc4-02 /dev/md3 95T 86T 8.8T 91% /export/md3 >>> dc4-03 /dev/md0 95T 74T 21T 78% /export/md0 >>> dc4-03 /dev/md1 102T 519G 102T 1% /export/md1 >>> >>> >> I believe this is the brick being removed - the one that has about 519G >> of data? If I have understood the scenario properly, there seems to be >> plenty of free space on the other bricks (most seem to have terabytes free) >> . Is there something I am missing ? >> >> Regards, >> Nithya >> >> >>> This is the backup storage, so if I HAVE to lose the 519GB and resync, >>> that's an acceptable worst-case. >>> >>> gluster> v info video-backup >>> >>> Volume Name: video-backup >>> Type: Distribute >>> Volume ID: 887bdc2a-ca5e-4ca2-b30d-86831839ed04 >>> Status: Started >>> Snapshot Count: 0 >>> Number of Bricks: 10 >>> Transport-type: tcp >>> Bricks: >>> Brick1: 10.0.0.41:/export/md0/brick >>> Brick2: 10.0.0.42:/export/md0/brick >>> Brick3: 10.0.0.43:/export/md0/brick >>> Brick4: 10.0.0.41:/export/md1/brick >>> Brick5: 10.0.0.42:/export/md1/brick >>> Brick6: 10.0.0.41:/export/md2/brick >>> Brick7: 10.0.0.42:/export/md2/brick >>> Brick8: 10.0.0.41:/export/md3/brick >>> Brick9: 10.0.0.42:/export/md3/brick >>> Brick10: 10.0.0.43:/export/md1/brick >>> Options Reconfigured: >>> cluster.rebal-throttle: aggressive >>> cluster.min-free-disk: 1% >>> transport.address-family: inet >>> performance.readdir-ahead: on >>> nfs.disable: on >>> >>> >>> Best, >>> >>> Steve >>> >>> >>> On Wed, 12 Dec 2018 at 03:07, Nithya Balachandran <[email protected]> >>> wrote: >>> >>>> >>>> This is the current behaviour of rebalance and nothing to be concerned >>>> about - it will migrate data on all bricks on the nodes which host the >>>> bricks being removed. The data on the removed bricks will be moved to other >>>> bricks, some of the data on the other bricks on the node will just be >>>> moved to other bricks based on the new directory layouts. >>>> I will fix this in the near future but you don't need to to stop the >>>> remove-brick operation. >>>> >>>> Regards, >>>> Nithya >>>> >>>> On Wed, 12 Dec 2018 at 06:36, Stephen Remde <[email protected]> >>>> wrote: >>>> >>>>> I requested a brick be removed from a distribute only volume and it seems >>>>> to be migrating data from the wrong brick... unless I am reading this >>>>> wrong which I doubt because the disk usage is definitely decreasing on >>>>> the wrong brick. >>>>> >>>>> gluster> volume status >>>>> Status of volume: video-backup >>>>> Gluster process TCP Port RDMA Port Online >>>>> Pid >>>>> ------------------------------------------------------------------------------ >>>>> Brick 10.0.0.41:/export/md0/brick 49172 0 Y >>>>> 5306 >>>>> Brick 10.0.0.42:/export/md0/brick 49172 0 Y >>>>> 3651 >>>>> Brick 10.0.0.43:/export/md0/brick 49155 0 Y >>>>> 2826 >>>>> Brick 10.0.0.41:/export/md1/brick 49173 0 Y >>>>> 5311 >>>>> Brick 10.0.0.42:/export/md1/brick 49173 0 Y >>>>> 3656 >>>>> Brick 10.0.0.41:/export/md2/brick 49174 0 Y >>>>> 5316 >>>>> Brick 10.0.0.42:/export/md2/brick 49174 0 Y >>>>> 3662 >>>>> Brick 10.0.0.41:/export/md3/brick 49175 0 Y >>>>> 5322 >>>>> Brick 10.0.0.42:/export/md3/brick 49175 0 Y >>>>> 3667 >>>>> Brick 10.0.0.43:/export/md1/brick 49156 0 Y >>>>> 4836 >>>>> >>>>> Task Status of Volume video-backup >>>>> ------------------------------------------------------------------------------ >>>>> Task : Rebalance >>>>> ID : 7895be7c-4ab9-440d-a301-c11dae0dd9e1 >>>>> Status : completed >>>>> >>>>> gluster> volume remove-brick video-backup 10.0.0.43:/export/md1/brick >>>>> start >>>>> volume remove-brick start: success >>>>> ID: f666a196-03c2-4940-bd38-45d8383345a4 >>>>> >>>>> gluster> volume status >>>>> Status of volume: video-backup >>>>> Gluster process TCP Port RDMA Port Online >>>>> Pid >>>>> ------------------------------------------------------------------------------ >>>>> Brick 10.0.0.41:/export/md0/brick 49172 0 Y >>>>> 5306 >>>>> Brick 10.0.0.42:/export/md0/brick 49172 0 Y >>>>> 3651 >>>>> Brick 10.0.0.43:/export/md0/brick 49155 0 Y >>>>> 2826 >>>>> Brick 10.0.0.41:/export/md1/brick 49173 0 Y >>>>> 5311 >>>>> Brick 10.0.0.42:/export/md1/brick 49173 0 Y >>>>> 3656 >>>>> Brick 10.0.0.41:/export/md2/brick 49174 0 Y >>>>> 5316 >>>>> Brick 10.0.0.42:/export/md2/brick 49174 0 Y >>>>> 3662 >>>>> Brick 10.0.0.41:/export/md3/brick 49175 0 Y >>>>> 5322 >>>>> Brick 10.0.0.42:/export/md3/brick 49175 0 Y >>>>> 3667 >>>>> Brick 10.0.0.43:/export/md1/brick 49156 0 Y >>>>> 4836 >>>>> >>>>> Task Status of Volume video-backup >>>>> ------------------------------------------------------------------------------ >>>>> Task : Remove brick >>>>> ID : f666a196-03c2-4940-bd38-45d8383345a4 >>>>> Removed bricks: >>>>> 10.0.0.43:/export/md1/brick >>>>> Status : in progress >>>>> >>>>> >>>>> But when I check the rebalance log on the host with the brick being >>>>> removed, it is actually migrating data from the other brick on the same >>>>> host 10.0.0.43:/export/md0/brick >>>>> >>>>> >>>>> ..... >>>>> [2018-12-11 11:59:52.572657] I [MSGID: 109086] >>>>> [dht-shared.c:297:dht_parse_decommissioned_bricks] 0-video-backup-dht: >>>>> *decommissioning subvolume video-backup-client-9* >>>>> .... >>>>> 29: volume video-backup-client-2 >>>>> 30: type protocol/client >>>>> 31: option clnt-lk-version 1 >>>>> 32: option volfile-checksum 0 >>>>> 33: option volfile-key rebalance/video-backup >>>>> 34: option client-version 3.8.15 >>>>> 35: option process-uuid >>>>> node-dc4-03-25536-2018/12/11-11:59:47:551328-video-backup-client-2-0-0 >>>>> 36: option fops-version 1298437 >>>>> 37: option ping-timeout 42 >>>>> 38: option remote-host 10.0.0.43 >>>>> 39: option remote-subvolume /export/md0/brick >>>>> 40: option transport-type socket >>>>> 41: option transport.address-family inet >>>>> 42: option username 9e7fe743-ecd7-40aa-b3db-e112086b2fc7 >>>>> 43: option password dab178d6-ecb4-4293-8c1d-6281ec2cafc2 >>>>> 44: end-volume >>>>> ... >>>>> 112: volume video-backup-client-9 >>>>> 113: type protocol/client >>>>> 114: option ping-timeout 42 >>>>> 115: option remote-host 10.0.0.43 >>>>> 116: option remote-subvolume /export/md1/brick >>>>> 117: option transport-type socket >>>>> 118: option transport.address-family inet >>>>> 119: option username 9e7fe743-ecd7-40aa-b3db-e112086b2fc7 >>>>> 120: option password dab178d6-ecb4-4293-8c1d-6281ec2cafc2 >>>>> 121: end-volume >>>>> ... >>>>> [2018-12-11 11:59:52.608698] I >>>>> [dht-rebalance.c:3668:gf_defrag_start_crawl] 0-video-backup-dht: >>>>> gf_defrag_start_crawl using commit hash 3766302106 >>>>> [2018-12-11 11:59:52.609478] I [MSGID: 109081] >>>>> [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of >>>>> / >>>>> [2018-12-11 11:59:52.615348] I [MSGID: 0] >>>>> [dht-rebalance.c:3746:gf_defrag_start_crawl] 0-video-backup-dht: local >>>>> subvols are video-backup-client-2 >>>>> [2018-12-11 11:59:52.615378] I [MSGID: 0] >>>>> [dht-rebalance.c:3746:gf_defrag_start_crawl] 0-video-backup-dht: local >>>>> subvols are video-backup-client-9 >>>>> ... >>>>> [2018-12-11 11:59:52.616554] I >>>>> [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate >>>>> data called on / >>>>> [2018-12-11 11:59:54.000363] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /symlinks.txt: attempting to move from >>>>> video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:55.110549] I [MSGID: 109022] >>>>> [dht-rebalance.c:1703:dht_migrate_file] 0-video-backup-dht: completed >>>>> migration of /symlinks.txt from subvolume video-backup-client-2 to >>>>> video-backup-client-4 >>>>> [2018-12-11 11:59:58.100931] I [MSGID: 109081] >>>>> [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of >>>>> /A6 >>>>> [2018-12-11 11:59:58.107389] I >>>>> [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate >>>>> data called on /A6 >>>>> [2018-12-11 11:59:58.132138] I >>>>> [dht-rebalance.c:2866:gf_defrag_process_dir] 0-video-backup-dht: >>>>> Migration operation on dir /A6 took 0.02 secs >>>>> [2018-12-11 11:59:58.330393] I [MSGID: 109081] >>>>> [dht-common.c:4198:dht_setxattr] 0-video-backup-dht: fixing the layout of >>>>> /A6/2017 >>>>> [2018-12-11 11:59:58.337601] I >>>>> [dht-rebalance.c:2652:gf_defrag_process_dir] 0-video-backup-dht: migrate >>>>> data called on /A6/2017 >>>>> [2018-12-11 11:59:58.493906] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /A6/2017/57c81ed09f31cd6c1c8990ae20160908101048: >>>>> attempting to move from video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:58.706068] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: >>>>> /A6/2017/57c81ed09f31cd6c1c8990ae20160908120734132317: attempting to move >>>>> from video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:58.783952] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161124091841: >>>>> attempting to move from video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:58.843315] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161124135453: >>>>> attempting to move from video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:58.951637] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /A6/2017/584a8bcdaca0515f595dff8820161122111252: >>>>> attempting to move from video-backup-client-2 to video-backup-client-4 >>>>> [2018-12-11 11:59:59.005324] I >>>>> [dht-rebalance.c:2866:gf_defrag_process_dir] 0-video-backup-dht: >>>>> Migration operation on dir /A6/2017 took 0.67 secs >>>>> [2018-12-11 11:59:59.005362] I [dht-rebalance.c:1230:dht_migrate_file] >>>>> 0-video-backup-dht: /A6/2017/58906aaaaca0515f5994104d20170213154555: >>>>> attempting to move from video-backup-client-2 to video-backup-client-4 >>>>> >>>>> etc... >>>>> >>>>> Can I stop/cancel it without data loss? How can I make gluster remove the >>>>> correct brick? >>>>> >>>>> Thanks >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> [email protected] >>>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>> >>> > > -- > > Dr Stephen Remde > Director, Innovation and Research > > > T: 01535 280066 > M: 07764 740920 > E: [email protected] > W: www.gaist.co.uk >
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
