Hi All We were able to get all 4 bricks are distributed , we can see the right amount of space. but we have been rebalancing since 4 days ago for 16Tb. and still only 8tb. is there a way to speed up. there is also data we can remove from it to speed it up, but what is the best procedures removing data , is it from the Gluster main export point or going on each brick and remove it . We would like to stop rebalancing , delete the data and rebalancing again.
is there a down side, doing this, What happens with Gluster missing data when rebalancing? Thanks Jose --------------------------------- Jose Sanchez Systems/Network Analyst 1 Center of Advanced Research Computing 1601 Central Ave. MSC 01 1190 Albuquerque, NM 87131-0001 carc.unm.edu <http://carc.unm.edu/> 575.636.4232 > On Apr 27, 2018, at 4:16 AM, Hari Gowtham <[email protected]> wrote: > > Hi Jose, > > Why are all the bricks visible in volume info if the pre-validation > for add-brick failed? I suspect that the remove brick wasn't done > properly. > > You can provide the cmd_history.log to verify this. Better to get the > other log messages. > > Also I need to know what are the bricks that were actually removed, > the command used and its output. > > On Thu, Apr 26, 2018 at 3:47 AM, Jose Sanchez <[email protected]> wrote: >> Looking at the logs , it seems that it is trying to add using the same port >> was assigned for gluster01ib: >> >> >> Any Ideas?? >> >> Jose >> >> >> >> [2018-04-25 22:08:55.169302] I [MSGID: 106482] >> [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: >> Received add brick req >> [2018-04-25 22:08:55.186037] I [run.c:191:runner_log] >> (-->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0x33045) >> [0x7f5464b9b045] >> -->/usr/lib64/glusterfs/3.8.15/xlator/mgmt/glusterd.so(+0xcbd85) >> [0x7f5464c33d85] -->/lib64/libglusterfs.so.0(runner_log+0x115) >> [0x7f54704cf1e5] ) 0-management: Ran script: >> /var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh >> --volname=scratch --version=1 --volume-op=add-brick >> --gd-workdir=/var/lib/glusterd >> [2018-04-25 22:08:55.309534] I [MSGID: 106143] >> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick >> /gdata/brick1/scratch on port 49152 >> [2018-04-25 22:08:55.309659] I [MSGID: 106143] >> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick >> /gdata/brick1/scratch.rdma on port 49153 >> [2018-04-25 22:08:55.310231] E [MSGID: 106005] >> [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start >> brick gluster02ib:/gdata/brick1/scratch >> [2018-04-25 22:08:55.310275] E [MSGID: 106074] >> [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add >> bricks >> [2018-04-25 22:08:55.310304] E [MSGID: 106123] >> [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit >> failed. >> [2018-04-25 22:08:55.310316] E [MSGID: 106123] >> [glusterd-mgmt.c:1427:glusterd_mgmt_v3_commit] 0-management: Commit failed >> for operation Add brick on local node >> [2018-04-25 22:08:55.310330] E [MSGID: 106123] >> [glusterd-mgmt.c:2018:glusterd_mgmt_v3_initiate_all_phases] 0-management: >> Commit Op Failed >> [2018-04-25 22:09:11.678141] E [MSGID: 106452] >> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: >> gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or >> be contained by an existing brick >> [2018-04-25 22:09:11.678184] W [MSGID: 106122] >> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick >> prevalidation failed. >> [2018-04-25 22:09:11.678200] E [MSGID: 106122] >> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: >> Pre Validation failed on operation Add brick >> [root@gluster02 glusterfs]# gluster volume status scratch >> Status of volume: scratch >> Gluster process TCP Port RDMA Port Online Pid >> ------------------------------------------------------------------------------ >> Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y >> 1819 >> Brick gluster01ib:/gdata/brick2/scratch 49154 49155 Y >> 1827 >> Brick gluster02ib:/gdata/brick1/scratch N/A N/A N N/A >> >> >> >> Task Status of Volume scratch >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> >> >> [root@gluster02 glusterfs]# >> >> >> >> On Apr 25, 2018, at 3:23 PM, Jose Sanchez <[email protected]> wrote: >> >> Hello Karthik >> >> >> Im having trouble adding the two bricks back online. Any help is >> appreciated >> >> thanks >> >> >> when i try to add-brick command this is what i get >> >> [root@gluster01 ~]# gluster volume add-brick scratch >> gluster02ib:/gdata/brick2/scratch/ >> volume add-brick: failed: Pre Validation failed on gluster02ib. Brick: >> gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or >> be contained by an existing brick >> >> I have run the following commands and remove the .glusterfs hidden >> directories >> >> [root@gluster02 ~]# setfattr -x trusted.glusterfs.volume-id >> /gdata/brick2/scratch/ >> setfattr: /gdata/brick2/scratch/: No such attribute >> [root@gluster02 ~]# setfattr -x trusted.gfid /gdata/brick2/scratch/ >> setfattr: /gdata/brick2/scratch/: No such attribute >> [root@gluster02 ~]# >> >> >> this is what I get when I run status and info >> >> >> [root@gluster01 ~]# gluster volume info scratch >> >> Volume Name: scratch >> Type: Distribute >> Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 4 >> Transport-type: tcp,rdma >> Bricks: >> Brick1: gluster01ib:/gdata/brick1/scratch >> Brick2: gluster01ib:/gdata/brick2/scratch >> Brick3: gluster02ib:/gdata/brick1/scratch >> Brick4: gluster02ib:/gdata/brick2/scratch >> Options Reconfigured: >> nfs.disable: on >> performance.readdir-ahead: on >> [root@gluster01 ~]# >> >> >> [root@gluster02 ~]# gluster volume status scratch >> Status of volume: scratch >> Gluster process TCP Port RDMA Port Online Pid >> ------------------------------------------------------------------------------ >> Brick gluster01ib:/gdata/brick1/scratch 49156 49157 Y >> 1819 >> Brick gluster01ib:/gdata/brick2/scratch 49158 49159 Y >> 1827 >> Brick gluster02ib:/gdata/brick1/scratch N/A N/A N N/A >> Brick gluster02ib:/gdata/brick2/scratch N/A N/A N N/A >> >> Task Status of Volume scratch >> ------------------------------------------------------------------------------ >> There are no active volume tasks >> >> [root@gluster02 ~]# >> >> >> This are the logs files from Gluster ETC >> >> [2018-04-25 20:56:54.390662] I [MSGID: 106143] >> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick >> /gdata/brick1/scratch on port 49152 >> [2018-04-25 20:56:54.390798] I [MSGID: 106143] >> [glusterd-pmap.c:250:pmap_registry_bind] 0-pmap: adding brick >> /gdata/brick1/scratch.rdma on port 49153 >> [2018-04-25 20:56:54.391401] E [MSGID: 106005] >> [glusterd-utils.c:4877:glusterd_brick_start] 0-management: Unable to start >> brick gluster02ib:/gdata/brick1/scratch >> [2018-04-25 20:56:54.391457] E [MSGID: 106074] >> [glusterd-brick-ops.c:2493:glusterd_op_add_brick] 0-glusterd: Unable to add >> bricks >> [2018-04-25 20:56:54.391476] E [MSGID: 106123] >> [glusterd-mgmt.c:294:gd_mgmt_v3_commit_fn] 0-management: Add-brick commit >> failed. >> [2018-04-25 20:56:54.391490] E [MSGID: 106123] >> [glusterd-mgmt-handler.c:603:glusterd_handle_commit_fn] 0-management: commit >> failed on operation Add brick >> [2018-04-25 20:58:55.332262] I [MSGID: 106499] >> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: >> Received status volume req for volume scratch >> [2018-04-25 21:02:07.464357] E [MSGID: 106452] >> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: >> gluster02ib:/gdata/brick1/scratch not available. Brick may be containing or >> be contained by an existing brick >> [2018-04-25 21:02:07.464395] W [MSGID: 106122] >> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick >> prevalidation failed. >> [2018-04-25 21:02:07.464414] E [MSGID: 106122] >> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: >> Pre Validation failed on operation Add brick >> [2018-04-25 21:04:56.198662] E [MSGID: 106452] >> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: >> gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or >> be contained by an existing brick >> [2018-04-25 21:04:56.198700] W [MSGID: 106122] >> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick >> prevalidation failed. >> [2018-04-25 21:04:56.198716] E [MSGID: 106122] >> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: >> Pre Validation failed on operation Add brick >> [2018-04-25 21:07:11.084205] I [MSGID: 106482] >> [glusterd-brick-ops.c:447:__glusterd_handle_add_brick] 0-management: >> Received add brick req >> [2018-04-25 21:07:11.087682] E [MSGID: 106452] >> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: >> gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or >> be contained by an existing brick >> [2018-04-25 21:07:11.087716] W [MSGID: 106122] >> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick >> prevalidation failed. >> [2018-04-25 21:07:11.087729] E [MSGID: 106122] >> [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre >> Validation failed for operation Add brick on local node >> [2018-04-25 21:07:11.087741] E [MSGID: 106122] >> [glusterd-mgmt.c:2009:glusterd_mgmt_v3_initiate_all_phases] 0-management: >> Pre Validation Failed >> [2018-04-25 21:12:22.340221] E [MSGID: 106452] >> [glusterd-utils.c:6064:glusterd_new_brick_validate] 0-management: Brick: >> gluster02ib:/gdata/brick2/scratch not available. Brick may be containing or >> be contained by an existing brick >> [2018-04-25 21:12:22.340259] W [MSGID: 106122] >> [glusterd-mgmt.c:188:gd_mgmt_v3_pre_validate_fn] 0-management: ADD-brick >> prevalidation failed. >> [2018-04-25 21:12:22.340274] E [MSGID: 106122] >> [glusterd-mgmt-handler.c:337:glusterd_handle_pre_validate_fn] 0-management: >> Pre Validation failed on operation Add brick >> [2018-04-25 21:18:13.427036] I [MSGID: 106499] >> [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: >> Received status volume req for volume scratch >> [root@gluster02 glusterfs]# >> >> >> --------------------------------- >> Jose Sanchez >> Systems/Network Analyst 1 >> Center of Advanced Research Computing >> 1601 Central Ave. >> MSC 01 1190 >> Albuquerque, NM 87131-0001 >> carc.unm.edu >> 575.636.4232 >> >> On Apr 12, 2018, at 12:11 AM, Karthik Subrahmanya <[email protected]> >> wrote: >> >> >> >> On Wed, Apr 11, 2018 at 7:38 PM, Jose Sanchez <[email protected]> wrote: >>> >>> Hi Karthik >>> >>> Looking at the information you have provided me, I would like to make sure >>> that I’m running the right commands. >>> >>> 1. gluster volume heal scratch info >> >> If the count is non zero, trigger the heal and wait for heal info count to >> become zero. >>> >>> 2. gluster volume remove-brick scratch replica 1 >>> gluster02ib:/gdata/brick1/scratch gluster02ib:/gdata/brick2/scratch force >>> >>> 3. gluster volume add-brick “#" scratch gluster02ib:/gdata/brick1/scratch >>> gluster02ib:/gdata/brick2/scratch >>> >>> >>> Based on the configuration I have, Brick 1 from Node A and B are tide >>> together and Brick 2 from Node A and B are also tide together. Looking at >>> your remove command (step #2), it seems that you want me to remove Brick 1 >>> and 2 from Node B (gluster02ib). is that correct? I thought the data was >>> distributed in bricks 1 between nodes A and B) and duplicated on Bricks 2 >>> (node A and B). >> >> Data is duplicated between bricks 1 of nodes A & B and bricks 2 of nodes A & >> B and data is distributed between these two pairs. >> You need not always remove the bricks 1 & 2 from node B itself. The idea >> here is to keep one copy from both the replica pairs. >>> >>> >>> Also when I add the bricks back to gluster, do I need to specify if it is >>> distributed or replicated?? and Do i need a configuration #?? for example on >>> your command (Step #2) you have “replica 1” when remove bricks, do I need to >>> do the same when adding the nodes back ? >> >> No. You just need to erase the data on those bricks and add those bricks >> back to the volume. The previous remove-brick command will make the volume >> plain distribute. Then simply adding the bricks without specifying any "#" >> will expand the volume as a plain distribute volue. >>> >>> >>> Im planning on moving with this changes in few days. At this point each >>> brick has 14tb and adding bricks 1 from node A and B, i have a total of >>> 28tb, After doing all the process, (removing and adding bricks) I should be >>> able to see a total of 56Tb right ? >> >> Yes after all these you will have 56TB in total. >> After adding the bricks, do volume rebalance, so that the data which were >> present previously, will be moved to the correct bricks. >> >> HTH, >> Karthik >>> >>> >>> Thanks >>> >>> Jose >>> >>> >>> >>> >>> --------------------------------- >>> Jose Sanchez >>> Systems/Network Analyst 1 >>> Center of Advanced Research Computing >>> 1601 Central Ave. >>> MSC 01 1190 >>> Albuquerque, NM 87131-0001 >>> carc.unm.edu >>> 575.636.4232 >>> >>> On Apr 7, 2018, at 8:29 AM, Karthik Subrahmanya <[email protected]> >>> wrote: >>> >>> Hi Jose, >>> >>> Thanks for providing the volume info. You have 2 subvolumes. Data is >>> replicated within the bricks of that subvolumes. >>> First one consisting of Node A's brick1 & Node B's brick1 and the second >>> one consisting of Node A's brick2 and Node B's brick2. >>> You don't have the same data on all the 4 bricks. Data are distributed >>> between these two subvolumes. >>> To remove the replica you can use the command >>> gluster volume remove-brick scratch replica 1 >>> gluster02ib:/gdata/brick1/scratch gluster02ib:/gdata/brick2/scratch force >>> So you will have one copy of data present from both the distributes. >>> Before doing this make sure "gluster volume heal scratch info" value is >>> zero. So copies you retain will have the correct data. >>> After the remove-brick erase the data from the backend. >>> Then you can expand the volume by following the steps at [1]. >>> >>> [1] >>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#expanding-volumes >>> >>> Regards, >>> Karthik >>> >>> On Fri, Apr 6, 2018 at 11:39 PM, Jose Sanchez <[email protected]> >>> wrote: >>>> >>>> Hi Karthik >>>> >>>> this is our configuration, is 2x2 =4 , they are all replicated , each >>>> brick has 14tb. we have 2 nodes A and B, each one with brick 1 and 2. >>>> >>>> Node A (replicated A1 (14tb) and B1 (14tb) ) same with node B >>>> (Replicated A2 (14tb) and B2 (14tb)). >>>> >>>> Do you think we need to degrade the node first before removing it. i >>>> believe the same copy of data is on all 4 bricks, we would like to keep one >>>> of them, and add the other bricks as extra space >>>> >>>> Thanks for your help on this >>>> >>>> Jose >>>> >>>> >>>> >>>> >>>> >>>> [root@gluster01 ~]# gluster volume info scratch >>>> >>>> Volume Name: scratch >>>> Type: Distributed-Replicate >>>> Volume ID: 23f1e4b1-b8e0-46c3-874a-58b4728ea106 >>>> Status: Started >>>> Snapshot Count: 0 >>>> Number of Bricks: 2 x 2 = 4 >>>> Transport-type: tcp,rdma >>>> Bricks: >>>> Brick1: gluster01ib:/gdata/brick1/scratch >>>> Brick2: gluster02ib:/gdata/brick1/scratch >>>> Brick3: gluster01ib:/gdata/brick2/scratch >>>> Brick4: gluster02ib:/gdata/brick2/scratch >>>> Options Reconfigured: >>>> performance.readdir-ahead: on >>>> nfs.disable: on >>>> >>>> [root@gluster01 ~]# gluster volume status all >>>> Status of volume: scratch >>>> Gluster process TCP Port RDMA Port Online >>>> Pid >>>> >>>> ------------------------------------------------------------------------------ >>>> Brick gluster01ib:/gdata/brick1/scratch 49152 49153 Y >>>> 1743 >>>> Brick gluster02ib:/gdata/brick1/scratch 49156 49157 Y >>>> 1732 >>>> Brick gluster01ib:/gdata/brick2/scratch 49154 49155 Y >>>> 1738 >>>> Brick gluster02ib:/gdata/brick2/scratch 49158 49159 Y >>>> 1733 >>>> Self-heal Daemon on localhost N/A N/A Y >>>> 1728 >>>> Self-heal Daemon on gluster02ib N/A N/A Y >>>> 1726 >>>> >>>> Task Status of Volume scratch >>>> >>>> ------------------------------------------------------------------------------ >>>> There are no active volume tasks >>>> >>>> --------------------------------- >>>> Jose Sanchez >>>> Systems/Network Analyst 1 >>>> Center of Advanced Research Computing >>>> 1601 Central Ave. >>>> MSC 01 1190 >>>> Albuquerque, NM 87131-0001 >>>> carc.unm.edu >>>> 575.636.4232 >>>> >>>> On Apr 6, 2018, at 3:49 AM, Karthik Subrahmanya <[email protected]> >>>> wrote: >>>> >>>> Hi Jose, >>>> >>>> By switching into pure distribute volume you will lose availability if >>>> something goes bad. >>>> >>>> I am guessing you have a nX2 volume. >>>> If you want to preserve one copy of the data in all the distributes, you >>>> can do that by decreasing the replica count in the remove-brick operation. >>>> If you have any inconsistency, heal them first using the "gluster volume >>>> heal <volname>" command and wait till the >>>> "gluster volume heal <volname> info" output becomes zero, before removing >>>> the bricks, so that you will have the correct data. >>>> If you do not want to preserve the data then you can directly remove the >>>> bricks. >>>> Even after removing the bricks the data will be present in the backend of >>>> the removed bricks. You have to manually erase them (both data and >>>> .glusterfs folder). >>>> See [1] for more details on remove-brick. >>>> >>>> [1]. >>>> https://docs.gluster.org/en/latest/Administrator%20Guide/Managing%20Volumes/#shrinking-volumes >>>> >>>> HTH, >>>> Karthik >>>> >>>> >>>> On Thu, Apr 5, 2018 at 8:17 PM, Jose Sanchez <[email protected]> >>>> wrote: >>>>> >>>>> >>>>> We have a Gluster setup with 2 nodes (distributed replication) and we >>>>> would like to switch it to the distributed mode. I know the data is >>>>> duplicated between those nodes, what is the proper way of switching it to >>>>> a >>>>> distributed, we would like to double or gain the storage space on our >>>>> gluster storage node. what happens with the data, do i need to erase one >>>>> of >>>>> the nodes? >>>>> >>>>> Jose >>>>> >>>>> >>>>> --------------------------------- >>>>> Jose Sanchez >>>>> Systems/Network Analyst >>>>> Center of Advanced Research Computing >>>>> 1601 Central Ave. >>>>> MSC 01 1190 >>>>> Albuquerque, NM 87131-0001 >>>>> carc.unm.edu >>>>> 575.636.4232 >>>>> >>>>> >>>>> _______________________________________________ >>>>> Gluster-users mailing list >>>>> [email protected] >>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>>> >>>> >>> >>> >> >> >> >> >> _______________________________________________ >> Gluster-users mailing list >> [email protected] >> http://lists.gluster.org/mailman/listinfo/gluster-users > > > > -- > Regards, > Hari Gowtham.
_______________________________________________ Gluster-users mailing list [email protected] http://lists.gluster.org/mailman/listinfo/gluster-users
