That makes sense. System is made of four data arrays with a hardware RAID 6 and then the distributed volume on top. I honestly don't know how that works, but the previous administrator said we had redundancy. I'm hoping there is a way to bypass the safeguard of migrating data when removing a brick from the volume, which in my beginner's mind, would be a straight-forward way of remedying the problem. Hopefully once the empty bricks are removed, the "missing" data will be visible again in the volume.
On Wed, Feb 27, 2019 at 3:59 PM Jim Kinney <[email protected]> wrote: > Keep in mind that gluster is a metadata process. It doesn't really touch > the actual volume files. The exception is the .glusterfs and .trashcan > folders in the very top directory of the gluster volume. > > When you create a gluster volume from brick, it doesn't format the > filesystem. It uses what's already there. > > So if you remove a volume and all it's bricks, you've not deleted data. > > That said, if you are using anything but replicated bricks, which is what > I use exclusively for my needs, then reassembling them into a new volume > with correct name might be tricky. By listing the bricks in the exact same > order as they were listed when creating the wrong name volume when making > the correct named volume, it should use the same method to put data on the > drives as previously and not scramble anything. > > On Wed, 2019-02-27 at 14:24 -0500, Tami Greene wrote: > > I sent this and realized I hadn't registered. My apologies for the > duplication > > Subject: Added bricks with wrong name and now need to remove them without > destroying volume. > To: <[email protected]> > > > > Yes, I broke it. Now I need help fixing it. > > > > I have an existing Gluster Volume, spread over 16 bricks and 4 servers; > 1.5P space with 49% currently used . Added an additional 4 bricks and > server as we expect large influx of data in the next 4 to 6 months. The > system had been established by my predecessor, who is no longer here. > > > > First solo addition of bricks to gluster. > > > > Everything went smoothly until “gluster volume add-brick Volume > newserver:/bricks/dataX/vol.name" > > (I don’t have the exact response as I worked on this for > almost 5 hours last night) Unable to add-brick as “it is already mounted” > or something to that affect. > > Double checked my instructions, the name of the bricks. > Everything seemed correct. Tried to add again adding “force.” Again, > “unable to add-brick” > > Because of the keyword (in my mind) “mounted” in the > error, I checked /etc/fstab, where the name of the mount point is simply > /bricks/dataX. > > This convention was the same across all servers, so I thought I had > discovered an error in my notes and changed the name to > newserver:/bricks/dataX. > > Still had to use force, but the bricks were added. > > Restarted the gluster volume vol.name. No errors. > > Rebooted; but /vol.name did not mount on reboot as the /etc/fstab > instructs. So I attempted to mount manually and discovered a had a big mess > on my hands. > > “Transport endpoint not connected” in > addition to other messages. > > Discovered an issue between certificates and the > auth.ssl-allow list because of the hostname of new server. I made > correction and /vol.name mounted. > > However, df -h indicated the 4 new bricks were not being > seen as 400T were missing from what should have been available. > > > > Thankfully, I could add something to vol.name on one machine and see it > on another machine and I wrongly assumed the volume was operational, even > if the new bricks were not recognized. > > So I tried to correct the main issue by, > > gluster volume remove vol.name newserver/bricks/dataX/ > > received prompt, data will be migrated before brick is > removed continue (or something to that) and I started the process, think > this won’t take long because there is no data. > > After 10 minutes and no apparent progress on the process, > I did panic, thinking worse case scenario – it is writing zeros over my > data. > > Executed the stop command and there was still no progress, > and I assume it was due to no data on the brick to be remove causing the > program to hang. > > Found the process ID and killed it. > > > This morning, while all clients and servers can access /vol.name; not all > of the data is present. I can find it under cluster, but users cannot > reach it. I am, again, assume it is because of the 4 bricks that have been > added, but aren't really a part of the volume because of their incorrect > name. > > > > So – how do I proceed from here. > > > 1. Remove the 4 empty bricks from the volume without damaging data. > > 2. Correctly clear any metadata about these 4 bricks ONLY so they may be > added correctly. > > > If this doesn't restore the volume to full functionality, I'll write > another post if I cannot find answer in the notes or on line. > > > Tami-- > > > _______________________________________________ > > Gluster-users mailing list > > [email protected] > > > https://lists.gluster.org/mailman/listinfo/gluster-users > > -- > > James P. Kinney III Every time you stop a school, you will have to build a > jail. What you gain at one end you lose at the other. It's like feeding a > dog on his own tail. It won't fatten the dog. - Speech 11/23/1900 Mark > Twain http://heretothereideas.blogspot.com/ > > -- Tami
_______________________________________________ Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
