Thanks Gobinda,
I am in the process of finishing up the 9 node cluster, once done I will
test this ansible role...
On Fri, Jun 14, 2019 at 12:45 PM Gobinda Das wrote:
> We have ansible role to replace gluster node.I think it works only with
> same FQDN.
>
We have ansible role to replace gluster node.I think it works only with
same FQDN.
https://github.com/sac/gluster-ansible-maintenance
I am not sure if it covers all senarios, but you can try with same FQDN.
On Fri, Jun 14, 2019 at 7:13 AM Adrian Quintero
wrote:
> Strahil,
> Thanks for all the
Strahil,
Thanks for all the follow up, I will try to reproduce the same scenario
today, deploy a 9 node cluster, Completely kill the initiating node (vmm10)
and see If i can recover using the extra server approach (Different
IP/FQDN). If I am able to recover I will also try to test with your
Hi Adrian,
Please keep in mind that when a server dies, the easiest way to recover is to
get another freshly installed server with different IP/FQDN .
Then you will need to use 'replace-brick' and once gluster replaces that node -
you should be able to remove the old entry in oVirt.
Once the
Strahil, I dont use the GUI that much, in this case I need to understand
how all is tied together if I want to move to production. As far as Gluster
goes, I can get do the administration thru CLI, however when my test
environment was set up it was setup using geodeploy for Hyperconverged
setup
Do you have empty space to store the VMs ? If yes, you can always script the
migration of the disks via the API . Even a bash script and curl can do the
trick.
About the /dev/sdb , I still don't get it . A pure "df -hT" from a node will
make it way clear. I guess '/dev/sdb' is a PV and you got
adding gluster pool list:
UUID Hostname State
2c86fa95-67a2-492d-abf0-54da625417f8 vmm12.mydomain.com Connected
ab099e72-0f56-4d33-a16b-ba67d67bdf9d vmm13.mydomain.com Connected
c35ad74d-1f83-4032-a459-079a27175ee4 vmm14.mydomain.com Connected
aeb7712a-e74e-4492-b6af-9c266d69bfd3
Strahil,
Looking at your suggestions I think I need to provide a bit more info on my
current setup.
1.
I have 9 hosts in total
2.
I have 5 storage domains:
-
hosted_storage (Data Master)
-
vmstore1 (Data)
-
data1 (Data)
-
data2
Hi Adrian,
You have several options:
A) If you have space on another gluster volume (or volumes) or on NFS-based
storage, you can migrate all VMs live . Once you do it, the simple way will be
to stop and remove the storage domain (from UI) and gluster volume that
correspond to the problematic
Thanks for pointing me in the right direction, I was able to add the server
to the cluster by adding /etc/vdsm/vdsm.id
I will now try to create the new bricks and try a replacement brick, this
part I think I will have to do thru command line because my Hyperconverged
setup with a replica 3 is as
https://stijn.tintel.eu/blog/2013/03/02/ovirt-problem-duplicate-uuids
On Mon, Jun 10, 2019, 18:13 Adrian Quintero
wrote:
> Ok I have tried reinstalling the server from scratch with a different name
> and IP address and when trying to add it to cluster I get the following
> error:
>
> Event
Hi, i think you can generate and use a new uuid, althought i am not sure
about the procedure right now..
On Mon, Jun 10, 2019, 18:13 Adrian Quintero
wrote:
> Ok I have tried reinstalling the server from scratch with a different name
> and IP address and when trying to add it to cluster I get
Can you let me know how to fix the gluster and missing brick?,
I tried removing it by going to "storage > Volumes > vmstore > bricks >
selected the brick
However it is showing as an unknown status (which is expected because the
server was completely wiped) so if I try to "remove", "replace brick"
At this point I'd go to engine VM and remove host from the postgres DB
manually.
A bit of a hack, but...
ssh root@
su - postgres
cd /opt/rh/rh-postgresql10/
source enable
psql engine
select vds_id from vds_static where host_name='myhost1.mydomain.com';
select DeleteVds('');
Of course, keep in
Hi Adrian,
Did you fix the issue with the gluster and the missing brick?
If yes, try to set the 'old' host in maintenance and then forcefully remove it
from oVirt.
If it succeeds (and it should), then you can add the server back and then check
what happens.
Best Regards,
Strahil NikolovOn Jun
Ok I have tried reinstalling the server from scratch with a different name
and IP address and when trying to add it to cluster I get the following
error:
Event details
ID: 505
Time: Jun 10, 2019, 10:00:00 AM
Message: Host myshost2.virt.iad3p installation failed. Host
myhost2.mydomain.com reports
Leo,
I did try putting it under maintenance and checking to ignore gluster and
it did not work.
Error while executing action:
-Cannot remove host. Server having gluster volume.
Note: the server was already reinstalled so gluster will never see the
volumes or bricks for this server.
I will rename
You will need to remove the storage role from that server first ( not being
part of gluster cluster ).
I cannot test this right now on production, but maybe putting host
although its already died under "mantainance" while checking to ignore
guster warning will let you remove it.
Maybe I am wrong
Can you remove bricks that belong to a fried server? Either from a GUI or
CLI
You should be able to do so and then it should allow you to remove host
from the oVirt setup.
--
Dmitry Filonov
Linux Administrator
SBGrid Core | Harvard Medical School
250 Longwood Ave, SGM-114
Boston, MA 02115
On
I'll presume you didn't fully backup your hosts root file systems on the
host which was fried.
It may be easier to replace with a new hostname/IP.
I would focus on the gluster config first, since it was hyperconverged.
I don't know which way engine UI is using to detect gluster mount on
missing
Definitely is a challenge trying to replace a bad host.
So let me tell you what I see and have done so far:
1.-I have a host that went bad due to HW issues.
2.-This bad host is still showing in the compute --> hosts section.
3.-This host was part of a hyperconverged setup with Gluster.
4.-The
Have you tried with "Force remove" tick ?
Best Regards,Strahil Nikolov
В четвъртък, 6 юни 2019 г., 21:47:20 ч. Гринуич+3, Adrian Quintero
написа:
I tried removing the bad host but running into the following issue , any idea?
Operation Canceled
Error while executing action:
I tried removing the bad host but running into the following issue , any
idea?
Operation Canceled
Error while executing action:
host1.mydomain.com
- Cannot remove Host. Server having Gluster volume.
On Thu, Jun 6, 2019 at 11:18 AM Adrian Quintero
wrote:
> Leo, I forgot to mention that I
Leo, I forgot to mention that I have 1 SSD disk for caching purposes,
wondering how that setup should be achieved?
thanks,
Adrian
On Wed, Jun 5, 2019 at 11:25 PM Adrian Quintero
wrote:
> Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
>
> Will test tomorrow and post the
Hi Leo, yes, this helps a lot, this confirms the plan we had in mind.
Will test tomorrow and post the results.
Thanks again
Adrian
On Wed, Jun 5, 2019 at 11:18 PM Leo David wrote:
> Hi Adrian,
> I think the steps are:
> - reinstall the host
> - join it to virtualisation cluster
> And if was
Hi Adrian,
I think the steps are:
- reinstall the host
- join it to virtualisation cluster
And if was member of gluster cluster as well:
- go to host - storage devices
- create the bricks on the devices - as they are on the other hosts
- go to storage - volumes
- replace each failed brick with the
26 matches
Mail list logo