Re: Disaster after maintenance

2019-03-20 Thread Sergey Levitskiy
+1 on the advice to start from scratch. Provisioning is failing because it can’t spin up either SSVM or proxy due to not enough capacity. The reason might be: * Not enough capacity either CPU or RAM. increasing overprovisioning factors or reducing disable thresholds might help. *

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Hi Jevgeni, I would perhaps consider you continue with plan B from your separate email thread (root volumes --> create snapshots, convert snaps to template, download template somewhere safe - for DATA volumes, also create snapshots, then convert to volume and download it (or simply directly

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
It started with 4.10 and then gradually upgraded with all stops, when new releases were available. >>> Why do you have 3 zones in this installation - what is the setup ? >>> SSVM and CPVM (for whatever zone) are failing to be created... Its a result of attempts to create new zone and somehow

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Hi, 2019-03-20 06:41:50,446 INFO [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) DB version = 4.10.0.0 Code Version = 4.11.2.0 2019-03-20 06:41:50,447 DEBUG [c.c.u.DatabaseUpgradeChecker] (main:null) (logid:) Running upgrade Upgrade41000to41100 to upgrade from 4.10.0.0-4.11.0.0 to 4.11.0.0

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
Basic Zone - Yes router has been actually started/created on KVM side - not created, not started. Thats the main problem, I guess agent.log https://drive.google.com/open?id=1rATxHKqgNKo2kD23BtlrZy_9gFXC-Bq- management log https://drive.google.com/open?id=1H2jI0roeiWxtzReB8qV6QxDkNpaki99A >>

Re: Disaster after maintenance

2019-03-20 Thread Dag Sonstebo
Jevgeni, Can you also explain your infrastructure - you said you have two hosts only, where does CloudStack management run? Reason I'm asking is when checking your logs from yesterday the IP address 192.168.1.14 seems to be used for management, NFS and a KVM host? Is this the case, do you

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
Just to confirm, you are using Basic Zone in CloudStack, right ? Can you confirm that router has been actually started/created on KVM side, again, as requested please post logs (mgmt and agent - and note the time around which you tried to start VR last time it partially succeeded) - we can't

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
After dozen of attempts, the Virtual Router could finally be recreated. But its in eternal Starting status, and console prompts it required upgrade and Version is UNKNOWN It does not resolve the problem, I cannot move further form this point. Any hints? Or I am condemned to do reinstall

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
l.invoke0(Native > > > > > > > > > > > > > > > Method) > > > > > > > > > > > > > > > >> > >>at > > > > > > > > > >

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
; > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: Disaster after maintenance

2019-03-20 Thread Jevgeni Zolotarjov
> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.inv

Re: Disaster after maintenance

2019-03-20 Thread Andrija Panic
> > > >> > >> > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > &

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
t; > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174) > > > > > > > > > > > >> > >>at > > >

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
; > > > > > > >> > >> > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
> > > > > > > > > > >> > >> Source) > > > > > > > > > > >> > >>at java.lang.Thread.run(Unknown > Source) > > > > > > > > > > >&

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
> >> > >> > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > > > > > &g

Re: Disaster after maintenance

2019-03-19 Thread Sergey Levitskiy
gt; > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
ingframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:197) > > > > > > > > > >> > >>at > > > > > > > > > >> > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > >

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
; > > > java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > > > > > > > > >> > >> Source) > > > > > > > > >> > >>at > > > > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown > > > > > > > > >> > >> Source) > > > > > > > > >> &

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t; > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:174) > >

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
gt; > > > > > > > com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:51) > > > > > > >> > >>at > > > > > > >> > >> > > > &g

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
; > > > > >> > >> > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > org.springframework.aop.intercepto

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
> > org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) > > > > >> > >>at > > > > >> > >> > > > > >> > > > &g

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
gt; > >> > > > >> > > > > >> > > > > > > org.apache.cloudstack.api.command.user.network.RestartNetworkCmd.execute(RestartNetworkCmd.java:99) &g

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
piAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:108) > > >> > >>at > > >> > >> > > >> > > > >> > > > org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(Async

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
gt; > >> > >> > > >> > org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56) > >> > >>at > >> > >&

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
ultManagedContext.java:103) >> > >>at >> > >> >> > >> org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53) >> > >>at >> > >> >> > >> org.apache

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
current.Executors$RunnableAdapter.call(Unknown > > >> Source) > > >>at java.util.concurrent.FutureTask.run(Unknown Source) > > >>at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown > > >> Source) > > >>at java.uti

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
t;> > >>>>>> libvirtd.service - Virtualization daemon > >>>>>> Loaded: loaded (/usr/lib/systemd/system/libvirtd.service; > >>>>>> enabled; vendor preset: enabled) > >>>>>> Active: failed (Result: start-limit) sin

Re: Disaster after maintenance

2019-03-19 Thread Andrija Panic
omplete > async > >> job-5093, jobStatus: FAILED, resultCode: 530, result: > >> > org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":530,"errortext":"Resource > >> [DataCenter:1] is un > >> &

Re: Disaster after maintenance

2019-03-19 Thread Boris Stoyanov
ue, Mar 19, 2019 at 4:19 PM Andrija Panic >> wrote: >> >>> ​​ >>> Your network can't be deleted due to "Can't delete the network, not all >>> user vms are expunged. Vm >>> VM[User|i-2-11-VM] is in Stopped state&q

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
: Jevgeni Zolotarjov Sent: 19 March 2019 17:29 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance Guys, please help with it. What can be done here? There is too much valuable data. On Tue, Mar 19, 2019 at 4:21 PM Jevgeni Zolotarjov wrote: > Tried that just now and got er

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
>> VM[User|i-2-11-VM] is in Stopped state" - which is fine. >> >> You should be able to just start the user VM - but if you have actually >> delete the VR itself, then just do Network restart with "cleanup" and it >> will recreate a new VR, after which you

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
; Andrija > > andrija.pa...@shapeblue.com > www.shapeblue.com > Amadeus House, Floral Street, London WC2E 9DPUK > @shapeblue > > > > > -Original Message- > From: Jevgeni Zolotarjov > Sent: 19 March 2019 15:10 > To: users@cloudstack

RE: Disaster after maintenance

2019-03-19 Thread Andrija Panic
2019 15:10 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance I mean I cannot delete network: In the management server log I see == 019-03-19 14:06:36,316 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] (API-Job-Executor-1:c

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
ailed to start >> > Virtualization daemon. >> > Mar 19 13:03:07 mtl1-apphst04.mt.pbt.com.mt systemd[1]: Unit >> > libvirtd.service entered failed state. >> > Mar 19 13:03:07 mtl1-apphst04.mt.pbt.com.mt systemd[1]: >> libvirtd.service >> > failed

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
the host and the > > > agent logs (usual logs directory) > > > Also worth checking that libvirt has started ok. Do you have some NUMA > > > constraints or anything which requires particular RAM configuration? > > > > > > paul.an...@shapeblue.com > > > www.shapeblue.c

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
Subject: Re: Disaster after maintenance That's it - libvirtd failed to start on second host. Tried restarting, but it does not start. >> Do you have some NUMA constraints or anything which requires >> particular RAM configuration? No libvirtd.service - Virtualization daemon Lo

Re: Disaster after maintenance

2019-03-19 Thread Ivan Kudryavtsev
gt; > www.shapeblue.com > > Amadeus House, Floral Street, London WC2E 9DPUK > > @shapeblue > > > > > > > > > > -Original Message- > > From: Jevgeni Zolotarjov > > Sent: 19 March 2019 14:49 > > To: users@cloudstack.apache.org &

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
From: Jevgeni Zolotarjov > Sent: 19 March 2019 14:49 > To: users@cloudstack.apache.org > Subject: Re: Disaster after maintenance > > Can you try migrating a VM to the server that you changed the RAM amount? > > Also: > What is the hypervisor version? > KVM > QEMU V

RE: Disaster after maintenance

2019-03-19 Thread Paul Angus
Amadeus House, Floral Street, London WC2E 9DPUK @shapeblue -Original Message- From: Jevgeni Zolotarjov Sent: 19 March 2019 14:49 To: users@cloudstack.apache.org Subject: Re: Disaster after maintenance Can you try migrating a VM to the server that you changed the RAM amount? Also: What

Re: Disaster after maintenance

2019-03-19 Thread Rafael Weingärtner
that is why nothing deploys there. You need to connect this host to ACS. otherwise, it will just be ignored. Did you check the log files in the agent (in the host)? And, of course, in ACS? On Tue, Mar 19, 2019 at 9:49 AM Jevgeni Zolotarjov wrote: > Can you try migrating a VM to the server that

Re: Disaster after maintenance

2019-03-19 Thread Jevgeni Zolotarjov
Can you try migrating a VM to the server that you changed the RAM amount? Also: What is the hypervisor version? KVM QEMU Version : 2.0.0 Release : 1.el7.6 Host status in ACS? 1st server: Unsecure 2nd server: Disconnected Did you try to force a VM to start/deploy in this server where

Re: Disaster after maintenance

2019-03-19 Thread Rafael Weingärtner
Can you try migrating a VM to the server that you changed the RAM amount? Also: What is the hypervisor version? Host status in ACS? Did you try to force a VM to start/deploy in this server where you changed the RAM? On Tue, Mar 19, 2019 at 9:39 AM Jevgeni Zolotarjov wrote: > We have