GitHub user arpanbht created a discussion: Apache CloudStack Multi-Management 
Server Across Different Subnets with Shared MySQL DB – HA and Cross-Zone 
Snapshot Copy Issues

Hi everyone,

I am currently testing a multi-management-server Apache CloudStack setup across 
two separate private networks and wanted clarification regarding management 
server failover behavior and a cross-zone snapshot/template copy issue.

### Current Architecture

I have two separate networks:
- 192.168.1.0/24
- 192.168.2.0/24

Both networks are reachable from each other.

### Infrastructure Layout

### Network 1 (192.168.1.0/24)

- Management Server - 192.168.1.101
- KVM Host - 192.168.1.102
- NFS Secondary Storage - 192.168.1.103

### Network 2 (192.168.2.0/24)

- Management Server - 192.168.2.101
- KVM Host - 192.168.2.105
- NFS Secondary Storage - 192.168.2.102

### Setup Process

- Initially, I deployed the entire CloudStack infrastructure in the 
192.168.1.0/24 network.
- Later, I created another standalone management server in the 192.168.2.0/24 
network.
- Instead of using a separate database, I pointed the second management server 
to the same MySQL DB used by the first management server.
- I changed the MySQL configuration:
`bind-address = 0.0.0.0`

so both management servers can connect to the same DB.

- I configured:
`management.network.cidr=192.168.0.0/16`

to allow both subnets to communicate within the management network range.

- Then, Created another Zone

-     Added the 192.168.2.x KVM host
-     Added the second NFS server as secondary storage for that zone

### Current Behavior Working Properly

- Both zones are operational
- All 4 System VMs are running properly
- Hosts can communicate across subnets
- Shared database connectivity works
- Both management servers are operational

### Observed Concern

Even though I have two management servers, all KVM hosts appear to connect only 
to the first management server (192.168.1.101) which also hosts the MySQL DB.

I want to understand:

### HA / Failover Question

If the first management server `(192.168.1.101)` goes down:

- Will the hosts automatically reconnect to the second management server 
`(192.168.2.101)`?
- Will CloudStack continue functioning normally?
- Is additional HA or load balancer configuration required?
- Does CloudStack management server failover work automatically when multiple 
management servers share the same DB?

### Cross-Zone Snapshot/Template Copy Problem

Everything works until I try to:

- Copy templates across zones
- Copy VM snapshots across zones

Then I receive:

HTTP Server returned 403 (expected 200 OK)

### Relevant Logs
Failed to copy snapshot: 9d3f5a15-c8f8-4c00-9d08-1854ba68a90e with error:
HTTP Server returned 403 (expected 200 OK)

Additional stack trace:

```
[Failed to copy snapshot: 9d3f5a15-c8f8-4c00-9d08-1854ba68a90e with error:  
HTTP Server returned 403 (expected 200 OK) ].
2026-05-22 06:10:04,241 ERROR [o.a.c.s.i.BaseImageStoreDriverImpl] 
(RemoteHostEndPoint-1:[ctx-ce32d305]) (logid:93c14a4e) Failed to copy snapshot: 
9d3f5a15-c8f8-4c00-9d08-1854ba68a90e with error:  HTTP Server returned 403 
(expected 200 OK)
2026-05-22 06:10:04,250 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-4:[ctx-d6dd543b, job-571]) (logid:c12eb79c) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.user.snapshot.CopySnapshotCmd 
com.cloud.utils.exception.CloudRuntimeException: Failed to copy snapshot
        at 
com.cloud.storage.snapshot.SnapshotManagerImpl.copySnapshot(SnapshotManagerImpl.java:2315)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at 
org.springframework.aop.support.AopUtils.invokeJoinpointUsingReflection(AopUtils.java:344)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.invokeJoinpoint(ReflectiveMethodInvocation.java:198)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163)
        at 
org.apache.cloudstack.network.contrail.management.EventUtils$EventInterceptor.invoke(EventUtils.java:109)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
        at 
com.cloud.event.ActionEventInterceptor.invoke(ActionEventInterceptor.java:52)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:175)
        at 
org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:97)
        at 
org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:186)
        at 
org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:215)
        at com.sun.proxy.$Proxy244.copySnapshot(Unknown Source)
        at 
org.apache.cloudstack.api.command.user.snapshot.CopySnapshotCmd.execute(CopySnapshotCmd.java:188)
        at com.cloud.api.ApiDispatcher.dispatch(ApiDispatcher.java:173)
        at 
com.cloud.api.ApiAsyncJobDispatcher.runJob(ApiAsyncJobDispatcher.java:110)
        at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.runInContext(AsyncJobManagerImpl.java:698)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable$1.run(ManagedContextRunnable.java:49)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext$1.call(DefaultManagedContext.java:56)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.callWithContext(DefaultManagedContext.java:103)
        at 
org.apache.cloudstack.managed.context.impl.DefaultManagedContext.runWithContext(DefaultManagedContext.java:53)
        at 
org.apache.cloudstack.managed.context.ManagedContextRunnable.run(ManagedContextRunnable.java:46)
        at 
org.apache.cloudstack.framework.jobs.impl.AsyncJobManagerImpl$5.run(AsyncJobManagerImpl.java:646)
        at 
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
```

### Questions Regarding the 403 Error

1. Is this issue related to:

- Secondary storage permissions?
- Apache/Nginx configuration?
- SSVM-to-secondary-storage communication?
- Cross-zone NFS accessibility?

2. Do both secondary storage servers need:
- Mutual access?
- Proper HTTP export configuration?
- Additional ACL or firewall rules?
3. Since both zones are in different subnets, is there any additional 
configuration needed for:
- SSVM routing
- Secondary storage access
- Template/snapshot copy traffic
4. Is the management.network.cidr=192.168.0.0/16 configuration sufficient for 
this type of architecture?

### Additional Notes

- Both subnets can ping each other
- System VMs are running in both zones
- KVM hosts are healthy
- Secondary storage mounts work locally
- Shared MySQL DB works from both management servers

I would appreciate guidance on:

- Proper multi-management-server HA behavior
- Best practices for multi-subnet CloudStack deployments
- Root cause of the HTTP 403 during cross-zone snapshot/template copy

GitHub link: https://github.com/apache/cloudstack/discussions/13220

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to