Re: Detected that another management node with the same IP is already running, please check your cluster configuration

2023-09-19 Thread Jithin Raju
Hi Jaejong,

Are you able to start clouddtack service without any errors? If not check the 
service using port 9090 and disable/stop it.

systemctl list-sockets


-Jithin

From: jaejong 
Date: Wednesday, 20 September 2023 at 10:06 AM
To: users@cloudstack.apache.org 
Subject: Re: Detected that another management node with the same IP is already 
running, please check your cluster configuration
hellow Jithin

I reboot host with  cloudstack-management enabled

port status are :
$ sudo netstat -nap | grep 8080

tcp6   0  0 :::8080 :::*LISTEN
1147/java

tcp6   0  0 :::9090 :::*LISTEN  
1/systemd
tcp6   0  0 10.0.33.1:9090  10.0.33.1:47813 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:42097 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:44439 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:52903 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:33893 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:45235 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:60319 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:49279 TIME_WAIT   
-

tcp6   0  0 :::8250 :::*LISTEN
1147/java

But after rebooting with  cloudstack-management disabled
only 9090 is used.

$ sudo netstat -nap | grep 9090
tcp6 0  0 :::9090  :::*   LISTEN 1/systemd

thanks a lot.




 


-Original Message-
From: "Jithin Raju"
To: "users@cloudstack.apache.org";
Cc:
Sent: 2023-09-20 (수) 12:46:36 (GMT+09:00)
Subject: Re: Detected that another management node with the same IP is already 
running, please check your cluster configuration

Hi Jaejong,

Could you check whether any of the port numbers 8250,9090,8080 TCP are in use 
already? Or reboot the OS?

-Jithin

From: jaejong 
Date: Tuesday, 19 September 2023 at 4:20 PM
To: users@cloudstack.apache.org 
Subject: Detected that another management node with the same IP is already 
running, please check your cluster configuration
rocky linux 9.2
mysql 8.0.32
A single Management Server node with mysql on the same node

1. After rebooting management host I get follow error messages

sudo systemctl status cloudstack-management

Loaded: loaded (/usr/lib/systemd/system/cloudstack-management.service; 
enabled; preset: disabled)
Active: active (running) since Tue 2023-09-19 17:35:39 KST; 2min 3s ago
  Main PID: 1148 (java)
 Tasks: 49 (limit: 408699)
Memory: 1.1G
   CPU: 40.653s
CGroup: /system.slice/cloudstack-management.service
└─1148 /usr/bin/java 
-Djava.security.properties=/etc/cloudstack/management/java.security.ciphers 
-Djava.awt.headless=true -Dcom.sun.management.>

java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
java[1148]: at org.eclipse.jetty.server.Server.start(Server.java:423)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
java[1148]: at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
java[1148]: at org.eclipse.jetty.server.Server.doStart(Server.java:387)
java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:192)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:107)
java[1148]: INFO  [o.a.c.s.NfsMountManager] (main:null) (logid:) Clean up 
mounted NFS mount points used in current session

server.log
2023-09-19 17:35:54,352 ERROR [c.c.c.ClusterManagerImpl] (main:null) (logid:) 
Detected that another management node with the same IP 10.0.33.1 is already 
running, please check your cluster configuration
2023-09-19 17:35:54,353 ERROR [o.a.c.s.l.CloudStackExtendedLifeCycle] 
(main:null) (logid:) Failed to configure ClusterManagerImpl
javax.naming.ConfigurationException: Detected that another management node with 
the same IP  is already running, please check your cluster configuration
   at 
com.cloud.cluster.ClusterManagerImpl.checkConflicts(ClusterManagerImpl.java:1245)
   at 
com.cloud.cluster.ClusterManagerImpl.configure(ClusterManagerImpl.java:1115)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle$3.with(CloudStackExtendedLifeCycle.java:114)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.with(CloudStackExtendedLifeCycle.java:153)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.configure(CloudStackExtendedLifeCycle.java:110)
   at 

Re: Detected that another management node with the same IP is already running, please check your cluster configuration

2023-09-19 Thread jaejong
hellow Jithin

I reboot host with  cloudstack-management enabled

port status are :
$ sudo netstat -nap | grep 8080

tcp6   0  0 :::8080 :::*LISTEN  
1147/java

tcp6   0  0 :::9090 :::*LISTEN  
1/systemd
tcp6   0  0 10.0.33.1:9090  10.0.33.1:47813 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:42097 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:44439 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:52903 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:33893 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:45235 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:60319 TIME_WAIT   
-
tcp6   0  0 10.0.33.1:9090  10.0.33.1:49279 TIME_WAIT   
-

tcp6   0  0 :::8250 :::*LISTEN  
1147/java

But after rebooting with  cloudstack-management disabled
only 9090 is used.

$ sudo netstat -nap | grep 9090
tcp6 0  0 :::9090  :::*   LISTEN 1/systemd

thanks a lot.




-Original Message-
From: "Jithin Raju"
To: "users@cloudstack.apache.org";
Cc:
Sent: 2023-09-20 (수) 12:46:36 (GMT+09:00)
Subject: Re: Detected that another management node with the same IP is already 
running, please check your cluster configuration

Hi Jaejong,

Could you check whether any of the port numbers 8250,9090,8080 TCP are in use 
already? Or reboot the OS?

-Jithin

From: jaejong 
Date: Tuesday, 19 September 2023 at 4:20 PM
To: users@cloudstack.apache.org 
Subject: Detected that another management node with the same IP is already 
running, please check your cluster configuration
rocky linux 9.2
mysql 8.0.32
A single Management Server node with mysql on the same node

1. After rebooting management host I get follow error messages

sudo systemctl status cloudstack-management

Loaded: loaded (/usr/lib/systemd/system/cloudstack-management.service; 
enabled; preset: disabled)
Active: active (running) since Tue 2023-09-19 17:35:39 KST; 2min 3s ago
  Main PID: 1148 (java)
 Tasks: 49 (limit: 408699)
Memory: 1.1G
   CPU: 40.653s
CGroup: /system.slice/cloudstack-management.service
└─1148 /usr/bin/java 
-Djava.security.properties=/etc/cloudstack/management/java.security.ciphers 
-Djava.awt.headless=true -Dcom.sun.management.>

java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
java[1148]: at org.eclipse.jetty.server.Server.start(Server.java:423)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
java[1148]: at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
java[1148]: at org.eclipse.jetty.server.Server.doStart(Server.java:387)
java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:192)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:107)
java[1148]: INFO  [o.a.c.s.NfsMountManager] (main:null) (logid:) Clean up 
mounted NFS mount points used in current session

server.log
2023-09-19 17:35:54,352 ERROR [c.c.c.ClusterManagerImpl] (main:null) (logid:) 
Detected that another management node with the same IP 10.0.33.1 is already 
running, please check your cluster configuration
2023-09-19 17:35:54,353 ERROR [o.a.c.s.l.CloudStackExtendedLifeCycle] 
(main:null) (logid:) Failed to configure ClusterManagerImpl
javax.naming.ConfigurationException: Detected that another management node with 
the same IP  is already running, please check your cluster configuration
   at 
com.cloud.cluster.ClusterManagerImpl.checkConflicts(ClusterManagerImpl.java:1245)
   at 
com.cloud.cluster.ClusterManagerImpl.configure(ClusterManagerImpl.java:1115)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle$3.with(CloudStackExtendedLifeCycle.java:114)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.with(CloudStackExtendedLifeCycle.java:153)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.configure(CloudStackExtendedLifeCycle.java:110)
   at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.start(CloudStackExtendedLifeCycle.java:55)
   at 
org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
   at 
org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:54)
   at 

Re: Detected that another management node with the same IP is already running, please check your cluster configuration

2023-09-19 Thread Jithin Raju
Hi Jaejong,

Could you check whether any of the port numbers 8250,9090,8080 TCP are in use 
already? Or reboot the OS?

-Jithin

From: jaejong 
Date: Tuesday, 19 September 2023 at 4:20 PM
To: users@cloudstack.apache.org 
Subject: Detected that another management node with the same IP is already 
running, please check your cluster configuration
rocky linux 9.2
mysql 8.0.32
A single Management Server node with mysql on the same node

1. After rebooting management host I get follow error messages

sudo systemctl status cloudstack-management

 Loaded: loaded (/usr/lib/systemd/system/cloudstack-management.service; 
enabled; preset: disabled)
 Active: active (running) since Tue 2023-09-19 17:35:39 KST; 2min 3s ago
   Main PID: 1148 (java)
  Tasks: 49 (limit: 408699)
 Memory: 1.1G
CPU: 40.653s
 CGroup: /system.slice/cloudstack-management.service
 └─1148 /usr/bin/java 
-Djava.security.properties=/etc/cloudstack/management/java.security.ciphers 
-Djava.awt.headless=true -Dcom.sun.management.>

java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
java[1148]: at org.eclipse.jetty.server.Server.start(Server.java:423)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
java[1148]: at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
java[1148]: at org.eclipse.jetty.server.Server.doStart(Server.java:387)
java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:192)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:107)
java[1148]: INFO  [o.a.c.s.NfsMountManager] (main:null) (logid:) Clean up 
mounted NFS mount points used in current session

server.log
2023-09-19 17:35:54,352 ERROR [c.c.c.ClusterManagerImpl] (main:null) (logid:) 
Detected that another management node with the same IP 10.0.33.1 is already 
running, please check your cluster configuration
2023-09-19 17:35:54,353 ERROR [o.a.c.s.l.CloudStackExtendedLifeCycle] 
(main:null) (logid:) Failed to configure ClusterManagerImpl
javax.naming.ConfigurationException: Detected that another management node with 
the same IP  is already running, please check your cluster configuration
at 
com.cloud.cluster.ClusterManagerImpl.checkConflicts(ClusterManagerImpl.java:1245)
at 
com.cloud.cluster.ClusterManagerImpl.configure(ClusterManagerImpl.java:1115)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle$3.with(CloudStackExtendedLifeCycle.java:114)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.with(CloudStackExtendedLifeCycle.java:153)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.configure(CloudStackExtendedLifeCycle.java:110)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.start(CloudStackExtendedLifeCycle.java:55)
at 
org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
at 
org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:54)
at 
org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:356)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at 
org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:155)
at 
org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:123)
at 
org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:937)
at 
org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.loadContext(DefaultModuleDefinitionSet.java:144)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet$2.with(DefaultModuleDefinitionSet.java:121)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:244)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:249)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:249)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:232)
at 

Re: Multilevel NAT with private networks

2023-09-19 Thread Nux
Perhaps it's the late hour, but I am still not sure I understood your 
"common network" and "branch network", but I feel like I very vaguely 
got it.
Anyway, you cannot port forward to a VM not in that respective network, 
as such it would be complicated if not impossible to do what you want.


If I were you and I had real shortage of IPs, I'd perhaps set up a new 
network, maybe shared or L2, put a linux/opnsense/etc in there, assign 
public IP and port forward from it to VMs you'd connect in the same or 
different shared or L2 network (all hooked up to your custom router).


hth

On 2023-09-19 23:43, Emil Karlsson wrote:

Hi,

Thanks for the quick response.

My bad, I meant Isolated networks.

The idea was to be able to isolate some VMs if needed by putting them
in the "branch" isolated network, "under" the root network, as
described in my previous email.

(Which means two port forwarding would be needed)

Now, it seems that I am not able to port forward from one isolated
network to another isolated network. And thus i was wondering if this
is even possible in CloudStack? Or if I can achieve similar results in
some other way?

To recap, it is ideal for us to be able to access any VM in the group
of isolated network using one public IP.

Best regards,
Emil

On Tue, Sep 19, 2023, 22:19 Nux  wrote:


Hello Emil,

I am not sure I follow.
What type of networks are those? Isolated networks, shared networks
or
L2 networks? Or VPC tiers/networks?

On 2023-09-19 10:40, Emil Karlsson wrote:

Hi all,

We're currently using CloudStack as a deployment platform, and I

am

interested to know if it's possible to port forward from one

private

network to another private network.

Our use case:
We have a common network, and a private networks as "branches"

(both

are of
type "Private networks" in CloudStack's terminology), where a VM

can

exist
in the common network an thus port forwarding is only required in

the

main
router -> VM. But they can also exist in any branch underneath,

such

that a
port forwarding rule is needed from root -> branch router -> VM.

As

below:

internet --- > common network  --- > private network 1
- vm 1   - vm 3
- vm 2   - vm 4

The reason for this, is that it would require only one Public IP
address.
However, it appears I am an unable to do this, as the create
portforwardingrule requires a vmID in the network.

Is their some way to achieve this using only CloudStack?

Best regards,
Emil Karlsson
kthcloud


Re: Multilevel NAT with private networks

2023-09-19 Thread Nux

Hello Emil,

I am not sure I follow.
What type of networks are those? Isolated networks, shared networks or 
L2 networks? Or VPC tiers/networks?



On 2023-09-19 10:40, Emil Karlsson wrote:

Hi all,

We're currently using CloudStack as a deployment platform, and I am
interested to know if it's possible to port forward from one private
network to another private network.

Our use case:
We have a common network, and a private networks as "branches" (both 
are of
type "Private networks" in CloudStack's terminology), where a VM can 
exist
in the common network an thus port forwarding is only required in the 
main
router -> VM. But they can also exist in any branch underneath, such 
that a
port forwarding rule is needed from root -> branch router -> VM. As 
below:


internet --- > common network  --- > private network 1
- vm 1   - vm 3
- vm 2   - vm 4

The reason for this, is that it would require only one Public IP 
address.

However, it appears I am an unable to do this, as the create
portforwardingrule requires a vmID in the network.

Is their some way to achieve this using only CloudStack?

Best regards,
Emil Karlsson
kthcloud


Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Jayanth Reddy
Hello Mosharaf,

Please also tail 100 lines of libvirtd like

# journalctl -n 100 -u libvirtd --no-pager

Thanks,
Jayanth



From: Jayanth Reddy 
Sent: Tuesday, September 19, 2023 11:07:51 PM
To: Mosharaf Hossain 
Cc: users@cloudstack.apache.org ; Andrija Panic 
; Product Development | BEXIMCO IT 

Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Hello Mosharaf,

Right. So, libvirt is unable to communicate with Ceph for some reason then. 
Would you also please tell what `ceph -s ` and what `ceph -W cephadm` say? Do 
you see any abnormalities?

Please also confirm your cloudstack version. AFAIK, Read Balancer just changes 
the primary OSDs for the PGs but it appears the issue might be related to this 
or could be something else. It could be with your monitors or the OSDs flapping 
which could potentially make your Ceph pools unavailable for clients. Please 
share your monitor map as well.

PS: We've had the similar issue with Quincy where our OSDs were flapping 
continuously, marking healthier ones dead as well. Issue was at our Intel 810 
series NICs where the driver `ice` wasn't able to handle under heavy network 
load (40 to 50 Gbps + and huge PPS) . We solved it eventually but I'm trying to 
corelate your problem here.

Thanks,
Jayanth

From: Mosharaf Hossain 
Sent: Tuesday, September 19, 2023 10:58:06 PM
To: Jayanth Reddy 
Cc: users@cloudstack.apache.org ; Andrija Panic 
; Product Development | BEXIMCO IT 

Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Hello Reddy
virsh secrect-list is showing but pool-list can't shows and it seems stuck.
[image.png]

Regards
Mosharaf Hossain
Manager, Product Development
IT Division

Bangladesh Export Import Company Ltd.

Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh

Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757

Cell: +8801787680828, Email: 
mosharaf.hoss...@bol-online.com, Web: 
www.bol-online.com



On Tue, Sep 19, 2023 at 9:05 PM Jayanth Reddy 
mailto:jayanthreddy5...@gmail.com>> wrote:
Hello Mosharaf,

I also see that you've created a thread on the Ceph-users mailing list 
regarding this. Did you get a chance to disable the Read Balancer as one of the 
devs suggested?

At Cloudstack end, in order to see if libvirt has issue communicating with 
Ceph, please try executing the below continuously on your hosts

# virsh pool-list

Please let me know if it freezes or doesn't return any response sometimes. 
AFAIK, there shouldn't be any compatibility issues as one of my Cloudstack 
deployments (v4.18.0.0) is running with Reef 18.2.0. Guess it has something to 
do with the Read balancer alone. Please also share your hosts' information, 
I'll see if I can reproduce.

Thanks,
Jayanth


From: Simon Weller mailto:siwelle...@gmail.com>>
Sent: Tuesday, September 19, 2023 8:25:17 PM
To: users@cloudstack.apache.org 
mailto:users@cloudstack.apache.org>>
Cc: Andrija Panic mailto:andrija.pa...@gmail.com>>; 
Product Development | BEXIMCO IT 
mailto:p...@bol-online.com>>
Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Mosharaf,

Did you upgrade the Ceph client on your hosts as well?

What does "ceph -s" report? Is your cluster healthy?

Do you have any logs that indicate OSDs are disconnecting?

I'm not very familiar with the new read balancer feature in Reef. Can you
disable it and see if your performance improves?

-Si











On Tue, Sep 19, 2023 at 1:25 AM Mosharaf Hossain <
mosharaf.hoss...@bol-online.com> wrote:

> Hello Andrija
>
>  Presently, CloudStack's host lists exhibited stability prior to the
> disaster, but their statuses are currently fluctuating continuously. Some
> hosts are initially marked as disconnected, but after a period, they
> transition to a connected state."
>
>
>
>
> [image: image.png]
>
> *Using virsh we are getting VM status on cshost1 as below*
> root@cshost1:~# virsh list
>  IdName   State
> ---
>  10i-14-597-VMrunning
>  61r-757-VM   running
>  69i-24-767-VMrunning
>  76r-71-VMrunning
>  82i-24-797-VMrunning
>  113   r-335-VM   running
>  128   r-577-VM   running
>  148   i-14-1151-VM   running
>  164   i-2-1253-VMrunning
>
>
> Regards
> Mosharaf Hossain
> Manager, Product Development
> IT Division
>
> Bangladesh Export Import Company Ltd.
>
> Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
>
> Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
>
> Cell: +8801787680828, Email: 
> mosharaf.hoss...@bol-online.com, Web:
> 

Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Jayanth Reddy
Hello Mosharaf,

Right. So, libvirt is unable to communicate with Ceph for some reason then. 
Would you also please tell what `ceph -s ` and what `ceph -W cephadm` say? Do 
you see any abnormalities?

Please also confirm your cloudstack version. AFAIK, Read Balancer just changes 
the primary OSDs for the PGs but it appears the issue might be related to this 
or could be something else. It could be with your monitors or the OSDs flapping 
which could potentially make your Ceph pools unavailable for clients. Please 
share your monitor map as well.

PS: We've had the similar issue with Quincy where our OSDs were flapping 
continuously, marking healthier ones dead as well. Issue was at our Intel 810 
series NICs where the driver `ice` wasn't able to handle under heavy network 
load (40 to 50 Gbps + and huge PPS) . We solved it eventually but I'm trying to 
corelate your problem here.

Thanks,
Jayanth

From: Mosharaf Hossain 
Sent: Tuesday, September 19, 2023 10:58:06 PM
To: Jayanth Reddy 
Cc: users@cloudstack.apache.org ; Andrija Panic 
; Product Development | BEXIMCO IT 

Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Hello Reddy
virsh secrect-list is showing but pool-list can't shows and it seems stuck.
[image.png]

Regards
Mosharaf Hossain
Manager, Product Development
IT Division

Bangladesh Export Import Company Ltd.

Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh

Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757

Cell: +8801787680828, Email: 
mosharaf.hoss...@bol-online.com, Web: 
www.bol-online.com



On Tue, Sep 19, 2023 at 9:05 PM Jayanth Reddy 
mailto:jayanthreddy5...@gmail.com>> wrote:
Hello Mosharaf,

I also see that you've created a thread on the Ceph-users mailing list 
regarding this. Did you get a chance to disable the Read Balancer as one of the 
devs suggested?

At Cloudstack end, in order to see if libvirt has issue communicating with 
Ceph, please try executing the below continuously on your hosts

# virsh pool-list

Please let me know if it freezes or doesn't return any response sometimes. 
AFAIK, there shouldn't be any compatibility issues as one of my Cloudstack 
deployments (v4.18.0.0) is running with Reef 18.2.0. Guess it has something to 
do with the Read balancer alone. Please also share your hosts' information, 
I'll see if I can reproduce.

Thanks,
Jayanth


From: Simon Weller mailto:siwelle...@gmail.com>>
Sent: Tuesday, September 19, 2023 8:25:17 PM
To: users@cloudstack.apache.org 
mailto:users@cloudstack.apache.org>>
Cc: Andrija Panic mailto:andrija.pa...@gmail.com>>; 
Product Development | BEXIMCO IT 
mailto:p...@bol-online.com>>
Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Mosharaf,

Did you upgrade the Ceph client on your hosts as well?

What does "ceph -s" report? Is your cluster healthy?

Do you have any logs that indicate OSDs are disconnecting?

I'm not very familiar with the new read balancer feature in Reef. Can you
disable it and see if your performance improves?

-Si











On Tue, Sep 19, 2023 at 1:25 AM Mosharaf Hossain <
mosharaf.hoss...@bol-online.com> wrote:

> Hello Andrija
>
>  Presently, CloudStack's host lists exhibited stability prior to the
> disaster, but their statuses are currently fluctuating continuously. Some
> hosts are initially marked as disconnected, but after a period, they
> transition to a connected state."
>
>
>
>
> [image: image.png]
>
> *Using virsh we are getting VM status on cshost1 as below*
> root@cshost1:~# virsh list
>  IdName   State
> ---
>  10i-14-597-VMrunning
>  61r-757-VM   running
>  69i-24-767-VMrunning
>  76r-71-VMrunning
>  82i-24-797-VMrunning
>  113   r-335-VM   running
>  128   r-577-VM   running
>  148   i-14-1151-VM   running
>  164   i-2-1253-VMrunning
>
>
> Regards
> Mosharaf Hossain
> Manager, Product Development
> IT Division
>
> Bangladesh Export Import Company Ltd.
>
> Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
>
> Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
>
> Cell: +8801787680828, Email: 
> mosharaf.hoss...@bol-online.com, Web:
> www.bol-online.com
>
> 
>
>
>
> On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
> mailto:andrija.pa...@gmail.com>>
> wrote:
>
>> Hi,
>>
>> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
>> broken: invalid connection pointer in 

Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Mosharaf Hossain
Hello Reddy
virsh secrect-list is showing but pool-list can't shows and it seems stuck.
[image: image.png]

Regards
Mosharaf Hossain
Manager, Product Development
IT Division

Bangladesh Export Import Company Ltd.

Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh

Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757

Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
www.bol-online.com




On Tue, Sep 19, 2023 at 9:05 PM Jayanth Reddy 
wrote:

> Hello Mosharaf,
>
> I also see that you've created a thread on the Ceph-users mailing list
> regarding this. Did you get a chance to disable the Read Balancer as one of
> the devs suggested?
>
> At Cloudstack end, in order to see if libvirt has issue communicating with
> Ceph, please try executing the below continuously on your hosts
>
> # virsh pool-list
>
> Please let me know if it freezes or doesn't return any response sometimes.
> AFAIK, there shouldn't be any compatibility issues as one of my Cloudstack
> deployments (v4.18.0.0) is running with Reef 18.2.0. Guess it has something
> to do with the Read balancer alone. Please also share your hosts'
> information, I'll see if I can reproduce.
>
> Thanks,
> Jayanth
>
> --
> *From:* Simon Weller 
> *Sent:* Tuesday, September 19, 2023 8:25:17 PM
> *To:* users@cloudstack.apache.org 
> *Cc:* Andrija Panic ; Product Development |
> BEXIMCO IT 
> *Subject:* Re: CloudStack agent can't connect to upgraded CEPH Cluster
>
> Mosharaf,
>
> Did you upgrade the Ceph client on your hosts as well?
>
> What does "ceph -s" report? Is your cluster healthy?
>
> Do you have any logs that indicate OSDs are disconnecting?
>
> I'm not very familiar with the new read balancer feature in Reef. Can you
> disable it and see if your performance improves?
>
> -Si
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Sep 19, 2023 at 1:25 AM Mosharaf Hossain <
> mosharaf.hoss...@bol-online.com> wrote:
>
> > Hello Andrija
> >
> >  Presently, CloudStack's host lists exhibited stability prior to the
> > disaster, but their statuses are currently fluctuating continuously. Some
> > hosts are initially marked as disconnected, but after a period, they
> > transition to a connected state."
> >
> >
> >
> >
> > [image: image.png]
> >
> > *Using virsh we are getting VM status on cshost1 as below*
> > root@cshost1:~# virsh list
> >  IdName   State
> > ---
> >  10i-14-597-VMrunning
> >  61r-757-VM   running
> >  69i-24-767-VMrunning
> >  76r-71-VMrunning
> >  82i-24-797-VMrunning
> >  113   r-335-VM   running
> >  128   r-577-VM   running
> >  148   i-14-1151-VM   running
> >  164   i-2-1253-VMrunning
> >
> >
> > Regards
> > Mosharaf Hossain
> > Manager, Product Development
> > IT Division
> >
> > Bangladesh Export Import Company Ltd.
> >
> > Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
> >
> > Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
> >
> > Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
> > www.bol-online.com
> >
> > <
> https://www.google.com/url?q=http://www.bol-online.com=D=hangouts=1557908951423000=AFQjCNGMxIuHSHsD3qO6y5JddpEZ0S592A
> >
> >
> >
> >
> > On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
> > wrote:
> >
> >> Hi,
> >>
> >> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
> >> broken: invalid connection pointer in virConnectGetVersion " - is a
> false
> >> alarm and does NOT means any errors actually.
> >>
> >> I can see that ACS agent sees different storage pools - namely
> >> "daab90ad-42d3-3c48-a9e4-b4c3c7fcdc84" and
> >> "a2d455c6-68cb-303f-a7fa-287e62a5be9c" - and I don't see any explicit
> error
> >> message about these 2 pools (both RBD/Ceph) pools.
> >>
> >> Also I can see that the cloudstack agent says it's connected to the mgmt
> >> host - which means that all pools are in place (otherwise the agent
> would
> >> not connect)
> >>
> >> 1. Are you KVM hosts all green when checking in CloudStack UI
> >> (Connected/Up)?
> >> 2. You can always use virsh to list pools and see if they are there
> >>
> >> Best,
> >>
> >> On Wed, 13 Sept 2023 at 13:54, Mosharaf Hossain <
> >> mosharaf.hoss...@bol-online.com> wrote:
> >>
> >>> Hello Folks
> >>> We've recently performed an upgrade on our Cephadm cluster,
> transitioning
> >>> from Ceph Quiency to Reef. However, following the manual implementation
> >>> of
> >>> a read balancer in the Reef cluster, we've experienced a significant
> >>> slowdown in client I/O operations within the Ceph cluster, affecting
> both
> >>> client bandwidth and overall cluster performance.
> >>>
> >>> This slowdown has resulted in unresponsiveness across all virtual
> >>> machines
> >>> within the cluster, despite the fact that the cluster 

Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Jayanth Reddy
Hello Mosharaf,

I also see that you've created a thread on the Ceph-users mailing list 
regarding this. Did you get a chance to disable the Read Balancer as one of the 
devs suggested?

At Cloudstack end, in order to see if libvirt has issue communicating with 
Ceph, please try executing the below continuously on your hosts

# virsh pool-list

Please let me know if it freezes or doesn't return any response sometimes. 
AFAIK, there shouldn't be any compatibility issues as one of my Cloudstack 
deployments (v4.18.0.0) is running with Reef 18.2.0. Guess it has something to 
do with the Read balancer alone. Please also share your hosts' information, 
I'll see if I can reproduce.

Thanks,
Jayanth


From: Simon Weller 
Sent: Tuesday, September 19, 2023 8:25:17 PM
To: users@cloudstack.apache.org 
Cc: Andrija Panic ; Product Development | BEXIMCO IT 

Subject: Re: CloudStack agent can't connect to upgraded CEPH Cluster

Mosharaf,

Did you upgrade the Ceph client on your hosts as well?

What does "ceph -s" report? Is your cluster healthy?

Do you have any logs that indicate OSDs are disconnecting?

I'm not very familiar with the new read balancer feature in Reef. Can you
disable it and see if your performance improves?

-Si











On Tue, Sep 19, 2023 at 1:25 AM Mosharaf Hossain <
mosharaf.hoss...@bol-online.com> wrote:

> Hello Andrija
>
>  Presently, CloudStack's host lists exhibited stability prior to the
> disaster, but their statuses are currently fluctuating continuously. Some
> hosts are initially marked as disconnected, but after a period, they
> transition to a connected state."
>
>
>
>
> [image: image.png]
>
> *Using virsh we are getting VM status on cshost1 as below*
> root@cshost1:~# virsh list
>  IdName   State
> ---
>  10i-14-597-VMrunning
>  61r-757-VM   running
>  69i-24-767-VMrunning
>  76r-71-VMrunning
>  82i-24-797-VMrunning
>  113   r-335-VM   running
>  128   r-577-VM   running
>  148   i-14-1151-VM   running
>  164   i-2-1253-VMrunning
>
>
> Regards
> Mosharaf Hossain
> Manager, Product Development
> IT Division
>
> Bangladesh Export Import Company Ltd.
>
> Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
>
> Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
>
> Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
> www.bol-online.com
>
> 
>
>
>
> On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
> wrote:
>
>> Hi,
>>
>> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
>> broken: invalid connection pointer in virConnectGetVersion " - is a false
>> alarm and does NOT means any errors actually.
>>
>> I can see that ACS agent sees different storage pools - namely
>> "daab90ad-42d3-3c48-a9e4-b4c3c7fcdc84" and
>> "a2d455c6-68cb-303f-a7fa-287e62a5be9c" - and I don't see any explicit error
>> message about these 2 pools (both RBD/Ceph) pools.
>>
>> Also I can see that the cloudstack agent says it's connected to the mgmt
>> host - which means that all pools are in place (otherwise the agent would
>> not connect)
>>
>> 1. Are you KVM hosts all green when checking in CloudStack UI
>> (Connected/Up)?
>> 2. You can always use virsh to list pools and see if they are there
>>
>> Best,
>>
>> On Wed, 13 Sept 2023 at 13:54, Mosharaf Hossain <
>> mosharaf.hoss...@bol-online.com> wrote:
>>
>>> Hello Folks
>>> We've recently performed an upgrade on our Cephadm cluster, transitioning
>>> from Ceph Quiency to Reef. However, following the manual implementation
>>> of
>>> a read balancer in the Reef cluster, we've experienced a significant
>>> slowdown in client I/O operations within the Ceph cluster, affecting both
>>> client bandwidth and overall cluster performance.
>>>
>>> This slowdown has resulted in unresponsiveness across all virtual
>>> machines
>>> within the cluster, despite the fact that the cluster exclusively
>>> utilizes
>>> SSD storage."
>>>
>>> In the CloudStack agent, we are getting libvirrt can't connect to CEPH
>>> pool
>>> and generating an error message.
>>>
>>> 2023-09-13 16:57:51,660 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Lost connection to host: 10.10.11.61. Attempting reconnection
>>> while we still have 1 command in progress.
>>> 2023-09-13 16:57:51,661 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) NioClient connection closed
>>> 2023-09-13 16:57:51,662 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Reconnecting to host:10.10.11.62
>>> 2023-09-13 16:57:51,662 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) Connecting to 10.10.11.62:8250
>>> 2023-09-13 16:57:51,663 INFO  [utils.nio.Link] (Agent-Handler-4:null)
>>> (logid:) Conf file found: 

Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Simon Weller
Mosharaf,

Did you upgrade the Ceph client on your hosts as well?

What does "ceph -s" report? Is your cluster healthy?

Do you have any logs that indicate OSDs are disconnecting?

I'm not very familiar with the new read balancer feature in Reef. Can you
disable it and see if your performance improves?

-Si











On Tue, Sep 19, 2023 at 1:25 AM Mosharaf Hossain <
mosharaf.hoss...@bol-online.com> wrote:

> Hello Andrija
>
>  Presently, CloudStack's host lists exhibited stability prior to the
> disaster, but their statuses are currently fluctuating continuously. Some
> hosts are initially marked as disconnected, but after a period, they
> transition to a connected state."
>
>
>
>
> [image: image.png]
>
> *Using virsh we are getting VM status on cshost1 as below*
> root@cshost1:~# virsh list
>  IdName   State
> ---
>  10i-14-597-VMrunning
>  61r-757-VM   running
>  69i-24-767-VMrunning
>  76r-71-VMrunning
>  82i-24-797-VMrunning
>  113   r-335-VM   running
>  128   r-577-VM   running
>  148   i-14-1151-VM   running
>  164   i-2-1253-VMrunning
>
>
> Regards
> Mosharaf Hossain
> Manager, Product Development
> IT Division
>
> Bangladesh Export Import Company Ltd.
>
> Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
>
> Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
>
> Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
> www.bol-online.com
>
> 
>
>
>
> On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
> wrote:
>
>> Hi,
>>
>> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
>> broken: invalid connection pointer in virConnectGetVersion " - is a false
>> alarm and does NOT means any errors actually.
>>
>> I can see that ACS agent sees different storage pools - namely
>> "daab90ad-42d3-3c48-a9e4-b4c3c7fcdc84" and
>> "a2d455c6-68cb-303f-a7fa-287e62a5be9c" - and I don't see any explicit error
>> message about these 2 pools (both RBD/Ceph) pools.
>>
>> Also I can see that the cloudstack agent says it's connected to the mgmt
>> host - which means that all pools are in place (otherwise the agent would
>> not connect)
>>
>> 1. Are you KVM hosts all green when checking in CloudStack UI
>> (Connected/Up)?
>> 2. You can always use virsh to list pools and see if they are there
>>
>> Best,
>>
>> On Wed, 13 Sept 2023 at 13:54, Mosharaf Hossain <
>> mosharaf.hoss...@bol-online.com> wrote:
>>
>>> Hello Folks
>>> We've recently performed an upgrade on our Cephadm cluster, transitioning
>>> from Ceph Quiency to Reef. However, following the manual implementation
>>> of
>>> a read balancer in the Reef cluster, we've experienced a significant
>>> slowdown in client I/O operations within the Ceph cluster, affecting both
>>> client bandwidth and overall cluster performance.
>>>
>>> This slowdown has resulted in unresponsiveness across all virtual
>>> machines
>>> within the cluster, despite the fact that the cluster exclusively
>>> utilizes
>>> SSD storage."
>>>
>>> In the CloudStack agent, we are getting libvirrt can't connect to CEPH
>>> pool
>>> and generating an error message.
>>>
>>> 2023-09-13 16:57:51,660 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Lost connection to host: 10.10.11.61. Attempting reconnection
>>> while we still have 1 command in progress.
>>> 2023-09-13 16:57:51,661 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) NioClient connection closed
>>> 2023-09-13 16:57:51,662 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Reconnecting to host:10.10.11.62
>>> 2023-09-13 16:57:51,662 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) Connecting to 10.10.11.62:8250
>>> 2023-09-13 16:57:51,663 INFO  [utils.nio.Link] (Agent-Handler-4:null)
>>> (logid:) Conf file found: /etc/cloudstack/agent/agent.properties
>>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) SSL: Handshake done
>>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) Connected to 10.10.11.62:8250
>>> 2023-09-13 16:57:51,815 INFO  [utils.linux.KVMHostInfo]
>>> (Agent-Handler-1:null) (logid:) Fetching CPU speed from command "lscpu".
>>> 2023-09-13 16:57:51,836 INFO  [utils.linux.KVMHostInfo]
>>> (Agent-Handler-1:null) (logid:) Command [lscpu | grep -i 'Model name' |
>>> head -n 1 | egrep -o '[[:digit:]].[[:digit:]]+GHz' | sed 's/GHz//g']
>>> resulted in the value [2100] for CPU speed.
>>> 2023-09-13 16:57:51,900 INFO  [kvm.storage.LibvirtStorageAdaptor]
>>> (Agent-Handler-1:null) (logid:) Attempting to create storage pool
>>> e205cf5f-ea32-46c7-ba18-d18f62772b80 (Filesystem) in libvirt
>>> 2023-09-13 16:57:51,901 ERROR [kvm.resource.LibvirtConnection]
>>> (Agent-Handler-1:null) (logid:) Connection with 

Multilevel NAT with private networks

2023-09-19 Thread Emil Karlsson
Hi all,

We're currently using CloudStack as a deployment platform, and I am
interested to know if it's possible to port forward from one private
network to another private network.

Our use case:
We have a common network, and a private networks as "branches" (both are of
type "Private networks" in CloudStack's terminology), where a VM can exist
in the common network an thus port forwarding is only required in the main
router -> VM. But they can also exist in any branch underneath, such that a
port forwarding rule is needed from root -> branch router -> VM. As below:

internet --- > common network  --- > private network 1
- vm 1   - vm 3
- vm 2   - vm 4

The reason for this, is that it would require only one Public IP address.
However, it appears I am an unable to do this, as the create
portforwardingrule requires a vmID in the network.

Is their some way to achieve this using only CloudStack?

Best regards,
Emil Karlsson
kthcloud


Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Andrija Panic
Hi,

ok, thx for the info.

I meant to use virsh to list pools (storage pools), not VMs - to see if the
storage pools are created inside libvirt.

Best,

On Tue, 19 Sept 2023 at 08:25, Mosharaf Hossain <
mosharaf.hoss...@bol-online.com> wrote:

> Hello Andrija
>
>  Presently, CloudStack's host lists exhibited stability prior to the
> disaster, but their statuses are currently fluctuating continuously. Some
> hosts are initially marked as disconnected, but after a period, they
> transition to a connected state."
>
>
>
>
> [image: image.png]
>
> *Using virsh we are getting VM status on cshost1 as below*
> root@cshost1:~# virsh list
>  IdName   State
> ---
>  10i-14-597-VMrunning
>  61r-757-VM   running
>  69i-24-767-VMrunning
>  76r-71-VMrunning
>  82i-24-797-VMrunning
>  113   r-335-VM   running
>  128   r-577-VM   running
>  148   i-14-1151-VM   running
>  164   i-2-1253-VMrunning
>
>
> Regards
> Mosharaf Hossain
> Manager, Product Development
> IT Division
>
> Bangladesh Export Import Company Ltd.
>
> Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh
>
> Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757
>
> Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
> www.bol-online.com
>
> 
>
>
>
> On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
> wrote:
>
>> Hi,
>>
>> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
>> broken: invalid connection pointer in virConnectGetVersion " - is a false
>> alarm and does NOT means any errors actually.
>>
>> I can see that ACS agent sees different storage pools - namely
>> "daab90ad-42d3-3c48-a9e4-b4c3c7fcdc84" and
>> "a2d455c6-68cb-303f-a7fa-287e62a5be9c" - and I don't see any explicit error
>> message about these 2 pools (both RBD/Ceph) pools.
>>
>> Also I can see that the cloudstack agent says it's connected to the mgmt
>> host - which means that all pools are in place (otherwise the agent would
>> not connect)
>>
>> 1. Are you KVM hosts all green when checking in CloudStack UI
>> (Connected/Up)?
>> 2. You can always use virsh to list pools and see if they are there
>>
>> Best,
>>
>> On Wed, 13 Sept 2023 at 13:54, Mosharaf Hossain <
>> mosharaf.hoss...@bol-online.com> wrote:
>>
>>> Hello Folks
>>> We've recently performed an upgrade on our Cephadm cluster, transitioning
>>> from Ceph Quiency to Reef. However, following the manual implementation
>>> of
>>> a read balancer in the Reef cluster, we've experienced a significant
>>> slowdown in client I/O operations within the Ceph cluster, affecting both
>>> client bandwidth and overall cluster performance.
>>>
>>> This slowdown has resulted in unresponsiveness across all virtual
>>> machines
>>> within the cluster, despite the fact that the cluster exclusively
>>> utilizes
>>> SSD storage."
>>>
>>> In the CloudStack agent, we are getting libvirrt can't connect to CEPH
>>> pool
>>> and generating an error message.
>>>
>>> 2023-09-13 16:57:51,660 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Lost connection to host: 10.10.11.61. Attempting reconnection
>>> while we still have 1 command in progress.
>>> 2023-09-13 16:57:51,661 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) NioClient connection closed
>>> 2023-09-13 16:57:51,662 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>>> (logid:) Reconnecting to host:10.10.11.62
>>> 2023-09-13 16:57:51,662 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) Connecting to 10.10.11.62:8250
>>> 2023-09-13 16:57:51,663 INFO  [utils.nio.Link] (Agent-Handler-4:null)
>>> (logid:) Conf file found: /etc/cloudstack/agent/agent.properties
>>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) SSL: Handshake done
>>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient]
>>> (Agent-Handler-4:null)
>>> (logid:) Connected to 10.10.11.62:8250
>>> 2023-09-13 16:57:51,815 INFO  [utils.linux.KVMHostInfo]
>>> (Agent-Handler-1:null) (logid:) Fetching CPU speed from command "lscpu".
>>> 2023-09-13 16:57:51,836 INFO  [utils.linux.KVMHostInfo]
>>> (Agent-Handler-1:null) (logid:) Command [lscpu | grep -i 'Model name' |
>>> head -n 1 | egrep -o '[[:digit:]].[[:digit:]]+GHz' | sed 's/GHz//g']
>>> resulted in the value [2100] for CPU speed.
>>> 2023-09-13 16:57:51,900 INFO  [kvm.storage.LibvirtStorageAdaptor]
>>> (Agent-Handler-1:null) (logid:) Attempting to create storage pool
>>> e205cf5f-ea32-46c7-ba18-d18f62772b80 (Filesystem) in libvirt
>>> 2023-09-13 16:57:51,901 ERROR [kvm.resource.LibvirtConnection]
>>> (Agent-Handler-1:null) (logid:) Connection with libvirtd is broken:
>>> invalid
>>> connection pointer in virConnectGetVersion
>>> 2023-09-13 16:57:51,903 INFO  [kvm.storage.LibvirtStorageAdaptor]
>>> 

Detected that another management node with the same IP is already running, please check your cluster configuration

2023-09-19 Thread jaejong
rocky linux 9.2
mysql 8.0.32
A single Management Server node with mysql on the same node

1. After rebooting management host I get follow error messages

sudo systemctl status cloudstack-management

 Loaded: loaded (/usr/lib/systemd/system/cloudstack-management.service; 
enabled; preset: disabled)
 Active: active (running) since Tue 2023-09-19 17:35:39 KST; 2min 3s ago
   Main PID: 1148 (java)
  Tasks: 49 (limit: 408699)
 Memory: 1.1G
CPU: 40.653s
 CGroup: /system.slice/cloudstack-management.service
 └─1148 /usr/bin/java 
-Djava.security.properties=/etc/cloudstack/management/java.security.ciphers 
-Djava.awt.headless=true -Dcom.sun.management.>

java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.start(ContainerLifeCycle.java:169)
java[1148]: at org.eclipse.jetty.server.Server.start(Server.java:423)
java[1148]: at 
org.eclipse.jetty.util.component.ContainerLifeCycle.doStart(ContainerLifeCycle.java:110)
java[1148]: at 
org.eclipse.jetty.server.handler.AbstractHandler.doStart(AbstractHandler.java:97)
java[1148]: at org.eclipse.jetty.server.Server.doStart(Server.java:387)
java[1148]: at 
org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:73)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.start(ServerDaemon.java:192)
java[1148]: at 
org.apache.cloudstack.ServerDaemon.main(ServerDaemon.java:107)
java[1148]: INFO  [o.a.c.s.NfsMountManager] (main:null) (logid:) Clean up 
mounted NFS mount points used in current session

server.log
2023-09-19 17:35:54,352 ERROR [c.c.c.ClusterManagerImpl] (main:null) (logid:) 
Detected that another management node with the same IP 10.0.33.1 is already 
running, please check your cluster configuration
2023-09-19 17:35:54,353 ERROR [o.a.c.s.l.CloudStackExtendedLifeCycle] 
(main:null) (logid:) Failed to configure ClusterManagerImpl
javax.naming.ConfigurationException: Detected that another management node with 
the same IP  is already running, please check your cluster configuration
at 
com.cloud.cluster.ClusterManagerImpl.checkConflicts(ClusterManagerImpl.java:1245)
at 
com.cloud.cluster.ClusterManagerImpl.configure(ClusterManagerImpl.java:1115)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle$3.with(CloudStackExtendedLifeCycle.java:114)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.with(CloudStackExtendedLifeCycle.java:153)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.configure(CloudStackExtendedLifeCycle.java:110)
at 
org.apache.cloudstack.spring.lifecycle.CloudStackExtendedLifeCycle.start(CloudStackExtendedLifeCycle.java:55)
at 
org.springframework.context.support.DefaultLifecycleProcessor.doStart(DefaultLifecycleProcessor.java:178)
at 
org.springframework.context.support.DefaultLifecycleProcessor.access$200(DefaultLifecycleProcessor.java:54)
at 
org.springframework.context.support.DefaultLifecycleProcessor$LifecycleGroup.start(DefaultLifecycleProcessor.java:356)
at java.base/java.lang.Iterable.forEach(Iterable.java:75)
at 
org.springframework.context.support.DefaultLifecycleProcessor.startBeans(DefaultLifecycleProcessor.java:155)
at 
org.springframework.context.support.DefaultLifecycleProcessor.onRefresh(DefaultLifecycleProcessor.java:123)
at 
org.springframework.context.support.AbstractApplicationContext.finishRefresh(AbstractApplicationContext.java:937)
at 
org.springframework.context.support.AbstractApplicationContext.refresh(AbstractApplicationContext.java:586)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.loadContext(DefaultModuleDefinitionSet.java:144)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet$2.with(DefaultModuleDefinitionSet.java:121)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:244)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:249)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:249)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.withModule(DefaultModuleDefinitionSet.java:232)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.loadContexts(DefaultModuleDefinitionSet.java:116)
at 
org.apache.cloudstack.spring.module.model.impl.DefaultModuleDefinitionSet.load(DefaultModuleDefinitionSet.java:78)
at 
org.apache.cloudstack.spring.module.factory.ModuleBasedContextFactory.loadModules(ModuleBasedContextFactory.java:37)
   

Re: [DISCUSS] New Design for the Apache CloudStack Website

2023-09-19 Thread Ivet Petrova
Hi all,

Thanks for the good feedback and positive words.
I would suggest that if no-one has something against the new design, we start a 
development of the new home page.
For the internal pages, which are mainly text, we can add simple headers and 
footers and something lightweight.

I will wait till next Wednesday to see if someone wants to add something more.

Kind regards,


 

On 7 Sep 2023, at 18:48, Foysal Kayum 
mailto:foy...@bol-online.com>> wrote:

Seems Great .



On Thu, Sep 7, 2023, 4:07 PM Wido den Hollander 
mailto:w...@widodh.nl>> wrote:


Op 31-08-2023 om 11:41 schreef Rohit Yadav:
> Thanks Ivet, the new iterated design looks impressive especially some of the 
> new graphical elements.
>
> +1 to moving forward with the proposed design. Let's collect any further 
> feedback and ideas from the community and if there are no objections go ahead 
> with updating the preview/staging site 
> (https://cloudstack.staged.apache.org/) and eventually publishing the website.
>
> Just to be clear on the CMS integration - the ASF infra has declined 
> supporting a 3rd party git-based CMS that can be used with the project 
> website. This is such a shame as I had created PoCs with some rather okay-ist 
> git-based CMS such as Netlify-CMS, TinaCMS and SpinalCMS which would give 
> similar UI-based workflow like the old Roller-based blog did.
>
> Nevertheless, for the purposes of publishing blogs the new Github-based 
> document/markdown editor/CMS is fair enough and now allows uploading of 
> assets (images, files etc.) and blog that can be edited directly for any 
> committer incl. PMC members logged into Github and upon saving such changes 
> the website is published by an automatic Github action that builds and 
> published the websites. Unfortunately, any non-committer would need to follow 
> the PR workflow. I had this documented at 
> https://cloudstack.staged.apache.org/website-guide/ that I can help further 
> update in this regard.
>

Thank you Ivet and Rohit, this is looking great! Much better then we had.

I have no real comments not objections to do. It's a great improvement
over what we currently have.

Wido

>
> Regards.
>
> 
> From: Ivet Petrova 
> mailto:ivet.petr...@shapeblue.com>>
> Sent: Wednesday, August 30, 2023 19:04
> To: Giles Sirett 
> mailto:giles.sir...@shapeblue.com>>
> Cc: d...@cloudstack.apache.org 
> mailto:d...@cloudstack.apache.org>>; 
> users@cloudstack.apache.org 
> mailto:users@cloudstack.apache.org>>; Marketing 
> mailto:market...@shapeblue.com>>
> Subject: Re: [DISCUSS] New Design for the Apache CloudStack Website
>
> Hi All,
>
> I uploaded the design here: 
> https://drive.google.com/file/d/1pef7xWWMPYAA5UkbS_XMUxrz53KB7J5t/view?usp=sharing
>
>
> Kind regards,
>
>
>
>
>
>
>
> On 30 Aug 2023, at 16:31, Giles Sirett 
> mailto:giles.sir...@shapeblue.com>>>
>  wrote:
>
> Hi Ivet – thanks for pushing forward with this – excited to review a new 
> design.
>
> On that note, I cant see a link in your mail ☹
>
> Kind Regards
> Giles
>
>
> Giles Sirett
> CEO
> giles.sir...@shapeblue.com>
> www.shapeblue.com
>
>
>
>
> From: Ivet Petrova 
> mailto:ivet.petr...@shapeblue.com>>>
> Sent: Wednesday, August 30, 2023 10:14 AM
> To: 
> users@cloudstack.apache.org>;
>  Marketing 
> mailto:market...@shapeblue.com>>>
> Cc: dev 
> mailto:d...@cloudstack.apache.org>>>
> Subject: [DISCUSS] New Design for the Apache CloudStack Website
>
> Hello,
>
> I would like to start a discussion on the design of the Apache CloudStack 
> Website and to propose a new design for it.
>
> As we all know, the website has not been changed for years in terms of design 
> and information. The biggest issue we know we have is that the website is not 
> showing the full potential of CloudStack. In addition to it during 
> discussions with many community members, I have noted the following issues:
> - the existing website design is old-school
> - the current homepage does not collect enough information to show 
> CloudStack's strengths
> - current website design is missing images from the ACS UI and cannot create 
> a feel for the product in the users
> - the website has issues on a mobile device
> - we lack any graphic and diagrams
> - some important information like how to download is not very visible
>
> I collected a lot of feedback during last months and want to propose a new up 
> 

Re: CloudStack agent can't connect to upgraded CEPH Cluster

2023-09-19 Thread Mosharaf Hossain
Hello Andrija

 Presently, CloudStack's host lists exhibited stability prior to the
disaster, but their statuses are currently fluctuating continuously. Some
hosts are initially marked as disconnected, but after a period, they
transition to a connected state."




[image: image.png]

*Using virsh we are getting VM status on cshost1 as below*
root@cshost1:~# virsh list
 IdName   State
---
 10i-14-597-VMrunning
 61r-757-VM   running
 69i-24-767-VMrunning
 76r-71-VMrunning
 82i-24-797-VMrunning
 113   r-335-VM   running
 128   r-577-VM   running
 148   i-14-1151-VM   running
 164   i-2-1253-VMrunning


Regards
Mosharaf Hossain
Manager, Product Development
IT Division

Bangladesh Export Import Company Ltd.

Level-8, SAM Tower, Plot #4, Road #22, Gulshan-1, Dhaka-1212,Bangladesh

Tel: +880 9609 000 999, +880 2 5881 5559, Ext: 14191, Fax: +880 2 9895757

Cell: +8801787680828, Email: mosharaf.hoss...@bol-online.com, Web:
www.bol-online.com




On Mon, Sep 18, 2023 at 12:43 PM Andrija Panic 
wrote:

> Hi,
>
> the message " Agent-Handler-1:null) (logid:) Connection with libvirtd is
> broken: invalid connection pointer in virConnectGetVersion " - is a false
> alarm and does NOT means any errors actually.
>
> I can see that ACS agent sees different storage pools - namely
> "daab90ad-42d3-3c48-a9e4-b4c3c7fcdc84" and
> "a2d455c6-68cb-303f-a7fa-287e62a5be9c" - and I don't see any explicit error
> message about these 2 pools (both RBD/Ceph) pools.
>
> Also I can see that the cloudstack agent says it's connected to the mgmt
> host - which means that all pools are in place (otherwise the agent would
> not connect)
>
> 1. Are you KVM hosts all green when checking in CloudStack UI
> (Connected/Up)?
> 2. You can always use virsh to list pools and see if they are there
>
> Best,
>
> On Wed, 13 Sept 2023 at 13:54, Mosharaf Hossain <
> mosharaf.hoss...@bol-online.com> wrote:
>
>> Hello Folks
>> We've recently performed an upgrade on our Cephadm cluster, transitioning
>> from Ceph Quiency to Reef. However, following the manual implementation of
>> a read balancer in the Reef cluster, we've experienced a significant
>> slowdown in client I/O operations within the Ceph cluster, affecting both
>> client bandwidth and overall cluster performance.
>>
>> This slowdown has resulted in unresponsiveness across all virtual machines
>> within the cluster, despite the fact that the cluster exclusively utilizes
>> SSD storage."
>>
>> In the CloudStack agent, we are getting libvirrt can't connect to CEPH
>> pool
>> and generating an error message.
>>
>> 2023-09-13 16:57:51,660 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>> (logid:) Lost connection to host: 10.10.11.61. Attempting reconnection
>> while we still have 1 command in progress.
>> 2023-09-13 16:57:51,661 INFO  [utils.nio.NioClient] (Agent-Handler-4:null)
>> (logid:) NioClient connection closed
>> 2023-09-13 16:57:51,662 INFO  [cloud.agent.Agent] (Agent-Handler-4:null)
>> (logid:) Reconnecting to host:10.10.11.62
>> 2023-09-13 16:57:51,662 INFO  [utils.nio.NioClient] (Agent-Handler-4:null)
>> (logid:) Connecting to 10.10.11.62:8250
>> 2023-09-13 16:57:51,663 INFO  [utils.nio.Link] (Agent-Handler-4:null)
>> (logid:) Conf file found: /etc/cloudstack/agent/agent.properties
>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient] (Agent-Handler-4:null)
>> (logid:) SSL: Handshake done
>> 2023-09-13 16:57:51,779 INFO  [utils.nio.NioClient] (Agent-Handler-4:null)
>> (logid:) Connected to 10.10.11.62:8250
>> 2023-09-13 16:57:51,815 INFO  [utils.linux.KVMHostInfo]
>> (Agent-Handler-1:null) (logid:) Fetching CPU speed from command "lscpu".
>> 2023-09-13 16:57:51,836 INFO  [utils.linux.KVMHostInfo]
>> (Agent-Handler-1:null) (logid:) Command [lscpu | grep -i 'Model name' |
>> head -n 1 | egrep -o '[[:digit:]].[[:digit:]]+GHz' | sed 's/GHz//g']
>> resulted in the value [2100] for CPU speed.
>> 2023-09-13 16:57:51,900 INFO  [kvm.storage.LibvirtStorageAdaptor]
>> (Agent-Handler-1:null) (logid:) Attempting to create storage pool
>> e205cf5f-ea32-46c7-ba18-d18f62772b80 (Filesystem) in libvirt
>> 2023-09-13 16:57:51,901 ERROR [kvm.resource.LibvirtConnection]
>> (Agent-Handler-1:null) (logid:) Connection with libvirtd is broken:
>> invalid
>> connection pointer in virConnectGetVersion
>> 2023-09-13 16:57:51,903 INFO  [kvm.storage.LibvirtStorageAdaptor]
>> (Agent-Handler-1:null) (logid:) Found existing defined storage pool
>> e205cf5f-ea32-46c7-ba18-d18f62772b80, using it.
>> 2023-09-13 16:57:51,904 INFO  [kvm.storage.LibvirtStorageAdaptor]
>> (Agent-Handler-1:null) (logid:) Trying to fetch storage pool
>> e205cf5f-ea32-46c7-ba18-d18f62772b80 from libvirt
>> 2023-09-13 16:57:51,924 INFO  [cloud.agent.Agent] (Agent-Handler-2:null)
>> (logid:) Process agent startup answer, agent id = 0
>>