Hello all, After having tried to collect the correct metrics from the API, we had to resort to a lower level set of collectors, as some of the metrics (e.g. per VM resource usage) were incorrect.
Our current setup is netdata that uses the cgroups collector to gather from KVM/QEMU. We then match the VM names with metadata which we are fetching directly from the API to enrich the metrics with labels. We're then storing everything inside Prometheus, and exposing the graphs via a separate Grafana. Globally, the feedback we've had from our end-users is that after seeing incorrect metrics (mostly in used RAM counters) they got really confused about the actual state of their project. It is really unsettling to see a VM configured to have multiple GB of RAM to use 100% of it immediately at boot (which is incorrect, because the incorrect underlying counter was used). We've also had other instances where the CPU usage went above 100% which looks really confusing. Finally, users were sceptical as to why there are two separate measurements for CPU, one being the total frequency (whatever that is supposed to mean), and the other the number of cores used. All these encounters really hurt when advocating it as a trustworthy system to our users. Now, we've tried some workarounds, for example, we tried compensating for the incorrect counters by enabling memory ballooning, which somewhat helped with the correct values but brought along its own set of issues. In the future, we are planning to do a lot more tests on the metrics, by comparing what Cloudstack gives us with what we actually see inside our systems. We hope that we'll eventually find the correct configuration elements, and maybe flag some issues if we happen to find any. To sum up, we believe that the current implementation of metrics isn't nearly accurate enough (at least out of the box) to be shown to all users as a landing page. We also completely get it that Cloudstack is a complex and complete ecosystem that has ties and dependencies to many sub-components, making it difficult to try and test all the possible use-cases. We hope to reach a point soon where we have used and understand the system well enough to be able to contribute in a meaningful way. Regards, Vladimir >On Mon, 7 Mar 2022 at 12:02, Ivet Petrova <ivet.petr...@shapeblue.com> wrote: >Maybe this talk from the last CloudStack Collaboration Conference can be >useful: https://www.youtube.com/watch?v=m8mYdWHoxLY > > > > Kind regards, > > > > > On 7 Mar 2022, at 12:10, benoit lair > <kurushi4...@gmail.com<mailto:kurushi4...@gmail.com>> wrote: > > Hi, > > We are now using Centreon with a custom autodeclaration feature done with > our own templates (now acs 4.16 in production) > If we can use something more out of the box i would enjoy to change it > We used to use Zenoss for our ACS 4.3 which has a plugin specific to > Cloudstack, but it is no more maintained > > Regards, Benoit > > Le dim. 6 mars 2022 à 16:40, Paul Angus > <pau...@apache.org<mailto:pau...@apache.org>> a écrit : > > Hi Nux, > > At Ticketmaster we use the Prometheus exporter. We about to work on adding > more detail to what's exported wrt to VMs, as it very infrastructure > focused > out-of-the-box. > > > > Kind regards > > Paul Angus > > -----Original Message----- > From: Nux <n...@li.nux.ro<mailto:n...@li.nux.ro>> > Sent: 02 March 2022 10:56 > To: users@cloudstack.apache.org<mailto:users@cloudstack.apache.org>; > d...@cloudstack.apache.org<mailto:d...@cloudstack.apache.org> > Subject: Re: How are you monitoring Cloudstack? > > Hi! > > Another nudge on the $subject in case people missed this. > > If you have a functioning way of monitoring Cloudstack & co in your > organisation I'd like to hear about it. > It doesn't have to be anything exotic, so don't be shy as long as we have > anything to talk about. > > Thanks :) > > > On 2022-02-21 14:38, Nux! wrote: > Hi folks, > > If anyone cares to share (on or off list) with me a few words about > how they are monitoring Cloudstack and related infrastructure that'd > be lovely. > I'm trying to find out what are the choices currently and how we can > improve the overall experience. > > Don't be shy! > > Cheers > > > -- *CONFIDENTIALITY AND DISCLAIMER NOTICE: * This email is intended only for the person to whom it is addressed and/or otherwise authorized personnel. The information contained herein and attached is confidential. If you are not the intended recipient, please be advised that viewing this message and any attachments, as well as copying, forwarding, printing, and disseminating any information related to this email is prohibited, and that you should not take any action based on the content of this email and/or its attachments. If you received this message in error, please contact the sender and destroy all copies of this email and any attachment. Please note that the views and opinions expressed herein are solely those of the author and do not necessarily reflect those of the company. While antivirus protection tools have been employed, you should check this email and attachments for the presence of viruses. No warranties or assurances are made in relation to the safety and content of this email and attachments. The Company accepts no liability for any damage caused by any virus transmitted by or contained in this email and attachments. No liability is accepted for any consequences arising from this email. *AVIS DE CONFIDENTIALITÉ ET DE NON RESPONSABILITE* : Ce courriel, ainsi que toute pièce jointe, est confidentiel et peut être protégé par le secret professionnel. Si vous n’en êtes pas le destinataire visé, veuillez en aviser l’expéditeur immédiatement et le supprimer. Vous ne devez pas le copier, ni l’utiliser à quelque fin que ce soit, ni divulguer son contenu à qui que ce soit. BSO se réserve le droit de contrôler toute transmission qui passe par son réseau. Veuillez noter que les opinions exprimées dans cet e-mail sont uniquement celles de l'auteur et ne reflètent pas nécessairement celles de la société. Bien que des outils de protection antivirus aient été utilisés, vous devez vérifier cet e-mail et les pièces jointes pour toute présence de virus. Aucune garantie ou assurance n'est donnée concernant la sécurité et le contenu de cet e-mail et de ses pièces jointes. La Société décline toute responsabilité pour tout dommage causé par tout virus transmis par ou contenu dans cet e-mail et ses pièces jointes. Aucune responsabilité n'est acceptée pour les conséquences découlant de cet e-mail.