On Sun, Jan 7, 2018 at 11:24 PM, Tristan Cacqueray <tdeca...@redhat.com> wrote:
> Hello David and Wesley, please find some comments inlined bellow. > > On January 5, 2018 6:39 pm, Wesley Hayutin wrote: > >> On Fri, Jan 5, 2018 at 12:36 PM, David Moreau Simard <d...@redhat.com> >> wrote: >> >> There are already plans [1] to add the software factory implementation of >>> Grafana on review.rdoproject.org, you can see what it looks like on >>> softwarefactory-project.io [2]. >>> >>> The backend to this grafana implementation is currently influxdb, not >>> graphite. >>> However, there are ongoing discussions to either both graphite and >>> influxdb simultaneously or optionally either. >>> >>> We're interested in leveraging this influxdb (or graphite) and grafana >>> implementation for monitoring data in general (uptime, resources, disk >>> space, load, etc.) so our goals align here. >>> We both agree that using graphite would be a plus in order to re-use the >>> same queries in the grafana dashboard but at the same time, influxdb is >>> more "modern" and easier to work with -- this is why we might end up >>> deploying both, we'll see. >>> >>> [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1514086 >>> [2]: https://softwarefactory-project.io/grafana/ >>> >>> > Note that the current influxdb/grafana integration is for instance system > metric (cpu, mem, network and i/o). We are working on getting zuul and > nodepool metric but the upstream query needs to be adapted for influxdb, > hence we may look at integrating graphite/carbon too so that is easier. > There is also this tool that can make influxdb a backend for graphite: > https://github.com/InfluxGraph/influxgraph > > Also note that we are integrating grafyaml to the config repo so that > grafana dashboards can be proposed and updated by regular user too. > > >>> This is great news David, thank you for sharing. >> Given that this is already in plan software factory and we have an >> immediate need I'm wondering how to proceed. >> Does the RDO Infra team have an estimate when graphite/influxdb/grafana >> will be moved to production? >> > > While we could setup the grafana/influxdb service, and we should > in the near future, it seems like this ci use-case needs some more > tinkering and I think it would be easier to start with another > dedicated setup until the requirements are better defined. > > > Some possibilities come to mind, depending on when it moves to prod >> >> 1. The TripleO-CI team waits for prod >> 2. TripleO CI would stand up a test instance of graphite/influxdb and >> grapha and start to work out what we need to send and how to send data >> 3. Is it possible to use the stage instance RDO SF as a testbed for >> TripleO-CI's work? Meaning we send metrics and use the stage instance >> with >> a backing up the data in mind? >> >> What do you think? >> Thanks >> >> >> > I think 1. will happen shortly, and this will bring a grafana setup > accessible from the top-menu. > > Though I think 2. is probably easier to begin with, and we could > configure the new graphite/influxdb backend in the existing grafana. > > Not sure what you mean by 3. If there is a graphite/influxdb service in > rdo-prod tenant, then you could use it for tripleo-ci work of course. > The backup of RDO SF is managed by this playbook: > https://softwarefactory-project.io/r/gitweb?p=software- > factory/sf-ops.git;a=blob;f=backup/ansible/backup.yml > We could add_host the new backend and backup it's data similarly. > > > Here are some more thoughts: > > Dependending on how the metrics are pushed, we may need some kind of > authorization mechanism and a job secret to allow external clients to > push new metrics. > > It seems like we could setup post run to push job metrics. Perhaps > we could leverage ara sqldump to extract per task duration. > > Software Factory may also automatically setup job duration graph > dashboard per project, here is a new user-story to track this work: > https://tree.taiga.io/project/morucci-software-factory/us/897 > > > Alternatively, we could also use the zuul sql reporter database, which > already record the start/end time of each job. Here is a gnuplot of that > data: > https://fedorapeople.org/~tdecacqu/tripleo-ci/periodic-tripl > eo-ci-centos-7-multinode-1ctlr-featureset018-pike.png > This could probably be integrated in the zuul-web dashboard upstream. > > Alternatively, the elasticsearch data could also be used to constructed a > similar graph in kibana, though it seems like it's missing a duration > field. > > > Regards, > -Tristan Thanks for the feedback David, Tristan. We will be discussing your feedback tomorrow directly after the tripleo meeting on #tripleo. You guys are always welcome to join, just ping on #oooq / #tripleo for details about the meeting. We're going to spend about 20min in a Q&A session about the tools. We'll follow up with our plans to this thread. Thanks all! > > > >> >> >>> >>> David Moreau Simard >>> Senior Software Engineer | OpenStack RDO >>> >>> dmsimard = [irc, github, twitter] >>> >>> On Fri, Jan 5, 2018 at 12:13 PM, Wesley Hayutin <whayu...@redhat.com> >>> wrote: >>> >>> Greetings, >>>> >>>> At the end of 2017, a number of the upstream multinode scenario jobs >>>> started to run over our required deployment times [1]. In an effort to >>>> better understand the performance of the deployment and CI the tripleo >>>> cores requested that a Graphite and Grafana server be stood up such >>>> that we >>>> can analyze the core issues more effectively. >>>> >>>> There is a certain amount of urgency with the issue as our upstream >>>> coverage is impacted. The TripleO-CI team is working on the deployment >>>> of >>>> both tools in a dev-ops style in RDO-Cloud this sprint. Nothing yet has >>>> been deployed. >>>> >>>> The TripleO CI team is also working with upstream infra to send metric >>>> and data to the upstream Graphite and Grafana servers. It is not clear >>>> yet >>>> if we have permission or access to the upstream tools. >>>> >>>> I wanted to publically announce this work to the RDO infra community to >>>> inform and to gather any feedback anyone may have. There are two >>>> scopes of >>>> work here, the initial tooling to stand up the infra and the longer term >>>> maintenance of the tools. Perhaps there are plans to build these into >>>> RDO >>>> SF already.. etc. >>>> >>>> Please reply with your comments and concerns. >>>> Thank you! >>>> >>>> >>>> [1] https://github.com/openstack-infra/tripleo-ci/commit/7a2 >>>> edf70eccfc7002d26fd1ce1eef803ce8d0ba8 >>>> >>>> >>>> >>>> _______________________________________________ >>>> dev mailing list >>>> dev@lists.rdoproject.org >>>> http://lists.rdoproject.org/mailman/listinfo/dev >>>> >>>> To unsubscribe: dev-unsubscr...@lists.rdoproject.org >>>> >>>> >>>> >>> _______________________________________________ >> dev mailing list >> dev@lists.rdoproject.org >> http://lists.rdoproject.org/mailman/listinfo/dev >> >> To unsubscribe: dev-unsubscr...@lists.rdoproject.org >> >>
_______________________________________________ dev mailing list dev@lists.rdoproject.org http://lists.rdoproject.org/mailman/listinfo/dev To unsubscribe: dev-unsubscr...@lists.rdoproject.org