Thanks, two follow-ups if you have a second: - What draws you to that page in the morning - a habit, an alert, something else? And is this for one system or several? - And once you see something firing, what's the next step: jump straight to the system, check the underlying metric, or ping someone?
PiotrW On Mon, Jun 8, 2026 at 11:09 AM Edmore Tshuma <[email protected]> wrote: > I check the Grafana Alert Rules page first. It shows the state of all > configured alerts - which ones are firing/not firing. > > NB: It’s not a dashboard, just some menu item in Grafana > > On Mon 8. 6. 2026 at 10:05, Piotr Wargulak <[email protected]> > wrote: > >> Hello all, >> >> @Edmore Tshuma: Thanks for the detailed answer last week, that gave me a >> good list of what a mature observability setup should cover. >> >> I'm still hoping to hear from people who actually run Fineract (or >> similar OSS) in production. Not how it should be done in theory, but how >> you actually do it day to day. >> >> If a full conversation feels like too much, here's a smaller question: >> what's the first thing you check in the morning to see if the system is OK? >> A dashboard, a log, a Slack channel, or do you "wait for someone to >> complain"? >> >> I'm especially interested in hearing from operators who lack a budget for >> a dedicated SRE team. >> >> Best regards, >> PiotrW >> SolDevelo >> >> On Mon, Jun 1, 2026 at 1:08 PM Edmore Tshuma <[email protected]> wrote: >> >>> Observability/SRE expert here :) >>> >>> well … it’s quite a lot. >>> >>> At a minimum you need: >>> >>> 1. A minimal full-fledged Observability stack (can be pure OSS tools) >>> which handles: >>> >>> - which telemetry you want to collect and from which workloads >>> - who decides what is worth collecting >>> - who decides what to drop and when >>> - alert rules standards, runbooks standards >>> - dashboards standards >>> - SLIs, SLOs >>> - what alerts to route to which teams ? >>> - which alerts should go to only prod environment/ staging/ dev ? >>> - data residency and compliance (where can your telemetry be stored >>> compliance-wise) >>> - securing your telemetry >>> - which alerts need to be suppressed >>> - Cost-management for the stack (THIS IS IMPORTANT !!) >>> >>> 2. Incident Management program >>> - on-call schedules, post-mortems >>> - notifications channels (slack/teams/pagerduty/phone/email ??) >>> - incident escalation policies (Manager on Duty ? Engineer on. Duty ?) >>> - which incidents to ignore on weekends ? >>> - customer relations —> who contacts the client during an incident & at >>> what point during the incident timeline? >>> >>> 3. Meta-monitoring >>> - monitoring your Monitoring/Observability stack >>> - implementing DR and resiliency >>> >>> PS: I scratched the surface here but this will probably answer you 80% >>> of the way. >>> >>> >>> >>> On Mon 1. 6. 2026 at 12:40, Piotr Wargulak via dev < >>> [email protected]> wrote: >>> >>>> Hi all, >>>> >>>> This is a slightly different request than usual for this list. >>>> I want to understand the operations side of running OSS systems in >>>> production, not the build side. Not necessarily Fineract specifically; any >>>> OSS stack you're actually keeping alive day to day. >>>> >>>> A few things I'm trying to figure out: how do you know your system's OK >>>> right now? Dashboards, alerts, someone checking every morning, or you find >>>> out when users complain? >>>> When something breaks, how do you find out, and how long until you know >>>> what actually went wrong? Who handles it on the ground: an internal team, a >>>> vendor, or that one person who built it? >>>> >>>> It would be most useful to hear from smaller operators (MFIs, credit >>>> unions, smaller fintechs running Fineract or similar) where Datadog-tier >>>> tooling isn't in the budget and there's no dedicated SRE team. >>>> >>>> For context, I'm at SolDevelo. We do OSS work, and we're thinking about >>>> how to better support organizations running these kinds of systems in >>>> production. I'd rather understand how the day-to-day actually looks first >>>> than guess. >>>> >>>> Reply on the list if it's a short answer, or DM me / schedule a 15 >>>> minute Google Meet if you want to talk it through. If a few people share, >>>> I'll write up an anonymized summary and send it back here. >>>> >>>> Best regards, >>>> PiotrW >>>> SolDevelo >>>> >>>> >>>> *SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com >>>> Al. Zwycięstwa 96/98 >>>> <https://www.google.com/maps/search/Al.+Zwyci%C4%99stwa+96%2F98?entry=gmail&source=g>, >>>> 81-451, Gdynia, Poland >>>> Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41 >>>> >>> >> >> *SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com >> Al. Zwycięstwa 96/98 >> <https://www.google.com/maps/search/Al.+Zwyci%C4%99stwa+96%2F98?entry=gmail&source=g>, >> 81-451, Gdynia, Poland >> Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41 >> > -- * SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com <http://www.soldevelo.com> Al. Zwycięstwa 96/98, 81-451, Gdynia, Poland Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41
