Hello all,

@Edmore Tshuma: Thanks for the detailed answer last week, that gave me a
good list of what a mature observability setup should cover.

I'm still hoping to hear from people who actually run Fineract (or similar
OSS) in production. Not how it should be done in theory, but how you
actually do it day to day.

If a full conversation feels like too much, here's a smaller question:
what's the first thing you check in the morning to see if the system is OK?
A dashboard, a log, a Slack channel, or do you "wait for someone to
complain"?

I'm especially interested in hearing from operators who lack a budget for a
dedicated SRE team.

Best regards,
PiotrW
SolDevelo

On Mon, Jun 1, 2026 at 1:08 PM Edmore Tshuma <[email protected]> wrote:

> Observability/SRE expert here :)
>
> well … it’s quite a lot.
>
> At a minimum you need:
>
> 1. A minimal full-fledged Observability stack (can be pure OSS tools)
> which handles:
>
>  - which telemetry you want to collect and from which workloads
> - who decides what is worth collecting
> - who decides what to drop and when
> - alert rules standards, runbooks standards
> - dashboards standards
> - SLIs, SLOs
> - what alerts to route to which teams ?
> - which alerts should go to only prod environment/ staging/ dev ?
> - data residency and compliance (where can your telemetry be stored
> compliance-wise)
> - securing your telemetry
> -  which alerts need to be suppressed
> - Cost-management for the stack (THIS IS IMPORTANT !!)
>
> 2. Incident Management program
> - on-call schedules, post-mortems
> - notifications channels (slack/teams/pagerduty/phone/email ??)
> - incident escalation policies (Manager on Duty ? Engineer on. Duty ?)
> - which incidents to ignore on weekends ?
> - customer relations —> who contacts the client during an incident & at
> what point during the incident timeline?
>
> 3. Meta-monitoring
> - monitoring your Monitoring/Observability stack
> - implementing DR and resiliency
>
> PS: I scratched the surface here but this will probably answer you 80% of
> the way.
>
>
>
> On Mon 1. 6. 2026 at 12:40, Piotr Wargulak via dev <
> [email protected]> wrote:
>
>> Hi all,
>>
>> This is a slightly different request than usual for this list.
>> I want to understand the operations side of running OSS systems in
>> production, not the build side. Not necessarily Fineract specifically; any
>> OSS stack you're actually keeping alive day to day.
>>
>> A few things I'm trying to figure out: how do you know your system's OK
>> right now? Dashboards, alerts, someone checking every morning, or you find
>> out when users complain?
>> When something breaks, how do you find out, and how long until you know
>> what actually went wrong? Who handles it on the ground: an internal team, a
>> vendor, or that one person who built it?
>>
>> It would be most useful to hear from smaller operators (MFIs, credit
>> unions, smaller fintechs running Fineract or similar) where Datadog-tier
>> tooling isn't in the budget and there's no dedicated SRE team.
>>
>> For context, I'm at SolDevelo. We do OSS work, and we're thinking about
>> how to better support organizations running these kinds of systems in
>> production. I'd rather understand how the day-to-day actually looks first
>> than guess.
>>
>> Reply on the list if it's a short answer, or DM me / schedule a 15 minute
>> Google Meet if you want to talk it through. If a few people share, I'll
>> write up an anonymized summary and send it back here.
>>
>> Best regards,
>> PiotrW
>> SolDevelo
>>
>>
>> *SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com
>> Al. Zwycięstwa 96/98
>> <https://www.google.com/maps/search/Al.+Zwyci%C4%99stwa+96%2F98?entry=gmail&source=g>,
>> 81-451, Gdynia, Poland
>> Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41
>>
>

-- 
*
SolDevelo* Sp. z o.o. [LLC] / www.soldevelo.com 
<http://www.soldevelo.com>
Al. Zwycięstwa 96/98, 81-451, Gdynia, Poland
Phone: +48 58 782 45 40 / Fax: +48 58 782 45 41

Reply via email to