On Wed,  5 Sep 2018, John Spray said:
> On Wed, Sep 5, 2018 at 8:38 AM Marc Roos <m.r...@f1-outsourcing.eu> wrote:
> >
> >
> > The adviced solution is to upgrade ceph only in HEALTH_OK state. And I
> > also read somewhere that is bad to have your cluster for a long time in
> > an HEALTH_ERR state.
> >
> > But why is this bad?

See https://ceph.com/community/new-luminous-pg-overdose-protection
under "Problems with past intervals"

"if the cluster becomes unhealthy, and especially if it remains unhealthy for 
an extended period of time, a combination of effects can cause problems."

"If a cluster is unhealthy for an extended period of time (e.g., days or even 
weeks), the past interval set can become large enough to require a significant 
amount of memory."


Sean
 
> Aside from the obvious (errors are bad things!), many people have
> external monitoring systems that will alert them on the transitions
> between OK/WARN/ERR.  If the system is stuck in ERR for a long time,
> they are unlikely to notice new errors or warnings.  These systems can
> accumulate faults without the operator noticing.
> 
> > Why is this bad during upgrading?
> 
> It depends what's gone wrong.  For example:
>  - If your cluster is degraded (fewer than desired number of replicas
> of data) then taking more services offline (even briefly) to do an
> upgrade will create greater risk to the data by reducing the number of
> copies available.
> - If your system is in an error state because something has gone bad
> on disk, then recovering it with the same software that wrote the data
> is a more tested code path than running some newer code against a
> system left in a strange state by an older version.
> 
> There will always be exceptions to this (e.g. where the upgrade is the
> fix for whatever caused the error), but the general purpose advice is
> to get a system nice and clean before starting the upgrade.
> 
> John
> 
> > Can I quantify how bad it is? (like with large log/journal file?)
> >
> >
> >
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to