Hi all,
As far as I understand, the monitor stores will grow while not HEALTH_OK as
they hold onto all cluster maps. Is this true for all HEALTH_WARN reasons? Our
cluster recently went into HEALTH_WARN due to a few weeks of backfilling onto
new hardware pushing the monitors data stores over
question about HEALTH_WARN and monitors
> holding onto cluster maps
>
>
>
> On 05/17/2018 04:37 PM, Thomas Byrne - UKRI STFC wrote:
> > Hi all,
> >
> >
> >
> > As far as I understand, the monitor stores will grow while not
> > HEALTH_OK as they
it with it's full weight and having everything move at once.
On Thu, May 17, 2018 at 12:56 PM Thomas Byrne - UKRI STFC
<tom.by...@stfc.ac.uk<mailto:tom.by...@stfc.ac.uk>> wrote:
That seems like a sane way to do it, thanks for the clarification Wido.
As a follow-up, do you
Assuming I understand it correctly:
"pg_upmap_items 6.0 [40,20]" refers to replacing (upmapping?) osd.40 with
osd.20 in the acting set of the placement group '6.0'. Assuming it's a 3
replica PG, the other two OSDs in the set remain unchanged from the CRUSH
calculation.
"pg_upmap_items 6.6
I recently spent some time looking at this, I believe the 'summary' and
'overall_status' sections are now deprecated. The 'status' and 'checks' fields
are the ones to use now.
The 'status' field gives you the OK/WARN/ERR, but returning the most severe
error condition from the 'checks' section
> In previous versions of Ceph, I was able to determine which PGs had
> scrub errors, and then a cron.hourly script ran "ceph pg repair" for them,
> provided that they were not already being scrubbed. In Luminous, the bad
> PG is not visible in "ceph --status" anywhere. Should I use
For what it's worth, I think the behaviour Pardhiv and Bryan are describing is
not quite normal, and sounds similar to something we see on our large luminous
cluster with elderly (created as jewel?) monitors. After large operations which
result in the mon stores growing to 20GB+, leaving the
June 2019 17:30
To: Byrne, Thomas (STFC,RAL,SC)
Cc: ceph-users
Subject: Re: [ceph-users] OSDs taking a long time to boot due to
'clear_temp_objects', even with fresh PGs
On Mon, Jun 24, 2019 at 9:06 AM Thomas Byrne - UKRI STFC
wrote:
>
> Hi all,
>
>
>
> Some bluestore OSDs in
Hi all,
Some bluestore OSDs in our Luminous test cluster have started becoming
unresponsive and booting very slowly.
These OSDs have been used for stress testing for hardware destined for our
production cluster, so have had a number of pools on them with many, many
objects in the past.
Hi Torben,
> Is it allowed to have the scrub period cross midnight ? eg have start time at
> 22:00 and end time 07:00 next morning.
Yes, I think that's what the way it is mostly used, primarily to reduce the
scrub impact during waking/working hours.
> I assume that if you only configure the
Hi all,
I'm investigating an issue with our (non-Ceph) caching layers of our large EC
cluster. It seems to be turning users requests for whole objects into lots of
small byte range requests reaching the OSDs, but I'm not sure how inefficient
this behaviour is in reality.
My limited
Sent: 09 September 2019 23:25
> To: Byrne, Thomas (STFC,RAL,SC)
> Cc: ceph-users
> Subject: Re: [ceph-users] Help understanding EC object reads
>
> On Thu, Aug 29, 2019 at 4:57 AM Thomas Byrne - UKRI STFC
> wrote:
> >
> > Hi all,
> >
> > I’m investigating an is
As a counterpoint, adding large amounts of new hardware in gradually (or more
specifically in a few steps) has a few benefits IMO.
- Being able to pause the operation and confirm the new hardware (and cluster)
is operating as expected. You can identify problems with hardware with OSDs at
10%
13 matches
Mail list logo