Hello!
I can only answer some of your questions:
-The backfill process obeys a "nearfull_ratio" limit (I think defaults
to 85%) above it the system will stop repairing itself, so it wont go up
to 100%
-The normal write ops obey a full_ratio too, I think default 95%, above
that no write io
okay another day another nightmare ;-)
So far we discussed pools as bundles of:
- pool 1) 15 HDD-OSDs (consisting of a total of 25 HDDs actual, 5
single HDDs and five raid0 pairs as mentioned before)
- pool 2) 6 SSD-OSDs
unfortunately (well) on the "physical" pool 1 there are two "logical"
pools
Quoting tim taler (robur...@gmail.com):
> And I'm still puzzled about the implication of the cluster size on the
> amount of OSD failures.
> With size=2 min_size=1 one host could die and (if by chance there is
> NO read error on any bit on the living host) I could (theoretically)
> recover, is
Yep, you are correct, thanks!
On 12/04/2017 07:31 PM, David Turner wrote:
"The journals can only be moved back by a complete rebuild of that osd
as to my knowledge."
I'm assuming that since this is a cluster that he's inherited and that
it's configured like this that it's probably not
On 04.12.2017 19:18, tim taler wrote:
In size=2 losing any 2 discs on different hosts would probably cause data to
be unavailable / lost, as the pg copys are randomly distribbuted across the
osds. Chances are, that you can find a pg which's acting group is the two
failed osd (you lost all your
"The journals can only be moved back by a complete rebuild of that osd as to
my knowledge."
I'm assuming that since this is a cluster that he's inherited and that it's
configured like this that it's probably not running luminous or bluestore
OSDs. Again more information needed about your cluster
> In size=2 losing any 2 discs on different hosts would probably cause data to
> be unavailable / lost, as the pg copys are randomly distribbuted across the
> osds. Chances are, that you can find a pg which's acting group is the two
> failed osd (you lost all your replicas)
okay I see, getting
Hi,
I would not rip out the discs, but I would reweight the osd to 0, wait
for the cluster to reconfigure, and when it is done, you can remove the
disc / raid pair without ever going down to 1 copy only.
The jornals can only be moved back by a complete rebiuld of that osd as
to my
Flushing a journal, and creating a new journal device before turning the
OSD on is viable and simple enough to do. Moving a raid0 while the new
host doesn't have the same controller wouldn't be recommended for obvious
reasons. That would change my recommendation for how to distribute the
OSDs,
thnx a lot again,
makes sense to me.
We have all journals of the HDD-OSDs on partitions on an extra
SSD-raid1 (each OSD got it's own journal partition on that raid1)
but as I understand they could be moved back to the OSD, at least for
the time of the restructuring.
What makes my tommy turn
Hi,
On 12/04/2017 12:12 PM, tim taler wrote:
Hi,
thnx a lot for the quick response
and for laying out some of the issues
I'm also new, but I'll try to help. IMHO most of the pros here would be quite
worried about this cluster if it is production:
thought so ;-/
-A prod ceph cluster
Your current node configuration cannot do size=3 for any pools. You only
have 2 hosts with HDDs and 2 hosts with SSDs in each root. You cannot put
3 copies of data for an HDD pool on 3 separate nodes when you only have 2
nodes with HDDs... In this configuration, size=2 is putting a copy of the
Hi,
thnx a lot for the quick response
and for laying out some of the issues
> I'm also new, but I'll try to help. IMHO most of the pros here would be quite
> worried about this cluster if it is production:
thought so ;-/
> -A prod ceph cluster should not be run with size=2 min_size=1,
Hi,
I'm also new, but I'll try to help. IMHO most of the pros here would be
quite worried about this cluster if it is production:
-A prod ceph cluster should not be run with size=2 min_size=1, because:
--In case of a down'ed osd / host the cluster could have problems
determining which data
Hi
I'm new to ceph but have to honor to look after a cluster that I haven't
set up by myself.
Rushing to the ceph docs and having a first glimpse on our cluster I start
worrying about our setup,
so I need some advice and guidance here.
The set up is:
3 machines, each running a ceph-monitor.
all
15 matches
Mail list logo