> On 10 Dec 2015, at 15:14, Sage Weil wrote:
>
> On Thu, 10 Dec 2015, Jan Schermer wrote:
>> Removing snapshot means looking for every *potential* object the snapshot
>> can have, and this takes a very long time (6TB snapshot will consist of 1.5M
>> objects (i
Removing snapshot means looking for every *potential* object the snapshot can
have, and this takes a very long time (6TB snapshot will consist of 1.5M
objects (in one replica) assuming the default 4MB object size). The same
applies to large thin volumes (don't try creating and then dropping a 1
Just try to give the booting OSD and all MONs the resources they ask for (CPU,
memory).
Yes, it causes disruption but only for a select group of clients, and only for
a moment (<20s with my extremely high number of PGs).
>From a service provider perspective this might break SLAs, but until you ge
You can setup logrotate however you want - not sure what the default is for
your distro.
Usually logrotate doesn't touch files that are smaller than some size even if
they are old. It will also not delete logs for OSDs that no longer exist.
Ceph itself has nothing to do with log rotation, logro
Have you tried running iperf between the nodes? Capturing a pcap of the
(failing) Ceph comms from both sides could help narrow it down.
Is there any SDN layer involved that could add overhead/padding to the frames?
What about some intermediate MTU like 8000 - does that work?
Oh and if there's any
Are there any errors on the NICs? (ethtool -s ethX)
Also take a look at the switch and look for flow control statistics - do you
have flow control enabled or disabled?
We had to disable flow control as it would pause all IO on the port whenever
any path got congested which you don't want to happe
> On 09/09/2015 10:54, "ceph-devel-ow...@vger.kernel.org on behalf of Jan
> Schermer"
> wrote:
>
>> I looked at THP before. It comes enabled on RHEL6 and on our KVM hosts it
>> merges a lot (~300GB hugepages on a 400GB KVM footprint).
>> I am probably going to
I looked at THP before. It comes enabled on RHEL6 and on our KVM hosts it
merges a lot (~300GB hugepages on a 400GB KVM footprint).
I am probably going to disable it and see if it introduces any problems for me
- the most important gain here is better processor memory lookup table (cache)
utiliz
Could someone clarify what the impact of this bug is?
We did increase pg_num/pgp_num and we are on dumpling (0.67.12 unofficial
snapshot).
Most of our clients are likely restarted already, but not all. Should we be
worried?
Thanks
Jan
> On 11 Aug 2015, at 17:31, Dan van der Ster wrote:
>
> On
Hi,
comments inline.
> On 05 Aug 2015, at 05:45, Jevon Qiao wrote:
>
> Hi Jan,
>
> Thank you for the detailed suggestion. Please see my reply in-line.
> On 5/8/15 01:23, Jan Schermer wrote:
>> I think I wrote about my experience with this about 3 months ago, includi
I think I wrote about my experience with this about 3 months ago, including
what techniques I used to minimize impact on production.
Basicaly we had to
1) increase pg_num in small increments only, bcreating the placement groups
themselves caused slowed requests on OSDs
2) increse pgp_num in smal
Not at all.
We have this: http://ceph.com/docs/master/releases/
I would expect that whatever distribution I install Ceph LTS release on will
be supported for the time specified.
That means if I install Hammer on CentOS 6 now it will stay supported
until 3Q/2016.
Of course if in the meantime the d
I understand your reasons, but dropping support for LTS release like this
is not right.
You should lege artis support every distribution the LTS release could have
ever been installed on - that’s what the LTS label is for and what we rely on
once we build a project on top of it
CentOS 6 in partic
go up you can’t go down.
Jan
> On 01 Jun 2015, at 10:57, huang jun wrote:
>
> hi,jan
>
> 2015-06-01 15:43 GMT+08:00 Jan Schermer :
>> We had to disable deep scrub or the cluster would me unusable - we need to
>> turn it back on sooner or later, though.
>> Wi
We had to disable deep scrub or the cluster would me unusable - we need to turn
it back on sooner or later, though.
With minimal scrubbing and recovery settings, everything is mostly good. Turned
out many issues we had were due to too few PGs - once we increased them from 4K
to 16K everything sp
15 matches
Mail list logo