date:20210616

[ceph-users] osd_scrub_max_preemptions for large OSDs or large EC pgs

2021-06-16 Thread Dave Hall

Hello,

I would like to ask about osd_scrub_max_preemptions in 14.2.20 for large
OSDs (mine are 12TB) and/or large k+m EC pools (mine are 8+2).  I did
search the archives for this list, but I did not see any reference.

Symptoms:

I have been seeing a behavior in my cluster over the past 2 or 3 weeks
where, for no apparent reason, there are suddenly slow ops, followed by a
brief OSD down, massive but brief degradation/activating/peering, and then
back to normal.

I had thought this might have to do with some backfill activity due to a
recently failed (as in down and out and process wouldn't start), but now
all of that is over and the cluster is mostly back to HEALTH_OK.

Thinking this might be something that was introduced between 14.2.9 and
14.2.16, I upgraded to 14.2.20 this morning.  However, I just saw the same
kind of event happen twice again.  At the time, the only non-client
activity was a single deep-scrub.

Question:

The description for osd_scrub_max_preemptions indicates that a deep scrub
process will allow itself to be preempted a fixed number of times by client
I/O and will then block client I/O until it finishes.  Although I don't
fully understand the deep scrub process, it seems that either the size of
the HDD or the k+m count of the EC Pool could affect the time needed to
complete a deep scrub and thus increase the likelihood that more than the
default 5 preemptions will occur.

Please tell me if my understanding is correct.  If so, is there any
guideline for increasing osd_scrub_max_preemptions just enough balance
between scrub progress and client responsiveness?

Or perhaps there are other scrub attributes that should be tuned instead?

Thanks.

-Dave

--
Dave Hall
Binghamton University
kdh...@binghamton.edu
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Strange (incorrect?) upmap entries in OSD map

2021-06-16 Thread Andras Pataki

I've been working on some improvements to our large cluster's space 
balancing, when I noticed that sometimes the OSD maps have strange upmap 
entries.  Here is an example on a clean cluster (PGs are active+clean):


    {
    "pgid": "1.1cb7",
...
    "up": [
    891,
    170,
    1338
    ],
    "acting": [
    891,
    170,
    1338
    ],
...
    },

with an upmap entry:

pg_upmap_items 1.1cb7 [170,891]

this would make the "up" list [ 170, 170, 1338 ], which isn't allowed.  
So the cluster just seems to ignore this upmap.  When I remove the 
upmap, nothing changes in the PG state, and I can even re-insert it 
(without any effect).  Any ideas why this upmap doesn't simply get 
rejected/removed?


However, if I were to insert an upmap [170, 892], it gets rejected 
correctly (since 891 and 892 are on the same host - violating crush rules).


Any insights would be helpful,

Andras
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: JSON output schema

2021-06-16 Thread Vladimir Prokofev

This is a great start, thank you! Basically I can look through the code to
get the keys I need.
But maybe I'm approaching this task wrong? Maybe there's already some
better solution to monitor cluster health details?

ср, 16 июн. 2021 г. в 02:47, Anthony D'Atri :

> Before Luminous, mon clock skew was part of the health status JSON.  With
> Luminous and later releases, one has to invoke a separate command to get
> the info.
>
> This is a royal PITA for monitoring / metrics infrastructure and I’ve
> never seen a reason why it was done.
>
> You might find the code here
> https://github.com/digitalocean/ceph_exporter
> useful.  Note that there are multiple branches, which can be confusing.
>
> > On Jun 15, 2021, at 4:21 PM, Vladimir Prokofev  wrote:
> >
> > Good day.
> >
> > I'm writing some code for parsing output data for monitoring purposes.
> > The data is that of "ceph status -f json", "ceph df -f json", "ceph osd
> > perf -f json" and "ceph osd pool stats -f json".
> > I also need support for all major CEPH releases, starting with Jewel till
> > Pacific.
> >
> > What I've stumbled upon is that:
> > - keys in JSON output are not present if there's no appropriate data.
> > For example the key ['pgmap', 'read_bytes_sec'] will not be present in
> > "ceph status" output if there's no read activity in the cluster;
> > - some keys changed between versions. For example ['health']['status']
> key
> > is not present in Jewel, but is available in all the following versions;
> > vice-versa, key ['osdmap', 'osdmap'] is not present in Pacific, but is in
> > all the previous versions.
> >
> > So I need to get a list of all possible keys for all CEPH releases. Any
> > ideas how this can be achieved? My only thought atm is to build a
> "failing"
> > cluster with all the possible states and get a reference data out of it.
> > Not only this is tedious work since it requires each possible cluster
> > version, but it is also prone for error.
> > Is there any publicly available JSON schema for output?
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph osd df return null

2021-06-16 Thread Konstantin Shalygin

Perhaps this OSD's offline / out?
Please, upload your `ceph osd df tree` & `ceph osd tree` to pastebin


Thanks,
k

> On 16 Jun 2021, at 10:43, julien lenseigne  
> wrote:
> 
> when i do ceph osd df,
> 
> some osd returns null size. For example :
> 
>  0   hdd  7.27699  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  29
>  1   hdd  7.27698  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  23
> 11   ssd  0.5  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  49
>  2   hdd  7.27699  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  30
>  4   hdd  7.27699  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  37
> 12   ssd  0.5  1.0  0B  0B  0B  0B 0B  0B 0   
>  0  39
>  7   hdd 24.55698  1.0 24.6TiB  663GiB  663GiB  0B  0B 23.9TiB  
> 2.64 0.79 302
> 42   hdd 10.69179  1.0 10.7TiB  346GiB  344GiB 30.0MiB 1.13GiB 10.4TiB  
> 3.16 0.95  98
> 43   hdd 10.69179  1.0 10.7TiB  357GiB  355GiB 49.1MiB 1.17GiB 10.3TiB  
> 3.26 0.98 131
> 44   hdd 10.69179  1.0 10.7TiB  298GiB  297GiB 30.4MiB 1.05GiB 10.4TiB  
> 2.72 0.81  92
> 45   hdd 10.69179  1.0 10.7TiB  342GiB  341GiB 26.6MiB 1.11GiB 10.4TiB  
> 3.12 0.94 105
> 40   ssd  0.87270  1.0  894GiB  180GiB  179GiB 33.5MiB 990MiB  714GiB 
> 20.10 6.02  53
> 41   ssd  0.87270  1.0  894GiB  254GiB  253GiB 39.0MiB 1.00GiB  640GiB 
> 28.38 8.50  61
> 46   ssd  0.87270  1.0  894GiB  255GiB  254GiB 26.8MiB 1.03GiB  639GiB 
> 28.55 8.55  76
>  3   hdd 24.55699  1.0 24.6TiB  537GiB  536GiB 67.9MiB 1.36GiB 24.0TiB  
> 2.14 0.64 252
> 
> do you know where it might come from ?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph monitor won't start after Ubuntu update

2021-06-16 Thread Petr

Hello Konstantin,

Wednesday, June 16, 2021, 1:50:55 PM, you wrote:

> Hi,

>> On 16 Jun 2021, at 01:33, Petr  wrote:
>> 
>> I've upgraded my Ubuntu server from 18.04.5 LTS to Ubuntu 20.04.2 LTS via 
>> 'do-release-upgrade',
>> during that process ceph packages were upgraded from Luminous to Octopus and 
>> now ceph-mon daemon(I have only one) won't start, log error is:
>> "2021-06-15T20:23:41.843+ 7fbb55e9b540 -1 mon.target@-1(probing) e2 
>> current monmap has recorded min_mon_release 12 (luminous) is >2 releases 
>> older than installed 15 (octopus);
>> you can only upgrade 2 releases at a time you should first upgrade to 13 
>> (mimic) or 14 (nautilus) stopping."
>> 
>> Is there any way to get cluster running or at least get data from OSDs?

> Ceph is not supported +3 releases upgrade, only +1 or +2.
Yep, already got that.

> I suggest to install Nautilus packages and start cluster again
I would like to, but there is no Nautilus packages for Ubuntu 20(focal).


> k
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io



-- 
Best regards,
Petr
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Drop old SDD / HDD Host crushmap rules

2021-06-16 Thread Denny Fuchs

Hello,

i have from the beginning on one DC very old crush map rules, to split
HDD and SSD disks. It is obsolete since Luminous and I want to drop them:

# ceph osd crush rule ls

replicated_rule
fc-r02-ssdpool
fc-r02-satapool
fc-r02-ssd

=
[
{
"rule_id": 0,
"rule_name": "replicated_rule",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 1,
"rule_name": "fc-r02-ssdpool",
"ruleset": 1,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -15,
"item_name": "r02-ssds"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 2,
"rule_name": "fc-r02-satapool",
"ruleset": 2,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -16,
"item_name": "r02-sata"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
},
{
"rule_id": 3,
"rule_name": "fc-r02-ssd",
"ruleset": 3,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -4,
"item_name": "default~ssd"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]

==


# ceph osd tree

ID  CLASS WEIGHT   TYPE NAME   STATUS REWEIGHT
PRI-AFF
-14  0 root sata

-18  0 datacenter fc-sata

-16  0 rack r02-sata

-13   18.55234 root ssds

-17   18.55234 datacenter fc-ssds

-15   18.55234 rack r02-ssds

 -63.09206 host fc-r02-ceph-osd-01

 41  nvme  0.36388 osd.41  up  1.0
1.0
  0   ssd  0.45470 osd.0   up  1.0
1.0
  1   ssd  0.45470 osd.1   up  1.0
1.0
  2   ssd  0.45470 osd.2   up  1.0
1.0
  3   ssd  0.45470 osd.3   up  1.0
1.0
  4   ssd  0.45470 osd.4   up  1.0
1.0
  5   ssd  0.45470 osd.5   up  1.0
1.0
 -23.09206 host fc-r02-ceph-osd-02

 36  nvme  0.36388 osd.36  up  1.0
1.0
  6   ssd  0.45470 osd.6   up  1.0
1.0
  7   ssd  0.45470 osd.7   up  1.0
1.0
  8   ssd  0.45470 osd.8   up  1.0
1.0
  9   ssd  0.45470 osd.9   up  1.0
1.0
 10   ssd  0.45470 osd.10  up  1.0
1.0
 29   ssd  0.45470 osd.29  up  1.0
1.0
 -53.45593 host fc-r02-ceph-osd-03

 38  nvme  0.36388 osd.38  up  1.0
1.0
 40  nvme  0.36388 osd.40  up  1.0
1.0
 11   ssd  0.45470 osd.11  up  1.0
1.0
 12   ssd  0.45470 osd.12  up  1.0
1.0
 13   ssd  0.45470 osd.13  up  1.0
1.0
 14   ssd  0.45470 osd.14  up  1.0
1.0
 15   ssd  0.45470 osd.15  up  1.0
1.0
 16   ssd  0.45470 osd.16  up  1.0
1.0
 -93.09206 host fc-r02-ceph-osd-04

 37  nvme  0.36388 osd.37  up  1.0
1.0
 30   ssd  0.45470 osd.30  up  1.0
1.0
 31   ssd  0.45470 osd.31  up  1.0
1.0
 32   ssd  0.45470 osd.32  up  1.0
1.0
 33   ssd  0.45470 osd.33  up  1.0
1.0
 34

[ceph-users] Likely date for Pacific backport for RGW fix?

2021-06-16 Thread Chris Palmer


Hi

Our first upgrade (non-cephadm) from Octopus to Pacific 16.0.4 went very 
smoothly. Thanks for all the effort.


The only thing that has bitten us is 
https://tracker.ceph.com/issues/50556 
 which prevents a multipart 
upload to an RGW bucket that has a bucket policy. While I've been able 
to rewrite the most urgent scripts to use s3api put-object (which 
doesn't do multipart), that only works for objects up to a certain size. 
Removing the bucket policies isn't an option.


I can see that it has been fixed, and is now pending backport 
(https://tracker.ceph.com/issues/51001 
). Will this be included in 
16.0.5? And do we have an estimated date for that?


We can wait a little longer, but otherwise I have to do some more 
drastic changes to an application. Having an indication of date would 
help me choose which...


Many thanks, Chris

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph Month June Schedule Now Available

2021-06-16 Thread Mike Perez

Hi everyone,

Here's today schedule for Ceph Month:

9:00 ET / 15:00 CEST Project Aquarium - An easy-to-use storage
appliance wrapped around Ceph [Joao Eduardo Luis]
9:30 ET / 15:30 CEST [lightning] Qemu: librbd vs krbd performance
[Wout van Heeswijk]
9:40 ET / 15:40 CEST [lightning] Evaluation of RBD replication options
@CERN Arthur, Outhenin-Chalandre

Meeting link:https://bluejeans.com/908675367
Full schedule: https://pad.ceph.com/p/ceph-month-june-2021


On Tue, Jun 15, 2021 at 5:52 AM Mike Perez  wrote:
>
> Hi everyone,
>
> Here's today's schedule for Ceph Month:
>
> 9:00ET / 15:00 CEST Dashboard Update [Ernesto]
> 9:30 ET / 15:30 CEST [lightning] RBD latency with QD=1 bs=4k [Wido,
> den Hollander]
> 9:40 ET / 15:40 CEST [lightning] From Open Source  to Open Ended in
> Ceph with Lua [Yuval Lifshitz]
>
> Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
> Meeting link: https://bluejeans.com/908675367
>
> On Mon, Jun 14, 2021 at 6:50 AM Mike Perez  wrote:
> >
> > Hi everyone,
> >
> > In ten minutes, Ceph Month continues with the following schedule today:
> >
> > 10:00 ET / 16:00 CEST RBD update [Ilya Dryomov]
> > 10:30 ET / 16:30 CEST 5 more ways to break your ceph cluster [Wout van 
> > Heeswijk]
> >
> > Full schedule: https://pad.ceph.com/p/ceph-month-june-2021
> > Meeting link: https://bluejeans.com/908675367
> >
> >
> > On Fri, Jun 11, 2021 at 6:50 AM Mike Perez  wrote:
> > >
> > > Hi everyone,
> > >
> > > In ten minutes, join us for the next Ceph Month presentation on Intel
> > > QLC SSD: Cost-Effective Ceph Deployments by Anthony D'Atri
> > >
> > > https://bluejeans.com/908675367
> > > https://pad.ceph.com/p/ceph-month-june-2021
> > >
> > > On Fri, Jun 11, 2021 at 5:50 AM Mike Perez  wrote:
> > > >
> > > > Hi everyone,
> > > >
> > > > In ten minutes, join us for the next Ceph Month presentation on
> > > > Performance Optimization for All Flash-based on aarch64 by Chunsong
> > > > Feng
> > > >
> > > > https://pad.ceph.com/p/ceph-month-june-2021
> > > > https://bluejeans.com/908675367
> > > >
> > > > On Thu, Jun 10, 2021 at 6:00 AM Mike Perez  wrote:
> > > > >
> > > > > Hi everyone,
> > > > >
> > > > > We're about to start Ceph Month 2021 with Casey Bodley giving a RGW 
> > > > > update!
> > > > >
> > > > > Afterward we'll have two BoF discussions on:
> > > > >
> > > > > 9:30 ET / 15:30 CEST [BoF] Ceph in Research & Scientific Computing
> > > > > [Kevin Hrpcek]
> > > > >
> > > > > 10:10 ET / 16:10 CEST [BoF] The go-ceph get together [John Mulligan]
> > > > >
> > > > > Join us now on the stream:
> > > > >
> > > > > https://bluejeans.com/908675367
> > > > >
> > > > > On Tue, Jun 1, 2021 at 6:50 AM Mike Perez  wrote:
> > > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > In ten minutes, join us for the start of the Ceph Month June event!
> > > > > > The schedule and meeting link can be found on this etherpad:
> > > > > >
> > > > > > https://pad.ceph.com/p/ceph-month-june-2021
> > > > > >
> > > > > > On Tue, May 25, 2021 at 11:56 AM Mike Perez  
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > The Ceph Month June schedule is now available:
> > > > > > >
> > > > > > > https://pad.ceph.com/p/ceph-month-june-2021
> > > > > > >
> > > > > > > We have great sessions from component updates, performance best
> > > > > > > practices, Ceph on different architectures, BoF sessions to get 
> > > > > > > more
> > > > > > > involved with working groups in the community, and more! You may 
> > > > > > > also
> > > > > > > leave open discussion topics for the listed talks that we'll get 
> > > > > > > to
> > > > > > > each Q/A portion.
> > > > > > >
> > > > > > > I will provide the video stream link on this thread and etherpad 
> > > > > > > once
> > > > > > > it's available. You can also add the Ceph community calendar, 
> > > > > > > which
> > > > > > > will have the Ceph Month sessions prefixed with "Ceph Month" to 
> > > > > > > get
> > > > > > > local timezone conversions.
> > > > > > >
> > > > > > > https://calendar.google.com/calendar/embed?src=9ts9c7lt7u1vic2ijvvqqlfpo0%40group.calendar.google.com
> > > > > > >
> > > > > > > Thank you to our speakers for taking the time to share with us 
> > > > > > > all the
> > > > > > > latest best practices and usage with Ceph!
> > > > > > >
> > > > > > > --
> > > > > > > Mike Perez
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] RADOSGW Keystone integration - S3 bucket policies targeting not just other tenants / projects ?

2021-06-16 Thread Christian Rohmann


Hallo Ceph-Users,

I've been wondering about the state of OpenStack Keystone Auth in RADOSGW.


1) Even though the general documentation on RADOSGW S3 bucket policies 
is a little "misleading" 
https://docs.ceph.com/en/latest/radosgw/bucketpolicy/#creation-and-removal 
in showing users being referred as Principal,
the documentation about Keystone integration at 
https://docs.ceph.com/en/latest/radosgw/keystone/#integrating-with-openstack-keystone 
clearly states, that "A Ceph Object Gateway user is mapped into a 
Keystone "||.


In the keystone authentication code it strictly only takes the project 
from the authenticating user:


 * 
https://github.com/ceph/ceph/blob/6ce6874bae8fbac8921f0bdfc3931371fc61d4ff/src/rgw/rgw_auth_keystone.cc#L127
 * 
https://github.com/ceph/ceph/blob/6ce6874bae8fbac8921f0bdfc3931371fc61d4ff/src/rgw/rgw_auth_keystone.cc#L515



This is rather unfortunate as this renders the usually powerful S3 
bucket policies to be rather basic with granting access to all users 
(with a certain role) of a project or more importantly all users of 
another project / tenant, as in using


  arn:aws:iam::$OS_REMOTE_PROJECT_ID:root

as principal.


Or am I just misreading anything here or is this really all that can be 
done if using native keystone auth?




2) There is a PR open implementing generic external authentication 
https://github.com/ceph/ceph/pull/34093


Apparently this seems to also address the lack of support for subusers 
for Keystone - if I understand this correctly I could then grant access 
to users


  arn:aws:iam::$OS_REMOTE_PROJECT_ID:$user


Are there any plans on the roadmap to extend the functionality in 
regards to keystone as authentication backend?





I know a similar question as been asked before 
(https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/GY7VUKCQ5QUMDYSFUJE233FKBRADXRZK/#GY7VUKCQ5QUMDYSFUJE233FKBRADXRZK)

but unfortunately with no discussion / responses then.



Regards


Christian

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-16 Thread mabi

‐‐‐ Original Message ‐‐‐
On Wednesday, June 16, 2021 10:18 AM, Andrew Walker-Brown 
 wrote:

> With active mode, you then have a transmit hashing policy, usually set 
> globally.
>
> On Linux the bond would be set as ‘bond-mode 802.3ad’ and then 
> ‘bond-xmit-hash-policy layer3+4’ - or whatever hashing policy you want.

Does the transmit hash policy need to be the same on the Linux server and on 
the switch side? or can they be different?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] libceph: monX session lost, hunting for new mon

2021-06-16 Thread Magnus HAGDORN

Hi all,
I know this came up before but I couldn't find a resolution.
We get the error
libceph: monX session lost, hunting for new mon
a lot on our samba servers that reexport cephfs. A lot means more than
once a minute. On other machines that are less busy we get it about
every 10-30 minutes. We only use a single network for both client and
backend traffic on bonded 10GE links.

So, my questions are: is this expected and normal behaviour? how to
track this problem down?

Regards
magnus
The University of Edinburgh is a charitable body, registered in Scotland, with 
registration number SC005336. Is e buidheann carthannais a th’ ann an Oilthigh 
Dhùn Èideann, clàraichte an Alba, àireamh clàraidh SC005336.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph monitor won't start after Ubuntu update

2021-06-16 Thread Konstantin Shalygin

Hi,

> On 16 Jun 2021, at 01:33, Petr  wrote:
> 
> I've upgraded my Ubuntu server from 18.04.5 LTS to Ubuntu 20.04.2 LTS via 
> 'do-release-upgrade',
> during that process ceph packages were upgraded from Luminous to Octopus and 
> now ceph-mon daemon(I have only one) won't start, log error is:
> "2021-06-15T20:23:41.843+ 7fbb55e9b540 -1 mon.target@-1(probing) e2 
> current monmap has recorded min_mon_release 12 (luminous) is >2 releases 
> older than installed 15 (octopus);
> you can only upgrade 2 releases at a time you should first upgrade to 13 
> (mimic) or 14 (nautilus) stopping."
> 
> Is there any way to get cluster running or at least get data from OSDs?

Ceph is not supported +3 releases upgrade, only +1 or +2.
I suggest to install Nautilus packages and start cluster again



k
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Strategy for add new osds

2021-06-16 Thread Nico Schottelius


I've to say I am reading quite some interesting strategies in this
thread and I'd like to shortly take the time to compare them:

1) one by one osd adding

- least amount of pg rebalance
- will potentially re-re-balance data that has just been distributed
  with the next OSD phase in
- limits the impact if you have a bug in the hdd/ssd series

The biggest problem with this approach is that you will re-re-re-balance
data over and over again and that will slowdown the process significantly.

2) reweighted phase in

- Starting slow with reweighting to a small amount of its potential
- Allows to see how the new OSD performs
- Needs manual interaction for growing
- delays the phase in possibly for "longer than necessary"

We use this approach when phasing in multiple, larger OSDs that are from
a newer / not so well known series of disks.

3) noin / norebalance based phase in

- Interesting approach to delay rebalancing until the "proper/final" new
  storage is in place
- Unclear how much of a difference it makes if you insert the new set of
  osds within a short timeframe (i.e. adding 1 osd at minute 0, 2nd at
  minute 1, etc.)


4) All at once / randomly

- Least amount of manual tuning
- In a way something one "would expect" ceph to do right (but in
  practice doesn't all the time)
- Might (likely) cause short term re-adjustments
- Might cause client i/o slowdown (see next point)

5) General slowing down

What we actually do in datacenterlight.ch is slowing down phase ins by
default via the followign tunings:

# Restrain recovery operations so that normal cluster is not affected
[osd]
osd max backfills = 1
osd recovery max active = 1
osd recovery op priority = 2

This works well in about 90% of the cases for us.

Quite an interesting thread, thanks everyone for sharing!

Cheers,

Nico


Anthony D'Atri  writes:

>> Hi,
>>
>> as far as I understand it,
>>
>> you get no real benefit with doing them one by one, as each osd add, can 
>> cause a lot of data to be moved to a different osd, even tho you just 
>> rebalanced it.
>
> Less than with older releases, but yeah.
>
> I’ve known someone who advised against doing them in parallel because one 
> would — for a time — have PGs with multiple remaps in the acting set.  The 
> objection may have been paranoia, I’m not sure.
>
> One compromise is to upweight the new OSDs one node at a time, so the churn 
> is limited to one failure domain at a time.
>
> — aad
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-16 Thread Andrew Walker-Brown

Ideally you’d want to have it the transmit hash the same...but so long load is 
being pretty evenly spread over all the links in the lag, then you’re fine.

Sent from Mail for Windows 10

From: mabi
Sent: 16 June 2021 09:29
To: Andrew Walker-Brown
Cc: huxia...@horebdata.cn; Joe 
Comeau; ceph-users
Subject: Re: [ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy 
using L2 MC-LAG

‐‐‐ Original Message ‐‐‐
On Wednesday, June 16, 2021 10:18 AM, Andrew Walker-Brown 
 wrote:

> With active mode, you then have a transmit hashing policy, usually set 
> globally.
>
> On Linux the bond would be set as ‘bond-mode 802.3ad’ and then 
> ‘bond-xmit-hash-policy layer3+4’ - or whatever hashing policy you want.

Does the transmit hash policy need to be the same on the Linux server and on 
the switch side? or can they be different?

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Strategy for add new osds

2021-06-16 Thread Anthony D'Atri



> Hi,
> 
> as far as I understand it,
> 
> you get no real benefit with doing them one by one, as each osd add, can 
> cause a lot of data to be moved to a different osd, even tho you just 
> rebalanced it.

Less than with older releases, but yeah.  

I’ve known someone who advised against doing them in parallel because one would 
— for a time — have PGs with multiple remaps in the acting set.  The objection 
may have been paranoia, I’m not sure.

One compromise is to upweight the new OSDs one node at a time, so the churn is 
limited to one failure domain at a time.

— aad
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-16 Thread Serkan Çoban

You cannot do much if the link is flapping or the cable is bad.
Maybe you can write some rules to shut the port down on the switch if
the error packet ratio goes up.
I also remember there are some config on the switch side for link flapping.

On Wed, Jun 16, 2021 at 10:57 AM huxia...@horebdata.cn
 wrote:
>
> Is it true that MC-LAG and 803.2ad, by its default, are working on 
> active-active.
>
> What else should i take care to ensure fault tolerance when one path is bad?
>
> best regards,
>
> samuel
>
>
>
> huxia...@horebdata.cn
>
> From: Joe Comeau
> Date: 2021-06-15 23:44
> To: ceph-users@ceph.io
> Subject: [ceph-users] Fwd: Re: Issues with Ceph network redundancy using L2 
> MC-LAG
> We also run with Dell VLT switches (40 GB)
> everything is active/active, so multiple paths as Andrew describes in
> his config
> Our config allows us:
>bring down one of the switches for upgrades
>bring down an iscsi gatway for patching
> all the while at least one path is up and servicing
> Thanks Joe
>
>
> >>> Andrew Walker-Brown  6/15/2021 10:26 AM
> >>>
> With an unstable link/port you could see the issues you describe.  Ping
> doesn’t have the packet rate for you to necessarily have a packet in
> transit at exactly the same time as the port fails temporarily.  Iperf
> on the other hand could certainly show the issue, higher packet rate and
> more likely to have packets in flight at the time of a link
> fail...combined with packet loss/retries gives poor throughput.
>
> Depending on what you want to happen, there are a number of tuning
> options both on the switches and Linux.  If you want the LAG to be down
> if any link fails, the you should be able to config this on the switches
> and/or Linux  (minimum number of links = 2 if you have 2 links in the
> lag).
>
> You can also tune the link monitoring, how frequently the links are
> checked (e.g. miimon) etc.  Bringing this value down from the default of
> 100ms may allow you to detect a link failure more quickly.  But you then
> run into the chance if detecting a transient failure that wouldn’t have
> caused any issuesand the LAG becoming more unstable.
>
> Flapping/unstable links are the worst kind of situation.  Ideally you’d
> pick that up quickly from monitoring/alerts and either fix immediately
> or take the link down until you can fix it.
>
> I run 2x10G from my hosts into separate switches (Dell S series – VLT
> between switches).  Pulling a single interface has no impact on Ceph,
> any packet loss is tiny and we’re not exceeding 10G bandwidth per host.
>
> If you’re running 1G links and the LAG is already busy, a link failure
> could be causing slow writes to the host, just down to
> congestion...which then starts to impact the wider cluster based on how
> Ceph works.
>
> Just caveating the above with - I’m relatively new to Ceph myself
>
> Sent from Mail for
> Windows 10
>
> From: huxia...@horebdata.cn
> Sent: 15 June 2021 17:52
> To: Serkan Çoban
> Cc: ceph-users
> Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2
> MC-LAG
>
> When i pull out the cable, then the bond is working properly.
>
> Does it mean that the port is somehow flapping? Ping can still work,
> but the iperf test yields very low results.
>
>
>
>
>
> huxia...@horebdata.cn
>
> From: Serkan Çoban
> Date: 2021-06-15 18:47
> To: huxia...@horebdata.cn
> CC: ceph-users
> Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2
> MC-LAG
> Do you observe the same behaviour when you pull a cable?
> Maybe a flapping port might cause this kind of behaviour, other than
> that you should't see any network disconnects.
> Are you sure about LACP configuration, what is the output of 'cat
> /proc/net/bonding/bond0'
>
> On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
>  wrote:
> >
> > Dear Cephers,
> >
> > I encountered the following networking issue several times, and i
> wonder whether there is a solution for networking HA solution.
> >
> > We build ceph using L2 multi chassis link aggregation group (MC-LAG )
> to provide switch redundancy. On each host, we use 802.3ad, LACP
> > mode for NIC redundancy. However, we observe several times, when a
> single network port, either the cable, or the SFP+ optical module fails,
> Ceph cluster  is badly affected by networking, although in theory it
> should be able to tolerate.
> >
> > Did i miss something important here? and how to really achieve
> networking HA in Ceph cluster?
> >
> > best regards,
> >
> > Samuel
> >
> >
> >
> >
> > huxia...@horebdata.cn
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
>
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-16 Thread Andrew Walker-Brown

Depends on when you configure the switch port. For dell :

Interface Ethernet 1/1/20
No switchport
Channel-group 10 mode active
!

‘Mode active’ set it as a dynamic lacp lag. Otherwise it would be ‘mode static’

With active mode, you then have a transmit hashing policy, usually set 
globally. 

On Linux the bond would be set as ‘bond-mode 802.3ad’ and then 
‘bond-xmit-hash-policy layer3+4’ - or whatever hashing policy you want. 

Sent from my iPhone

On 16 Jun 2021, at 08:57, huxia...@horebdata.cn wrote:

Is it true that MC-LAG and 803.2ad, by its default, are working on 
active-active. 

What else should i take care to ensure fault tolerance when one path is bad?

best regards,

samuel

huxia...@horebdata.cn

From: Joe Comeau
Date: 2021-06-15 23:44
To: ceph-users@ceph.io
Subject: [ceph-users] Fwd: Re: Issues with Ceph network redundancy using L2 
MC-LAG
We also run with Dell VLT switches (40 GB)
everything is active/active, so multiple paths as Andrew describes in
his config
Our config allows us:
  bring down one of the switches for upgrades
  bring down an iscsi gatway for patching
all the while at least one path is up and servicing
Thanks Joe

>>> Andrew Walker-Brown  6/15/2021 10:26 AM
>>> 
With an unstable link/port you could see the issues you describe.  Ping
doesn’t have the packet rate for you to necessarily have a packet in
transit at exactly the same time as the port fails temporarily.  Iperf
on the other hand could certainly show the issue, higher packet rate and
more likely to have packets in flight at the time of a link
fail...combined with packet loss/retries gives poor throughput.

Depending on what you want to happen, there are a number of tuning
options both on the switches and Linux.  If you want the LAG to be down
if any link fails, the you should be able to config this on the switches
and/or Linux  (minimum number of links = 2 if you have 2 links in the
lag).

You can also tune the link monitoring, how frequently the links are
checked (e.g. miimon) etc.  Bringing this value down from the default of
100ms may allow you to detect a link failure more quickly.  But you then
run into the chance if detecting a transient failure that wouldn’t have
caused any issuesand the LAG becoming more unstable.

Flapping/unstable links are the worst kind of situation.  Ideally you’d
pick that up quickly from monitoring/alerts and either fix immediately
or take the link down until you can fix it.

I run 2x10G from my hosts into separate switches (Dell S series – VLT
between switches).  Pulling a single interface has no impact on Ceph,
any packet loss is tiny and we’re not exceeding 10G bandwidth per host.

If you’re running 1G links and the LAG is already busy, a link failure
could be causing slow writes to the host, just down to
congestion...which then starts to impact the wider cluster based on how
Ceph works.

Just caveating the above with - I’m relatively new to Ceph myself

Sent from 
Mail
 for
Windows 10

From: huxia...@horebdata.cn
Sent: 15 June 2021 17:52
To: Serkan Çoban
Cc: ceph-users
Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2
MC-LAG

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work,
but the iperf test yields very low results.

huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2
MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
> 
> Dear Cephers,
> 
> I encountered the following networking issue several times, and i
wonder whether there is a solution for networking HA solution.
> 
> We build ceph using L2 multi chassis link aggregation group (MC-LAG )
to provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a
single network port, either the cable, or the SFP+ optical module fails,
Ceph cluster  is badly affected by networking, although in theory it
should be able to tolerate.
> 
> Did i miss something important here? and how to really achieve
networking HA in Ceph cluster?
> 
> best regards,
> 
> Samuel
> 
> 
> 
> 
>

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

2021-06-16 Thread huxia...@horebdata.cn

Is it true that MC-LAG and 803.2ad, by its default, are working on 
active-active. 

What else should i take care to ensure fault tolerance when one path is bad?

best regards,

samuel

huxia...@horebdata.cn

From: Joe Comeau
Date: 2021-06-15 23:44
To: ceph-users@ceph.io
Subject: [ceph-users] Fwd: Re: Issues with Ceph network redundancy using L2 
MC-LAG
We also run with Dell VLT switches (40 GB)
everything is active/active, so multiple paths as Andrew describes in
his config
Our config allows us:
   bring down one of the switches for upgrades
   bring down an iscsi gatway for patching
all the while at least one path is up and servicing
Thanks Joe

>>> Andrew Walker-Brown  6/15/2021 10:26 AM
>>>
With an unstable link/port you could see the issues you describe.  Ping
doesn’t have the packet rate for you to necessarily have a packet in
transit at exactly the same time as the port fails temporarily.  Iperf
on the other hand could certainly show the issue, higher packet rate and
more likely to have packets in flight at the time of a link
fail...combined with packet loss/retries gives poor throughput.

Depending on what you want to happen, there are a number of tuning
options both on the switches and Linux.  If you want the LAG to be down
if any link fails, the you should be able to config this on the switches
and/or Linux  (minimum number of links = 2 if you have 2 links in the
lag).

You can also tune the link monitoring, how frequently the links are
checked (e.g. miimon) etc.  Bringing this value down from the default of
100ms may allow you to detect a link failure more quickly.  But you then
run into the chance if detecting a transient failure that wouldn’t have
caused any issuesand the LAG becoming more unstable.

Flapping/unstable links are the worst kind of situation.  Ideally you’d
pick that up quickly from monitoring/alerts and either fix immediately
or take the link down until you can fix it.

I run 2x10G from my hosts into separate switches (Dell S series – VLT
between switches).  Pulling a single interface has no impact on Ceph,
any packet loss is tiny and we’re not exceeding 10G bandwidth per host.

If you’re running 1G links and the LAG is already busy, a link failure
could be causing slow writes to the host, just down to
congestion...which then starts to impact the wider cluster based on how
Ceph works.

Just caveating the above with - I’m relatively new to Ceph myself

Sent from Mail for
Windows 10

From: huxia...@horebdata.cn
Sent: 15 June 2021 17:52
To: Serkan Çoban
Cc: ceph-users
Subject: [ceph-users] Re: Issues with Ceph network redundancy using L2
MC-LAG

When i pull out the cable, then the bond is working properly.

Does it mean that the port is somehow flapping? Ping can still work,
but the iperf test yields very low results.

huxia...@horebdata.cn

From: Serkan Çoban
Date: 2021-06-15 18:47
To: huxia...@horebdata.cn
CC: ceph-users
Subject: Re: [ceph-users] Issues with Ceph network redundancy using L2
MC-LAG
Do you observe the same behaviour when you pull a cable?
Maybe a flapping port might cause this kind of behaviour, other than
that you should't see any network disconnects.
Are you sure about LACP configuration, what is the output of 'cat
/proc/net/bonding/bond0'

On Tue, Jun 15, 2021 at 7:19 PM huxia...@horebdata.cn
 wrote:
>
> Dear Cephers,
>
> I encountered the following networking issue several times, and i
wonder whether there is a solution for networking HA solution.
>
> We build ceph using L2 multi chassis link aggregation group (MC-LAG )
to provide switch redundancy. On each host, we use 802.3ad, LACP
> mode for NIC redundancy. However, we observe several times, when a
single network port, either the cable, or the SFP+ optical module fails,
Ceph cluster  is badly affected by networking, although in theory it
should be able to tolerate.
>
> Did i miss something important here? and how to really achieve
networking HA in Ceph cluster?
>
> best regards,
>
> Samuel
>
>
>
>
> huxia...@horebdata.cn
> ___
> ceph-users mailing list -- ceph-users@ceph.io

> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-06-16 Thread Ackermann, Christoph

Good Morning Dan,

adjusting  "ceph osd setmaxosd 76"  solved the problem so far. :-)

Thanks and Best regards,
Christoph

Am Di., 15. Juni 2021 um 21:14 Uhr schrieb Ackermann, Christoph <
c.ackerm...@infoserve.de>:

> Hi Dan,
>
> Thanks for the hint, i'll try this tomorrow with a test bed first. This
> evening I had to fix some  Bareos client systems to get a  quiet sleep. ;-)
>
> Will give you feedback asap.
>
> Best regards,
> Christoph
>
> Am Di., 15. Juni 2021 um 21:03 Uhr schrieb Dan van der Ster <
> d...@vanderster.com>:
>
>> Hi Christoph,
>>
>> What about the max osd? If "ceph osd getmaxosd" is not 76 on this
>> cluster, then set it: `ceph osd setmaxosd 76`.
>>
>> -- dan
>>
>> On Tue, Jun 15, 2021 at 8:54 PM Ackermann, Christoph
>>  wrote:
>> >
>> > Dan,
>> >
>> > sorry, we have no gaps in osd numbering:
>> > isceph@ceph-deploy:~$ sudo ceph osd ls |wc -l; sudo ceph osd tree |
>> sort -n -k1  |tail
>> > 76
>> > [..]
>> >  73ssd0.28600  osd.73  up
>>  1.0  1.0
>> >  74ssd0.27689  osd.74  up
>>  1.0  1.0
>> >  75ssd0.28600  osd.75  up
>>  1.0  1.0
>> >
>> > The (quite old) cluster is running v15.2.13 very well. :-)   OSDs
>> running on top of  (newest) centos8.4 bare metal, mon/mds run on (bewest)
>> Centos 7.9  VMs.  Problem just appears only with the newest Centos8 client
>> libceph.
>> >
>> > Christoph
>> >
>> >
>> >
>> >
>> >
>> > Am Di., 15. Juni 2021 um 20:26 Uhr schrieb Dan van der Ster <
>> d...@vanderster.com>:
>> >>
>> >> Replying to own mail...
>> >>
>> >> On Tue, Jun 15, 2021 at 7:54 PM Dan van der Ster 
>> wrote:
>> >> >
>> >> > Hi Ilya,
>> >> >
>> >> > We're now hitting this on CentOS 8.4.
>> >> >
>> >> > The "setmaxosd" workaround fixed access to one of our clusters, but
>> >> > isn't working for another, where we have gaps in the osd ids, e.g.
>> >> >
>> >> > # ceph osd getmaxosd
>> >> > max_osd = 553 in epoch 691642
>> >> > # ceph osd tree | sort -n -k1 | tail
>> >> >  541   ssd   0.87299 osd.541up  1.0
>> 1.0
>> >> >  543   ssd   0.87299 osd.543up  1.0
>> 1.0
>> >> >  548   ssd   0.87299 osd.548up  1.0
>> 1.0
>> >> >  552   ssd   0.87299 osd.552up  1.0
>> 1.0
>> >> >
>> >> > Is there another workaround for this?
>> >>
>> >> The following seems to have fixed this cluster:
>> >>
>> >> 1. Fill all gaps with: ceph osd new `uuid`
>> >> ^^ after this, the cluster is still not mountable.
>> >> 2. Purge all the gap osds: ceph osd purge 
>> >>
>> >> I filled/purged a couple hundred gap osds, and now the cluster can be
>> mounted.
>> >>
>> >> Cheers!
>> >>
>> >> Dan
>> >>
>> >> P.S. The bugzilla is not public:
>> >> https://bugzilla.redhat.com/show_bug.cgi?id=1972278
>> >>
>> >> >
>> >> > Cheers, dan
>> >> >
>> >> >
>> >> > On Mon, May 3, 2021 at 12:32 PM Ilya Dryomov 
>> wrote:
>> >> > >
>> >> > > On Mon, May 3, 2021 at 12:27 PM Magnus Harlander 
>> wrote:
>> >> > > >
>> >> > > > Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
>> >> > > >
>> >> > > > ceph osd setmaxosd 10
>> >> > > >
>> >> > > > Bingo! Mount works again.
>> >> > > >
>> >> > > > Vry strange things are going on here (-:
>> >> > > >
>> >> > > > Thanx a lot for now!! If I can help to track it down, please let
>> me know.
>> >> > >
>> >> > > Good to know it helped!  I'll think about this some more and
>> probably
>> >> > > plan to patch the kernel client to be less stringent and not choke
>> on
>> >> > > this sort of misconfiguration.
>> >> > >
>> >> > > Thanks,
>> >> > >
>> >> > > Ilya
>> >> > > ___
>> >> > > ceph-users mailing list -- ceph-users@ceph.io
>> >> > > To unsubscribe send an email to ceph-users-le...@ceph.io
>> >> ___
>> >> ceph-users mailing list -- ceph-users@ceph.io
>> >> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] ceph osd df return null

2021-06-16 Thread julien lenseigne


Hi,

when i do ceph osd df,

some osd returns null size. For example :

 0   hdd  7.27699  1.0  0B  0B  0B  0B 0B  
0B 0    0  29
 1   hdd  7.27698  1.0  0B  0B  0B  0B 0B  
0B 0    0  23
11   ssd  0.5  1.0  0B  0B  0B  0B 0B  
0B 0    0  49
 2   hdd  7.27699  1.0  0B  0B  0B  0B 0B  
0B 0    0  30
 4   hdd  7.27699  1.0  0B  0B  0B  0B 0B  
0B 0    0  37
12   ssd  0.5  1.0  0B  0B  0B  0B 0B  
0B 0    0  39
 7   hdd 24.55698  1.0 24.6TiB  663GiB  663GiB  0B  0B 
23.9TiB  2.64 0.79 302
42   hdd 10.69179  1.0 10.7TiB  346GiB  344GiB 30.0MiB 1.13GiB 
10.4TiB  3.16 0.95  98
43   hdd 10.69179  1.0 10.7TiB  357GiB  355GiB 49.1MiB 1.17GiB 
10.3TiB  3.26 0.98 131
44   hdd 10.69179  1.0 10.7TiB  298GiB  297GiB 30.4MiB 1.05GiB 
10.4TiB  2.72 0.81  92
45   hdd 10.69179  1.0 10.7TiB  342GiB  341GiB 26.6MiB 1.11GiB 
10.4TiB  3.12 0.94 105
40   ssd  0.87270  1.0  894GiB  180GiB  179GiB 33.5MiB 990MiB  
714GiB 20.10 6.02  53
41   ssd  0.87270  1.0  894GiB  254GiB  253GiB 39.0MiB 1.00GiB  
640GiB 28.38 8.50  61
46   ssd  0.87270  1.0  894GiB  255GiB  254GiB 26.8MiB 1.03GiB  
639GiB 28.55 8.55  76
 3   hdd 24.55699  1.0 24.6TiB  537GiB  536GiB 67.9MiB 1.36GiB 
24.0TiB  2.14 0.64 252


do you know where it might come from ?

Thanks.

--
Julien Lenseigne
Responsable informatique LMD
Tel: 0169335172
Ecole Polytechnique, route de saclay, 91128 Palaiseau
Bat 83 - Bureau 83.30.13

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] osd_scrub_max_preemptions for large OSDs or large EC pgs

[ceph-users] Strange (incorrect?) upmap entries in OSD map

[ceph-users] Re: JSON output schema

[ceph-users] Re: ceph osd df return null

[ceph-users] Re: Ceph monitor won't start after Ubuntu update

[ceph-users] Drop old SDD / HDD Host crushmap rules

[ceph-users] Likely date for Pacific backport for RGW fix?

[ceph-users] Re: Ceph Month June Schedule Now Available

[ceph-users] RADOSGW Keystone integration - S3 bucket policies targeting not just other tenants / projects ?

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

[ceph-users] libceph: monX session lost, hunting for new mon

[ceph-users] Re: Ceph monitor won't start after Ubuntu update

[ceph-users] Re: Strategy for add new osds

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

[ceph-users] Re: Strategy for add new osds

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

[ceph-users] Re: Fwd: Re: Issues with Ceph network redundancy using L2 MC-LAG

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

[ceph-users] ceph osd df return null

20 matches

Site Navigation

Mail list logo

Footer information