[ceph-users] Multiple corrupt bluestore osds, Host Machine attacks VM OSDs

2019-07-25 Thread Daniel Williams
Hey, I have a machine with 5 drives in a VM and 5 drives that were on the same host machine. I've made this mistake once before ceph-volume activate -all the host machines drives and it takes over the 5 drives in the VM as well and corrupts them. I've actually lost data this time. Erasure encoded

Re: [ceph-users] New best practices for osds???

2019-07-25 Thread Anthony D'Atri
> We run few hundred HDD OSDs for our backup cluster, we set one RAID 0 per HDD > in order to be able > to use -battery protected- write cache from the RAID controller. It really > improves performance, for both > bluestore and filestore OSDs. Having run something like 6000 HDD-based FileStore O

Re: [ceph-users] Upgrading and lost OSDs

2019-07-25 Thread Bob R
I would try 'mv /etc/ceph/osd{,.old}' then run 'ceph-volume simple scan' again. We had some problems upgrading due to OSDs (perhaps initially installed as firefly?) missing the 'type' attribute and iirc the 'ceph-volume simple scan' command refused to overwrite existing json files after I made som

Re: [ceph-users] Future of Filestore?

2019-07-25 Thread Stuart Longland
On 25/7/19 9:32 pm, Виталий Филиппов wrote: > Hi again, > > I reread your initial email - do you also run a nanoceph on some SBCs > each having one 2.5" 5400rpm HDD plugged into it? What SBCs do you use? :-) I presently have a 5-node Ceph cluster: - 3× Supermicro A1SAi-2750F with 1 120GB 2.5" SS

Re: [ceph-users] [Disarmed] Re: ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread DHilsbos
Nathan; I'm not an expert on firewalld, but shouldn't you have a list of open ports? ports: ? Here's the configuration on my test cluster: public (active) target: default icmp-block-inversion: no interfaces: bond0 sources: services: ssh dhcpv6-client ports: 6789/tcp 3300/tcp 680

Re: [ceph-users] ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread Nathan Harper
This is a new issue to us, and did not have the same problem running the same activity on our test system. Regards, Nathan > On 25 Jul 2019, at 22:00, solarflow99 wrote: > > I used ceph-ansible just fine, never had this problem. > >> On Thu, Jul 25, 2019 at 1:31 PM Nathan Harper >> wrote

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-07-25 Thread Janek Bevendorff
I am not sure if making caps recall more aggressive helps. It seems to be the client failing to respond to it (at least that's what the warnings say).But I will try your new suggested settings as soon as I get the chance and will report back with the results. On 25 Jul 2019 11:00 pm, Patrick Donnel

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-07-25 Thread Patrick Donnelly
On Thu, Jul 25, 2019 at 12:49 PM Janek Bevendorff wrote: > > > > Based on that message, it would appear you still have an inode limit > > in place ("mds_cache_size"). Please unset that config option. Your > > mds_cache_memory_limit is apparently ~19GB. > > No, I do not have an inode limit set. Onl

Re: [ceph-users] ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread solarflow99
I used ceph-ansible just fine, never had this problem. On Thu, Jul 25, 2019 at 1:31 PM Nathan Harper wrote: > Hi all, > > We've run into a strange issue with one of our clusters managed with > ceph-ansible. We're adding some RGW nodes to our cluster, and so re-ran > site.yml against the cluste

[ceph-users] ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread Nathan Harper
Hi all, We've run into a strange issue with one of our clusters managed with ceph-ansible. We're adding some RGW nodes to our cluster, and so re-ran site.yml against the cluster. The new RGWs added successfully, but When we did, we started to get slow requests, effectively across the whole

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-07-25 Thread Janek Bevendorff
> Based on that message, it would appear you still have an inode limit > in place ("mds_cache_size"). Please unset that config option. Your > mds_cache_memory_limit is apparently ~19GB. No, I do not have an inode limit set. Only the memory limit. > There is another limit mds_max_caps_per_clien

Re: [ceph-users] Large OMAP Objects in zone.rgw.log pool

2019-07-25 Thread Brett Chancellor
14.2.1 Thanks, I'll try that. On Thu, Jul 25, 2019 at 2:54 PM Casey Bodley wrote: > What ceph version is this cluster running? Luminous or later should not > be writing any new meta.log entries when it detects a single-zone > configuration. > > I'd recommend editing your zonegroup configuration

Re: [ceph-users] Large OMAP Objects in zone.rgw.log pool

2019-07-25 Thread Casey Bodley
What ceph version is this cluster running? Luminous or later should not be writing any new meta.log entries when it detects a single-zone configuration. I'd recommend editing your zonegroup configuration (via 'radosgw-admin zonegroup get' and 'put') to set both log_meta and log_data to false,

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-07-25 Thread Patrick Donnelly
On Thu, Jul 25, 2019 at 3:08 AM Janek Bevendorff wrote: > > The rsync job has been copying quite happily for two hours now. The good > news is that the cache size isn't increasing unboundedly with each > request anymore. The bad news is that it still is increasing afterall, > though much slower. I

Re: [ceph-users] Large OMAP Objects in zone.rgw.log pool

2019-07-25 Thread Brett Chancellor
Casey, These clusters were setup with the intention of one day doing multi site replication. That has never happened. The cluster has a single realm, which contains a single zonegroup, and that zonegroup contains a single zone. -Brett On Thu, Jul 25, 2019 at 2:16 PM Casey Bodley wrote: > Hi B

Re: [ceph-users] Large OMAP Objects in zone.rgw.log pool

2019-07-25 Thread Casey Bodley
Hi Brett, These meta.log objects store the replication logs for metadata sync in multisite. Log entries are trimmed automatically once all other zones have processed them. Can you verify that all zones in the multisite configuration are reachable and syncing? Does 'radosgw-admin sync status'

Re: [ceph-users] how to power off a cephfs cluster cleanly

2019-07-25 Thread Patrick Donnelly
On Thu, Jul 25, 2019 at 7:48 AM Dan van der Ster wrote: > > Hi all, > > In September we'll need to power down a CephFS cluster (currently > mimic) for a several-hour electrical intervention. > > Having never done this before, I thought I'd check with the list. > Here's our planned procedure: > > 1

[ceph-users] Large OMAP Objects in zone.rgw.log pool

2019-07-25 Thread Brett Chancellor
I'm having an issue similar to http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-March/033611.html . I don't see where any solution was proposed. $ ceph health detail HEALTH_WARN 1 large omap objects LARGE_OMAP_OBJECTS 1 large omap objects 1 large objects found in pool 'us-prd-1.rgw.log

Re: [ceph-users] Nautilus: significant increase in cephfs metadata pool usage

2019-07-25 Thread Nathan Fish
I have seen significant increases (1GB -> 8GB) proportional to number of inodes open, just like the MDS cache grows. These went away once the stat-heavy workloads (multiple parallel rsyncs) stopped. I disabled autoscale warnings on the metadata pools due to this fluctuation. On Thu, Jul 25, 2019 a

Re: [ceph-users] test, please ignore

2019-07-25 Thread Federico Lucifredi
Anything for our favorite esquire! :-) -F2 -- "'Problem' is a bleak word for challenge" - Richard Fish _ Federico Lucifredi Product Management Director, Ceph Storage Platform Red Hat A273 4F57 58C0 7FE8 838D 4F87 AEEB EC18 4A73 88AC redhat.com TRIED. TESTE

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Matthew Vernon
On 24/07/2019 20:06, Paul Emmerich wrote: +1 on adding them all at the same time. All these methods that gradually increase the weight aren't really necessary in newer releases of Ceph. FWIW, we added a rack-full (9x60 = 540 OSDs) in one go to our production cluster (then running Jewel) taki

Re: [ceph-users] how to power off a cephfs cluster cleanly

2019-07-25 Thread DHilsbos
Dan; I don't have a lot of experience with Ceph, but I generally set all of the following before taking a cluster offline: ceph osd set noout ceph osd set nobackfill ceph osd set norecover ceph osd set norebalance ceph osd set nodown ceph osd set pause I then unset them in the opposite order:

[ceph-users] how to power off a cephfs cluster cleanly

2019-07-25 Thread Dan van der Ster
Hi all, In September we'll need to power down a CephFS cluster (currently mimic) for a several-hour electrical intervention. Having never done this before, I thought I'd check with the list. Here's our planned procedure: 1. umounts /cephfs from all hpc clients. 2. ceph osd set noout 3. wait unti

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Xavier Trilla
We had a similar situation, with one machine reseting when we restarted another one. I’m not 100% sure why it happened, but I would bet it was related to several thousand client connections migrating from the machine we restarted to another one. We have a similar setup than yours, and if you c

Re: [ceph-users] Future of Filestore?

2019-07-25 Thread Виталий Филиппов
Hi again, I reread your initial email - do you also run a nanoceph on some SBCs each having one 2.5" 5400rpm HDD plugged into it? What SBCs do you use? :-) -- With best regards, Vitaliy Filippov___ ceph-users mailing list ceph-users@lists.ceph.com ht

Re: [ceph-users] Fwd: [lca-announce] linux.conf.au 2020 - Call for Sessions and Miniconfs now open!

2019-07-25 Thread Tim Serong
Hi All, Just a reminder, there's only a few days left to submit talks for this most excellent conference; the CFP is open until Sunday 28 July Anywhere on Earth. (I've submitted a Data Storage miniconf day, fingers crossed...) Regards, Tim On 6/26/19 2:09 PM, Tim Serong wrote: > Here we go aga

[ceph-users] test, please ignore

2019-07-25 Thread Tim Serong
Sorry for the noise, I was getting "Remote Server returned '550 Cannot process address'" errors earlier trying to send to ceph-users@lists.ceph.com, and wanted to re-test. -- Tim Serong Senior Clustering Engineer SUSE tser...@suse.com ___ ceph-users mai

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread zhanrzh...@teamsun.com.cn
Hi,Janne  Thank you for correcting my mistake. Maybe the first advice description is unclear,I want to say that add osds into one failuer domain at a time , so that only one PG  among up set to remap at a time. -- zhanrzh...@teamsun.com.cn >Den tors 25 juli 2019 kl 10:47 skrev 展荣臻(信泰

Re: [ceph-users] Please help: change IP address of a cluster

2019-07-25 Thread ST Wong (ITSC)
Hi, Migrated the cluster to new IP range. Regarding the MON that doesn't listen to v1 port, maybe I ran the command as mentioned in manual, but seems the part after comma doesn't work and tells the mon to listen on v2 port only. ceph-mon -i cmon5 --public-addr v2:10.0.1.97:3300,v1:10.0.1.97:67

Re: [ceph-users] Nautilus: significant increase in cephfs metadata pool usage

2019-07-25 Thread Dietmar Rieder
On 7/25/19 11:55 AM, Konstantin Shalygin wrote: >> we just recently upgraded our cluster from luminous 12.2.10 to nautilus >> 14.2.1 and I noticed a massive increase of the space used on the cephfs >> metadata pool although the used space in the 2 data pools basically did >> not change. See the at

Re: [ceph-users] Nautilus: significant increase in cephfs metadata pool usage

2019-07-25 Thread Konstantin Shalygin
we just recently upgraded our cluster from luminous 12.2.10 to nautilus 14.2.1 and I noticed a massive increase of the space used on the cephfs metadata pool although the used space in the 2 data pools basically did not change. See the attached graph (NOTE: log10 scale on y-axis) Is there any re

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Janne Johansson
Den tors 25 juli 2019 kl 10:47 skrev 展荣臻(信泰) : > > 1、Adding osds in same one failure domain is to ensure only one PG in pg up > set (ceph pg dump shows)to remap. > 2、Setting "osd_pool_default_min_size=1" is to ensure objects to read/write > uninterruptedly while pg remap. > Is this wrong? > How d

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 9:56 AM, Xiaoxi Chen wrote: > The real impact of changing min_size to 1 , is not about the possibility > of losing data ,but how much data it will lost.. in both case you will > lost some data , just how much. > > Let PG X -> (osd A, B, C), min_size = 2, size =3 > In your description,

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread 展荣臻(信泰)
1、Adding osds in same one failure domain is to ensure only one PG in pg up set (ceph pg dump shows)to remap. 2、Setting "osd_pool_default_min_size=1" is to ensure objects to read/write uninterruptedly while pg remap. Is this wrong? -原始邮件- 发件人:"Janne Johansson" 发送时间:2019-07-25 15:01:37

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Thomas Byrne - UKRI STFC
As a counterpoint, adding large amounts of new hardware in gradually (or more specifically in a few steps) has a few benefits IMO. - Being able to pause the operation and confirm the new hardware (and cluster) is operating as expected. You can identify problems with hardware with OSDs at 10% we

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Xiaoxi Chen
The real impact of changing min_size to 1 , is not about the possibility of losing data ,but how much data it will lost.. in both case you will lost some data , just how much. Let PG X -> (osd A, B, C), min_size = 2, size =3 In your description, T1, OSD A goes down due to upgrade, now the PG is

Re: [ceph-users] [Ceph-users] Re: MDS failing under load with large cache sizes

2019-07-25 Thread Janek Bevendorff
It's possible the MDS is not being aggressive enough with asking the single (?) client to reduce its cache size. There were recent changes [1] to the MDS to improve this. However, the defaults may not be aggressive enough for your client's workload. Can you try: ceph config set mds mds_recall_

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 9:19 AM, Xiaoxi Chen wrote: > We had hit this case in production but my solution will be change > min_size = 1 immediately so that PG back to active right after. > > It somewhat tradeoff reliability(durability) with availability during > that window of 15 mins but if you are certain o

Re: [ceph-users] Kernel, Distro & Ceph

2019-07-25 Thread Dietmar Rieder
On 7/24/19 10:05 PM, Wido den Hollander wrote: > > > On 7/24/19 9:38 PM, dhils...@performair.com wrote: >> All; >> >> There's been a lot of discussion of various kernel versions on this list >> lately, so I thought I'd seek some clarification. >> >> I prefer to run CentOS, and I prefer to keep t

Re: [ceph-users] Anybody using 4x (size=4) replication?

2019-07-25 Thread Wido den Hollander
On 7/25/19 8:55 AM, Janne Johansson wrote: > Den ons 24 juli 2019 kl 21:48 skrev Wido den Hollander >: > > Right now I'm just trying to find a clever solution to this. It's a 2k > OSD cluster and the likelihood of an host or OSD crashing is reasonable > while y

Re: [ceph-users] How to add 100 new OSDs...

2019-07-25 Thread Janne Johansson
Den tors 25 juli 2019 kl 04:36 skrev zhanrzh...@teamsun.com.cn < zhanrzh...@teamsun.com.cn>: > I think it should to set "osd_pool_default_min_size=1" before you add osd , > and the osd that you add at a time should in same Failure domain. > That sounds like weird or even bad advice? What is the