Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-19 Thread Kurt Bauer
Known problem, see http://tracker.ceph.com/issues/24326 Would nevertheless be nice to know, if it's planned to get this fixed. br, Kurt Ketil Froyn wrote on 19.02.19 08:47: I think there may be something wrong with the apt repository for bionic, actually. Compare the packages available for

Re: [ceph-users] IRC channels now require registered and identified users

2019-02-19 Thread Joao Eduardo Luis
On 02/18/2019 07:17 PM, David Turner wrote: > Is this still broken in the 1-way direction where Slack users' comments > do not show up in IRC?  That would explain why nothing I ever type (as > either helping someone or asking a question) ever have anyone respond to > them. I noticed that

Re: [ceph-users] Prevent rebalancing in the same host?

2019-02-19 Thread Christian Balzer
On Tue, 19 Feb 2019 09:21:21 +0100 Marco Gaiarin wrote: > Little cluster, 3 nodes, 4 OSD per node. > > An OSD died, and ceph start to rebalance data between the OSD of the > same node (not completing it, leading to 'near os full' warning). > > As exist: > mon osd down out subtree limit =

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Igor Fedotov
Hi Alexander, I think op_w_process_latency includes replication times, not 100% sure though. So restarting other nodes might affect latencies at this specific OSD. Thanks, Igot On 2/16/2019 11:29 AM, Alexandre DERUMIER wrote: There are 10 OSDs in these systems with 96GB of memory in

Re: [ceph-users] RGW: Reshard index of non-master zones in multi-site

2019-02-19 Thread Iain Buclaw
On Wed, 6 Feb 2019 at 09:28, Iain Buclaw wrote: > > On Tue, 5 Feb 2019 at 10:04, Iain Buclaw wrote: > > > > On Tue, 5 Feb 2019 at 09:46, Iain Buclaw wrote: > > > > > > Hi, > > > > > > Following the update of one secondary site from 12.2.8 to 12.2.11, the > > > following warning have come up. >

[ceph-users] Prevent rebalancing in the same host?

2019-02-19 Thread Marco Gaiarin
Little cluster, 3 nodes, 4 OSD per node. An OSD died, and ceph start to rebalance data between the OSD of the same node (not completing it, leading to 'near os full' warning). As exist: mon osd down out subtree limit = host to prevent host rebalancing, there's some way to prevent

Re: [ceph-users] RGW: Reshard index of non-master zones in multi-site

2019-02-19 Thread Iain Buclaw
On Tue, 19 Feb 2019 at 09:59, Iain Buclaw wrote: > > On Wed, 6 Feb 2019 at 09:28, Iain Buclaw wrote: > > > > On Tue, 5 Feb 2019 at 10:04, Iain Buclaw wrote: > > > > > > On Tue, 5 Feb 2019 at 09:46, Iain Buclaw wrote: > > > > > > > > Hi, > > > > > > > > Following the update of one secondary

Re: [ceph-users] CephFS: client hangs

2019-02-19 Thread Hennen, Christian
Hi! >mon_max_pg_per_osd = 400 > >In the ceph.conf and then restart all the services / or inject the config >into the running admin I restarted each server (MONs and OSDs weren’t enough) and now the health warning is gone. Still no luck accessing CephFS though. > MDS show a client got evicted.

Re: [ceph-users] RGW: Reshard index of non-master zones in multi-site

2019-02-19 Thread Iain Buclaw
On Tue, 19 Feb 2019 at 10:05, Iain Buclaw wrote: > > On Tue, 19 Feb 2019 at 09:59, Iain Buclaw wrote: > > > > On Wed, 6 Feb 2019 at 09:28, Iain Buclaw wrote: > > > > > > On Tue, 5 Feb 2019 at 10:04, Iain Buclaw wrote: > > > > > > > > On Tue, 5 Feb 2019 at 09:46, Iain Buclaw wrote: > > > > > >

Re: [ceph-users] CephFS: client hangs

2019-02-19 Thread Yan, Zheng
On Tue, Feb 19, 2019 at 5:10 PM Hennen, Christian wrote: > > Hi! > > >mon_max_pg_per_osd = 400 > > > >In the ceph.conf and then restart all the services / or inject the config > >into the running admin > > I restarted each server (MONs and OSDs weren’t enough) and now the health > warning is

[ceph-users] Ceph OSD: how to keep files after umount or reboot vs tempfs ?

2019-02-19 Thread PHARABOT Vincent
Hello Cephers, I have an issue with OSD device mount on tmpfs with bluestore For some occasion, I need to keep the files on the tiny bluestore fs (especially keyring and may be other useful files needed for osd to work) on a working OSD Since osd partition is mount as tmpfs , these files are

Re: [ceph-users] Ceph OSD: how to keep files after umount or reboot vs tempfs ?

2019-02-19 Thread PHARABOT Vincent
Ok thank you for confirmation Burkhard I’m trying this Vincent De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de Burkhard Linke Envoyé : mardi 19 février 2019 13:20 À : ceph-users@lists.ceph.com Objet : Re: [ceph-users] Ceph OSD: how to keep files after umount or reboot

Re: [ceph-users] Ceph OSD: how to keep files after umount or reboot vs tempfs ?

2019-02-19 Thread Burkhard Linke
Hi, On 2/19/19 11:52 AM, PHARABOT Vincent wrote: Hello Cephers, I have an issue with OSD device mount on tmpfs with bluestore For some occasion, I need to keep the files on the tiny bluestore fs (especially keyring and may be other useful files needed for osd to work) on a working OSD

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-19 Thread David Turner
I don't know that there's anything that can be done to resolve this yet without rebuilding the OSD. Based on a Nautilus tool being able to resize the DB device, I'm assuming that Nautilus is also capable of migrating the DB/WAL between devices. That functionality would allow anyone to migrate

[ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Balazs Soltesz
Hi all, I'm experimenting with CephFS as storage to a bitbucket cluster. One problems to tackle is replicating the filesystem contents between ceph clusters in different sites around the globe. I've read about pool replication, but I've also read replicating pools under a CephFS is not

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Alexandre DERUMIER
>>I think op_w_process_latency includes replication times, not 100% sure >>though. >> >>So restarting other nodes might affect latencies at this specific OSD. Seem to be the case, I have compared with sub_op_latency. I have changed my graph, to clearly identify the osd where the latency is

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Alexandre DERUMIER
Hi, I think that cephfs snap mirroring is coming for nautilus https://www.openstack.org/assets/presentation-media/2018.11.15-openstack-ceph-data-services.pdf (slide 26) But I don't known if it's already ready is master ? - Mail original - De: "Vitaliy Filippov" À: "Marc Roos" ,

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-19 Thread Konstantin Shalygin
On 2/19/19 11:46 PM, David Turner wrote: I don't know that there's anything that can be done to resolve this yet without rebuilding the OSD.  Based on a Nautilus tool being able to resize the DB device, I'm assuming that Nautilus is also capable of migrating the DB/WAL between devices.  That

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-19 Thread Brian Topping
> On Feb 19, 2019, at 3:30 PM, Vitaliy Filippov wrote: > > In our russian-speaking Ceph chat we swear "ceph inside kuber" people all the > time because they often do not understand in what state their cluster is at > all Agreed 100%. This is a really good way to lock yourself out of your data

[ceph-users] krbd: Can I only just update krbd module without updating kernal?

2019-02-19 Thread Wei Zhao
Hi: Because of some reasons, I can update the kernal to higher version. So I wonder if I can only just update krbd kernal module ? Has anyone done this before? ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] krbd: Can I only just update krbd module without updating kernal?

2019-02-19 Thread Konstantin Shalygin
Because of some reasons, I can update the kernal to higher version. So I wonder if I can only just update krbd kernal module ? Has anyone done this before? Of course you can. You "just" need a make krbd patch from upstream kernel and apply it to your kernel tree. It's a lot of work and

[ceph-users] How to change/anable/activate a different osd_memory_target value

2019-02-19 Thread Götz Reinicke
Hi, we run into some OSD node freezes with out of memory and eating all swap too. Till we get more physical RAM I’d like to reduce the osd_memory_target, but can’t find where and how to enable it. We have 24 bluestore Disks in 64 GB centos nodes with Luminous v12.2.11 Thanks for hints

Re: [ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-19 Thread solarflow99
no, but I know that if the wear leveling isn't right then I wouldn't expect them to last long, FW updates on SSDs are very important. On Mon, Feb 18, 2019 at 7:44 AM David Turner wrote: > We have 2 clusters of [1] these disks that have 2 Bluestore OSDs per disk > (partitioned), 3 disks per

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Marc Roos
>> > >I'm not saying CephFS snapshots are 100% stable, but for certain >use-cases they can be. > >Try to avoid: > >- Multiple CephFS in same cluster >- Snapshot the root (/) >- Having a lot of snapshots How many is a lot? Having a lot of snapshots in total? Or having a lot of

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Marc Roos
>> >> >> >> > >> >I'm not saying CephFS snapshots are 100% stable, but for certain >> >use-cases they can be. >> > >> >Try to avoid: >> > >> >- Multiple CephFS in same cluster >> >- Snapshot the root (/) >> >- Having a lot of snapshots >> >> How many is a lot? Having a

Re: [ceph-users] CephFS: client hangs

2019-02-19 Thread Hennen, Christian
> sounds like network issue. are there firewall/NAT between nodes? No, there is currently no firewall in place. Nodes and clients are on the same network. MTUs match, ports are opened according to nmap. > try running ceph-fuse on the node that run mds, check if it works properly. When I try to

Re: [ceph-users] CephFS: client hangs

2019-02-19 Thread David Turner
You're attempting to use mismatching client name and keyring. You want to use matching name and keyring. For your example, you would want to either use `--keyring /etc/ceph/ceph.client.admin.keyring --name client.admin` or `--keyring /etc/ceph/ceph.client.cephfs.keyring --name client.cephfs`.

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-19 Thread David Turner
If your client needs to be able to handle the writes like that on its own, RBDs might be the more appropriate use case. You lose the ability to have multiple clients accessing the data as easily as with CephFS, but you would gain the features you're looking for. On Tue, Feb 12, 2019 at 1:43 PM

Re: [ceph-users] faster switch to another mds

2019-02-19 Thread Fyodor Ustinov
Hi! >From documentation: mds beacon grace Description:The interval without beacons before Ceph declares an MDS laggy (and possibly replace it). Type: Float Default:15 I do not understand, 15 - are is seconds or beacons? And an additional misunderstanding - if we gently turn off

[ceph-users] ceph-ansible try to recreate existing osds in osds.yml

2019-02-19 Thread Jawad Ahmed
Hi all, I have running cluster deployed with ceph-ansible. Why ceph-ansible try to recreate and gives error on existing osds mentioned in osds.yml. Should't it just skip the existing osds and find disks/volumes which are empty? some info: I am using osd_scenarion: lvm with bluestore I mean,

Re: [ceph-users] faster switch to another mds

2019-02-19 Thread David Turner
It's also been mentioned a few times that when MDS and MON are on the same host that the downtime for MDS is longer when both daemons stop at about the same time. It's been suggested to stop the MDS daemon, wait for `ceph mds stat` to reflect the change, and then restart the rest of the server.

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-19 Thread David Turner
Have you ever seen an example of a Ceph cluster being run and managed by Rook? It's a really cool idea and takes care of containerizing mons, rgw, mds, etc that I've been thinking about doing anyway. Having those containerized means that if you can upgrade all of the mon services before any of

Re: [ceph-users] Ceph cluster stability

2019-02-19 Thread David Turner
With a RACK failure domain, you should be able to have an entire rack powered down without noticing any major impact on the clients. I regularly take down OSDs and nodes for maintenance and upgrades without seeing any problems with client IO. On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy

Re: [ceph-users] crush map has straw_calc_version=0 and legacy tunables on luminous

2019-02-19 Thread David Turner
[1] Here is a really cool set of slides from Ceph Day Berlin where Dan van der Ster uses the mgr balancer module with upmap to gradually change the tunables of a cluster without causing major client impact. The down side for you is that upmap requires all luminous or newer clients, but if you

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Wido den Hollander
On 2/19/19 6:00 PM, Balazs Soltesz wrote: > Hi all, > >   > > I’m experimenting with CephFS as storage to a bitbucket cluster. > >   > > One problems to tackle is replicating the filesystem contents between > ceph clusters in different sites around the globe. > > I’ve read about pool

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Wido den Hollander
On 2/19/19 6:28 PM, Marc Roos wrote: > > >> > > > >I'm not saying CephFS snapshots are 100% stable, but for certain > >use-cases they can be. > > > >Try to avoid: > > > >- Multiple CephFS in same cluster > >- Snapshot the root (/) > >- Having a lot of snapshots > > How many is a

Re: [ceph-users] Replicating CephFS between clusters

2019-02-19 Thread Vitaliy Filippov
Ah, yes, good question. I don't know if there is a true upper limit, but leaving old snapshot around could hurt you when replaying journals and such. Is is still so in mimic? Should I live in fear if I keep old snapshots all the time (because I'm using them as "checkpoints")? :) -- With

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-19 Thread Vitaliy Filippov
In our russian-speaking Ceph chat we swear "ceph inside kuber" people all the time because they often do not understand in what state their cluster is at all // Sorry to intervene :)) -- With best regards, Vitaliy Filippov ___ ceph-users mailing

Re: [ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-19 Thread Alexandre DERUMIER
I'm running some s4610 (SSDPE2KE064T8), with firmware VDV10140. don't have any problem with them since 6months. But I remember than around september 2017, supermicro has warned me about a firmware bug on s4600. (don't known which firmware version) - Mail original - De: "David Turner"