Re: [ceph-users] Mimic 13.2.3?

2019-01-08 Thread David Galloway
On 1/8/19 9:05 AM, Matthew Vernon wrote: > Dear Greg, > > On 04/01/2019 19:22, Gregory Farnum wrote: > >> Regarding Ceph releases more generally: > > [snip] > >> I imagine we will discuss all this in more detail after the release, >> but everybody's patience is appreciated as we work through

Re: [ceph-users] Mimic 13.2.3?

2019-01-08 Thread Ken Dreyer
On Fri, Jan 4, 2019 at 12:23 PM Gregory Farnum wrote: > I imagine we will discuss all this in more detail after the release, > but everybody's patience is appreciated as we work through these > challenges. We have some people on the list asking for more frequent releases, and some people on the

[ceph-users] Ceph Dashboard Rewrite

2019-01-08 Thread Marc Schöchlin
Hello ceph-users, we are using ceph luminous 12.2.10. We run 3 mgrs - if i access the dashboard on a non-active mgr i get a location redirect to the hostname. Because this is not a fqdn, i cannot access the the dasboard in a convenient way because my workstation does not append the datacenter

Re: [ceph-users] Ceph Dashboard Rewrite

2019-01-08 Thread Wido den Hollander
On 1/8/19 8:58 PM, Marc Schöchlin wrote: > Hello ceph-users, > > we are using ceph luminous 12.2.10. > > We run 3 mgrs - if i access the dashboard on a non-active mgr i get a > location redirect to the hostname. > Because this is not a fqdn, i cannot access the the dasboard in a convenient >

Re: [ceph-users] ceph health JSON format has changed

2019-01-08 Thread Gregory Farnum
On Fri, Jan 4, 2019 at 1:19 PM Jan Kasprzak wrote: > > Gregory Farnum wrote: > : On Wed, Jan 2, 2019 at 5:12 AM Jan Kasprzak wrote: > : > : > Thomas Byrne - UKRI STFC wrote: > : > : I recently spent some time looking at this, I believe the 'summary' and > : > : 'overall_status' sections are now

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin
Hello, > Hi guys, I need your help. > I'm new with Cephfs and we started using it as file storage. > Today we are getting no space left on device but I'm seeing that we have > plenty space on the filesystem. > Filesystem              Size  Used Avail Use% Mounted on >

Re: [ceph-users] Mimic 13.2.3?

2019-01-08 Thread Matthew Vernon
Dear Greg, On 04/01/2019 19:22, Gregory Farnum wrote: > Regarding Ceph releases more generally: [snip] > I imagine we will discuss all this in more detail after the release, > but everybody's patience is appreciated as we work through these > challenges. Thanks for this. Could you confirm

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Yoann, thanks for your response. Here are the results of the commands. root@pf-us1-dfs2:/var/log/ceph# ceph osd df ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS 0 hdd 7.27739 1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310 5 hdd 7.27739 1.0 7.3 TiB 5.6 TiB 1.7 TiB

Re: [ceph-users] Is it possible to increase Ceph Mon store?

2019-01-08 Thread Dan van der Ster
On Tue, Jan 8, 2019 at 12:48 PM Thomas Byrne - UKRI STFC wrote: > > For what it's worth, I think the behaviour Pardhiv and Bryan are describing > is not quite normal, and sounds similar to something we see on our large > luminous cluster with elderly (created as jewel?) monitors. After large >

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
It would but you should not: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014846.html Kevin Am Di., 8. Jan. 2019 um 15:35 Uhr schrieb Rodrigo Embeita : > > Thanks again Kevin. > If I reduce the size flag to a value of 2, that should fix the problem? > > Regards > > On Tue,

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin
> root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule dump > [ >    { >    "rule_id": 0, >    "rule_name": "replicated_rule", >    "ruleset": 0, >    "type": 1, >    "min_size": 1, >    "max_size": 10, >    "steps": [ >    { >    "op": "take", >

[ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi guys, I need your help. I'm new with Cephfs and we started using it as file storage. Today we are getting no space left on device but I'm seeing that we have plenty space on the filesystem. Filesystem Size Used Avail Use% Mounted on

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
You use replication 3 failure-domain host. OSD 2 and 4 are full, thats why your pool is also full. You need to add two disks to pf-us1-dfs3 or swap one from the larger nodes to this one. Kevin Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita : > > Hi Yoann, thanks for your response. >

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin
Hello, > Hi Yoann, thanks for your response. > Here are the results of the commands. > > root@pf-us1-dfs2:/var/log/ceph# ceph osd df > ID CLASS WEIGHT  REWEIGHT SIZE    USE AVAIL   %USE  VAR  PGS   > 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310   > 5   hdd 7.27739  1.0

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Thanks again Kevin. If I reduce the size flag to a value of 2, that should fix the problem? Regards On Tue, Jan 8, 2019 at 11:28 AM Kevin Olbrich wrote: > You use replication 3 failure-domain host. > OSD 2 and 4 are full, thats why your pool is also full. > You need to add two disks to

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
Looks like the same problem like mine: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html The free space is total while Ceph uses the smallest free space (worst OSD). Please check your (re-)weights. Kevin Am Di., 8. Jan. 2019 um 14:32 Uhr schrieb Rodrigo Embeita : > >

Re: [ceph-users] rocksdb mon stores growing until restart

2019-01-08 Thread Wido den Hollander
On 8/30/18 10:28 AM, Dan van der Ster wrote: > Hi, > > Is anyone else seeing rocksdb mon stores slowly growing to >15GB, > eventually triggering the 'mon is using a lot of disk space' warning? > > Since upgrading to luminous, we've seen this happen at least twice. > Each time, we restart all

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
I believe I found something but I don't know how to fix it. I run "ceph df" and I'm seeing that cephfs_data and cephfs_metadata is at 100% USED. How can I increase the cephfs_data and cephfs_metadata pool. Sorry I'm new with Ceph. root@pf-us1-dfs1:/etc/ceph# ceph df GLOBAL: SIZE AVAIL

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Kevin, thanks for your answer. How Can I check the (re-)weights? On Tue, Jan 8, 2019 at 10:36 AM Kevin Olbrich wrote: > Looks like the same problem like mine: > > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html > > The free space is total while Ceph uses the

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Yoann, thanks a lot for your help. root@pf-us1-dfs3:/home/rodrigo# ceph osd crush tree ID CLASS WEIGHT TYPE NAME -1 72.77390 root default -3 29.10956 host pf-us1-dfs1 0 hdd 7.27739 osd.0 5 hdd 7.27739 osd.5 6 hdd 7.27739 osd.6 8 hdd

Re: [ceph-users] Is it possible to increase Ceph Mon store?

2019-01-08 Thread Wido den Hollander
On 1/7/19 11:15 PM, Pardhiv Karri wrote: > Thank you Bryan, for the information. We have 816 OSDs of size 2TB each. > The mon store too big popped up when no rebalancing happened in that > month. It is slightly above the 15360 threshold around 15900 or 16100 > and stayed there for more than a

[ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread David Young
Hi all, One of my OSD hosts recently ran into RAM contention (was swapping heavily), and after rebooting, I'm seeing this error on random OSDs in the cluster: --- Jan 08 03:34:36 prod1 ceph-osd[3357939]: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) Jan 08

Re: [ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread Paul Emmerich
I've seen this before a few times but unfortunately there doesn't seem to be a good solution at the moment :( See also: http://tracker.ceph.com/issues/23145 Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247

Re: [ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread Sage Weil
I've seen this on luminous, but not on mimic. Can you generate a log with debug osd = 20 leading up to the crash? Thanks! sage On Tue, 8 Jan 2019, Paul Emmerich wrote: > I've seen this before a few times but unfortunately there doesn't seem > to be a good solution at the moment :( > > See

[ceph-users] All monitors fail

2019-01-08 Thread Fatih BİLGE
Hi, I use mimic version on my test environment and lose all monitors. i found  code piece for save my cluster but i dont know how to use it. Anybody help me please! Thank you. Fatih ms=/tmp/mon-store mkdir $ms # collect the cluster map from OSDs for host in $hosts; do rsync -avz $ms

Re: [ceph-users] ceph-mgr fails to restart after upgrade to mimic

2019-01-08 Thread Randall Smith
Thanks to everyone who has tried to help so far. I have filed a bug report on this issue at http://tracker.ceph.com/issues/37835. I hope we can get this fixed so I can finish this upgrade. On Fri, Jan 4, 2019 at 7:26 AM Randall Smith wrote: > Greetings, > > I'm upgrading my cluster from

Re: [ceph-users] [Ceph-maintainers] v13.2.4 Mimic released

2019-01-08 Thread Patrick Donnelly
On Mon, Jan 7, 2019 at 7:10 AM Alexandre DERUMIER wrote: > > Hi, > > >>* Ceph v13.2.2 includes a wrong backport, which may cause mds to go into > >>'damaged' state when upgrading Ceph cluster from previous version. > >>The bug is fixed in v13.2.3. If you are already running v13.2.2, > >>upgrading

Re: [ceph-users] [Ceph-maintainers] v13.2.4 Mimic released

2019-01-08 Thread Neha Ojha
When upgrading from 13.2.1 to 13.2.4, you should be careful about http://tracker.ceph.com/issues/36686. It might be worth considering the workaround mentioned here: https://github.com/ceph/ceph/blob/master/doc/releases/mimic.rst#v1322-mimic. Thanks, Neha On Tue, Jan 8, 2019 at 9:42 AM Patrick

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Janne Johansson
Den tis 8 jan. 2019 kl 16:05 skrev Yoann Moulin : > The best thing you can do here is added two disks to pf-us1-dfs3. After that, get a fourth host with 4 OSDs on it and add to the cluster. If you have 3 replicas (which is good!), then any downtime will mean the cluster is kept in a degraded

Re: [ceph-users] osdmaps not being cleaned up in 12.2.8

2019-01-08 Thread Bryan Stillwell
I was able to get the osdmaps to slowly trim (maybe 50 would trim with each change) by making small changes to the CRUSH map like this: for i in {1..100}; do ceph osd crush reweight osd.1754 4.1 sleep 5 ceph osd crush reweight osd.1754 4 sleep 5 done I believe this was the

Re: [ceph-users] ceph-mgr fails to restart after upgrade to mimic

2019-01-08 Thread Paul Emmerich
Can you try a 13.2.2 mgr? Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Mon, Jan 7, 2019 at 11:52 PM Randall Smith wrote: > > More follow up because, obviously,

Re: [ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread Peter Woodman
For the record, in the linked issue, it was thought that this might be due to write caching. This seems not to be the case, as it happened again to me with write caching disabled. On Tue, Jan 8, 2019 at 11:15 AM Sage Weil wrote: > > I've seen this on luminous, but not on mimic. Can you generate

Re: [ceph-users] Is it possible to increase Ceph Mon store?

2019-01-08 Thread Thomas Byrne - UKRI STFC
For what it's worth, I think the behaviour Pardhiv and Bryan are describing is not quite normal, and sounds similar to something we see on our large luminous cluster with elderly (created as jewel?) monitors. After large operations which result in the mon stores growing to 20GB+, leaving the