date:20190814

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Hemant Sonawane

Hello guys, Thank you so much for your responses really appreciate it. But I would like to mention one more thing which I forgot in my last email is that I am going to use this storage for openstack VM's. So still the answer will be the same that I should use 1GB for wal? On Wed, 14 Aug 2019 at

[ceph-users] strange backfill delay after outing one node

2019-08-14 Thread Simon Oosthoek

Hi all, Yesterday I marked out all the osds on one node in our new cluster to reconfigure them with WAL/DB on their NVMe devices, but it is taking ages to rebalance. The whole cluster (and thus the osds) is only ~1% full, therefore the full ratio is nowhere in sight. We have 14 osd nodes with 12

Re: [ceph-users] Cephfs cannot mount with kernel client

2019-08-14 Thread Serkan Çoban

Hi, just double checked the stack trace and I can confirm it is same as in tracker. compaction also worked for me, I can now mount cephfs without problems. Thanks for help, Serkan On Tue, Aug 13, 2019 at 6:44 PM Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 4:30 PM Serkan Çoban wrote: > > > >

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Wido den Hollander

On 8/14/19 9:33 AM, Hemant Sonawane wrote: > Hello guys, > > Thank you so much for your responses really appreciate it. But I would > like to mention one more thing which I forgot in my last email is that I > am going to use this storage for openstack VM's. So still the answer > will be the same

Re: [ceph-users] strange backfill delay after outing one node

2019-08-14 Thread Wido den Hollander

On 8/14/19 9:48 AM, Simon Oosthoek wrote: > Hi all, > > Yesterday I marked out all the osds on one node in our new cluster to > reconfigure them with WAL/DB on their NVMe devices, but it is taking > ages to rebalance. The whole cluster (and thus the osds) is only ~1% > full, therefore the full

Re: [ceph-users] strange backfill delay after outing one node

2019-08-14 Thread Janne Johansson

Den ons 14 aug. 2019 kl 09:49 skrev Simon Oosthoek : > Hi all, > > Yesterday I marked out all the osds on one node in our new cluster to > reconfigure them with WAL/DB on their NVMe devices, but it is taking > ages to rebalance. > > > ceph tell 'osd.*' injectargs '--osd-max-backfills 16' > > c

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Burkhard Linke

Hi, please keep in mind that due to the rocksdb level concept, only certain db partition sizes are useful. Larger partitions are a waste of capacity, since rockdb will only use whole level sizes. There has been a lot of discussion about this on the mailing list in the last months. A plain

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-14 Thread Paul Emmerich

Starting point to debug/fix this would be to extract the osdmap from one of the dead OSDs: ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/... Then try to run osdmaptool on that osdmap to see if it also crashes, set some --debug options (don't know which one off the top of my

Re: [ceph-users] Scrub start-time and end-time

2019-08-14 Thread Thomas Byrne - UKRI STFC

Hi Torben, > Is it allowed to have the scrub period cross midnight ? eg have start time at > 22:00 and end time 07:00 next morning. Yes, I think that's what the way it is mostly used, primarily to reduce the scrub impact during waking/working hours. > I assume that if you only configure the on

Re: [ceph-users] Canonical Livepatch broke CephFS client

2019-08-14 Thread Ilya Dryomov

On Tue, Aug 13, 2019 at 10:56 PM Tim Bishop wrote: > > Hi, > > This email is mostly a heads up for others who might be using > Canonical's livepatch on Ubuntu on a CephFS client. > > I have an Ubuntu 18.04 client with the standard kernel currently at > version linux-image-4.15.0-54-generic 4.15.0-

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Igor Fedotov

Hi Wido & Hermant. On 8/14/2019 11:36 AM, Wido den Hollander wrote: On 8/14/19 9:33 AM, Hemant Sonawane wrote: Hello guys, Thank you so much for your responses really appreciate it. But I would like to mention one more thing which I forgot in my last email is that I am going to use this stora

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-14 Thread Kenneth Van Alstyne

Got it! I can calculate individual clone usage using “rbd du”, but does anything exist to show total clone usage across the pool? Otherwise it looks like phantom space is just missing. Thanks, -- Kenneth Van Alstyne Systems Architect M: 228.547.8045 15052 Conference Center Dr, Chantilly, VA 2

Re: [ceph-users] Canonical Livepatch broke CephFS client

2019-08-14 Thread Tim Bishop

On Wed, Aug 14, 2019 at 12:44:15PM +0200, Ilya Dryomov wrote: > On Tue, Aug 13, 2019 at 10:56 PM Tim Bishop wrote: > > This email is mostly a heads up for others who might be using > > Canonical's livepatch on Ubuntu on a CephFS client. > > > > I have an Ubuntu 18.04 client with the standard kerne

[ceph-users] reproducible rbd-nbd crashes

2019-08-14 Thread Marc Schöchlin

Hello Mike, see my inline comments. Am 14.08.19 um 02:09 schrieb Mike Christie: >>> - >>> Previous tests crashed in a reproducible manner with "-P 1" (single io >>> gzip/gunzip) after a few minutes up to 45 minutes. >>> >>> Overview of my tests: >>> >>> - SUCCESSFUL: kernel 4.15, ceph 12.2.5

Re: [ceph-users] strange backfill delay after outing one node

2019-08-14 Thread Simon Oosthoek

On 14/08/2019 10:44, Wido den Hollander wrote: > > > On 8/14/19 9:48 AM, Simon Oosthoek wrote: >> Is it a good idea to give the above commands or other commands to speed >> up the backfilling? (e.g. like increasing "osd max backfills") >> > > Yes, as right now the OSDs aren't doing that many bac

Re: [ceph-users] Canonical Livepatch broke CephFS client

2019-08-14 Thread Ilya Dryomov

On Wed, Aug 14, 2019 at 1:54 PM Tim Bishop wrote: > > On Wed, Aug 14, 2019 at 12:44:15PM +0200, Ilya Dryomov wrote: > > On Tue, Aug 13, 2019 at 10:56 PM Tim Bishop wrote: > > > This email is mostly a heads up for others who might be using > > > Canonical's livepatch on Ubuntu on a CephFS client.

[ceph-users] Question to developers about iscsi

2019-08-14 Thread Fyodor Ustinov

Hi! As I understand - iscsi gate is part of ceph. Documentation says: Note The iSCSI management functionality of Ceph Dashboard depends on the latest version 3 of the ceph-iscsi project. Make sure that your operating system provides the correct version, otherwise the dashboard won’t enable the

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

2019-08-14 Thread Troy Ablan

Paul, Thanks for the reply. All of these seemed to fail except for pulling the osdmap from the live cluster. -Troy -[~:#]- ceph-objectstore-tool --op get-osdmap --data-path /var/lib/ceph/osd/ceph-45/ --file osdmap45 terminate called after throwing an instance of 'ceph::buffer::malformed_in

Re: [ceph-users] MDS corruption

2019-08-14 Thread ☣Adam

I was able to get this resolved, thanks again to Pierre Dittes! The reason the recovery did not work the first time I tried it was because I still had the filesystem mounted (or at least attempted to have it mounted). This was causing sessions to be active. After rebooting all the machines which

[ceph-users] New Cluster Failing to Start

2019-08-14 Thread DHilsbos

All; We're working to deploy our first production Ceph cluster, and we've run into a snag. The MONs start, but the "cluster" doesn't appear to come up. Ceph -s never returns. These are the last lines in the event log of one of the mons: 2019-08-13 16:20:03.706 7f668108f180 0 starting mon.s70

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Ilya Dryomov

On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > MDS servers. This is a CephFS with two ranks; I manually failed over the > first rank and the new MDS server ran out of RAM in the rejoin phase > (ceph-mds didn't get

Re: [ceph-users] WAL/DB size

2019-08-14 Thread solarflow99

> Actually standalone WAL is required when you have either very small fast > device (and don't want db to use it) or three devices (different in > performance) behind OSD (e.g. hdd, ssd, nvme). So WAL is to be located > at the fastest one. > > For the given use case you just have HDD and NVMe and D

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Jeff Layton

On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > On Tue, Aug 13, 2019 at 1:06 PM Hector Martin wrote: > > I just had a minor CephFS meltdown caused by underprovisioned RAM on the > > MDS servers. This is a CephFS with two ranks; I manually failed over the > > first rank and the new MDS ser

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Mark Nelson

On 8/14/19 1:06 PM, solarflow99 wrote: Actually standalone WAL is required when you have either very small fast device (and don't want db to use it) or three devices (different in performance) behind OSD (e.g. hdd, ssd, nvme). So WAL is to be located at the fastest one.

Re: [ceph-users] New Cluster Failing to Start (Resolved)

2019-08-14 Thread DHilsbos

All; We found the problem, we had the v2 ports incorrect in the monmap. Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message- From: ceph-users [mailto:ceph-users-boun...@list

Re: [ceph-users] ceph device list empty

2019-08-14 Thread Gary Molenkamp

I've had no luck in tracing this down. I've tried setting debugging and log channels to try and find what is failing with no success. With debug_mgr at 20/20, the logs will show: log_channel(audit) log [DBG] : from='client.10424012 -' entity='client.admin' cmd=[{"prefix": "device ls", "

Re: [ceph-users] reproducible rbd-nbd crashes

2019-08-14 Thread Mike Christie

On 08/14/2019 07:35 AM, Marc Schöchlin wrote: >>> 3. I wonder if we are hitting a bug with PF_MEMALLOC Ilya hit with krbd. >>> He removed that code from the krbd. I will ping him on that. > > Interesting. I activated Coredumps for that processes - probably we can > find something interesting here.

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Anthony D'Atri

Good points in both posts, but I think there’s still some unclarity. Absolutely let’s talk about DB and WAL together. By “bluestore goes on flash” I assume you mean WAL+DB? “Simply allocate DB and WAL will appear there automatically” Forgive me please if this is obvious, but I’d like to see a

Re: [ceph-users] reproducible rbd-nbd crashes

2019-08-14 Thread Mike Christie

On 08/14/2019 02:09 PM, Mike Christie wrote: > On 08/14/2019 07:35 AM, Marc Schöchlin wrote: 3. I wonder if we are hitting a bug with PF_MEMALLOC Ilya hit with krbd. He removed that code from the krbd. I will ping him on that. >> >> Interesting. I activated Coredumps for that processes -

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

2019-08-14 Thread Konstantin Shalygin

On 8/14/19 6:19 PM, Kenneth Van Alstyne wrote: Got it! I can calculate individual clone usage using “rbd du”, but does anything exist to show total clone usage across the pool? Otherwise it looks like phantom space is just missing. rbd du for each snapshot, I think... k ___

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Janne Johansson

Den tors 15 aug. 2019 kl 00:16 skrev Anthony D'Atri : > Good points in both posts, but I think there’s still some unclarity. > ... > We’ve seen good explanations on the list of why only specific DB sizes, > say 30GB, are actually used _for the DB_. > If the WAL goes along with the DB, shouldn’t

Re: [ceph-users] WAL/DB size

[ceph-users] strange backfill delay after outing one node

Re: [ceph-users] Cephfs cannot mount with kernel client

Re: [ceph-users] WAL/DB size

Re: [ceph-users] strange backfill delay after outing one node

Re: [ceph-users] strange backfill delay after outing one node

Re: [ceph-users] WAL/DB size

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

Re: [ceph-users] Scrub start-time and end-time

Re: [ceph-users] Canonical Livepatch broke CephFS client

Re: [ceph-users] WAL/DB size

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

Re: [ceph-users] Canonical Livepatch broke CephFS client

[ceph-users] reproducible rbd-nbd crashes

Re: [ceph-users] strange backfill delay after outing one node

Re: [ceph-users] Canonical Livepatch broke CephFS client

[ceph-users] Question to developers about iscsi

Re: [ceph-users] Sudden loss of all SSD OSDs in a cluster, immedaite abort on restart [Mimic 13.2.6]

Re: [ceph-users] MDS corruption

[ceph-users] New Cluster Failing to Start

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

Re: [ceph-users] WAL/DB size

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

Re: [ceph-users] WAL/DB size

Re: [ceph-users] New Cluster Failing to Start (Resolved)

Re: [ceph-users] ceph device list empty

Re: [ceph-users] reproducible rbd-nbd crashes

Re: [ceph-users] WAL/DB size

Re: [ceph-users] reproducible rbd-nbd crashes

Re: [ceph-users] Ceph capacity versus pool replicated size discrepancy?

Re: [ceph-users] WAL/DB size

31 matches

Site Navigation

Mail list logo

Footer information