Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
On 12/06/2019 22.33, Yan, Zheng wrote: > I have tracked down the bug. thank you for reporting this. 'echo 2 > > /proc/sys/vm/drop_cache' should fix the hang. If you can compile ceph > from source, please try following patch. > > diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc > index

[ceph-users] Enable buffered write for bluestore

2019-06-12 Thread Trilok Agarwal
Hi How can we enable bluestore_default_buffered_write using ceph-conf utility Any pointers would be appreciated ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Verifying current configuration values

2019-06-12 Thread Mark Nelson
On 6/12/19 5:51 PM, Jorge Garcia wrote: I'm following the bluestore config reference guide and trying to change the value for osd_memory_target. I added the following entry in the /etc/ceph/ceph.conf file:   [osd]   osd_memory_target = 2147483648 and restarted the osd daemons doing

[ceph-users] Verifying current configuration values

2019-06-12 Thread Jorge Garcia
I'm following the bluestore config reference guide and trying to change the value for osd_memory_target. I added the following entry in the /etc/ceph/ceph.conf file:   [osd]   osd_memory_target = 2147483648 and restarted the osd daemons doing "systemctl restart ceph-osd.target". Now, how do

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Sage Weil wrote: > On Thu, 13 Jun 2019, Simon Leinen wrote: > > Sage Weil writes: > > >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column > > >> families: [default] > > >> Unrecognized command: stats > > >> ceph-kvstore-tool:

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Simon Leinen
[Sorry for the piecemeal information... it's getting late here] > Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the > overall size of that directory increased(!) from 3.9GB to 12GB. The > compaction seems to have eaten two .log files, but created many more > .sst files. ...and

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Thu, 13 Jun 2019, Simon Leinen wrote: > Sage Weil writes: > >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families: > >> [default] > >> Unrecognized command: stats > >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356: > >>

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Simon Leinen
Simon Leinen writes: > Sage Weil writes: >> Try 'compact' instead of 'stats'? > That run for a while and then crashed, also in the destructor for > rocksdb::Version, but with an otherwise different backtrace. [...] Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the overall size

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Simon Leinen
Sage Weil writes: >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families: >> [default] >> Unrecognized command: stats >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356: >> rocksdb::Version::~Version(): Assertion `path_id < >>

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Simon Leinen wrote: > We hope that we can get some access to S3 bucket indexes back, possibly > by somehow dropping and re-creating those indexes. Are all 3 OSDs crashing in the same way? My guess is that the reshard process triggered some massive rocksdb transaction that

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Simon Leinen wrote: > Sage Weil writes: > > What happens if you do > > > ceph-kvstore-tool rocksdb /mnt/ceph/db stats > > (I'm afraid that our ceph-kvstore-tool doesn't know about a "stats" > command; but it still tries to open the database.) > > That aborts after

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Simon Leinen
Sage Weil writes: > What happens if you do > ceph-kvstore-tool rocksdb /mnt/ceph/db stats (I'm afraid that our ceph-kvstore-tool doesn't know about a "stats" command; but it still tries to open the database.) That aborts after complaining about many missing files in /mnt/ceph/db. When I ( cd

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Simon Leinen wrote: > Dear Sage, > > > Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm > > pretty sure it'll crash in the same spot, but just want to confirm > > it's a bluefs issue. > > To my surprise, this actually seems to have worked: > > $ time

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Simon Leinen
Dear Sage, > Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm > pretty sure it'll crash in the same spot, but just want to confirm > it's a bluefs issue. To my surprise, this actually seems to have worked: $ time sudo ceph-bluestore-tool --out-dir /mnt/ceph bluefs-export

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Harald Staub wrote: > On 12.06.19 17:40, Sage Weil wrote: > > On Wed, 12 Jun 2019, Harald Staub wrote: > > > Also opened an issue about the rocksdb problem: > > > https://tracker.ceph.com/issues/40300 > > > > Thanks! > > > > The 'rocksdb: Corruption: file is too short' the

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Harald Staub
On 12.06.19 17:40, Sage Weil wrote: On Wed, 12 Jun 2019, Harald Staub wrote: Also opened an issue about the rocksdb problem: https://tracker.ceph.com/issues/40300 Thanks! The 'rocksdb: Corruption: file is too short' the root of the problem here. Can you try starting the OSD with

Re: [ceph-users] Error when I compare hashes of export-diff / import-diff

2019-06-12 Thread Rafael Diaz Maurin
Le 12/06/2019 à 16:01, Jason Dillaman a écrit : On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin wrote: Hello Jason, Le 11/06/2019 à 15:31, Jason Dillaman a écrit : 4- I export the snapshot from the source pool and I import the snapshot towards the destination pool (in the pipe) rbd

[ceph-users] Ceph Cluster Replication / Disaster Recovery

2019-06-12 Thread DHilsbos
All; I'm testing and evaluating Ceph for the next generation of storage architecture for our company, and so far I'm fairly impressed, but I've got a couple of questions around cluster replication and disaster recovery. First; intended uses. Ceph Object Gateway will be used to support new

Re: [ceph-users] [Ceph-large] Large Omap Warning on Log pool

2019-06-12 Thread Aaron Bassett
Correct, it was pre-jewel. I believe we toyed with multisite replication back then so it may have gotten baked into the zonegroup inadvertently. Thanks for the info! > On Jun 12, 2019, at 11:08 AM, Casey Bodley wrote: > > Hi Aaron, > > The data_log objects are storing logs for multisite

[ceph-users] RGW Multisite Q's

2019-06-12 Thread Peter Eisch
Hi, Could someone be able to point me to a blog or documentation page which helps me resolve the issues noted below? All nodes are Luminous, 12.2.12; one realm, one zonegroup (clustered haproxies fronting), two zones (three rgw in each); All endpoint references to each zone go are an haproxy.

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Sage Weil
On Wed, 12 Jun 2019, Harald Staub wrote: > Also opened an issue about the rocksdb problem: > https://tracker.ceph.com/issues/40300 Thanks! The 'rocksdb: Corruption: file is too short' the root of the problem here. Can you try starting the OSD with 'debug_bluestore=20' and 'debug_bluefs=20'?

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Casey Bodley
Hi Harald, If the bucket reshard didn't complete, it's most likely one of the new bucket index shards that got corrupted here and the original index shard should still be intact. Does $BAD_BUCKET_ID correspond to the new/resharded instance id? If so, once the rocksdb/osd issues are resolved,

Re: [ceph-users] [Ceph-large] Large Omap Warning on Log pool

2019-06-12 Thread Casey Bodley
Hi Aaron, The data_log objects are storing logs for multisite replication. Judging by the pool name '.us-phx2.log', this cluster was created before jewel. Are you (or were you) using multisite or radosgw-agent? If not, you'll want to turn off the logging (log_meta and log_data -> false) in

Re: [ceph-users] RFC: relicence Ceph LGPL-2.1 code as LGPL-2.1 or LGPL-3.0

2019-06-12 Thread Sage Weil
On Fri, 10 May 2019, Sage Weil wrote: > Hi everyone, > > -- What -- > > The Ceph Leadership Team[1] is proposing a change of license from > *LGPL-2.1* to *LGPL-2.1 or LGPL-3.0* (dual license). The specific changes > are described by this pull request: > >

Re: [ceph-users] [Ceph-community] Monitors not in quorum (1 of 3 live)

2019-06-12 Thread Lluis Arasanz i Nonell - Adam
Hi, If nothing special with defined “initial monitos” on cluster, we’ll try to remove mon01 from cluster. I comment about “initial monitor” because in our ceph implementation there is only one monitor as “initial: [root@mon01 ceph]# cat /etc/ceph/ceph.conf [global] fsid =

Re: [ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Harald Staub
Also opened an issue about the rocksdb problem: https://tracker.ceph.com/issues/40300 On 12.06.19 16:06, Harald Staub wrote: We ended in a bad situation with our RadosGW (Cluster is Nautilus 14.2.1, 350 OSDs with BlueStore): 1. There is a bucket with about 60 million objects, without shards.

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Nathan Fish
I have run into a similar hang on 'ls .snap' recently: https://tracker.ceph.com/issues/40101#note-2 On Wed, Jun 12, 2019 at 9:33 AM Yan, Zheng wrote: > > On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote: > > > > Hi list, > > > > I have a setup where two clients mount the same filesystem and

[ceph-users] rocksdb corruption, stale pg, rebuild bucket index

2019-06-12 Thread Harald Staub
We ended in a bad situation with our RadosGW (Cluster is Nautilus 14.2.1, 350 OSDs with BlueStore): 1. There is a bucket with about 60 million objects, without shards. 2. radosgw-admin bucket reshard --bucket $BIG_BUCKET --num-shards 1024 3. Resharding looked fine first, it counted up to the

Re: [ceph-users] ceph threads and performance

2019-06-12 Thread Paul Emmerich
If there was an optimal setting, then it would be the default. Also, both have these options have been removed in Luminous ~2 years ago Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel:

Re: [ceph-users] Error when I compare hashes of export-diff / import-diff

2019-06-12 Thread Jason Dillaman
On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin wrote: > > Hello Jason, > > Le 11/06/2019 à 15:31, Jason Dillaman a écrit : > >> 4- I export the snapshot from the source pool and I import the snapshot > >> towards the destination pool (in the pipe) > >> rbd export-diff --from-snap ${LAST-SNAP}

Re: [ceph-users] ceph threads and performance

2019-06-12 Thread Bastiaan Visser
On both larger and smaller clusters i have never had problems with the default values. So i guess thats a pretty good start. - Original Message - From: "tim taler" To: "Paul Emmerich" Cc: "ceph-users" Sent: Wednesday, June 12, 2019 3:51:43 PM Subject: Re: [ceph-users] ceph threads

Re: [ceph-users] ceph threads and performance

2019-06-12 Thread tim taler
I will look into that, but: IS there a rule of thumb to determine the optimal setting for osd disk threads and osd op threads ? TIA On Wed, Jun 12, 2019 at 3:22 PM Paul Emmerich wrote: > > > > On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote: >> >> We experience absurd slow i/o in the VMs and I

Re: [ceph-users] Error when I compare hashes of export-diff / import-diff

2019-06-12 Thread Rafael Diaz Maurin
Hello Jason, Le 11/06/2019 à 15:31, Jason Dillaman a écrit : 4- I export the snapshot from the source pool and I import the snapshot towards the destination pool (in the pipe) rbd export-diff --from-snap ${LAST-SNAP} ${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP} - | rbd -c ${BACKUP-CLUSTER}

Re: [ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Yan, Zheng
On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote: > > Hi list, > > I have a setup where two clients mount the same filesystem and > read/write from mostly non-overlapping subsets of files (Dovecot mail > storage/indices). There is a third client that takes backups by > snapshotting the

[ceph-users] Fwd: ceph threads and performance

2019-06-12 Thread tim taler
Hi all, we have a 5 node ceph cluster with 44 OSDs where all nodes also serve as virtualization hosts, running about 22 virtual machines with all in all about 75 rbd s (158 including snapshots). We experience absurd slow i/o in the VMs and I suspect our thread settings in ceph.conf to be one of

Re: [ceph-users] ceph threads and performance

2019-06-12 Thread Paul Emmerich
On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote: > We experience absurd slow i/o in the VMs and I suspect > our thread settings in ceph.conf to be one of the culprits. > this is probably not the cause. But someone might be able to help you if you share details on your setup (hardware,

Re: [ceph-users] [Ceph-community] Monitors not in quorum (1 of 3 live)

2019-06-12 Thread Paul Emmerich
On Wed, Jun 12, 2019 at 11:45 AM Lluis Arasanz i Nonell - Adam < lluis.aras...@adam.es> wrote: > - Be careful adding or removing monitors in a not healthy monitor cluster: > If they lost quorum you will be into problems. > safe procedure: remove the dead monitor before adding a new one > > >

Re: [ceph-users] Any CEPH's iSCSI gateway users?

2019-06-12 Thread Paul Emmerich
On Wed, Jun 12, 2019 at 6:48 AM Glen Baars wrote: > Interesting performance increase! I'm Iscsi it at a few installations and > now a wonder what version of Centos is required to improve performance! Did > the cluster go from Luminous to Mimic? > wild guess: probably related to updating

Re: [ceph-users] [Ceph-community] Monitors not in quorum (1 of 3 live)

2019-06-12 Thread Lluis Arasanz i Nonell - Adam
Hi all, Here our story. Perhaps some day could help anyone. Be in mind that English is not my native language so sorry if I make mistakes. Our system is: Ceph 0.87.2 (Giant), with 5 OSD servers (116 1TB osd total) and 3 monitors. After a nightmare time, we initially "correct" ceph monitor

Re: [ceph-users] Expected IO in luminous Ceph Cluster

2019-06-12 Thread Виталий Филиппов
Hi Felix, Better use fio. Like fio -ioengine=rbd -direct=1 -invalidate=1 -name=test -bs=4k -iodepth=128 -rw=randwrite -pool=rpool_hdd -runtime=60 -rbdname=testimg (for peak parallel random iops) Or the same with -iodepth=1 for the latency test. Here you usually get Or the same with

[ceph-users] ceph threads and performance

2019-06-12 Thread tim taler
Hi all, we have a 5 node ceph cluster with 44 OSDs where all nodes also serve as virtualization hosts, running about 22 virtual machines with all in all about 75 rbd s (158 including snapshots). We experience absurd slow i/o in the VMs and I suspect our thread settings in ceph.conf to be one of

[ceph-users] MDS getattr op stuck in snapshot

2019-06-12 Thread Hector Martin
Hi list, I have a setup where two clients mount the same filesystem and read/write from mostly non-overlapping subsets of files (Dovecot mail storage/indices). There is a third client that takes backups by snapshotting the top-level directory, then rsyncing the snapshot over to another location.

Re: [ceph-users] Large OMAP object in RGW GC pool

2019-06-12 Thread Wido den Hollander
On 6/11/19 9:48 PM, J. Eric Ivancich wrote: > Hi Wido, > > Interleaving below > > On 6/11/19 3:10 AM, Wido den Hollander wrote: >> >> I thought it was resolved, but it isn't. >> >> I counted all the OMAP values for the GC objects and I got back: >> >> gc.0: 0 >> gc.11: 0 >> gc.14: 0 >>

Re: [ceph-users] MDS hangs in "heartbeat_map" deadlock

2019-06-12 Thread Stefan Kooman
Quoting Patrick Donnelly (pdonn...@redhat.com): > Hi Stefan, > > Sorry I couldn't get back to you sooner. NP. > Looks like you hit the infinite loop bug in OpTracker. It was fixed in > 12.2.11: https://tracker.ceph.com/issues/37977 > > The problem was introduced in 12.2.8. We've been quite