[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

[ceph-users] Cache pool latency impact

2015-01-14 Thread Pavan Rallabhandi
This is regarding cache pools and the impact of the flush/evict on the client IO latencies. Am seeing a direct impact on the client IO latencies (making them worse) when flush/evict is triggered on the cache pool. In a constant ingress of IOs on the cache pool, the write performance is no

Re: [ceph-users] RGW hammer/master woes

2015-03-05 Thread Pavan Rallabhandi
Is there anyone who is hitting this? or any help on this is much appreciated. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Pavan Rallabhandi Sent: Saturday, February 28, 2015 11:42 PM To: ceph-us...@ceph.com Subject: [ceph-users] RGW hammer

[ceph-users] RGW hammer/master woes

2015-02-28 Thread Pavan Rallabhandi
Am struggling to get through a basic PUT via swift client with RGW and CEPH binaries built out of Hammer/Master codebase, whereas the same (command on the same setup) is going through with RGW and CEPH binaries built out of Giant. Find below RGW log snippet and the command that was run. Am I

Re: [ceph-users] FW: RGW performance issue

2015-11-12 Thread Pavan Rallabhandi
If you are on >=hammer builds, you might want to consider the option of using 'rgw_num_rados_handles', which opens up more handles to the cluster from RGW. This would help in scenarios, where you have enough number of OSDs to drive the cluster bandwidth, which I guess is the case with you.

Re: [ceph-users] FW: RGW performance issue

2015-11-13 Thread Pavan Rallabhandi
To: Pavan Rallabhandi Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] FW: RGW performance issue 2015-11-13 5:47 GMT+01:00 Pavan Rallabhandi <pavan.rallabha...@sandisk.com>: > If you are on >=hammer builds, you might want to consider the option > of using 'rgw_num_rados_handles',

Re: [ceph-users] rgw bucket deletion woes

2016-06-15 Thread Pavan Rallabhandi
To update this thread, this is now fixed via https://github.com/ceph/ceph/pull/8679 Thanks! From: Ben Hines <bhi...@gmail.com<mailto:bhi...@gmail.com>> Date: Thursday, March 17, 2016 at 4:47 AM To: Yehuda Sadeh-Weinraub <yeh...@redhat.com<mailto:yeh...@redhat.com>&g

[ceph-users] rgw bucket deletion woes

2016-03-16 Thread Pavan Rallabhandi
Hi, I find this to be discussed here before, but couldn¹t find any solution hence the mail. In RGW, for a bucket holding objects in the range of ~ millions, one can find it to take for ever to delete the bucket(via radosgw-admin). I understand the gc(and its parameters) that would reclaim the

Re: [ceph-users] CBT results parsing/plotting

2016-07-06 Thread Pavan Rallabhandi
06/2016 03:22 AM, Pavan Rallabhandi wrote: > Wanted to check if there are any readily available tools that the community > is aware of/using for parsing/plotting CBT run results. Am particularly > interested in tools for the CBT librbdfio runs, where in the aggregated > BW/IOPS/L

[ceph-users] CBT results parsing/plotting

2016-07-06 Thread Pavan Rallabhandi
Wanted to check if there are any readily available tools that the community is aware of/using for parsing/plotting CBT run results. Am particularly interested in tools for the CBT librbdfio runs, where in the aggregated BW/IOPS/Latency reports are generated either in CSV/graphical. Thanks!

[ceph-users] rgw meta pool

2016-09-08 Thread Pavan Rallabhandi
Trying it one more time on the users list. In our clusters running Jewel 10.2.2, I see default.rgw.meta pool running into large number of objects, potentially to the same range of objects contained in the data pool. I understand that the immutable metadata entries are now stored in this heap

Re: [ceph-users] rgw meta pool

2016-09-09 Thread Pavan Rallabhandi
Any help on this is much appreciated, am considering to fix this, given it’s confirmed an issue unless am missing something obvious. Thanks, -Pavan. On 9/8/16, 5:04 PM, "ceph-users on behalf of Pavan Rallabhandi" <ceph-users-boun...@lists.ceph.com on behalf of prallabha...@wa

Re: [ceph-users] rgw meta pool

2016-09-10 Thread Pavan Rallabhandi
> 2013585 > default.rgw.meta 25 743M 0 172T > 1975937 > > We¹re concerned this will be unmanageable over time. > > Warren Wang > > > On 9/9/16, 10:54 AM, "ceph-users on behalf of Pavan Rallabhand

Re: [ceph-users] Same pg scrubbed over and over (Jewel)

2016-09-21 Thread Pavan Rallabhandi
We find this as well in our fresh built Jewel clusters, and seems to happen only with a handful of PGs from couple of pools. Thanks! On 9/21/16, 3:14 PM, "ceph-users on behalf of Tobias Böhm" wrote: Hi, there is an

Re: [ceph-users] OSDs are flapping and marked down wrongly

2016-10-17 Thread Pavan Rallabhandi
Regarding the mon_osd_min_down_reports I was looking at it recently, this could provide some insight https://github.com/ceph/ceph/commit/0269a0c17723fd3e22738f7495fe017225b924a4 Thanks! On 10/17/16, 1:36 PM, "ceph-users on behalf of Somnath Roy"

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
Thanks for verifying at your end Jason. It’s pretty weird that the difference is >~10X, with "rbd_cache_writethrough_until_flush = true" I see ~400 IOPS vs with "rbd_cache_writethrough_until_flush = false" I see them to be ~6000 IOPS. The QEMU cache is none for all of the rbd drives. On that

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
.” Thanks, -Pavan. On 10/21/16, 11:31 PM, "Jason Dillaman" <jdill...@redhat.com> wrote: On Fri, Oct 21, 2016 at 1:15 PM, Pavan Rallabhandi <prallabha...@walmartlabs.com> wrote: > The QEMU cache is none for all of the rbd drives Hmm -- if you have QEMU cache

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
ll/10797/commits/5498377205523052476ed81aebb2c2e6973f67ef On Fri, Oct 21, 2016 at 7:42 AM, Pavan Rallabhandi <prallabha...@walmartlabs.com> wrote: > I see the fix for write back cache not getting turned on after flush has made into Jewel 10.2.3 ( http://tracker.ceph.com/issue

Re: [ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
objects) block_name_prefix: rbd_data.5ebf12d1934e format: 2 features: layering, striping flags: stripe unit: 4096 kB stripe count: 1 Thanks! On 10/21/16, 7:26 PM, "ceph-users on behalf of Pavan Rallabhandi" <ceph-users-boun...@l

[ceph-users] rbd cache writethrough until flush

2016-10-21 Thread Pavan Rallabhandi
I see the fix for write back cache not getting turned on after flush has made into Jewel 10.2.3 ( http://tracker.ceph.com/issues/17080 ) but our testing says otherwise. The cache is still behaving as if its writethrough, though the setting is set to true. Wanted to check if it’s still broken

Re: [ceph-users] RadosGW and Openstack Keystone revoked tokens

2017-04-21 Thread Pavan Rallabhandi
You may want to look here http://tracker.ceph.com/issues/19499 and http://tracker.ceph.com/issues/9493 Thanks, From: ceph-users on behalf of "magicb...@gmail.com" Date: Friday, 21 April 2017 1:11 pm To: ceph-users

Re: [ceph-users] Speeding up garbage collection in RGW

2017-07-25 Thread Pavan Rallabhandi
If your Ceph version is >=Jewel, you can try the `--bypass-gc` option in radosgw-admin, which would remove the tails objects as well without marking them to be GCed. Thanks, On 25/07/17, 1:34 AM, "ceph-users on behalf of Bryan Stillwell"

Re: [ceph-users] Speeding up garbage collection in RGW

2017-07-25 Thread Pavan Rallabhandi
I’ve just realized that the option is present in Hammer (0.94.10) as well, you should try that. From: Bryan Stillwell <bstillw...@godaddy.com> Date: Tuesday, 25 July 2017 at 9:45 PM To: Pavan Rallabhandi <prallabha...@walmartlabs.com>, "ceph-users@lists.ceph.com" <

Re: [ceph-users] FW: radosgw: stale/leaked bucket index entries

2017-06-20 Thread Pavan Rallabhandi
Hi Orit, No, we do not use multi-site. Thanks, -Pavan. From: Orit Wasserman <owass...@redhat.com> Date: Tuesday, 20 June 2017 at 12:49 PM To: Pavan Rallabhandi <prallabha...@walmartlabs.com> Cc: "ceph-users@lists.ceph.com" <ceph-users@lists.ceph.com> Subject: EXT:

[ceph-users] FW: radosgw: stale/leaked bucket index entries

2017-06-19 Thread Pavan Rallabhandi
Trying one more time with ceph-users On 19/06/17, 11:07 PM, "Pavan Rallabhandi" <prallabha...@walmartlabs.com> wrote: On many of our clusters running Jewel (10.2.5+), am running into a strange problem of having stale bucket index entries left over for (some of the) objects

Re: [ceph-users] FW: radosgw: stale/leaked bucket index entries

2017-06-22 Thread Pavan Rallabhandi
Looks like I’ve now got a consistent repro scenario, please find the gory details here http://tracker.ceph.com/issues/20380 Thanks! On 20/06/17, 2:04 PM, "Pavan Rallabhandi" <prallabha...@walmartlabs.com> wrote: Hi Orit, No, we do not use multi-site. Th

Re: [ceph-users] Bucket reporting content inconsistently

2018-05-21 Thread Pavan Rallabhandi
Can possibly be due to these http://tracker.ceph.com/issues/20380, http://tracker.ceph.com/issues/22555 Thanks, From: ceph-users on behalf of Tom W Date: Saturday, May 12, 2018 at 10:57 AM To: ceph-users

Re: [ceph-users] Jewel PG stuck inconsistent with 3 0-size objects

2018-07-16 Thread Pavan Rallabhandi
Yes, that suggestion worked for us, although we hit this when we've upgraded to 10.2.10 from 10.2.7. I guess this was fixed via http://tracker.ceph.com/issues/21440 and http://tracker.ceph.com/issues/19404 Thanks, -Pavan. On 7/16/18, 5:07 AM, "ceph-users on behalf of Matthew Vernon"

Re: [ceph-users] Backfilling on Luminous

2018-03-30 Thread Pavan Rallabhandi
‘expected_num_objects’ at the time of pool creation, be aware of this fix http://tracker.ceph.com/issues/22530. Thanks, -Pavan. From: David Turner <drakonst...@gmail.com> Date: Tuesday, March 20, 2018 at 1:50 PM To: Pavan Rallabhandi <prallabha...@walmartlabs.com> Cc: ceph-users <ceph-users

Re: [ceph-users] Does jewel 10.2.10 support filestore_split_rand_factor?

2018-04-01 Thread Pavan Rallabhandi
No, it is supported in the next version of Jewel http://tracker.ceph.com/issues/22658 From: ceph-users on behalf of shadow_lin Date: Sunday, April 1, 2018 at 3:53 AM To: ceph-users Subject: EXT: [ceph-users]

Re: [ceph-users] Backfilling on Luminous

2018-03-19 Thread Pavan Rallabhandi
David, Pretty sure you must be aware of the filestore random split on existing OSD PGs, `filestore split rand factor`, may be you could try that too. Thanks, -Pavan. From: ceph-users on behalf of David Turner Date: Monday, March 19,

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-27 Thread Pavan Rallabhandi
“compression=kNoCompression” from the filestore_rocksdb_options And/Or check if rocksdb is expecting snappy to be enabled. Thanks, -Pavan. From: David Turner Date: Thursday, September 27, 2018 at 1:18 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-11-03 Thread Pavan Rallabhandi
Not exactly, this feature was supported in Jewel starting 10.2.11, ref https://github.com/ceph/ceph/pull/18010 I thought you mentioned you were using Luminous 12.2.4. From: David Turner Date: Friday, November 2, 2018 at 5:21 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-11-02 Thread Pavan Rallabhandi
It was Redhat versioned Jewel. But may be more relevantly, we are on Ubuntu unlike your case. From: David Turner Date: Friday, November 2, 2018 at 10:24 AM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-11-05 Thread Pavan Rallabhandi
Not sure I understand that, but starting Luminous, the filestore omap backend is rocksdb by default. From: David Turner Date: Monday, November 5, 2018 at 3:25 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the cluster unusable

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-29 Thread Pavan Rallabhandi
supported: 0 2018-09-29 17:38:38.629114 7fbd318d4b00 4 rocksdb: LZ4 supported: 0 2018-09-29 17:38:38.629114 7fbd318d4b00 4 rocksdb: ZSTD supported: 0 2018-09-29 17:38:38.629115 7fbd318d4b00 4 rocksdb: Fast CRC32 supported: 0 On 9/27/18, 2:56 PM, "Pavan Rallabhandi" wrote:

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-10-01 Thread Pavan Rallabhandi
Yeah, I think this is something to do with the CentOS binaries, sorry that I couldn’t be of much help here. Thanks, -Pavan. From: David Turner Date: Monday, October 1, 2018 at 1:37 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-19 Thread Pavan Rallabhandi
if rocksdb is expecting snappy to be enabled. From: David Turner Date: Tuesday, September 18, 2018 at 6:01 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever Here's the [1] full log from the time the OSD

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
The steps that were outlined for conversion are correct, have you tried setting some the relevant ceph conf values too: filestore_rocksdb_options = "max_background_compactions=8;compaction_readahead_size=2097152;compression=kNoCompression" filestore_omap_backend = rocksdb Thanks, -Pavan.

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
the omap backend to be leveldb and to do with the compression. Thanks, -Pavan. From: David Turner Date: Tuesday, September 18, 2018 at 5:07 PM To: Pavan Rallabhandi Cc: ceph-users Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever Are those

Re: [ceph-users] Any backfill in our cluster makes the cluster unusable and takes forever

2018-09-18 Thread Pavan Rallabhandi
I meant the stack trace hints that the superblock still has leveldb in it, have you verified that already? On 9/18/18, 5:27 PM, "Pavan Rallabhandi" wrote: You should be able to set them under the global section and that reminds me, since you are on Luminous already, I guess th

Re: [ceph-users] Large OMAP Objects in default.rgw.log pool

2019-03-09 Thread Pavan Rallabhandi
That can happen if you have lot of objects with swift object expiry (TTL) enabled. You can 'listomapkeys' on these log pool objects and check for the objects that have registered for TTL as omap entries. I know this is the case with at least Jewel version. Thanks, -Pavan. On 3/7/19, 10:09

Re: [ceph-users] Try to log the IP in the header X-Forwarded-For with radosgw behind haproxy

2019-04-08 Thread Pavan Rallabhandi
Refer "rgw log http headers" under http://docs.ceph.com/docs/nautilus/radosgw/config-ref/ Or even better in the code https://github.com/ceph/ceph/pull/7639 Thanks, -Pavan. On 4/8/19, 8:32 PM, "ceph-users on behalf of Francois Lafont" wrote: Hi @all, I'm using Ceph rados