Re: [ceph-users] Stale pg_upmap_items entries after pg increase

2018-11-20 Thread Rene Diepstraten
Thanks very much, I can use this. It would be nice if the balancer module had functionality to check/cleanup these stale entries. I may create an issue for this. On 20/11/2018 17:37, Dan van der Ster wrote: I've noticed the same and have a script to help find these:

[ceph-users] s3 bucket policies and account suspension

2018-11-20 Thread Graham Allan
While tinkering with bucket policies, I noticed that where a bucket policy grants access to additional users; if the owner account for that bucket is suspended, then the bucket is also inaccessible to these other users. I'm not sure if this is an unexpected bug, or is the way bucket policies

Re: [ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

2018-11-20 Thread Yan, Zheng
you can run 13.2.1 mds on another machine. kill all client sessions and wait until purge queue is empty. then it's safe to run 13.2.2 mds. run command "cephfs-journal-tool --rank=cephfs_name:rank --journal=purge_queue header get" purge queue is empty when write_pos == expire_pos On Wed, Nov

[ceph-users] MDS damaged after mimic 13.2.1 to 13.2.2 upgrade

2018-11-20 Thread Chris Martin
I am also having this problem. Zheng (or anyone else), any idea how to perform this downgrade on a node that is also a monitor and an OSD node? dpkg complains of a dependency conflict when I try to install ceph-mds_13.2.1-1xenial_amd64.deb: ``` dpkg: dependency problems prevent configuration of

Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-20 Thread Patrick Donnelly
You either need to accept that reads/writes will land on different data centers, primary OSD for a given pool is always in the desired data center, or some other non-Ceph solution which will have either expensive, eventual, or false consistency. On Fri, Nov 16, 2018, 10:07 AM Vlad Kopylov This

Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-20 Thread Vlad Kopylov
I see the point, but not for the read case: no overhead for just choosing or let Mount option choose read replica. This is simple feature that can be implemented, that will save many people bandwidth in really distributed cases. Main issue this surfaces is that RADOS maps ignore clients - they

[ceph-users] 答复: Re: Stale pg_upmap_items entries after pg increase

2018-11-20 Thread xie.xingguo
I've sent a pr(https://github.com/ceph/ceph/pull/25196) for the issue below, which might help. 原始邮件 发件人:ReneDiepstraten 收件人:Dan van der Ster ; 抄送人:ceph-users ; 日 期 :2018年11月21日 05:26 主 题 :Re: [ceph-users] Stale pg_upmap_items entries after pg increase Thanks very much, I

Re: [ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

2018-11-20 Thread Yan, Zheng
ceph-fuse --client_mds_namespace=xxx On Tue, Nov 20, 2018 at 7:33 PM ST Wong (ITSC) wrote: Hi all, We’re using mimic and enabled multiple fs flag. We can do kernel mount of particular fs (e.g. fs1) with mount option mds_namespace=fs1.However, this is not working for ceph-fuse:

Re: [ceph-users] Huge latency spikes

2018-11-20 Thread Alex Litvak
John, If I go with write through, shouldn't disk cache be enabled? On 11/20/2018 6:12 AM, John Petrini wrote: I would disable cache on the controller for your journals. Use write through and no read ahead. Did you make sure the disk cache is disabled? On Tuesday, November 20, 2018, Alex

Re: [ceph-users] Huge latency spikes

2018-11-20 Thread John Petrini
I would disable cache on the controller for your journals. Use write through and no read ahead. Did you make sure the disk cache is disabled? On Tuesday, November 20, 2018, Alex Litvak wrote: > I went through raid controller firmware update. I replaced a pair of SSDs with new ones. Nothing

[ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

2018-11-20 Thread ST Wong (ITSC)
Hi all, We're using mimic and enabled multiple fs flag. We can do kernel mount of particular fs (e.g. fs1) with mount option mds_namespace=fs1.However, this is not working for ceph-fuse: #ceph-fuse -n client.acapp3 -o mds_namespace=fs1 /tmp/ceph 2018-11-20 19:30:35.246 7ff5653edcc0 -1

Re: [ceph-users] Ceph pure ssd strange performance.

2018-11-20 Thread Darius Kasparavičius
Update. So I rebuilt the osd with a separate DB partition on the ssd drive and i/o to disks is what I expected, about ~3x the client I/O. On Tue, Nov 20, 2018 at 11:30 AM Darius Kasparavičius wrote: > > Hello, > > > I'm running some tests on pure SSD pool with mimic and bluestore. > Strange

[ceph-users] Ceph pure ssd strange performance.

2018-11-20 Thread Darius Kasparavičius
Hello, I'm running some tests on pure SSD pool with mimic and bluestore. Strange thing is that currently running fio into rbd images I'm getting a huge difference client and disk I/O. For pure write performance I'm seeing about ~20k iops on the client side and about ~300k on the ssd side. I have

[ceph-users] bucket indices: ssd-only or is a large fast block.db sufficient?

2018-11-20 Thread Dan van der Ster
Hi ceph-users, Most of our servers have 24 hdds plus 4 ssds. Any experience how these should be configured to get the best rgw performance? We have two options: 1) All osds the same, with data on the hdd and block.db on a 40GB ssd partition 2) Two osd device types: hdd-only for the rgw

Re: [ceph-users] Huge latency spikes

2018-11-20 Thread Ashley Merrick
Me and quite a few others have had high random latency issues with disk cache enabled. ,Ash On Tue, 20 Nov 2018 at 9:09 PM, Alex Litvak wrote: > John, > > If I go with write through, shouldn't disk cache be enabled? > > On 11/20/2018 6:12 AM, John Petrini wrote: > > I would disable cache on

Re: [ceph-users] bucket indices: ssd-only or is a large fast block.db sufficient?

2018-11-20 Thread Gregory Farnum
Looks like you’ve considered the essential points for bluestore OSDs, yep. :) My concern would just be the surprisingly-large block.db requirements for rgw workloads that have been brought up. (300+GB per OSD, I think someone saw/worked out?). -Greg On Tue, Nov 20, 2018 at 1:35 AM Dan van der

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov
Hi Florian, what's your Ceph version? Can you also check the output for ceph-bluestore-tool show-label -p It should report 'size' labels for every volume, please check they contain new values. Thanks, Igor On 11/20/2018 5:29 PM, Florian Engelmann wrote: Hi, today we migrated all of

[ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Florian Engelmann
Hi, today we migrated all of our rocksdb and wal devices to new once. The new once are much bigger (500MB for wal/db -> 60GB db and 2G WAL) and LVM based. We migrated like: export OSD=x systemctl stop ceph-osd@$OSD lvcreate -n db-osd$OSD -L60g data || exit 1 lvcreate -n

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Florian Engelmann
Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5 - patched to the latest version) Can you also check the output for ceph-bluestore-tool show-label -p ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/ infering bluefs devices from bluestore path {

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Florian Engelmann
Am 11/20/18 um 4:59 PM schrieb Igor Fedotov: On 11/20/2018 6:42 PM, Florian Engelmann wrote: Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5 - patched to the latest version) Can you also check the output for ceph-bluestore-tool show-label -p ceph-bluestore-tool show-label

Re: [ceph-users] bucket indices: ssd-only or is a large fast block.db sufficient?

2018-11-20 Thread Mark Nelson
One consideration is that you may not be able to fit higher DB levels on the db partition and end up with a lot of waste (Nick Fisk recently saw this on his test cluster).  We've talked about potentially trying to pre-compute the hierarchy sizing so that we can align a level boundary to fit

Re: [ceph-users] Stale pg_upmap_items entries after pg increase

2018-11-20 Thread Dan van der Ster
I've noticed the same and have a script to help find these: https://github.com/cernceph/ceph-scripts/blob/master/tools/clean-upmaps.py -- dan On Tue, Nov 20, 2018 at 5:26 PM Rene Diepstraten wrote: > > Hi. > > Today I've been looking at upmap and the balancer in upmap mode. > The balancer has

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov
On 11/20/2018 6:42 PM, Florian Engelmann wrote: Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5 - patched to the latest version) Can you also check the output for ceph-bluestore-tool show-label -p ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-0/ infering bluefs

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov
On 11/20/2018 7:05 PM, Florian Engelmann wrote: Am 11/20/18 um 4:59 PM schrieb Igor Fedotov: On 11/20/2018 6:42 PM, Florian Engelmann wrote: Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5 - patched to the latest version) Can you also check the output for ceph-bluestore-tool

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov
FYI: https://github.com/ceph/ceph/pull/25187 On 11/20/2018 8:13 PM, Igor Fedotov wrote: On 11/20/2018 7:05 PM, Florian Engelmann wrote: Am 11/20/18 um 4:59 PM schrieb Igor Fedotov: On 11/20/2018 6:42 PM, Florian Engelmann wrote: Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5 -