Re: [ceph-users] Ceph assimilated configuration - unable to remove item

2019-12-13 Thread David Herselman
rbd_default_features 31; ceph config dump | grep -e WHO -e rbd_default_features; WHOMASK LEVELOPTION VALUE RO global advanced rbd_default_features 31 Regards David Herselman -Original Message- From: Stefan Kooman Sent

[ceph-users] Ceph assimilated configuration - unable to remove item

2019-12-11 Thread David Herselman
OPTION VALUE RO global advanced rbd_default_features 7 global advanced rbd_default_features 31 Regards David Herselman ___ ceph-users mailing list ceph-users@lists.ceph.com http

[ceph-users] Pool Max Avail and Ceph Dashboard Pool Useage on Nautilus giving different percentages

2019-12-10 Thread David Majchrzak, ODERLAND Webbhotell AB
from in dashboard? My guess is that is comes from calculating: 1 - Max Avail / (Used + Max avail) = 0.93 Kind Regards, David Majchrzak ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RGW bucket stats - strange behavior & slow performance requiring RGW restarts

2019-12-03 Thread David Monschein
Hi all, I've been observing some strange behavior with my object storage cluster running Nautilus 14.2.4. We currently have around 1800 buckets (A small percentage of those buckets are actively used), with a total of 13.86M objects. We have 20 RGWs right now, 10 for regular S3 access, and 10 for

Re: [ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
or sysctl or things like Wido suggested with c-states would make any differences. (Thank you Wido!) Yes, running benchmarks is great, and we're already doing that ourselves. Cheers and have a nice evening! -- David Majchrzak On tor, 2019-11-28 at 17:46 +0100, Paul Emmerich wrote: > Please don't

[ceph-users] Tuning Nautilus for flash only

2019-11-28 Thread David Majchrzak, ODERLAND Webbhotell AB
? We have 256GB of RAM on each OSD host, 8 OSD hosts with 10 SSDs on each. 2 osd daemons on each SSD. Raise ssd bluestore cache to 8GB? Workload is about 50/50 r/w ops running qemu VMs through librbd. So mixed block size. 3 replicas. Appreciate any advice! Kind Regards, -- David Majchrzak

[ceph-users] Dynamic bucket index resharding bug? - rgw.none showing unreal number of objects

2019-11-22 Thread David Monschein
Hi all. Running an Object Storage cluster with Ceph Nautilus 14.2.4. We are running into what appears to be a serious bug that is affecting our fairly new object storage cluster. While investigating some performance issues -- seeing abnormally high IOPS, extremely slow bucket stat listings (over

Re: [ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-05 Thread J David
On Tue, Nov 5, 2019 at 2:21 PM Janne Johansson wrote: > I seem to recall some ticket where zap would "only" clear 100M of the drive, > but lvm and all partition info needed more to be cleared, so using dd > bs=1M count=1024 (or more!) would be needed to make sure no part of the OS > picks

Re: [ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-05 Thread J David
On Tue, Nov 5, 2019 at 3:18 AM Paul Emmerich wrote: > could be a new feature, I've only realized this exists/works since Nautilus. > You seem to be a relatively old version since you still have ceph-disk > installed None of this is using ceph-disk? It's all done with ceph-volume. The ceph

Re: [ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-04 Thread J David
On Mon, Nov 4, 2019 at 1:32 PM Paul Emmerich wrote: > BTW: you can run destroy before stopping the OSD, you won't need the > --yes-i-really-mean-it if it's drained in this case This actually does not seem to work: $ sudo ceph osd safe-to-destroy 42 OSD(s) 42 are safe to destroy without reducing

Re: [ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-04 Thread J David
On Mon, Nov 4, 2019 at 1:32 PM Paul Emmerich wrote: > That's probably the ceph-disk udev script being triggered from > something somewhere (and a lot of things can trigger that script...) That makes total sense. > Work-around: convert everything to ceph-volume simple first by running >

[ceph-users] Zombie OSD filesystems rise from the grave during bluestore conversion

2019-11-04 Thread J David
While converting a luminous cluster from filestore to bluestore, we are running into a weird race condition on a fairly regular basis. We have a master script that writes upgrade scripts for each OSD server. The script for an OSD looks like this: ceph osd out 68 while ! ceph osd safe-to-destroy

[ceph-users] Using multisite to migrate data between bucket data pools.

2019-10-30 Thread David Turner
This is a tangent on Paul Emmerich's response to "[ceph-users] Correct Migration Workflow Replicated -> Erasure Code". I've tried Paul's method before to migrate between 2 data pools. However I ran into some issues. The first issue seems like a bug in RGW where the RGW for the new zone was able

Re: [ceph-users] ceph balancer do not start

2019-10-22 Thread David Turner
Of the top of my head, if say your cluster might have the wrong tunables for crush-compat. I know I ran into that when I first set up the balancer and nothing obviously said that was the problem. Only researching find it for me. My real question, though, is why aren't you using upmap? It is

Re: [ceph-users] Decreasing the impact of reweighting osds

2019-10-22 Thread David Turner
Most times you are better served with simpler settings like osd_recovery_sleep, which has 3 variants if you have multiple types of OSDs in your cluster (osd_recovery_sleep_hdd, osd_recovery_sleep_sdd, osd_recovery_sleep_hybrid). Using those you can tweak a specific type of OSD that might be having

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-10-10 Thread David C
Patrick Donnelly wrote: > Looks like this bug: https://tracker.ceph.com/issues/41148 > > On Wed, Oct 9, 2019 at 1:15 PM David C wrote: > > > > Hi Daniel > > > > Thanks for looking into this. I hadn't installed ceph-debuginfo, here's > the bt with line nu

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-10-09 Thread David C
rash. Can you get line numbers from your backtrace? > > Daniel > > On 10/7/19 9:59 AM, David C wrote: > > Hi All > > > > Further to my previous messages, I upgraded > > to libcephfs2-14.2.2-0.el7.x86_64 as suggested and things certainly seem > > a lot more

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-10-07 Thread David C
: rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=removed,local_lock=none,addr=removed) On Fri, Jul 19, 2019 at 5:47 PM David C wrote: > Thanks, Jeff. I'll give 14.2.2 a go when it's released. > > On Wed, 17 Jul 2019, 2

Re: [ceph-users] eu.ceph.com mirror out of sync?

2019-09-23 Thread David Majchrzak, ODERLAND Webbhotell AB
Hi, I'll have a look at the status of se.ceph.com tomorrow morning, it's maintained by us. Kind Regards, David On mån, 2019-09-23 at 22:41 +0200, Oliver Freyermuth wrote: > Hi together, > > the EU mirror still seems to be out-of-sync - does somebody on this > list happen

[ceph-users] Problem formatting erasure coded image

2019-09-22 Thread David Herselman
MiB 0 666 GiB Regards David Herselman ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] FYI: Mailing list domain change

2019-08-07 Thread David Galloway
scribe you to the new list. No other action should be required on your part. -- David Galloway Systems Administrator, RDU Ceph Engineering IRC: dgalloway ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph Nautilus - can't balance due to degraded state

2019-08-03 Thread David Herselman
403] pg_upmap_items 8.409 [404,403] pg_upmap_items 8.40b [103,102,404,405] pg_upmap_items 8.40c [404,400] pg_upmap_items 8.410 [404,403] pg_upmap_items 8.411 [404,405] pg_upmap_items 8.417 [404,403] pg_upmap_items 8.418 [404,403] pg_upmap_items 9.2 [10401,10400] pg_upmap

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-07-19 Thread David C
p in. > > > > Cheers, > > Jeff > > > > On Wed, 2019-07-17 at 10:36 +0100, David C wrote: > > > Thanks for taking a look at this, Daniel. Below is the only > interesting bit from the Ceph MDS log at the time of the crash but I > suspect the slow requests are

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-07-17 Thread David C
nd my expertise, at this point. Maybe some ceph > logs would help? > > Daniel > > On 7/15/19 10:54 AM, David C wrote: > > This list has been deprecated. Please subscribe to the new devel list at > lists.nfs-ganesha.org. > > > > > > Hi All > > > >

[ceph-users] What's the best practice for Erasure Coding

2019-07-07 Thread David
d I adopt, and how to choose the combinations of (k,m) (e.g. (k=3,m=2), (k=6,m=3) ). Does anyone share some experience? Thanks for any help. Regards, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/li

Re: [ceph-users] Cannot delete bucket

2019-06-27 Thread David Turner
deletion. On Thu, Jun 27, 2019 at 2:58 PM Sergei Genchev wrote: > @David Turner > Did your bucket delete ever finish? I am up to 35M incomplete uploads, > and I doubt that I actually had that many upload attempts. I could be > wrong though. > Is there a way to force bucket

Re: [ceph-users] Cannot delete bucket

2019-06-24 Thread David Turner
It's aborting incomplete multipart uploads that were left around. First it will clean up the cruft like that and then it should start actually deleting the objects visible in stats. That's my understanding of it anyway. I'm int he middle of cleaning up some buckets right now doing this same thing.

Re: [ceph-users] Changing the release cadence

2019-06-17 Thread David Turner
This was a little long to respond with on Twitter, so I thought I'd share my thoughts here. I love the idea of a 12 month cadence. I like October because admins aren't upgrading production within the first few months of a new release. It gives it plenty of time to be stable for the OS distros as

Re: [ceph-users] Erasure Coding - FPGA / Hardware Acceleration

2019-06-14 Thread David Byte
I can't speak to the SoftIron solution, but I have done some testing on an all-SSD environment comparing latency, CPU, etc between using the Intel ISA plugin and using Jerasure. Very little difference is seen in CPU and capability in my tests, so I am not sure of the benefit. David Byte Sr

Re: [ceph-users] NFS-Ganesha CEPH_FSAL | potential locking issue

2019-05-17 Thread David C
PM Jeff Layton wrote: > On Tue, Apr 16, 2019 at 10:36 AM David C wrote: > > > > Hi All > > > > I have a single export of my cephfs using the ceph_fsal [1]. A CentOS 7 > machine mounts a sub-directory of the export [2] and is using it for the > home director

Re: [ceph-users] Samba vfs_ceph or kernel client

2019-05-16 Thread David Disseldorp
e-mode locks and leases can be supported without the requirement for a kernel interface. Cheers, David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] IMPORTANT : NEED HELP : Low IOPS on hdd : MAX AVAIL Draining fast

2019-04-27 Thread David C
On Sat, 27 Apr 2019, 18:50 Nikhil R, wrote: > Guys, > We now have a total of 105 osd’s on 5 baremetal nodes each hosting 21 > osd’s on HDD which are 7Tb with journals on HDD too. Each journal is about > 5GB > This would imply you've got a separate hdd partition for journals, I don't think

Re: [ceph-users] Default Pools

2019-04-23 Thread David Turner
You should be able to see all pools in use in a RGW zone from the radosgw-admin command. This [1] is probably overkill for most, but I deal with multi-realm clusters so I generally think like this when dealing with RGW. Running this as is will create a file in your current directory for each zone

Re: [ceph-users] Osd update from 12.2.11 to 12.2.12

2019-04-22 Thread David Turner
Do you perhaps have anything in the ceph.conf files on the servers with those OSDs that would attempt to tell the daemon that they are filestore osds instead of bluestore? I'm sure you know that the second part [1] of the output in both cases only shows up after an OSD has been rebooted. I'm

[ceph-users] NFS-Ganesha CEPH_FSAL | potential locking issue

2019-04-16 Thread David C
-0.1.el7.x86_64 Ceph.conf on nfs-ganesha server: [client] mon host = 10.10.10.210:6789, 10.10.10.211:6789, 10.10.10.212:6789 client_oc_size = 8388608000 client_acl_type=posix_acl client_quota = true client_quota_df = true Thanks, David

Re: [ceph-users] Looking up buckets in multi-site radosgw configuration

2019-03-20 Thread David Coles
On Tue, Mar 19, 2019 at 7:51 AM Casey Bodley wrote: > Yeah, correct on both points. The zonegroup redirects would be the only > way to guide clients between clusters. Awesome. Thank you for the clarification. ___ ceph-users mailing list

[ceph-users] Looking up buckets in multi-site radosgw configuration

2019-03-18 Thread David Coles
/affb7d396f76273e885cfdbcd363c1882496726c/src/rgw/rgw_op.cc#L653-L669 2. https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/object_gateway_guide_for_red_hat_enterprise_linux/multi_site#configuring_multiple_zones_without_replication -- David Coles

Re: [ceph-users] ceph osd pg-upmap-items not working

2019-03-15 Thread David Turner
Why do you think that it can't resolve this by itself? You just said that the balancer was able to provide an optimization, but then that the distribution isn't perfect. When there are no further optimizations, running `ceph balancer optimize plan` won't create a plan with any changes. Possibly

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread David C
Out of curiosity, are you guys re-exporting the fs to clients over something like nfs or running applications directly on the OSD nodes? On Tue, 12 Mar 2019, 18:28 Paul Emmerich, wrote: > Mounting kernel CephFS on an OSD node works fine with recent kernels > (4.14+) and enough RAM in the

Re: [ceph-users] 3-node cluster with 3 x Intel Optane 900P - very low benchmarked performance (200 IOPS)?

2019-03-11 Thread David Clarke
is appropriate for your CPUs and kernel, in the boot cmdline. -- David Clarke Systems Architect Catalyst IT signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo

Re: [ceph-users] OpenStack with Ceph RDMA

2019-03-11 Thread David Turner
I can't speak to the rdma portion. But to clear up what each of these does... the cluster network is only traffic between the osds for replicating writes, reading EC data, as well as backfilling and recovery io. Mons, mds, rgw, and osds talking with clients all happen on the public network. The

Re: [ceph-users] priorize degraged objects than misplaced

2019-03-11 Thread David Turner
Ceph has been getting better and better about prioritizing this sorry of recovery, but free of those optimizations are in Jewel, which had been out of the support cycle for about a year. You should look into upgrading to mimic where you should see a pretty good improvement on this sorry of

Re: [ceph-users] CEPH ISCSI Gateway

2019-03-11 Thread David Turner
The problem with clients on osd nodes is for kernel clients only. That's true of krbd and the kernel client for cephfs. The only other reason not to run any other Ceph daemon in the same node as osds is resource contention if you're running at higher CPU and memory utilizations. On Sat, Mar 9,

Re: [ceph-users] Failed to repair pg

2019-03-07 Thread David Zafman
"snapid":326022,"hash":#,"max":0,"pool":2,"namespace":"","max":0}] Use the json for snapid 326022 to remove it. # ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-12/ --journal-path /dev/sdXX '["2.2bb",

Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread David C
The general advice has been to not use the kernel client on an osd node as you may see a deadlock under certain conditions. Using the fuse client should be fine or use the kernel client inside a VM. On Wed, 6 Mar 2019, 03:07 Zhenshi Zhou, wrote: > Hi, > > I'm gonna mount cephfs from my ceph

Re: [ceph-users] [Nfs-ganesha-devel] NFS-Ganesha CEPH_FSAL ceph.quota.max_bytes not enforced

2019-03-04 Thread David C
On Mon, Mar 4, 2019 at 5:53 PM Jeff Layton wrote: > > On Mon, 2019-03-04 at 17:26 +, David C wrote: > > Looks like you're right, Jeff. Just tried to write into the dir and am > > now getting the quota warning. So I guess it was the libcephfs cache > > as you say. Tha

Re: [ceph-users] [Nfs-ganesha-devel] NFS-Ganesha CEPH_FSAL ceph.quota.max_bytes not enforced

2019-03-04 Thread David C
} Thanks, On Mon, Mar 4, 2019 at 2:50 PM Jeff Layton wrote: > On Mon, 2019-03-04 at 09:11 -0500, Jeff Layton wrote: > > This list has been deprecated. Please subscribe to the new devel list at > lists.nfs-ganesha.org. > > On Fri, 2019-03-01 at 15:49 +, David C wrote: > &g

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-03-01 Thread David Turner
, Mar 1, 2019, 6:28 PM solarflow99 wrote: > It has to be mounted from somewhere, if that server goes offline, you need > to mount it from somewhere else right? > > > On Thu, Feb 28, 2019 at 11:15 PM David Turner > wrote: > >> Why are you making the same rbd to multiple se

[ceph-users] NFS-Ganesha CEPH_FSAL ceph.quota.max_bytes not enforced

2019-03-01 Thread David C
client_quota_df = true Related links: [1] http://tracker.ceph.com/issues/16526 [2] https://github.com/nfs-ganesha/nfs-ganesha/issues/100 Thanks David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mimic 13.2.4 rbd du slowness

2019-02-28 Thread David Turner
Have you used strace on the du command to see what it's spending its time doing? On Thu, Feb 28, 2019, 8:45 PM Glen Baars wrote: > Hello Wido, > > The cluster layout is as follows: > > 3 x Monitor hosts ( 2 x 10Gbit bonded ) > 9 x OSD hosts ( > 2 x 10Gbit bonded, > LSI cachecade and write cache

Re: [ceph-users] rbd unmap fails with error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy

2019-02-28 Thread David Turner
Why are you making the same rbd to multiple servers? On Wed, Feb 27, 2019, 9:50 AM Ilya Dryomov wrote: > On Wed, Feb 27, 2019 at 12:00 PM Thomas <74cmo...@gmail.com> wrote: > > > > Hi, > > I have noticed an error when writing to a mapped RBD. > > Therefore I unmounted the block device. > > Then

Re: [ceph-users] PG Calculations Issue

2019-02-28 Thread David Turner
Those numbers look right for a pool only containing 10% of your data. Now continue to calculate the pg counts for the remaining 90% of your data. On Wed, Feb 27, 2019, 12:17 PM Krishna Venkata wrote: > Greetings, > > > I am having issues in the way PGs are calculated in >

Re: [ceph-users] redirect log to syslog and disable log to stderr

2019-02-28 Thread David Turner
You can always set it in your ceph.conf file and restart the mgr daemon. On Tue, Feb 26, 2019, 1:30 PM Alex Litvak wrote: > Dear Cephers, > > In mimic 13.2.2 > ceph tell mgr.* injectargs --log-to-stderr=false > Returns an error (no valid command found ...). What is the correct way to > inject

Re: [ceph-users] Right way to delete OSD from cluster?

2019-02-28 Thread David Turner
The reason is that an osd still contributes to the host weight in the crush map even while it is marked out. When you out and then purge, the purging operation removed the osd from the map and changes the weight of the host which changes the crush map and data moves. By weighting the osd to 0.0,

Re: [ceph-users] Cephfs recursive stats | rctime in the future

2019-02-28 Thread David C
On Wed, Feb 27, 2019 at 11:35 AM Hector Martin wrote: > On 27/02/2019 19:22, David C wrote: > > Hi All > > > > I'm seeing quite a few directories in my filesystem with rctime years in > > the future. E.g > > > > ]# getfattr -d -m ceph.dir.* /path/to/dir

[ceph-users] Cephfs recursive stats | rctime in the future

2019-02-27 Thread David C
data pool. I have just received a scrub error this morning with 1 inconsistent pg but I've been noticing the incorrect rctimes for a while a now so not sure if that's related. Any help much appreciated Thanks David ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Usenix Vault 2019

2019-02-24 Thread David Turner
There is a scheduled birds of a feather for Ceph tomorrow night, but I also noticed that there are only trainings tomorrow. Unless you are paying more for those, you likely don't have much to do on Monday. That's the boat I'm in. Is anyone interested in getting together tomorrow in Boston during

Re: [ceph-users] Configuration about using nvme SSD

2019-02-24 Thread David Turner
One thing that's worked for me to get more out of nvmes with Ceph is to create multiple partitions on the nvme with an osd on each partition. That way you get more osd processes and CPU per nvme device. I've heard of people using up to 4 partitions like this. On Sun, Feb 24, 2019, 10:25 AM

Re: [ceph-users] Doubts about backfilling performance

2019-02-23 Thread David Turner
Jewel is really limited on the settings you can tweak for backfilling [1]. Luminous and Mimic have a few more knobs. An option you can do, though, is to use osd_crush_initial_weight found [2] here. With this setting you set your initial crush weight for new osds to 0.0 and gradually increase them

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread David Turner
Mon disks don't have journals, they're just a folder on a filesystem on a disk. On Fri, Feb 22, 2019, 6:40 AM M Ranga Swami Reddy wrote: > ceph mons looks fine during the recovery. Using HDD with SSD > journals. with recommeded CPU and RAM numbers. > > On Fri, Feb 22, 2019 at 4

Re: [ceph-users] REQUEST_SLOW across many OSDs at the same time

2019-02-22 Thread David Turner
Can you correlate the times to scheduled tasks inside of any VMs? For instance if you have several Linux VMs with the updatedb command installed that by default they will all be scanning their disks at the same time each day to see where files are. Other common culprits could be scheduled backups,

Re: [ceph-users] Ceph cluster stability

2019-02-22 Thread David Turner
uestore + rocksdb > > compared to filestore + leveldb . > > > > > > On Wed, Feb 20, 2019 at 4:27 PM M Ranga Swami Reddy > > wrote: > > > > > > Thats expected from Ceph by design. But in our case, we are using all > > > recommendation like rack fa

Re: [ceph-users] faster switch to another mds

2019-02-20 Thread David Turner
If I'm not mistaken, if you stop them at the same time during a reboot on a node with both mds and mon, the mons might receive it, but wait to finish their own election vote before doing anything about it. If you're trying to keep optimal uptime for your mds, then stopping it first and on its own

Re: [ceph-users] faster switch to another mds

2019-02-19 Thread David Turner
It's also been mentioned a few times that when MDS and MON are on the same host that the downtime for MDS is longer when both daemons stop at about the same time. It's been suggested to stop the MDS daemon, wait for `ceph mds stat` to reflect the change, and then restart the rest of the server.

Re: [ceph-users] CephFS overwrite/truncate performance hit

2019-02-19 Thread David Turner
If your client needs to be able to handle the writes like that on its own, RBDs might be the more appropriate use case. You lose the ability to have multiple clients accessing the data as easily as with CephFS, but you would gain the features you're looking for. On Tue, Feb 12, 2019 at 1:43 PM

Re: [ceph-users] CephFS: client hangs

2019-02-19 Thread David Turner
You're attempting to use mismatching client name and keyring. You want to use matching name and keyring. For your example, you would want to either use `--keyring /etc/ceph/ceph.client.admin.keyring --name client.admin` or `--keyring /etc/ceph/ceph.client.cephfs.keyring --name client.cephfs`.

Re: [ceph-users] crush map has straw_calc_version=0 and legacy tunables on luminous

2019-02-19 Thread David Turner
[1] Here is a really cool set of slides from Ceph Day Berlin where Dan van der Ster uses the mgr balancer module with upmap to gradually change the tunables of a cluster without causing major client impact. The down side for you is that upmap requires all luminous or newer clients, but if you

Re: [ceph-users] Ceph cluster stability

2019-02-19 Thread David Turner
With a RACK failure domain, you should be able to have an entire rack powered down without noticing any major impact on the clients. I regularly take down OSDs and nodes for maintenance and upgrades without seeing any problems with client IO. On Tue, Feb 12, 2019 at 5:01 AM M Ranga Swami Reddy

Re: [ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-19 Thread David Turner
ng aswell. (But keep in > mind that the help on their mailing list is not so good as here ;)) > > > > -Original Message- > From: David Turner [mailto:drakonst...@gmail.com] > Sent: 18 February 2019 17:31 > To: ceph-users > Subject: [ceph-users] Migrating a baremet

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-19 Thread David Turner
their DB back off of their spinner which is what's happening to you. I don't believe that sort of tooling exists yet, though, without compiling the Nautilus Beta tooling for yourself. On Tue, Feb 19, 2019 at 12:03 AM Konstantin Shalygin wrote: > On 2/18/19 9:43 PM, David Turner wrote: > &g

Re: [ceph-users] Upgrade Luminous to mimic on Ubuntu 18.04

2019-02-18 Thread David Turner
Everybody is just confused that you don't have a newer version of Ceph available. Are you running `apt-get dist-upgrade` to upgrade ceph? Do you have any packages being held back? There is no reason that Ubuntu 18.04 shouldn't be able to upgrade to 12.2.11. On Mon, Feb 18, 2019, 4:38 PM Hello

Re: [ceph-users] IRC channels now require registered and identified users

2019-02-18 Thread David Turner
Is this still broken in the 1-way direction where Slack users' comments do not show up in IRC? That would explain why nothing I ever type (as either helping someone or asking a question) ever have anyone respond to them. On Tue, Dec 18, 2018 at 6:50 AM Joao Eduardo Luis wrote: > On 12/18/2018

[ceph-users] Migrating a baremetal Ceph cluster into K8s + Rook

2019-02-18 Thread David Turner
I'm getting some "new" (to me) hardware that I'm going to upgrade my home Ceph cluster with. Currently it's running a Proxmox cluster (Debian) which precludes me from upgrading to Mimic. I am thinking about taking the opportunity to convert most of my VMs into containers and migrate my cluster

[ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-18 Thread David Turner
We have 2 clusters of [1] these disks that have 2 Bluestore OSDs per disk (partitioned), 3 disks per node, 5 nodes per cluster. The clusters are 12.2.4 running CephFS and RBDs. So in total we have 15 NVMe's per cluster and 30 NVMe's in total. They were all built at the same time and were

Re: [ceph-users] Placing replaced disks to correct buckets.

2019-02-18 Thread David Turner
Also what commands did you run to remove the failed HDDs and the commands you have so far run to add their replacements back in? On Sat, Feb 16, 2019 at 9:55 PM Konstantin Shalygin wrote: > I recently replaced failed HDDs and removed them from their respective > buckets as per procedure. > >

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-18 Thread David Turner
been there ever since. On Sat, Feb 16, 2019 at 1:50 AM Konstantin Shalygin wrote: > On 2/16/19 12:33 AM, David Turner wrote: > > The answer is probably going to be in how big your DB partition is vs > > how big your HDD disk is. From your output it looks like you have a > >

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-15 Thread David Turner
The answer is probably going to be in how big your DB partition is vs how big your HDD disk is. From your output it looks like you have a 6TB HDD with a 28GB Blocks.DB partition. Even though the DB used size isn't currently full, I would guess that at some point since this OSD was created that

Re: [ceph-users] jewel10.2.11 EC pool out a osd, its PGs remap to the osds in the same host

2019-02-15 Thread David Turner
I'm leaving the response on the CRUSH rule for Gregory, but you have another problem you're running into that is causing more of this data to stay on this node than you intend. While you `out` the OSD it is still contributing to the Host's weight. So the host is still set to receive that amount

Re: [ceph-users] Problems with osd creation in Ubuntu 18.04, ceph 13.2.4-1bionic

2019-02-15 Thread David Turner
I have found that running a zap before all prepare/create commands with ceph-volume helps things run smoother. Zap is specifically there to clear everything on a disk away to make the disk ready to be used as an OSD. Your wipefs command is still fine, but then I would lvm zap the disk before

Re: [ceph-users] [Ceph-community] Deploy and destroy monitors

2019-02-13 Thread David Turner
Ceph-users is the proper ML to post questions like this. On Thu, Dec 20, 2018 at 2:30 PM Joao Eduardo Luis wrote: > On 12/20/2018 04:55 PM, João Aguiar wrote: > > I am having an issue with "ceph-ceploy mon” > > > > I started by creating a cluster with one monitor with "create-deploy > new"…

Re: [ceph-users] [Ceph-community] Ceph SSE-KMS integration to use Safenet as Key Manager service

2019-02-13 Thread David Turner
Ceph-users is the correct ML to post questions like this. On Wed, Jan 2, 2019 at 5:40 PM Rishabh S wrote: > Dear Members, > > Please let me know if you have any link with examples/detailed steps of > Ceph-Safenet(KMS) integration. > > Thanks & Regards, > Rishabh > >

Re: [ceph-users] [Ceph-community] Error during playbook deployment: TASK [ceph-mon : test if rbd exists]

2019-02-13 Thread David Turner
Ceph-users ML is the proper mailing list for questions like this. On Sat, Jan 26, 2019 at 12:31 PM Meysam Kamali wrote: > Hi Ceph Community, > > I am using ansible 2.2 and ceph branch stable-2.2, on centos7, to deploy > the playbook. But the deployment get hangs in this step "TASK [ceph-mon : >

Re: [ceph-users] [Ceph-community] Need help related to ceph client authentication

2019-02-13 Thread David Turner
The Ceph-users ML is the correct list to ask questions like this. Did you figure out the problems/questions you had? On Tue, Dec 4, 2018 at 11:39 PM Rishabh S wrote: > Hi Gaurav, > > Thank You. > > Yes, I am using boto, though I was looking for suggestions on how my ceph > client should get

Re: [ceph-users] all vms can not start up when boot all the ceph hosts.

2019-02-13 Thread David Turner
This might not be a Ceph issue at all depending on if you're using any sort of caching. If you have caching on your disk controllers at all, then the write might have happened to the cache but never made it to the OSD disks which would show up as problems on the VM RBDs. Make sure you have

Re: [ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

2019-02-13 Thread David Turner
Note that this format in fstab does require a certain version of util-linux because of the funky format of the line. Pretty much it maps all command line options at the beginning of the line separated with commas. On Wed, Feb 13, 2019 at 2:10 PM David Turner wrote: > I believe the fstab l

Re: [ceph-users] how to mount one of the cephfs namespace using ceph-fuse?

2019-02-13 Thread David Turner
I believe the fstab line for ceph-fuse in this case would look something like [1] this. We use a line very similar to that to mount cephfs at a specific client_mountpoint that the specific cephx user only has access to. [1] id=acapp3,client_mds_namespace=fs1 /tmp/ceph fuse.ceph

Re: [ceph-users] compacting omap doubles its size

2019-02-13 Thread David Turner
Sorry for the late response on this, but life has been really busy over the holidays. We compact our omaps offline with the ceph-kvstore-tool. Here [1] is a copy of the script that we use for our clusters. You might need to modify things a bit for your environment. I don't remember which

Re: [ceph-users] backfill_toofull while OSDs are not full

2019-01-30 Thread David Zafman
above 90% (backfillfull_ratio) usage. David On 1/27/19 11:34 PM, Wido den Hollander wrote: On 1/25/19 8:33 AM, Gregory Farnum wrote: This doesn’t look familiar to me. Is the cluster still doing recovery so we can at least expect them to make progress when the “out” OSDs get removed from t

Re: [ceph-users] CEPH_FSAL Nfs-ganesha

2019-01-30 Thread David C
Patrick Donnelly wrote: > On Mon, Jan 14, 2019 at 7:11 AM Daniel Gryniewicz wrote: > > > > Hi. Welcome to the community. > > > > On 01/14/2019 07:56 AM, David C wrote: > > > Hi All > > > > > > I've been playing around with the nfs-gan

Re: [ceph-users] How To Properly Failover a HA Setup

2019-01-21 Thread David C
It could also be the kernel client versions, what are you running? I remember older kernel clients didn't always deal with recovery scenarios very well. On Mon, Jan 21, 2019 at 9:18 AM Marc Roos wrote: > > > I think his downtime is coming from the mds failover, that takes a while > in my case

Re: [ceph-users] CephFS - Small file - single thread - read performance.

2019-01-18 Thread David C
On Fri, 18 Jan 2019, 14:46 Marc Roos > > [@test]# time cat 50b.img > /dev/null > > real0m0.004s > user0m0.000s > sys 0m0.002s > [@test]# time cat 50b.img > /dev/null > > real0m0.002s > user0m0.000s > sys 0m0.002s > [@test]# time cat 50b.img > /dev/null > > real0m0.002s

Re: [ceph-users] CephFS - Small file - single thread - read performance.

2019-01-18 Thread David C
On Fri, Jan 18, 2019 at 2:12 PM wrote: > Hi. > > We have the intention of using CephFS for some of our shares, which we'd > like to spool to tape as a part normal backup schedule. CephFS works nice > for large files but for "small" .. < 0.1MB .. there seem to be a > "overhead" on 20-40ms per

[ceph-users] Fw: Re: Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-16 Thread David Young
Forgot to reply to the list! ‐‐‐ Original Message ‐‐‐ On Thursday, January 17, 2019 8:32 AM, David Young wrote: > Thanks David, > > "ceph osd df" looks like this: > > - > root@node1:~# ceph osd df > ID CLASS WEIGHT REWEIGHT SIZEUSE AV

Re: [ceph-users] Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-16 Thread David C
On Wed, 16 Jan 2019, 02:20 David Young Hi folks, > > My ceph cluster is used exclusively for cephfs, as follows: > > --- > root@node1:~# grep ceph /etc/fstab > node2:6789:/ /ceph ceph > auto,_netdev,name=admin,secretfile=/root/ceph.admin.secret > root@node1:~# >

[ceph-users] Why does "df" on a cephfs not report same free space as "rados df" ?

2019-01-15 Thread David Young
Hi folks, My ceph cluster is used exclusively for cephfs, as follows: --- root@node1:~# grep ceph /etc/fstab node2:6789:/ /ceph ceph auto,_netdev,name=admin,secretfile=/root/ceph.admin.secret root@node1:~# --- "rados df" shows me the following: --- root@node1:~# rados df POOL_NAME

[ceph-users] CEPH_FSAL Nfs-ganesha

2019-01-14 Thread David C
size which seemed quite conservative. Thanks David [1] http://docs.ceph.com/docs/mimic/cephfs/nfs/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] cephfs free space issue

2019-01-10 Thread David C
On Thu, Jan 10, 2019 at 4:07 PM Scottix wrote: > I just had this question as well. > > I am interested in what you mean by fullest, is it percentage wise or raw > space. If I have an uneven distribution and adjusted it, would it make more > space available potentially. > Yes - I'd recommend

Re: [ceph-users] Mimic 13.2.3?

2019-01-08 Thread David Galloway
On 1/8/19 9:05 AM, Matthew Vernon wrote: > Dear Greg, > > On 04/01/2019 19:22, Gregory Farnum wrote: > >> Regarding Ceph releases more generally: > > [snip] > >> I imagine we will discuss all this in more detail after the release, >> but everybody's patience is appreciated as we work through

[ceph-users] OSDs crashing in EC pool (whack-a-mole)

2019-01-08 Thread David Young
Hi all, One of my OSD hosts recently ran into RAM contention (was swapping heavily), and after rebooting, I'm seeing this error on random OSDs in the cluster: --- Jan 08 03:34:36 prod1 ceph-osd[3357939]: ceph version 13.2.4 (b10be4d44915a4d78a8e06aa31919e74927b142e) mimic (stable) Jan 08

Re: [ceph-users] Balancer=on with crush-compat mode

2019-01-05 Thread David C
On Sat, 5 Jan 2019, 13:38 Marc Roos > I have straw2, balancer=on, crush-compat and it gives worst spread over > my ssd drives (4 only) being used by only 2 pools. One of these pools > has pg 8. Should I increase this to 16 to create a better result, or > will it never be any better. > > For now I

  1   2   3   4   5   6   7   8   9   10   >