Re: [ceph-users] backfill_toofull while OSDs are not full

2019-01-27 Thread Wido den Hollander
On 1/25/19 8:33 AM, Gregory Farnum wrote: > This doesn’t look familiar to me. Is the cluster still doing recovery so > we can at least expect them to make progress when the “out” OSDs get > removed from the set? The recovery has already finished. It resolves itself, but in the meantime I saw

Re: [ceph-users] Mix hardware on object storage cluster

2019-01-27 Thread Ashley Merrick
What is different with the new hosts? Better / Large Disk? More ram? Higher ghz on the CPU's? Redundant PSU e.t.c Depending on what is different about the hardware will help pinpoint how you may be able to make better use of them. On Mon, Jan 28, 2019 at 3:26 PM Félix Barbeira wrote: > Hi

[ceph-users] Mix hardware on object storage cluster

2019-01-27 Thread Félix Barbeira
Hi Cephers, We are managing a cluster where all machines have the same hardware. The cluster is used only for object storage. We are planning to increase nodes number. Those new nodes have better hardware than the old ones. If we only add those nodes as regular nodes to cluster we are not use the

Re: [ceph-users] RBD client hangs

2019-01-27 Thread ST Wong (ITSC)
> That doesn't appear to be an error -- that's just stating that it found a > dead client that was holding the exclusice-lock, so it broke the dead > client's lock on the image (by blacklisting the client). As there is only 1 RBD client in this testing, does it mean the RBD client process

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Alexandre DERUMIER
Hi, currently I'm using telegraf + influxdb to monitor. Note that this bug seem to be only occur on writes, I don't have latency increase on read. counters are op_latency , op_w_latency, op_w_process_latency SELECT non_negative_derivative(first("op_latency.sum"),

Re: [ceph-users] how to debug a stuck cephfs?

2019-01-27 Thread Sang, Oliver
Thanks! So we should not evict client? we run into slow request so guess maybe evict all clients could help. In which case we should or should not use evict? -Original Message- From: Yan, Zheng [mailto:uker...@gmail.com] Sent: Monday, January 28, 2019 1:07 PM To: Sang, Oliver Cc:

Re: [ceph-users] how to debug a stuck cephfs?

2019-01-27 Thread Yan, Zheng
http://docs.ceph.com/docs/master/cephfs/troubleshooting/ For your case, it's likely client got evicted by mds. On Mon, Jan 28, 2019 at 9:50 AM Sang, Oliver wrote: > > Hello, > > > > Our cephfs looks just stuck. If I run some command such like ‘makdir’, > ‘touch’ a new file, it just stuck

Re: [ceph-users] MDS performance issue

2019-01-27 Thread Yan, Zheng
On Mon, Jan 28, 2019 at 10:34 AM Albert Yue wrote: > > Hi Yan Zheng, > > Our clients are also complaining about operations like 'du' or 'ncdu' being > very slow. Is there any alternative tool for such kind of operation on > CephFS? Thanks! > 'du' traverse whole directory tree to calculate

Re: [ceph-users] MDS performance issue

2019-01-27 Thread Albert Yue
Hi Yan Zheng, Our clients are also complaining about operations like 'du' or 'ncdu' being very slow. Is there any alternative tool for such kind of operation on CephFS? Thanks! Best regards, Albert On Wed, Jan 23, 2019 at 11:04 AM Yan, Zheng wrote: > On Wed, Jan 23, 2019 at 10:02 AM Albert

Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-01-27 Thread Will Dennis
Thanks for contributing your knowledge to that book, Anthony - really enjoying it :) I didn't mean to use the OS SSD for Ceph use - would buy a second SSD per server for that... I will take a look at SATA SSD prices, hopefully the smaller ones (>500MB) will be at an acceptable price so that I

[ceph-users] how to debug a stuck cephfs?

2019-01-27 Thread Sang, Oliver
Hello, Our cephfs looks just stuck. If I run some command such like 'makdir', 'touch' a new file, it just stuck there. Any suggestion about how to debug this issue will be very appreciated. ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-01-27 Thread Anthony D'Atri
> Been reading "Learning Ceph - Second Edition” An outstanding book, I must say ;) > So can I get by with using a single SATA SSD (size?) per server for RocksDB / > WAL if I'm using Bluestore? Depends on the rest of your setup and use-case, but I think this would be a bottleneck. Some

Re: [ceph-users] Ceph rbd.ko compatibility

2019-01-27 Thread Paul Emmerich
(Copying from our Ceph training material https://croit.io/training ) Feature vs. kernel version layering 3.8 striping 3.10 exclusive-lock 4.9 object-map not supported fast-diff not supported deep-flatten not supported data-pool 4.11 journaling (WIP, might be in 5.0 or later)

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-27 Thread Chris
When your node went down, you lost 100% of the copies of the objects that were stored on that node, so the cluster had to re-create a copy of everything. When the node came back online (and particularly since your usage was near-zero), the cluster discovered that many objects did not require

Re: [ceph-users] Questions about using existing HW for PoC cluster

2019-01-27 Thread Will Dennis
Been reading "Learning Ceph - Second Edition" (https://learning.oreilly.com/library/view/learning-ceph-/9781787127913/8f98bac7-44d4-45dc-b672-447d162ea604.xhtml) and in Ch. 4 I read this: "We've noted that Ceph OSDs built with the new BlueStore back end do not require journals. One might

Re: [ceph-users] One host with 24 OSDs is offline - best way to get it back online

2019-01-27 Thread Götz Reinicke
Dear all, thanks for your feedback and Fäll try to take any suggestion in consideration. I’v rebooted node in question and oll 24 OSDs came online without any complaining. But wat makes me wonder is: During the downtime the Object got rebalanced and placed on the remaining nodes. With the

[ceph-users] cephfs constantly strays ( num_strays)

2019-01-27 Thread Marc Roos
I have constantly strays. What are strays? Why do I have them? Is this bad? [@~]# ceph daemon mds.c perf dump| grep num_stray "num_strays": 25823, "num_strays_delayed": 0, "num_strays_enqueuing": 0, [@~]# ___ ceph-users

[ceph-users] Bug in application of bucket policy s3:PutObject?

2019-01-27 Thread Marc Roos
If I want that only a user can put objects, and not download or delete. I have to apply a secondary statement denying the GetObject. Yet I did not specify the GetObject. This works { "Sid": "put-only-objects-s2", "Effect": "Deny", "Principal": { "AWS": [

Re: [ceph-users] Radosgw s3 subuser permissions

2019-01-27 Thread Marc Roos
I tried with these, but didn't get any results "arn:aws:iam::Company:user/testuser:testsubuser" "arn:aws:iam::Company:subuser/testuser:testsubuser" -Original Message- From: Adam C. Emerson [mailto:aemer...@redhat.com] Sent: vrijdag 25 januari 2019 16:40 To: The Exoteric Order of the

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-01-27 Thread Marc Roos
Hi Alexandre, I was curious if I had a similar issue, what value are you monitoring? I have quite a lot to choose from. Bluestore.commitLat Bluestore.kvLat Bluestore.readLat Bluestore.readOnodeMetaLat Bluestore.readWaitAioLat Bluestore.stateAioWaitLat Bluestore.stateDoneLat

[ceph-users] Ceph rbd.ko compatibility

2019-01-27 Thread Marc Schöchlin
Hello ceph-users, we are using a low number of rbd.ko clients with our luminous cluster. Where can i get information about the following questions: * Which features and cluster compatibility is provided by the rbd.ko module of my system? (/sys/module/rbd/**, "modinfo rbd" not seems to