Re: [ceph-users] units of metrics

2020-01-14 Thread Robert LeBlanc
On Tue, Jan 14, 2020 at 12:30 AM Stefan Kooman wrote: > Quoting Robert LeBlanc (rob...@leblancnet.us): > > The link that you referenced above is no longer available, do you have a > > new link?. We upgraded from 12.2.8 to 12.2.12 and the MDS metrics all > > changed, so I'm

Re: [ceph-users] units of metrics

2020-01-13 Thread Robert LeBlanc
The link that you referenced above is no longer available, do you have a new link?. We upgraded from 12.2.8 to 12.2.12 and the MDS metrics all changed, so I'm trying to may the old values to the new values. Might just have to look in the code. :( Thanks! Robert LeBlanc PGP

Re: [ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Robert LeBlanc
> The solution for you is to simply put the option under global and restart > ceph-mgr (or use daemon config set; it doesn't support changing config via > ceph tell for some reason) > > > Paul > > On Mon, Dec 9, 2019 at 8:32 PM Paul Emmerich > wrote: > >> >> &

[ceph-users] Annoying PGs not deep-scrubbed in time messages in Nautilus.

2019-12-09 Thread Robert LeBlanc
0 1.8 PiB default.rgw.buckets.non-ec 8 8.1 MiB 22 8.1 MiB 0 1.8 PiB Please help me figure out what I'm doing wrong with these settings. Thanks, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654

[ceph-users] Cephfs metadata fix tool

2019-12-07 Thread Robert LeBlanc
Our Jewel cluster is exhibiting some similar issues to the one in this thread [0] and it was indicated that a tool would need to be written to fix that kind of corruption. Has the tool been written? How would I go about repair this 16EB directories that won't delete? Thank you, Robert LeBlanc [0

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Robert LeBlanc
On Tue, Dec 3, 2019 at 9:11 AM Ed Fisher wrote: > > > On Dec 3, 2019, at 10:28 AM, Robert LeBlanc wrote: > > Did you make progress on this? We have a ton of < 64K objects as well and > are struggling to get good performance out of our RGW. Sometimes we have > RGW

Re: [ceph-users] RGW performance with low object sizes

2019-12-03 Thread Robert LeBlanc
-+ > | 8 | 196.3 MB/s |2 1 2 2 3 3 5 > 5 |2 1 2 2 3 3 5 5 | > > +-++++ > [...section CLEANUP wa

Re: [ceph-users] Revert a CephFS snapshot?

2019-12-03 Thread Robert LeBlanc
back each file's content. The MDS could do this more > efficiently than rsync give what it knows about the snapped inodes > (skipping untouched inodes or, eventually, entire subtrees) but it's a > non-trivial amount of work to implement. > > Would it make sense to extend CephF

Re: [ceph-users] Decreasing the impact of reweighting osds

2019-10-25 Thread Robert LeBlanc
Yout can try adding osd op queue = wpq osd op queue cut off = high To all the osd ceph configs and restarting, That has made reweighting pretty painless for us. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Oct 22, 2019 at 8:36 PM

Re: [ceph-users] Decreasing the impact of reweighting osds

2019-10-25 Thread Robert LeBlanc
You can try adding Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Oct 22, 2019 at 8:36 PM David Turner wrote: > > Most times you are better served with simpler settings like > osd_recovery_sleep, which has 3 variants if

Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread Robert LeBlanc
mpact for client traffic. Those would need to be set on all OSDs to be completely effective. Maybe go back to the defaults? Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-17 Thread Robert LeBlanc
arameter? Wow! Dusting off the cobwebs here. I think this is what lead me to dig into the code and write the WPQ scheduler. I can't remember doing anything specific. I'm sorry I'm not much help in this regard. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A90

Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-16 Thread Robert LeBlanc
settings on and it really helped both of them. ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Openstack VM IOPS drops dramatically during Ceph recovery

2019-10-14 Thread Robert LeBlanc
ontrol? > > best regards, > > Samuel Not sure which version of Ceph you are on, but add these to your /etc/ceph/ceph.conf on all your OSDs and restart them. osd op queue = wpq osd op queue cut off = high That should really help and make backfills and recovery be non-impactful. This wi

Re: [ceph-users] Commit and Apply latency on nautilus

2019-10-01 Thread Robert LeBlanc
On Tue, Oct 1, 2019 at 7:54 AM Robert LeBlanc wrote: > > On Mon, Sep 30, 2019 at 5:12 PM Sasha Litvak > wrote: > > > > At this point, I ran out of ideas. I changed nr_requests and readahead > > parameters to 128->1024 and 128->4096, tuned nodes to > > pe

Re: [ceph-users] Commit and Apply latency on nautilus

2019-10-01 Thread Robert LeBlanc
hat else I can > try. > > Any suggestions? If you haven't already tried this, add this to your ceph.conf and restart your OSDs, this should help bring down the variance in latency (It will be the default in Octopus): osd op queue = wpq osd op queue cut off = high Robert LeB

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-24 Thread Robert LeBlanc
ig in a single file, so I put it in my inventory file, but it looks like you have the right idea. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] hanging slow requests: failed to authpin, subtree is being exported

2019-09-23 Thread Robert LeBlanc
t: 141 KiB/s rd, 54 MiB/s wr, 62 op/s rd, 577 op/s wr > > > > > [root@mds02 ~]# ceph health detail > > HEALTH_WARN 1 MDSs report slow requests; 2 MDSs behind on trimming > > MDS_SLOW_REQUEST 1 MDSs report slow requests > > mdsmds02(mds.1): 2 slow requests a

Re: [ceph-users] ceph; pg scrub errors

2019-09-23 Thread Robert LeBlanc
m is repaired and when it deep-scrubs to check it, the problem has reappeared or another problem was found and the disk needs to be replaced. Try running: rados list-inconsistent-obj ${PG} --format=json and see what the exact problems are. -------- Robert LeBlanc PGP Fingerprint 79A2 9CA

Re: [ceph-users] cephfs performance issue MDSs report slow requests and osd memory usage

2019-09-23 Thread Robert LeBlanc
> included below; oldest blocked for > 62497.675728 secs > > 2019-09-19 08:53:47.528891 mds.icadmin007 [WRN] 3 slow requests, 0 included > > below; oldest blocked for > 62501.243214 secs > > 2019-09-19 08:53:52.529021 mds.icadmin007 [WRN] 3 slow requests, 0 included > >

Re: [ceph-users] Failure to start ceph-mon in docker

2019-08-29 Thread Robert LeBlanc
to the Ceph distributed packages didn't change the UID. Thanks, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Aug 29, 2019 at 12:33 AM Frank Schilder wrote: > Hi Robert, > > this is a bit less trivial than it m

[ceph-users] Specify OSD size and OSD journal size with ceph-ansible

2019-08-28 Thread Robert LeBlanc
ible and no LVs were created. Please help me understand how to configure what I would like to do. Thank you, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users m

Re: [ceph-users] Failure to start ceph-mon in docker

2019-08-28 Thread Robert LeBlanc
Turns out /var/lib/ceph was ceph.ceph and not 167.167, chowning it made things work. I guess only monitor needs that permission, rgw,mgr,osd are all happy without needing it to be 167.167. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed

[ceph-users] Failure to start ceph-mon in docker

2019-08-28 Thread Robert LeBlanc
-rw-r--r-- 1 167 167 1.3M Aug 28 19:16 MANIFEST-027846 -rw-r--r-- 1 167 167 4.7K Aug 1 23:38 OPTIONS-002825 -rw-r--r-- 1 167 167 4.7K Aug 16 07:40 OPTIONS-027849 Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] MDSs report damaged metadata

2019-08-22 Thread Robert LeBlanc
one. When I deleted the directories with the damage the active MDS crashed, but the replay took over just fine. I haven't had the messages now for almost a week. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Aug 19, 2019 at 10:30 PM Lars

Re: [ceph-users] How does CephFS find a file?

2019-08-19 Thread Robert LeBlanc
to know how many objects to fetch for the whole object. The file is stored by the inode (in hex) appended by the object offset. The inode corresponds to the same value in `ls -li` in CephFS converted to hex. I hope that is correct and useful as a starting point for you. Robert LeBlanc

Re: [ceph-users] New CRUSH device class questions

2019-08-12 Thread Robert LeBlanc
I'm looking for. Thank you, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Replay MDS server stuck

2019-08-09 Thread Robert LeBlanc
would be appreciated. Thank you, Robert LeBlanc ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Robert LeBlanc
On Wed, Aug 7, 2019 at 12:08 AM Konstantin Shalygin wrote: > On 8/7/19 1:40 PM, Robert LeBlanc wrote: > > > Maybe it's the lateness of the day, but I'm not sure how to do that. > > Do you have an example where all the OSDs are of class ssd? > Can't parse what you mean. Yo

Re: [ceph-users] New CRUSH device class questions

2019-08-07 Thread Robert LeBlanc
t of balance. This is run by cron every 5 minutes. If there is a way to reserve some capacity for a pool that no other pool can use, please provide an example. Think of reserved inode space in ext4/XFS/etc. Thank you. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C7

Re: [ceph-users] New CRUSH device class questions

2019-08-06 Thread Robert LeBlanc
On Tue, Aug 6, 2019 at 11:11 AM Paul Emmerich wrote: > On Tue, Aug 6, 2019 at 7:45 PM Robert LeBlanc > wrote: > > We have a 12.2.8 luminous cluster with all NVMe and we want to take some > of the NVMe OSDs and allocate them strictly to metadata pools (we have a > probl

[ceph-users] New CRUSH device class questions

2019-08-06 Thread Robert LeBlanc
(potentially using a file with a list of partition UUIDs that should be in the metadata pool).? Any other options I may not be considering? Thank you, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] Built-in HA?

2019-08-05 Thread Robert LeBlanc
Routing and bind the source port on the connection (not the easiest, but allows you to have multiple NICs in the same broadcast domain). I don't have experience with Ceph in this type of configuration. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1

Re: [ceph-users] How to add 100 new OSDs...

2019-08-03 Thread Robert LeBlanc
Gorbachev wrote: > On Fri, Aug 2, 2019 at 6:57 PM Robert LeBlanc > wrote: > > > > On Fri, Jul 26, 2019 at 1:02 PM Peter Sabaini wrote: > >> > >> On 26.07.19 15:03, Stefan Kooman wrote: > >> > Quoting Peter Sabaini (pe...@sabaini.at): > >>

Re: [ceph-users] Problems understanding 'ceph-features' output

2019-08-02 Thread Robert LeBlanc
On Tue, Jul 30, 2019 at 2:06 AM Janne Johansson wrote: > Someone should make a webpage where you can enter that hex-string and get > a list back. > Providing a minimum bitmap would allow someone to do so, and someone like me to do it manually until then. ---- Robert Le

Re: [ceph-users] How to add 100 new OSDs...

2019-08-02 Thread Robert LeBlanc
s of client I/O, but the clients haven't noticed that huge backfills have been going on. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mark CephFS inode as lost

2019-07-23 Thread Robert LeBlanc
Thanks, I created a ticket. http://tracker.ceph.com/issues/40906 Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jul 22, 2019 at 11:45 PM Yan, Zheng wrote: > please create a ticket at http://tracker.ceph.com/projects/cep

[ceph-users] Mark CephFS inode as lost

2019-07-22 Thread Robert LeBlanc
, how can we tell MDS that the inode is lost and to forget about it without trying to do any checks on it (checking the RADOS objects may be part of the problem)? Once the inode is out of CephFS, we can clean up the RADOS objects manually or leave them there to rot. Thanks, Robert LeBlanc

Re: [ceph-users] Investigating Config Error, 300x reduction in IOPs performance on RGW layer

2019-07-17 Thread Robert LeBlanc
I'm pretty new to RGW, but I'm needing to get max performance as well. Have you tried moving your RGW metadata pools to nvme? Carve out a bit of NVMe space and then pin the pool to the SSD class in CRUSH, that way the small metadata ops aren't on slow media. Robert LeBlanc PGP

[ceph-users] Allocation recommendations for separate blocks.db and WAL

2019-07-17 Thread Robert LeBlanc
/Leveled-Compaction Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] enterprise support

2019-07-15 Thread Robert LeBlanc
We recently used Croit (https://croit.io/) and they were really good. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jul 15, 2019 at 12:53 PM Void Star Nill wrote: > Hello, > > Other than Redhat and SUSE, are there other

Re: [ceph-users] To backport or not to backport

2019-07-05 Thread Robert LeBlanc
;Just Works". By not back porting new features, I think it gives more time to bake the features into the new version and frees up the developers to focus on the forward direction of the product. If I want a new feature, then the burden is on me to t

Re: [ceph-users] cannot add fuse options to ceph-fuse command

2019-07-05 Thread Robert LeBlanc
Is this a Ceph specific option? If so, you may need to prefix it with "ceph.", at least I had to for FUSE to pass it to the Ceph module/code portion. -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jul 4, 2019 at 7:35 AM s

[ceph-users] ceph-ansible with docker

2019-07-01 Thread Robert LeBlanc
their own IP address and are bridges created like LXD or does it share the host IP? Thank you, Robert LeBlanc Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users

Re: [ceph-users] increase pg_num error

2019-07-01 Thread Robert LeBlanc
On Mon, Jul 1, 2019 at 11:57 AM Brett Chancellor wrote: > In Nautilus just pg_num is sufficient for both increases and decreases. > > Good to know, I haven't gotten to Nautilus yet. ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2

Re: [ceph-users] increase pg_num error

2019-07-01 Thread Robert LeBlanc
I believe he needs to increase the pgp_num first, then pg_num. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jul 1, 2019 at 7:21 AM Nathan Fish wrote: > I ran into this recently. Try running "ceph osd require-osd-release &g

Re: [ceph-users] How does monitor know OSD is dead?

2019-06-29 Thread Robert LeBlanc
conds pass with the monitor not hearing from the OSD, it will mark it down. It 'should' only take 20 seconds to detect a downed OSD. Usually, the problem is that an OSD gets too busy and misses heartbeats so other OSDs wrongly mark them down. If 'nodown' is set,

Re: [ceph-users] How does monitor know OSD is dead?

2019-06-29 Thread Robert LeBlanc
he "mon osd down out interval". The rest of what I wrote is correct. Just to make sure I don't confuse anyone else. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Migrating a cephfs data pool

2019-06-28 Thread Robert LeBlanc
, then you can remove the pool from cephfs and the overlay. That way the OSDs are the one doing the data movement. I don't know that part of the code, so I can't quickly propose any patches. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri

Re: [ceph-users] Migrating a cephfs data pool

2019-06-28 Thread Robert LeBlanc
hundreds of Terabytes, we need something that can be done online, and if it has a minute or two of downtime would be okay. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Jun 28, 2019 at 9:02 AM Marc Roos wrote: > > > 1. >

Re: [ceph-users] How does monitor know OSD is dead?

2019-06-28 Thread Robert LeBlanc
to continue. Then when the down timeout expires it will start backfilling and recovering the PGs that were affected. Double check that size != min_size for your pools. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jun 27, 2019 at 5:26 PM Bryan

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-25 Thread Robert LeBlanc
There may also be more memory coping involved instead of just passing pointers around as well, but I'm not 100% sure. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jun 24, 2019 at 10:28 AM Jeff Layton wrote: > On Mon, 2019-06

Re: [ceph-users] rebalancing ceph cluster

2019-06-25 Thread Robert LeBlanc
} {weight}``` Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Jun 24, 2019 at 2:25 AM jinguk.k...@ungleich.ch < jinguk.k...@ungleich.ch> wrote: > Hello everyone, > > We have some osd on the ceph. > Some osd's usa

Re: [ceph-users] slow requests are blocked > 32 sec. Implicated osds 0, 2, 3, 4, 5 (REQUEST_SLOW)

2019-06-10 Thread Robert LeBlanc
W) warning, even if my OSD disk usage goes above 95% (fio >> ran from 4 diffrent hosts) >> >> On my prod cluster, release 12.2.9, as soon as I run fio on a single >> host, I see a lot of REQUEST_SLOW warninr gmessages, but "iostat -xd 1" >> does not sho

Re: [ceph-users] slow requests are blocked > 32 sec. Implicated osds 0, 2, 3, 4, 5 (REQUEST_SLOW)

2019-06-10 Thread Robert LeBlanc
; help in this case ? > Regards > Your disk times look okay, just a lot more unbalanced than I would expect. I'd give wpq a try, I use it all the time, just be sure to also include the op_cutoff setting too or it doesn't have much effect. Let me know how it goes.

Re: [ceph-users] slow requests are blocked > 32 sec. Implicated osds 0, 2, 3, 4, 5 (REQUEST_SLOW)

2019-06-07 Thread Robert LeBlanc
backfills not so disruptive. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Jun 6, 2019 at 1:43 AM BASSAGET Cédric wrote: > Hello, > > I see messages related to REQUEST_SLOW a few times per day. > > here's my ceph -s : > >

Re: [ceph-users] performance in a small cluster

2019-05-24 Thread Robert LeBlanc
sts are small amounts of data, but once the drive started getting full, the performance dropped off a cliff. Considering that Ceph is really hard on drives, it's good to test the extreme. Robert LeBlanc ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS object mapping.

2019-05-24 Thread Robert LeBlanc
On Fri, May 24, 2019 at 2:14 AM Burkhard Linke < burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > On 5/22/19 5:53 PM, Robert LeBlanc wrote: > > When you say 'some' is it a fixed offset that the file data starts? Is the > first stripe just metadata? > > No

Re: [ceph-users] Major ceph disaster

2019-05-24 Thread Robert LeBlanc
gt; Does this mean that the lost object isn't even a file that appears in the > ceph directory. Maybe a leftover of a file that has not been deleted > properly? It wouldn't be an issue to mark the object as lost in that case. > On 24.05.19 5:08 nachm., Robert LeBlanc wrote: > > You need to u

Re: [ceph-users] Major ceph disaster

2019-05-24 Thread Robert LeBlanc
You need to use the first stripe of the object as that is the only one with the metadata. Try "rados -p ec31 getxattr 10004dfce92. parent" instead. Robert LeBlanc Sent from a mobile device, please excuse any typos. On Fri, May 24, 2019, 4:42 AM Kevin Flöh wrote: > Hi, &

Re: [ceph-users] Major ceph disaster

2019-05-22 Thread Robert LeBlanc
how to proceed. 1. Do a deep-scrub on each PG that is inconsistent. (This may fix some of them) 2. Print out the inconsistent report for each inconsistent PG. `rados list-inconsistent-obj --format=json-pretty` 3. You will want to look at the error messages and see if all

Re: [ceph-users] CephFS object mapping.

2019-05-22 Thread Robert LeBlanc
On Wed, May 22, 2019 at 12:22 AM Burkhard Linke < burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > > On 5/21/19 9:46 PM, Robert LeBlanc wrote: > > I'm at a new job working with Ceph again and am excited to back in the > > community! > > > > I

[ceph-users] CephFS object mapping.

2019-05-21 Thread Robert LeBlanc
ert LeBlanc ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] deep-scrubbing has large impact on performance

2016-11-22 Thread Robert LeBlanc
If you use wpq, I recommend also setting "osd_op_queue_cut_off = high" as well, otherwise replication OPs are not weighted and really reduces the benefit of wpq. -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Nov 22, 2016

Re: [ceph-users] Blocked ops, OSD consuming memory, hammer

2016-05-26 Thread Robert LeBlanc
C7mxzfXTsxQj68iIAKUp 47Nwaz4ln2A5f20SQe3W4jxp33MKsAJYej2/xn/B0roxH7ZTAhXlcpYhU8Ni s5aw =X+rk -END PGP SIGNATURE----- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, May 24, 2016 at 3:16 PM, Heath Albritton <halbr...@harm.org> wro

Re: [ceph-users] mark out vs crush weight 0

2016-05-23 Thread Robert LeBlanc
+3MejQPzr9UDay/k4 7gxrNrB7YJ7YIX5i2yGYfE+tNVNUD4nGBgPCcBY7yDAzvbBKM5HzZSxWfYv6 JULq+EVc592gGjUx8BI+vJnckV3yGABCrVdUda2xxYwjMkIHbnoQtL7yi3DL W7Y5Z5iIDGSSpDcMOIEzSCiABzuKJHQC+EPf1NHGbEtK7ZGFPqmVx98eREgO oyjl =U0LK -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4

Re: [ceph-users] Ceph InfiniBand Cluster - Jewel - Performance

2016-04-07 Thread Robert LeBlanc
V8Mhs7mmkEQtwcvyaYQ0bx0Bs3o4cvTTeYbJUpLWEgMmGAEBZbf7Sx+y3dIp jUHb2jPEchBb83BGeLvAkCTfouq/J3pzQK96gA2Kh/KOlVJTpFdKUU5x+wpM ACqD+S/AFkgnfGm4fcgBexhro7GImiO6VIaOdxvTSdQbSsaoKckZOxFhVWih XyBJ =EF9A -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Apr 7, 2016

Re: [ceph-users] data corruption with hammer

2016-03-20 Thread Robert LeBlanc
Su1Iud2fYdma5w8MFStjp1BAV3osg1WgIM6KYzsSZI1BkCQAqU58ROZ0ZsMb D05/AEK/A6fp0ROXUczhXDcXlXcGEWyJm1QEtg7cSu3C+9qu5qvQQxyrrwbZ MK8C5lhVb44sqSVcSIZ+KCrPC+x8UKodDQZCz6O6NrJjZLn2g06583cMFWK8 qLo+ =qgB7 -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Mar 17, 2016 at 8:19

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
Yep, let me pull and build that branch. I tried installing the dbg packages and running it in gdb, but it didn't load the symbols. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Mar 17, 2016 at 11:36 AM, Sage Weil <sw...@redhat.

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
CppgbnchgRHuI68rNM6nFYPJa4C3MlyQhu2WmOialAGgQi+IQP/g6h70e0RR eqLX =DcjE -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Mar 16, 2016 at 1:40 PM, Gregory Farnum <gfar...@redhat.com> wrote: > This tracker ticket happened to

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
could reproduce this with a > ceph_test_rados workload (from ceph-tests). I.e., get ceph_test_rados > running, and then find the sequence of operations that are sufficient to > trigger a failure. > > sage > > > > > > > > > > > > -O

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
Cherry-picking that commit onto v0.94.6 wasn't clean so I'm just building your branch. I'm not sure what the difference between your branch and 0.94.6 is, I don't see any commits against osd/ReplicatedPG.cc in the last 5 months other than the one you did today. Robert LeBlanc PGP

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Mar 17, 2016 at 10:39 AM, Sage Weil <sw...@redhat.com> wrote: > On Thu, 17 Mar 2016, Robert LeBlanc wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA256 >> &g

Re: [ceph-users] data corruption with hammer

2016-03-19 Thread Robert LeBlanc
Also, is this ceph_test_rados rewriting objects quickly? I think that the issue is with rewriting objects so if we can tailor the ceph_test_rados to do that, it might be easier to reproduce. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu

Re: [ceph-users] data corruption with hammer

2016-03-15 Thread Robert LeBlanc
+yVQ8WB5AQmEqN6p6enBCxpvr42p8Eu484dO0xqjIiEOfsMANT/8V63y RzjPMOaFKFnl3JoYNm61RGAUYszNBeX/Plm/3mP0qiiGBAeHYoxh7DNYlrs/ gUb/O9V0yNuHQIRTs8ZRyrzZKpmh9YMYo8hCsfIqWZjMwEyQaRFuysQB3NaR lQCO/o12Khv2cygmTCQxS2L7vp2zrkPaS/KietqQ0gwkV1XbynK0XyLkAVDw zTLa =Wk/a -END PGP SIGNATURE- Robert

Re: [ceph-users] how to downgrade when upgrade from firefly to hammer fail

2016-03-07 Thread Robert LeBlanc
+9n4M0jW14n2BXejMZjpKXxNa86N5cF7yO/hILCtz1CVJgNcqT2z+kIDZ3z 50aZva/SHsvxmdwK+UxrB3jnFldhzPUB6nU/xJCQWN+BBTSQByFmAg+JkEuX 13qV0h4yWRfH4uaKYdKuzTVSX0zY8HkAA4ZHTatxiPXiVET+NwNE+4aqdbTz hw+f =nLNP -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sun

Re: [ceph-users] Cache Pool and EC: objects didn't flush to a cold EC storage

2016-03-07 Thread Robert LeBlanc
Sga4WkKqlz e3Oz9PsDU9Tw2UVyo4zLEqgpcWcbY8E1VAAoirKAGcCqnwzwjvhGM2e1h66L yYjepiUQ9oLbIct9MXJOSAMwctsrAYgvR1veG+vqND5ZLr+OIR7at9Vpeg8m +oBVG+4PgxlIEfxVGf+8OjLK9sJUTm+AtLMzsbDqMFX9VQtpoTlsqYGd5gTW 9t/H =7sfH -END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On

Re: [ceph-users] Replacing OSD drive without rempaping pg's

2016-03-01 Thread Robert LeBlanc
-END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Feb 29, 2016 at 10:29 PM, Lindsay Mathieson <lindsay.mathie...@gmail.com> wrote: > I was looking at replacing an osd drive in place as per the procedure here:

Re: [ceph-users] List of SSDs

2016-02-26 Thread Robert LeBlanc
/0TgsfN2bszoXbHk1rN1NqMVt9BDqHr ZGb++dyfjUFaMOM/S8WXfkxV3dtYi7LKGEn4pSQ2IyZ92REwcTWej2TPV5r9 Nq0g =LM6/ -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Feb 26, 2016 at 5:41 PM, Shinobu Kinjo <ski...@redhat.

Re: [ceph-users] List of SSDs

2016-02-26 Thread Robert LeBlanc
AeraQSHLBtOtyrXBcFCtZv2YVbl2juwwC2lNXHJZBd0b/iUDnrBA358U0crm +TqyYR7LoZiUjUMI0HZzjeyVIsST201R6uQ1Tv9b6DFAOxDMPWD7ViJLcSIO yAiI =vXUO -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Fri, Feb 26, 2016 at 4:05 PM, Shinobu Kinjo <

Re: [ceph-users] List of SSDs

2016-02-25 Thread Robert LeBlanc
benchmarks. Some of the data about the S3500s is from my test cluster that has them. Sent from a mobile device, please excuse any typos. On Feb 25, 2016 9:20 PM, "Christian Balzer" <ch...@gol.com> wrote: > > Hello, > > On Wed, 24 Feb 2016 22:56:15 -0700 Robert LeBlan

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-25 Thread Robert LeBlanc
, but there was some. Sent from a mobile device, please excuse any typos. On Feb 25, 2016 9:15 PM, "Christian Balzer" <ch...@gol.com> wrote: > > Hello, > > On Wed, 24 Feb 2016 23:01:43 -0700 Robert LeBlanc wrote: > > > With my S3500 drives in my test clus

Re: [ceph-users] Can not disable rbd cache

2016-02-25 Thread Robert LeBlanc
My guess would be that if you are already running hammer on the client it is already using the new watcher API. This would be a fix on the OSDs to allow the object to be moved because the current client is smart enough to try again. It would be watchers per object. Sent from a mobile device,

Re: [ceph-users] Observations with a SSD based pool under Hammer

2016-02-24 Thread Robert LeBlanc
With my S3500 drives in my test cluster, the latest master branch gave me an almost 2x increase in performance compare to just a month or two ago. There looks to be some really nice things coming in Jewel around SSD performance. My drives are now 80-85% busy doing about 10-12K IOPS when doing 4K

Re: [ceph-users] List of SSDs

2016-02-24 Thread Robert LeBlanc
We are moving to the Intel S3610, from our testing it is a good balance between price, performance and longevity. But as with all things, do your testing ahead of time. This will be our third model of SSDs for our cluster. The S3500s didn't have enough life and performance tapers off add it gets

Re: [ceph-users] ceph hammer : rbd info/Status : operation not supported (95) (EC+RBD tier pools)

2016-02-24 Thread Robert LeBlanc
We have not seen this issue, but we don't run EC pools yet (we are waiting for multiple layers to be available). We are not running 0.94.6 in production yet either. We have adopted the policy to only run released versions in production unless there is a really pressing need to have a patch. We are

Re: [ceph-users] Can not disable rbd cache

2016-02-24 Thread Robert LeBlanc
- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Feb 24, 2016 at 4:29 AM, Oliver Dzombic <i...@ip-interactive.de> wrote: > Hi Esta, > > how do you know, that its still active ? > > -- > Mit freundlichen Gruessen / Best regards

Re: [ceph-users] Crush map customization for production use

2016-02-24 Thread Robert LeBlanc
HRiFDl 7cD0IpScVkSFHVn4MfOeB4Z+qw9ow9SwGB75BYm98axxsRdNlPNiQzxRcb5z Tdal =iMwX -END PGP SIGNATURE- ---- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Feb 24, 2016 at 4:09 AM, Vickey Singh <vickey.singh22...@gmail.com> wrote: > He

Re: [ceph-users] Incorrect output from ceph osd map command

2016-02-23 Thread Robert LeBlanc
/T8bRYeIzINkkB60k6gSvrF5TO2Kq+x7UiYUQ82KyHE+zlTryXW 0BEj2bK9s4NtAItkx3F7bcmnusOOlb1AMMJFssMQV/LmjDOR9xJUYiuqXxrb 6AB3 =hv6I -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Feb 23, 2016 at 3:33 PM, Vickey Singh

Re: [ceph-users] Ceph and its failures

2016-02-23 Thread Robert LeBlanc
HP9Wi3MrVJtXDLFrnQRglB2dfFWvBlrlBTj3uG7Ebn5DO6glxPEAvzrOgsJ2 O8D5+AMvooc41T74aUcWQK8NHNrrN+eL18yhRfjCgyadA2VYvWeu6K7sIUFo NKFE66ahsxrNKZUrLjeCo69iP4Zf5+AgY7rCau81vzQNtmFUPjzUKyOzgpsb Y2fQ =TGcG -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Feb

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-13 Thread Robert LeBlanc
I -END PGP SIGNATURE- -------- Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Sat, Feb 13, 2016 at 8:51 PM, Tom Christensen <pav...@gmail.com> wrote: >> > Next this : > --- > 2016-02-12 01:35:33.915981 7f75be4d57c0 0 osd.

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Robert LeBlanc
few weeks time I can have a report on what I find. Hopefully we can have it fixed for Jewel and Hammer. Fingers crossed. Robert LeBlanc Sent from a mobile device please excuse any typos. On Feb 12, 2016 10:32 PM, "Christian Balzer" <ch...@gol.com> wrote: > > Hello, > >

Re: [ceph-users] Reducing the impact of OSD restarts (noout ain't uptosnuff)

2016-02-12 Thread Robert LeBlanc
o resolve this by the time I'm finished with the queue optimizations I'm doing (hopefully in a week or two), I plan on looking into this to see if there is something that can be done to prevent the OPs from being accepted until the OSD is ready for them. - ---- Robert LeBlanc PGP Fingerprint

Re: [ceph-users] cls_rbd ops on rbd_id.$name objects in EC pool

2016-02-11 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Is this only a problem with EC base tiers or would replicated base tiers see this too? - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Feb 11, 2016 at 6:09 PM, Sage Weil wrote: > On

Re: [ceph-users] K is for Kraken

2016-02-08 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Too bad K isn't an LTS. It was be fun to release the Kraken many times. I like liliput - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Feb 8, 2016 at 11:36 AM, Sage Weil wrote: >

Re: [ceph-users] Unified queue in Infernalis

2016-02-05 Thread Robert LeBlanc
I believe this is referring to combining the previously separate queues into a single queue (PrioritizedQueue and soon to be WeightedPriorityQueue) in ceph. That way client IO and recovery IO can be better prioritized in the Ceph code. This is all before the disk queue. Robert LeBlanc Sent from

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-04 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 On Thu, Feb 4, 2016 at 8:32 PM, Christian Balzer wrote: > On Wed, 3 Feb 2016 22:42:32 -0700 Robert LeBlanc wrote: > I just finished downgrading my test cluster from testing to Jessie and > then upgrading Ceph from Firefly to Hammer (tha

Re: [ceph-users] Upgrading with mon & osd on same host

2016-02-04 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Just make sure that your monitors and OSDs are on the very latest of Hammer or else your Infernalis OSDs won't activate. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Thu, Feb 4, 2016 at 12

Re: [ceph-users] Set cache tier pool forward state automatically!

2016-02-03 Thread Robert LeBlanc
yTKyzFarbae6tbAmaMPC8l9vaj15t7bAB0KOokMqZied7EcM1ZoFVqKRahrm 73mIeHiDUwZ8gi+BHKX7OwqKt3VZJYf/+rNJx+g4kp5WN0FEkUMoqF75qO4p 62+PuQIwh6jUpB4cDsbEJd78UGbCptJBojmsNVogU+xiSXTKQmEduP0HqQfG JhTLg3Un2C4/MSGbhRI26csFCzEi66iRXQWdfCITP4Um70KO6dE2C1MAveYg hJ7b =CaRF -END PGP SIGNATURE- Robert LeBlanc PGP Fingerprint 79A2

Re: [ceph-users] Optimal OSD count for SSDs / NVMe disks

2016-02-03 Thread Robert LeBlanc
in the cache, there is still a performance penalty for having the caching tier vs. a native SSD pool. So if you are not using the tiering, move to a straight SSD pool. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Wed, Feb 3, 2016 at 5:01 AM

  1   2   3   4   5   >