Re: [ceph-users] fio test rbd - single thread - qd1

2019-03-19 Thread Piotr Dałek
48.16 50.574872 24.01572 Same here - shold be cached in the blue-store cache as it is 16GB x 84 OSD's .. with a 1GB testfile. Any thoughts - suggestions - insights ? Jesper -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-use

Re: [ceph-users] ceph block - volume with RAID#0

2019-01-31 Thread Piotr Dałek
? Exclusive lock on RBD images will kill any (theoretical) performance gains. Without exclusive lock, you lose some of RBD features. Plus, using 2+ clients with single images doesn't sound like a good idea. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com

Re: [ceph-users] Fwd: what are the potential risks of mixed cluster and client ms_type

2018-11-18 Thread Piotr Dałek
or network hardware issues. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Fwd: what are the potential risks of mixed cluster and client ms_type

2018-11-18 Thread Piotr Dałek
messengers use the same protocol. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] RBD image "lightweight snapshots"

2018-08-09 Thread Piotr Dałek
snapshots. Removal of these "lightweight" snapshots would be instant (or near instant). So what do others think about this? -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ce

Re: [ceph-users] Safe to use rados -p rbd cleanup?

2018-07-16 Thread Piotr Dałek
a object and remove only objects indiced by these metadata. "--prefix" is used when these metadata are lost or overwritten. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.

Re: [ceph-users] Safe to use rados -p rbd cleanup?

2018-07-16 Thread Piotr Dałek
or rbd images are named differently. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] SSDs for data drives

2018-07-11 Thread Piotr Dałek
was running and how heavy it was. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Piotr Dałek
scrubbing is fine). >Shutdown all activity to the ceph cluster before that moment? Depends on whether it's actually possible in your case and what load your users generate - you have to decide. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.

Re: [ceph-users] Prioritize recovery over backfilling

2018-06-07 Thread Piotr Dałek
f there are other pgs to backfill and/or recovery. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Reduced productivity because of slow requests

2018-06-06 Thread Piotr Dałek
to find smth out in osd logs but there are nothing about it. Any thoughts how to avoid it? Have you tried disabling scrub and deep scrub? -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] a big cluster or several small

2018-05-15 Thread Piotr Dałek
. For us this already proved useful in the past. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovhcloud.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Integrating XEN Server : Long query time for "rbd ls -l" queries

2018-04-25 Thread Piotr Dałek
n the "rbd" utility. So what can i do to make "rbd ls -l" faster or to get comparable information regarding snapshot hierarchy information? Can you run this command with extra argument "--rbd_concurrent_management_ops=1" and share the timing of that? -- Piotr Dałek piot

Re: [ceph-users] High apply latency

2018-02-02 Thread Piotr Dałek
ay want to try the above as well. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] formatting bytes and object counts in ceph status ouput

2018-01-02 Thread Piotr Dałek
expect that non-size counters - like object counts - use base-10, and size counters use base-2 units. Ceph's "standard" of using base-2 everywhere was confusing for me as well initially, but I got used to that... Still, wouldn't mind if that would get sorted out once and for all. -- P

Re: [ceph-users] Snap trim queue length issues

2017-12-18 Thread Piotr Dałek
On 17-12-15 03:58 PM, Sage Weil wrote: On Fri, 15 Dec 2017, Piotr Dałek wrote: On 17-12-14 05:31 PM, David Turner wrote: I've tracked this in a much more manual way.  I would grab a random subset [..] This was all on a Hammer cluster.  The changes to the snap trimming queues going

Re: [ceph-users] Snap trim queue length issues

2017-12-15 Thread Piotr Dałek
once disk space is all used up. Hopefully it'll be convincing enough for devs. ;) -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] Snap trim queue length issues

2017-12-14 Thread Piotr Dałek
be helpful in pushing this into next Jewel release. Thanks! [1] one of our guys hacked a bash oneliner that printed out snap trim queue lengths for all pgs, but full run takes over an hour to complete on a cluster with over 20k pgs... [2] https://github.com/ceph/ceph/pull/19520 -- Piotr Dałek

Re: [ceph-users] ceph.conf tuning ... please comment

2017-12-06 Thread Piotr Dałek
t 3 lowest, or if that's not acceptable then at least set "osd heartbeat min size" to 0. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Single disk per OSD ?

2017-12-01 Thread Piotr Dałek
area once such OSD fails). -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] ceph-disk is now deprecated

2017-11-28 Thread Piotr Dałek
a *big* problem with this (we haven't upgraded to Luminous yet, so we can skip to next point release and move to ceph-volume together with Luminous). It's still a problem, though - now we have more of our infrastructure to migrate and test, meaning even more delays in production upgrades. -- Piotr

Re: [ceph-users] Restart is required?

2017-11-16 Thread Piotr Dałek
files in a subdir before merging into parent NOTE: A negative value means to disable subdir merging " will variable definition like "filestore_merge_treshold = -50" (negative value) work? (in Jewel it worked like a charm) Yes, I don't see any changes to that. -- Piotr Dałek piot

Re: [ceph-users] Restart is required?

2017-11-16 Thread Piotr Dałek
tiple" is not observed for runtime changes, meaning that new value will be stored in osd.0 process memory, but not used at all. Do I really need to restart OSD to make changes to take effect? ceph version 12.2.1 () luminous (stable) Yes. -- Piotr Dałek piotr.da...@corp.ovh.com https://w

Re: [ceph-users] Libvirt hosts freeze after ceph osd+mon problem

2017-11-07 Thread Piotr Dałek
want to *stop* (as in, freeze) a process instead of killing it? Anyway, with processes still there, it may take a few minutes before cluster realizes that daemons are stopped and kicks it out of cluster, restoring normal behavior (assuming correctly set crush rules). -- Piotr Dałek piotr.da

Re: [ceph-users] rbd rm snap on image with exclusive lock

2017-10-25 Thread Piotr Dałek
g properly refreshed. I'd love to, but that would require us to restart that client - not an option. We'll try to reproduce this somehow anyway and let you know if something interesting shows up. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ __

Re: [ceph-users] rbd rm snap on image with exclusive lock

2017-10-25 Thread Piotr Dałek
, that makes things clear. Seems like we have some Cinders utilizing Infernalis (9.2.1) librbd. Are you aware of any bugs in 9.2.x that could cause such behavior? We've seen that for the first time... -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us

[ceph-users] rbd rm snap on image with exclusive lock

2017-10-25 Thread Piotr Dałek
ots but not remove them when exclusive lock on image is taken? (jewel bug?) 2. Why the error is transformed and then ignored? -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.

Re: [ceph-users] A new SSD for journals - everything sucks?

2017-10-11 Thread Piotr Dałek
). You may want to look at this: https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/ -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] why sudden (and brief) HEALTH_ERR

2017-10-04 Thread Piotr Dałek
:-) Since Jewel (AFAIR), when (re)starting OSDs, pg status is reset to "never contacted", resulting in "pgs are stuck inactive for more than 300 seconds" being reported until osds regain connections between themselves. -- Piotr Dałek piotr.da...@corp.ovh.com ht

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-07 Thread Piotr Dałek
On 17-07-06 09:39 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 3:25 PM, Piotr Dałek <bra...@predictor.org.pl> wrote: Is that deep copy an equivalent of what Jewel librbd did at unspecified point of time, or extra one? It's equivalent / replacement -- not an additiona

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek
On 17-07-06 04:40 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 10:22 AM, Piotr Dałek <piotr.da...@corp.ovh.com> wrote: So I really see two problems here: lack of API docs and backwards-incompatible change in API behavior. Docs are always in need of update, so any pull requests

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek
memory bus? So I really see two problems here: lack of API docs and backwards-incompatible change in API behavior. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http

Re: [ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek
On 17-07-06 03:03 PM, Jason Dillaman wrote: On Thu, Jul 6, 2017 at 8:26 AM, Piotr Dałek <piotr.da...@corp.ovh.com> wrote: Hi, If you're using "rbd_aio_write()" in your code, be aware of the fact that before Luminous release, this function expects buffer to remain unchanged unt

[ceph-users] Note about rbd_aio_write usage

2017-07-06 Thread Piotr Dałek
ary memory allocation and copy on your side (though it's probably unavoidable with current state of Luminous). -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/li

Re: [ceph-users] Sparse file info in filestore not propagated to other OSDs

2017-06-21 Thread Piotr Dałek
On 17-06-21 03:24 PM, Sage Weil wrote: On Wed, 21 Jun 2017, Piotr Dałek wrote: On 17-06-14 03:44 PM, Sage Weil wrote: On Wed, 14 Jun 2017, Paweł Sadowski wrote: On 04/13/2017 04:23 PM, Piotr Dałek wrote: On 04/06/2017 03:25 PM, Sage Weil wrote: On Thu, 6 Apr 2017, Piotr Dałek wrote: [snip

Re: [ceph-users] Prioritise recovery on specific PGs/OSDs?

2017-06-21 Thread Piotr Dałek
if that would work for you (as others wrote), or +1 this PR: https://github.com/ceph/ceph/pull/13723 (it's bit outdated as I'm constantly low on time, but I promise to push it forward!). -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph

Re: [ceph-users] Sparse file info in filestore not propagated to other OSDs

2017-06-21 Thread Piotr Dałek
On 17-06-14 03:44 PM, Sage Weil wrote: On Wed, 14 Jun 2017, Paweł Sadowski wrote: On 04/13/2017 04:23 PM, Piotr Dałek wrote: On 04/06/2017 03:25 PM, Sage Weil wrote: On Thu, 6 Apr 2017, Piotr Dałek wrote: [snip] I think the solution here is to use sparse_read during recovery. The PushOp

Re: [ceph-users] Socket errors, CRC, lossy con messages

2017-04-11 Thread Piotr Dałek
by sending side. Try gathering some more examples of such crc errors and isolate osd/host that sends malformed data, then do usual diagnostics like memory test on that mahcine. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph

Re: [ceph-users] slow perfomance: sanity check

2017-04-06 Thread Piotr Dałek
drives, because Ceph is not optimized for those. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Recompiling source code - to find exact RPM

2017-03-24 Thread Piotr Dałek
tart of Ceph daemons is still required. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Recompiling source code - to find exact RPM

2017-03-23 Thread Piotr Dałek
to figure out. Yes, I understand that. But wouldn't be faster and/or more convenient if you would just recompile binaries in-place (or use network symlinks) instead of packaging entire Ceph and (re)installing its packages each time you do the change? Generating RPMs takes a while. -- Piotr

Re: [ceph-users] Recompiling source code - to find exact RPM

2017-03-23 Thread Piotr Dałek
them via nfs (or whatever) to build machine and build once there. -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Upgrading 2K OSDs from Hammer to Jewel. Our experience

2017-03-13 Thread Piotr Dałek
uldn't be a problem (at least we don't see it anymore). -- Piotr Dałek piotr.da...@corp.ovh.com https://www.ovh.com/us/ ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

2017-01-26 Thread Piotr Dałek
in ceph -w. i haven't dug into it much but just wanted to second that i've seen this happen on a recent hammer to recent jewel upgrade. Thanks for confirmation. We've prepared the patch which fixes the issue for us: https://github.com/ceph/ceph/pull/13131 -- Piotr Dałek piotr.da...@corp.ovh.com

Re: [ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

2017-01-18 Thread Piotr Dałek
On 01/17/2017 12:52 PM, Piotr Dałek wrote: During our testing we found out that during upgrade from 0.94.9 to 10.2.5 we're hitting issue http://tracker.ceph.com/issues/17386 ("Upgrading 0.94.6 -> 0.94.9 saturating mon node networking"). Apparently, there's a few commits for both ham

[ceph-users] Issue with upgrade from 0.94.9 to 10.2.5

2017-01-17 Thread Piotr Dałek
pposed to fix this issue for upgrades from 0.94.6 to 0.94.9 (and possibly for others), but we're still seeing this upgrading to Jewel, and symptoms are exactly same - after upgrading MONs, each not yet upgraded OSD takes full OSDMap from monitors after failing the CRC check. Anyone else encountered thi

[ceph-users] Any librados C API users out there?

2017-01-11 Thread Piotr Dałek
intermediate data copy, which will reduce cpu and memory load on clients. If you're using librados C API for object writes, feel free to comment here or in the pull request. -- Piotr Dałek ___ ceph-users mailing list ceph-users@lists.ceph.com http