[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Bradley Kite
Hi Igor, This has been very helpful. I have identified (when numjobs=1, the least-worst case) that there are approximately just as many bluestore_write_small_pre_read per second as there are sequential-write IOPS per second: Tue 4 Feb 22:44:34 GMT 2020 "bluestore_write_small_pre_read":

[ceph-users] Migrate journal to Nvme from old SSD journal drive?

2020-02-04 Thread Alex L
Hi, I finally got my Samsung PM983 [1] to use as journal for about 6 drives plus drive cache replacing a consumer SSD - Kingston SV300. But I can't for the life of me figure out how to move an existing journal to this NVME on my Nautilus cluster. # Created a new big partition on the NVME

[ceph-users] Re: osd_memory_target ignored

2020-02-04 Thread Stefan Kooman
Quoting Frank Schilder (fr...@dtu.dk): > Dear Stefan, > > I check with top the total allocation. ps -aux gives: > > USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND > ceph 784155 15.8 3.1 6014276 4215008 ? Sl Jan31 932:13 > /usr/bin/ceph-osd --cluster ceph

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread Rodrigo Severo - Fábrica
Em ter., 4 de fev. de 2020 às 15:19, escreveu: > > Rodrigo; > > Best bet would be to check logs. Check the OSD logs on the affected server. > Check cluster logs on the MONs. Check OSD logs on other servers. > > Your Ceph version(s) and your OS distribution and version would also be > useful

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread DHilsbos
Rodrigo; Best bet would be to check logs. Check the OSD logs on the affected server. Check cluster logs on the MONs. Check OSD logs on other servers. Your Ceph version(s) and your OS distribution and version would also be useful to help you troubleshoot this OSD flapping issue. Thank you,

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread Wesley Dillingham
I would guess that you have something preventing osd to osd communication on ports 6800-7300 or osd to mon communication on port 6789 and/or 3300. Respectfully, *Wes Dillingham* w...@wesdillingham.com LinkedIn On Tue, Feb 4, 2020 at 12:44 PM

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread Rodrigo Severo - Fábrica
Em ter., 4 de fev. de 2020 às 12:39, Rodrigo Severo - Fábrica escreveu: > > Hi, > > > I have a rather small cephfs cluster with 3 machines right now: all of > them sharing MDS, MON, MGS and OSD roles. > > I had to move all machines to a new physical location and, > unfortunately, I had to move

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread Rodrigo Severo - Fábrica
Em ter., 4 de fev. de 2020 às 13:11, escreveu: > > Rodrigo; > > Are all your hosts using the same IP addresses as before the move? Is the > new network structured the same? Yes for both questions. Rodrigo ___ ceph-users mailing list --

[ceph-users] Bucket rename with

2020-02-04 Thread EDH - Manuel Rios
Hi Some Customer asked us for a normal easy problem, they want rename a bucket. Checking the Nautilus documentation looks by now its not possible, but I checked master documentation and a CLI should be accomplish this apparently. $ radosgw-admin bucket link --bucket=foo --bucket-new-name=bar

[ceph-users] Cephalocon Seoul is canceled

2020-02-04 Thread Sage Weil
Hi everyone, We are sorry to announce that, due to the recent coronavirus outbreak, we are canceling Cephalocon for March 3-5 in Seoul. More details will follow about how to best handle cancellation of hotel reservations and so forth. Registrations will of course be refunded--expect an email

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread vitalif
SSD (block.db) partition contains object metadata in RocksDB so it probably loads the metadata before modifying objects (if it's not in cache yet). Also it sometimes performs compaction which also results in disk reads and writes. There are other things going on that I'm not completely aware

[ceph-users] Re: Bluestore cache parameter precedence

2020-02-04 Thread Igor Fedotov
Hi Boris, general settings (unless they are set to zero) override disk-specific settings . I.e. bluestore_cache_size overrides both bluestore_cache_size_hdd and bluestore_cache_size_ssd. Here is the code snippet in case you know C++   if (cct->_conf->bluestore_cache_size) {     cache_size

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Igor Fedotov
Hi Bradley, you might want to check performance counters for this specific OSD. Available via 'ceph daemon osd.0 perf dump'  command in Nautilus. A bit different command for Luminous AFAIR. Then look for 'read' substring in the dump and try to find unexpectedly high read-related counter

[ceph-users] Re: recovery_unfound

2020-02-04 Thread Chad William Seys
Hi Jake and all, We're having what looks to be the exact same problem. In our case it happened when I was "draining" an OSD for removal. (ceph crush remove...) Adding the OSD back doesn't help workaround the bug. Everything is either triply replicated or EC k3m2, either of which should

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Bradley Kite
Hi Vitaliy Yes - I tried this and I can still see a number of reads (~110 iops, 440KB/sec) on the SSD, so it is significantly better, but the result is still puzzling - I'm trying to understand what is causing the reads. The problem is amplified with numjobs >= 2 but it looks like it is still

[ceph-users] More OMAP Issues

2020-02-04 Thread DHilsbos
All; We're backing to having large OMAP object warnings regarding our RGW index pool. This cluster is now in production, so I can simply dump the buckets / pools and hope everything works out. I did some additional research on this issue, and it looks like I need to (re)shard the bucket

[ceph-users] Re: All pgs peering indefinetely

2020-02-04 Thread DHilsbos
Rodrigo; Are all your hosts using the same IP addresses as before the move? Is the new network structured the same? Thank you, Dominic L. Hilsbos, MBA Director - Information Technology Perform Air International Inc. dhils...@performair.com www.PerformAir.com -Original Message-

[ceph-users] Re: osd_memory_target ignored

2020-02-04 Thread Frank Schilder
Dear Stefan, I check with top the total allocation. ps -aux gives: USER PID %CPU %MEMVSZ RSS TTY STAT START TIME COMMAND ceph 784155 15.8 3.1 6014276 4215008 ? Sl Jan31 932:13 /usr/bin/ceph-osd --cluster ceph -f -i 243 ... ceph 784732 16.6 3.0 6058736

[ceph-users] All pgs peering indefinetely

2020-02-04 Thread Rodrigo Severo - Fábrica
Hi, I have a rather small cephfs cluster with 3 machines right now: all of them sharing MDS, MON, MGS and OSD roles. I had to move all machines to a new physical location and, unfortunately, I had to move all of them at the same time. They are already on again but ceph won't be accessible as

[ceph-users] Re: osd_memory_target ignored

2020-02-04 Thread Stefan Kooman
Hi, Quoting Frank Schilder (fr...@dtu.dk): > I recently upgraded from 13.2.2 to 13.2.8 and observe two changes that > I struggle with: > > - from release notes: The bluestore_cache_* options are no longer > needed. They are replaced by osd_memory_target, defaulting to 4GB. - > the default for

[ceph-users] Re: Write i/o in CephFS metadata pool

2020-02-04 Thread Samy Ascha
> On 2 Feb 2020, at 12:45, Patrick Donnelly wrote: > > On Wed, Jan 29, 2020 at 1:25 AM Samy Ascha wrote: >> >> Hi! >> >> I've been running CephFS for a while now and ever since setting it up, I've >> seen unexpectedly large write i/o on the CephFS metadata pool. >> >> The filesystem is

[ceph-users] Re: Doubt about AVAIL space on df

2020-02-04 Thread Wido den Hollander
On 2/4/20 2:00 PM, German Anders wrote: > Hello Everyone, > > I would like to understand if this output is right: > > *# ceph df* > GLOBAL: > SIZEAVAIL RAW USED %RAW USED > 85.1TiB 43.7TiB 41.4TiB 48.68 > POOLS: > NAMEID USED

[ceph-users] Re: Doubt about AVAIL space on df

2020-02-04 Thread German Anders
Manuel, find the output of ceph osd df tree command: # ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS TYPE NAME -7 84.00099- 85.1TiB 41.6TiB 43.6TiB 48.82 1.00 - root root -5 12.0- 13.1TiB 5.81TiB 7.29TiB 44.38 0.91 -

[ceph-users] Re: Doubt about AVAIL space on df

2020-02-04 Thread EDH - Manuel Rios
With “ceph osd df tree” will be clear but right now I can see that some %USE osd between 44% and 65%. Ceph osd df tree give also the balance at host level. Do you have balancer enabled ?No “perfect” distribution cause that you cant use the full space. In our case we gain space manually

[ceph-users] osd_memory_target ignored

2020-02-04 Thread Frank Schilder
I recently upgraded from 13.2.2 to 13.2.8 and observe two changes that I struggle with: - from release notes: The bluestore_cache_* options are no longer needed. They are replaced by osd_memory_target, defaulting to 4GB. - the default for bluestore_allocator has changed from stupid to bitmap,

[ceph-users] Re: Doubt about AVAIL space on df

2020-02-04 Thread German Anders
Hi Manuel, Sure thing: # ceph osd df ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS 0 nvme 1.0 1.0 1.09TiB 496GiB 622GiB 44.35 0.91 143 1 nvme 1.0 1.0 1.09TiB 488GiB 630GiB 43.63 0.89 141 2 nvme 1.0 1.0 1.09TiB 537GiB 581GiB 48.05 0.99 155

[ceph-users] Re: Doubt about AVAIL space on df

2020-02-04 Thread EDH - Manuel Rios
Hi German, Can you post , ceph osd df tree ? Looks like your usage distribution is not perfect and that's why you got less space than real. Regards -Mensaje original- De: German Anders Enviado el: martes, 4 de febrero de 2020 14:00 Para: ceph-us...@ceph.com Asunto: [ceph-users]

[ceph-users] Doubt about AVAIL space on df

2020-02-04 Thread German Anders
Hello Everyone, I would like to understand if this output is right: *# ceph df* GLOBAL: SIZEAVAIL RAW USED %RAW USED 85.1TiB 43.7TiB 41.4TiB 48.68 POOLS: NAMEID USED%USED MAX AVAIL OBJECTS volumes 13 13.8TiB

[ceph-users] OSDs crashing

2020-02-04 Thread Raymond Clotfelter
I have 30 or so OSDs on a cluster with 240 that just keep crashing. Below is the last part of one of the log files showing the crash, can anyone please help me read this to figure out what is going on and how to correct it? When I start the OSDs they generally seem to work for 5-30 minutes, and

[ceph-users] Re: recovery_unfound

2020-02-04 Thread Jake Grimmett
Hi Paul, Many thanks for your helpful suggestions. Yes, we have 13 pgs with "might_have_unfound" entries. (also 1 pgs without "might_have_unfound" stuck in active+recovery_unfound+degraded+repair state) Taking one pg with unfound objects: [root@ceph1 ~]# ceph health detail | grep 5.5c9

[ceph-users] Bluestore cache parameter precedence

2020-02-04 Thread Boris Epstein
Hello list, As stated in this document: https://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/ there are multiple parameters defining cache limits for BlueStore. You have bluestore_cache_size (presumably controlling the cache size), bluestore_cache_size_hdd (presumably

[ceph-users] Re: Understanding Bluestore performance characteristics

2020-02-04 Thread Vitaliy Filippov
Hi, Try to repeat your test with numjobs=1, I've already seen strange behaviour with parallel jobs to one RBD image. Also as usual: https://yourcmc.ru/wiki/Ceph_performance :-) Hi, We have a production cluster of 27 OSD's across 5 servers (all SSD's running bluestore), and have started to