Re: [ceph-users] OSD crash after change of osd_memory_target

2020-01-22 Thread Igor Fedotov
Hi Martin, looks like a bug to me. You might want to remove all custom settings from config database and try to set osd-memory-target only. Would it help? Thanks, Igor On 1/22/2020 3:43 PM, Martin Mlynář wrote: Dne 21. 01. 20 v 21:12 Stefan Kooman napsal(a): Quoting Martin Mlynář

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-20 Thread Igor Fedotov
reets, Stefan Am 19.01.2020 um 11:53 schrieb Igor Fedotov : So the intermediate summary is: Any OSD in the cluster can experience interim RocksDB checksum failure. Which isn't present after OSD restart. No HW issues observed, no persistent artifacts (except OSD log) afterwards. And looks l

Re: [ceph-users] OSD up takes 15 minutes after machine restarts

2020-01-20 Thread Igor Fedotov
? huxia...@horebdata.cn *From:* Igor Fedotov <mailto:ifedo...@suse.de> *Date:* 2020-01-19 11:41 *To:* huxia...@horebdata.cn <mailto:huxia...@horebdata.cn>; ceph-users <mailto:ceph-users@lists.ceph.com> *Subject:* Re: [ceph-u

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-19 Thread Igor Fedotov
and/or prior restarts, etc) sometimes this might provide some hints. Thanks, Igor On 1/17/2020 2:30 PM, Stefan Priebe - Profihost AG wrote: HI Igor, Am 17.01.20 um 12:10 schrieb Igor Fedotov: hmmm.. Just in case - suggest to check H/W errors with dmesg. this happens on around 80 nodes - i don't

Re: [ceph-users] OSD up takes 15 minutes after machine restarts

2020-01-19 Thread Igor Fedotov
Hi Samuel, wondering if you have bluestore_fsck_on_mount option set to true? Can you see high read load over OSD device(s) during the startup? If so it might be fsck running which takes that long. Thanks, Igor On 1/19/2020 11:53 AM, huxia...@horebdata.cn wrote: Dear folks, I had a

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-17 Thread Igor Fedotov
ot limited to failure logs, perf counter dumps, system resource reports etc) for future analysis. On 1/16/2020 11:58 PM, Stefan Priebe - Profihost AG wrote: Hi Igor, answers inline. Am 16.01.20 um 21:34 schrieb Igor Fedotov: you may want to run fsck against failing OSDs. Hopefully it will shed

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Igor Fedotov
https://github.com/ceph/ceph/pull/28644 Greets, Stefan Am 16.01.20 um 17:00 schrieb Igor Fedotov: Hi Stefan, would you please share log snippet prior the assertions? Looks like RocksDB is failing during transaction submission... Thanks, Igor On 1/16/2020 11:56 AM, Stefan Priebe - Profihost AG

Re: [ceph-users] Luminous Bluestore OSDs crashing with ASSERT

2020-01-16 Thread Igor Fedotov
Hi Stefan, would you please share log snippet prior the assertions? Looks like RocksDB is failing during transaction submission... Thanks, Igor On 1/16/2020 11:56 AM, Stefan Priebe - Profihost AG wrote: Hello, does anybody know a fix for this ASSERT / crash? 2020-01-16 02:02:31.316394

Re: [ceph-users] Impact of a small DB size with Bluestore

2019-12-02 Thread Igor Fedotov
Hi Lars, I've also seen interim space usage burst during my experiments. Up to 2x times of max level size when topmost RocksDB level is  L3 (i.e. 25GB max). So I think 2x (which results in 60-64 GB for DB) is a good grade when your DB is expected to be small and medium sized. Not sure this

Re: [ceph-users] ceph-objectstore-tool crash when trying to recover pg from OSD

2019-11-07 Thread Igor Fedotov
Hi Eugene, this looks like https://tracker.ceph.com/issues/42223 indeed. Would you please find the first crash for these OSDs and share corresponding logs in the ticket. Unfortunately I don't know reliable enough ways to recover OSD after such a failure. If they exist at all... :( I've

Re: [ceph-users] Ceph Negative Objects Number

2019-10-14 Thread Igor Fedotov
Hi Lazuardi, never seen that. Just wondering what Ceph version are you running? Thanks, Igor On 10/8/2019 3:52 PM, Lazuardi Nasution wrote: Hi, I get following weird negative objects number on tiering. Why is this happening? How to get back to normal? Best regards, [root@management-a

Re: [ceph-users] ceph version 14.2.3-OSD fails

2019-10-11 Thread Igor Fedotov
eta  to this email can something dose this logs can shed some light ? On Thu, Sep 12, 2019 at 7:20 PM Igor Fedotov mailto:ifedo...@suse.de>> wrote: Hi, this line:     -2> 2019-09-12 16:38:15.101 7fcd02fd1f80  1 bluestore(/var/lib/ceph/osd

Re: [ceph-users] 3,30,300 GB constraint of block.db size on SSD

2019-09-30 Thread Igor Fedotov
Hi Massimo, On 9/29/2019 9:13 AM, Massimo Sgaravatto wrote: In my ceph cluster I am use spinning disks for bluestore OSDs and SSDs just for the  block.db. If I have got it right, right now: a) only 3,30,300GB can be used on the SSD rocksdb spillover to slow device, so you don't have any

Re: [ceph-users] Bluestore OSDs keep crashing in BlueStore.cc: 8808: FAILED assert(r == 0)

2019-09-12 Thread Igor Fedotov
to prevent rebooting all ceph nodes. Greets, Stefan Am 27.08.19 um 16:20 schrieb Igor Fedotov: It sounds like OSD is "recovering" after checksum error. I.e. just failed OSD shows no errors in fsck and is able to restart and process new write requests for long enough period (longer than jus

Re: [ceph-users] ceph version 14.2.3-OSD fails

2019-09-12 Thread Igor Fedotov
Hi, this line:     -2> 2019-09-12 16:38:15.101 7fcd02fd1f80  1 bluestore(/var/lib/ceph/osd/ceph-71) _open_alloc loaded 0 B in 0 extents tells me that OSD is unable to load free list manager properly, i.e. list of free/allocated blocks in unavailable. You might want to set 'debug bluestore

Re: [ceph-users] Bluestore OSDs keep crashing in BlueStore.cc: 8808: FAILED assert(r == 0)

2019-08-27 Thread Igor Fedotov
ng as well... Igor On 8/27/2019 4:52 PM, Stefan Priebe - Profihost AG wrote: see inline Am 27.08.19 um 15:43 schrieb Igor Fedotov: see inline On 8/27/2019 4:41 PM, Stefan Priebe - Profihost AG wrote: Hi Igor, Am 27.08.19 um 14:11 schrieb Igor Fedotov: Hi Stefan, this looks like a duplicate

Re: [ceph-users] Bluestore OSDs keep crashing in BlueStore.cc: 8808: FAILED assert(r == 0)

2019-08-27 Thread Igor Fedotov
see inline On 8/27/2019 4:41 PM, Stefan Priebe - Profihost AG wrote: Hi Igor, Am 27.08.19 um 14:11 schrieb Igor Fedotov: Hi Stefan, this looks like a duplicate for https://tracker.ceph.com/issues/37282 Actually the root cause selection might be quite wide. From HW issues to broken logic

Re: [ceph-users] Bluestore OSDs keep crashing in BlueStore.cc: 8808: FAILED assert(r == 0)

2019-08-27 Thread Igor Fedotov
Hi Stefan, this looks like a duplicate for https://tracker.ceph.com/issues/37282 Actually the root cause selection might be quite wide. From HW issues to broken logic in RocksDB/BlueStore/BlueFS etc. As far as I understand you have different OSDs which are failing, right? Is the set of

Re: [ceph-users] WAL/DB size

2019-08-14 Thread Igor Fedotov
Hi Wido & Hermant. On 8/14/2019 11:36 AM, Wido den Hollander wrote: On 8/14/19 9:33 AM, Hemant Sonawane wrote: Hello guys, Thank you so much for your responses really appreciate it. But I would like to mention one more thing which I forgot in my last email is that I am going to use this

Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread Igor Fedotov
for l_bluestore_commit_lat, latency = 87.7928s, txc = 0x55eaa7a40600 Maybe move OMAP +META from all OSD to a NVME of 480GB per node helps in this situation but not sure. Manuel *De:*Igor Fedotov *Enviado el:* miércoles, 7 de agosto de 2019 13:10 *Para:* EDH - Manuel Rios Fernandez ; 'Ceph

Re: [ceph-users] 14.2.2 - OSD Crash

2019-08-07 Thread Igor Fedotov
Hi Manuel, as Brad pointed out timeouts and suicides are rather consequences of some other issues with OSDs. I recall at least two recent relevant tickets: https://tracker.ceph.com/issues/36482 https://tracker.ceph.com/issues/40741 (see last comments) Both had massive and slow reads from

Re: [ceph-users] Wrong ceph df result

2019-07-30 Thread Igor Fedotov
Hi Sylvain, have you upgraded to Nautilus recently? Have you added/repaired any OSDs since then? If so then you're facing a known issue caused by a mixture of legacy and new approaches to collect pool statistics. Sage shared detailed information on the issue in this mailing list under

Re: [ceph-users] Adding block.db afterwards

2019-07-26 Thread Igor Fedotov
Hi Frank, you can specify new db size in the following way: CEPH_ARGS="--bluestore-block-db-size 107374182400" ceph-bluestore-tool bluefs-bdev-new-db Thanks, Igor On 7/26/2019 2:49 PM, Frank Rothenstein wrote: Hi, I'm running a small (3 hosts) ceph cluster. ATM I want to speed up

Re: [ceph-users] Repair statsfs fail some osd 14.2.1 to 14.2.2

2019-07-23 Thread Igor Fedotov
Hi Manuel, this looks like either corrupted data in BlueStore data base or memory related (some leakage?) issue. This is reproducible, right? Could you please make a ticket in upstream tracker, rerun repair with debug bluestore set to 5/20 and upload corresponding log. Please observe

Re: [ceph-users] disk usage reported incorrectly

2019-07-17 Thread Igor Fedotov
contact the sender and destroy any copies of this information. ____ From: Igor Fedotov Sent: Wednesday, July 17, 2019 11:33 AM To: Paul Mezzanini; ceph-users@lists.ceph.com Subject: Re: [ceph-users] disk usage reported incorrectly Forgot to

Re: [ceph-users] disk usage reported incorrectly

2019-07-17 Thread Igor Fedotov
Forgot to provide a workaround... If that's the case then you need to repair each OSD with corresponding command in ceph-objectstore-tool... Thanks, Igor. On 7/17/2019 6:29 PM, Paul Mezzanini wrote: Sometime after our upgrade to Nautilus our disk usage statistics went off the rails

Re: [ceph-users] disk usage reported incorrectly

2019-07-17 Thread Igor Fedotov
H Paul, there was a post from Sage named "Pool stats issue with upgrades to nautilus" recently. Perhaps that's the case if you add new OSD or repair existing one... Thanks, Igor On 7/17/2019 6:29 PM, Paul Mezzanini wrote: Sometime after our upgrade to Nautilus our disk usage statistics

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-12 Thread Igor Fedotov
o try manual rocksdb compaction using ceph-kvstore-tool.. Sent from my Huawei tablet Original Message Subject: Re: [ceph-users] 3 OSDs stopped and unable to restart From: Brett Chancellor To: Igor Fedotov CC: Ceph Users Once backfill

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
?  I can't find any documentation on it.  Also do you think this could be related to the .rgw.meta pool having too many objects per PG? The disks that die always seem to be backfilling a pg from that pool, and they have ~550k objects per PG. -Brett On Tue, Jul 9, 2019 at 1:03 PM Igor Fedotov

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
Please try to set bluestore_bluefs_gift_ratio to 0.0002 On 7/9/2019 7:39 PM, Brett Chancellor wrote: Too large for pastebin.. The problem is continually crashing new OSDs. Here is the latest one. On Tue, Jul 9, 2019 at 11:46 AM Igor Fedotov <mailto:ifedo...@suse.de>> wrote:

Re: [ceph-users] DR practice: "uuid != super.uuid" and csum error at blob offset 0x0

2019-07-09 Thread Igor Fedotov
Hi Mark, I doubt read-only mode would help here. Log replay  is required to build a consistent store state and one can't bypass it. And looks like your drive/controller still detect some errors while reading. For the second issue this PR might help (you'll be able to disable csum

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
with OSDs crashing. Interestingly it seems that the dying OSDs are always working on a pg from the .rgw.meta pool when they crash. Log : https://pastebin.com/yuJKcPvX On Tue, Jul 9, 2019 at 5:14 AM Igor Fedotov <mailto:ifedo...@suse.de>> wrote: Hi Brett, in Nautilus you can do

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-09 Thread Igor Fedotov
that a try.  Is it something like... ceph tell 'osd.*' bluestore_allocator stupid ceph tell 'osd.*' bluefs_allocator stupid And should I expect any issues doing this? On Mon, Jul 8, 2019 at 1:04 PM Igor Fedotov <mailto:ifedo...@suse.de>> wrote: I should read call stack more carefull

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-09 Thread Igor Fedotov
Hi Lukasz, if this is filestore then most probably my comments are irrelevant. The issue I expected is BlueStore specific Unfortunately I'm not an expert in filestore hence unable to help in further investigation. Sorry... Thanks, Igor On 7/9/2019 11:39 AM, Luk wrote: We have

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-08 Thread Igor Fedotov
/8/2019 8:00 PM, Igor Fedotov wrote: Hi Brett, looks like BlueStore is unable to allocate additional space for BlueFS at main device. It's either lacking free space or it's too fragmented... Would you share osd log, please? Also please run "ceph-bluestore-tool --path path-to-osd!!!>

Re: [ceph-users] 3 OSDs stopped and unable to restart

2019-07-08 Thread Igor Fedotov
Hi Brett, looks like BlueStore is unable to allocate additional space for BlueFS at main device. It's either lacking free space or it's too fragmented... Would you share osd log, please? Also please run "ceph-bluestore-tool --path path-to-osd!!!> bluefs-bdev-sizes" and share the output.

Re: [ceph-users] slow requests due to scrubbing of very small pg

2019-07-04 Thread Igor Fedotov
Hi Lukasz, I've seen something like that - slow requests and relevant OSD reboots on suicide timeout at least twice with two different clusters. The root cause was slow omap listing for some objects which had started to happen after massive removals from RocksDB. To verify if this is the

Re: [ceph-users] troubleshooting space usage

2019-07-04 Thread Igor Fedotov
. The numbers are identical it seems:     .rgw.buckets   19      15 TiB     78.22       4.3 TiB *8786934* # cat /root/ceph-rgw.buckets-rados-ls-all |wc -l *8786934* Cheers *From: *"Igor Fedotov" *To

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Igor Fedotov
*From: *"Igor Fedotov" *To: *"andrei" *Cc: *"ceph-users" *Sent: *Wednesday, 3 July, 2019 12:29:33 *Subject: *Re: [ceph-users] troubleshooting space usage Hi Andrei, Additionally I'd like to see performance counters dum

Re: [ceph-users] troubleshooting space usage

2019-07-03 Thread Igor Fedotov
. The issues seems to be only with the .rgw-buckets pool where the "ceph df " output shows 15TB of usage and the sum of all buckets in that pool shows just over 6.5TB. Cheers Andrei -------- *From: *"Igor Fedotov"

Re: [ceph-users] troubleshooting space usage

2019-07-02 Thread Igor Fedotov
Hi Andrei, The most obvious reason is space usage overhead caused by BlueStore allocation granularity, e.g. if bluestore_min_alloc_size is 64K  and average object size is 16K one will waste 48K per object in average. This is rather a speculation so far as we lack key the information about

Re: [ceph-users] OSD bluestore initialization failed

2019-06-21 Thread Igor Fedotov
bluefs _read got 4096 2019-06-21 10:50:56.475440 7f462db84d00 10 bluefs _replay 0x104000: txn(seq 332735 len 0xca5 crc 0x4715a5c6) The entire file as 17M and I can send with necessary , Saulo Augusto Silva Em sex, 21 de jun de 2019 às 06:42, Igor Fedotov <mailto:ifedo...@suse.de>>

Re: [ceph-users] understanding the bluestore blob, chunk and compression params

2019-06-21 Thread Igor Fedotov
daemon osd.130 config set bluestore_compression_mode force', where it restarted immediately. FTR, it *should* compress with osd bluestore_compression_mode=none and the pool's compression_mode=force, right? -- dan -- Dan On Thu, Jun 20, 2019 at 6:57 PM Igor Fedotov wrote: I'd like to see more d

Re: [ceph-users] understanding the bluestore blob, chunk and compression params

2019-06-21 Thread Igor Fedotov
compression algorithm isn't applied when osd compression mode set to none. Hence no compression if pool lacks explicit algorithm specification. -- dan -- Dan On Thu, Jun 20, 2019 at 6:57 PM Igor Fedotov wrote: I'd like to see more details (preferably backed with logs) on this... On 6/20

Re: [ceph-users] OSD bluestore initialization failed

2019-06-21 Thread Igor Fedotov
Hi Saulo, looks like disk I/O error. Will you set debug_bluefs to 20 and collect the log, then share a few lines prior to the assertion? Checking smartctl output might be a good idea too. Thanks, Igor On 6/21/2019 11:30 AM, Saulo Silva wrote: Hi, After a power failure all OSD´s from a

Re: [ceph-users] understanding the bluestore blob, chunk and compression params

2019-06-20 Thread Igor Fedotov
On 6/20/2019 8:55 PM, Dan van der Ster wrote: On Thu, Jun 20, 2019 at 6:55 PM Igor Fedotov wrote: Hi Dan, bluestore_compression_max_blob_size is applied for objects marked with some additional hints only: if ((alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_SEQUENTIAL_

Re: [ceph-users] understanding the bluestore blob, chunk and compression params

2019-06-20 Thread Igor Fedotov
I'd like to see more details (preferably backed with logs) on this... On 6/20/2019 6:23 PM, Dan van der Ster wrote: P.S. I know this has been discussed before, but the compression_(mode|algorithm) pool options [1] seem completely broken -- With the pool mode set to force, we see that sometimes

Re: [ceph-users] understanding the bluestore blob, chunk and compression params

2019-06-20 Thread Igor Fedotov
Hi Dan, bluestore_compression_max_blob_size is applied for objects marked with some additional hints only:   if ((alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_SEQUENTIAL_READ) &&   (alloc_hints & CEPH_OSD_ALLOC_HINT_FLAG_RANDOM_READ) == 0 &&   (alloc_hints &

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Igor Fedotov
fine turning the warnings off, but it's curious that only this cluster is showing the alerts.  Is there any value in rebuilding the with smaller SSD meta data volumes? Say 60GB or 30GB? -Brett On Tue, Jun 18, 2019 at 1:55 PM Igor Fedotov <mailto:ifedo...@suse.de>> wrote:

Re: [ceph-users] BlueFS spillover detected - 14.2.1

2019-06-18 Thread Igor Fedotov
Hi Brett, this issue has been with you long before upgrade to 14.2.1. This upgrade just brought corresponding alert visible. You can turn the alert off by setting bluestore_warn_on_bluefs_spillover=false. But generally this warning shows DB data layout inefficiency - some data is kept at

Re: [ceph-users] bluestore_allocated vs bluestore_stored

2019-06-17 Thread Igor Fedotov
Hi Maged, min_alloc_size determines allocation granularity hence if object size isn't aligned with its value allocation overhead still takes place. E.g. with min_alloc_size = 16K and object size = 24K total allocation (i.e. bluestore_allocated) would be 32K. And yes, this overhead is

Re: [ceph-users] OSD hanging on 12.2.12 by message worker

2019-06-07 Thread Igor Fedotov
Hi Max, I don't think this is allocator related issue. The symptoms that triggered us to start using bitmap allocator over stupid one were: - write op latency gradually increasing over time (days not hours) - perf showing significant amount of time spent in allocator related function -

Re: [ceph-users] SSD Sizing for DB/WAL: 4% for large drives?

2019-05-28 Thread Igor Fedotov
Hi Jake, just my 2 cents - I'd suggest to use LVM for DB/WAL to be able seamlessly extend their sizes if needed. Once you've configured this way and if you're able to add more NVMe later you're almost free to select any size at the initial stage. Thanks, Igor On 5/28/2019 4:13 PM, Jake

Re: [ceph-users] Luminous OSD: replace block.db partition

2019-05-28 Thread Igor Fedotov
Konstantin, one should resize device  before using bluefs-bdev-expand command. So the first question should be what's the backend for block.db - simple device partition, LVM volume, raw file? LVM volume and raw file resizing is quite simple, while partition one might need manual data

Re: [ceph-users] Lost OSD - 1000: FAILED assert(r == 0)

2019-05-24 Thread Igor Fedotov
Hi Guillaume, Could you please set debug-bluefs to 20, restart OSD and collect the whole log. Thanks, Igor On 5/24/2019 4:50 PM, Guillaume Chenuet wrote: Hi, We are running a Ceph cluster with 36 OSD splitted on 3 servers (12 OSD per server) and Ceph version 12.2.11 

Re: [ceph-users] Scrub Crash OSD 14.2.1

2019-05-17 Thread Igor Fedotov
Hi Manuel, Just in case - haven't you done any manipulation with underlying disk/partition/volume - resize, replacement etc? Thanks, Igor On 5/17/2019 3:00 PM, EDH - Manuel Rios Fernandez wrote: Hi , Today we got some osd that crash after scrub. Version 14.2.1 2019-05-17 12:49:40.955

Re: [ceph-users] OSDs failing to boot

2019-05-09 Thread Igor Fedotov
Hi Paul, could you please set both "debug bluestore" and "debug bluefs" to 20, run again and share the resulting log. Thanks, Igor On 5/9/2019 2:34 AM, Rawson, Paul L. wrote: Hi Folks, I'm having trouble getting some of my OSDs to boot. At some point, these disks got very full. I fixed

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2019-05-06 Thread Igor Fedotov
[0x55b2136b3ab0]  NOTE: a copy of the executable, or `objdump -rdS ` is needed to interpret this. --- logging levels ---    0/ 5 none    0/ 1 lockdep    0/ 1 context    1/ 1 crush    1/ 5 mds    1/ 5 mds_balancer    1/ 5 mds_locker    1/ 5 mds_log    1/ 5 mds_log_expire    1/ 5 mds_migrator   

Re: [ceph-users] Bluestore Compression

2019-05-02 Thread Igor Fedotov
Hi Ashley, general rule is that compression switch do not affect existing data but controls future write request processing. You can enable/disable compression at any time. Once disabled - no more compression is happening. And data that has been compressed remains in this state until

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2019-05-01 Thread Igor Fedotov
? -- Try different allocator. Ah, BTW, except memory allocator there's another option: recently backported bitmap allocator. Igor Fedotov wrote about it's expected to have lesser memory footprint with time: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-April/034299.html Also I'm

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Igor Fedotov
Or try full rebuild? On 4/17/2019 5:37 PM, Igor Fedotov wrote: Could you please check if libfio_ceph_objectstore.so has been rebuilt with your last build? On 4/17/2019 6:37 AM, Can Zhang wrote: Thanks for your suggestions. I tried to build libfio_ceph_objectstore.so, but it fails to load

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-17 Thread Igor Fedotov
(https://tracker.ceph.com/issues/38360), the error seems to be caused by mixed versions. My build environment is CentOS 7.5.1804 with SCL devtoolset-7, and ceph is latest master branch. Does someone know about the symbol? Best, Can Zhang Best, Can Zhang On Tue, Apr 16, 2019 at 8:37 PM Igor

Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-04-16 Thread Igor Fedotov
On 4/15/2019 4:17 PM, Wido den Hollander wrote: On 4/15/19 2:55 PM, Igor Fedotov wrote: Hi Wido, the main driver for this backport were multiple complains on write ops latency increasing over time. E.g. see thread named:  "ceph osd commit latency increase over time, until restart&

Re: [ceph-users] Is it possible to run a standalone Bluestore instance?

2019-04-16 Thread Igor Fedotov
Besides already mentioned store_test.cc one can also use ceph objectstore fio plugin (https://github.com/ceph/ceph/tree/master/src/test/fio) to access standalone BlueStore instance from FIO benchmarking tool. Thanks, Igor On 4/16/2019 7:58 AM, Can ZHANG wrote: Hi, I'd like to run a

Re: [ceph-users] BlueStore bitmap allocator under Luminous and Mimic

2019-04-15 Thread Igor Fedotov
Hi Wido, the main driver for this backport were multiple complains on write ops latency increasing over time. E.g. see thread named:  "ceph osd commit latency increase over time, until restart" here. Or http://tracker.ceph.com/issues/38738 Most symptoms showed Stupid Allocator as a root

Re: [ceph-users] bluefs-bdev-expand experience

2019-04-12 Thread Igor Fedotov
.4 GiB 644 GiB 35.41 MIN/MAX VAR: 0.91/1.10 STDDEV: 3.37 It worked: AVAIL = 594+50 = 644. Great! Thanks a lot for your help. And one more question regarding your last remark is inline below. On Wed, Apr 10, 2019 at 09:54:35PM +0300, Igor Fedotov wrote: On 4/9/2019 1:59 PM, Yury Shevchuk wr

Re: [ceph-users] bluefs-bdev-expand experience

2019-04-10 Thread Igor Fedotov
here for now. You can also note that reported SIZE for osd.2 is 400GiB in your case which is absolutely inline with slow device capacity.  Hence no DB involved. Thanks for your help, -- Yury On Mon, Apr 08, 2019 at 10:17:24PM +0300, Igor Fedotov wrote: Hi Yuri, both issues from R

Re: [ceph-users] How to reduce HDD OSD flapping due to rocksdb compacting event?

2019-04-10 Thread Igor Fedotov
It's ceph-bluestore-tool. On 4/10/2019 10:27 AM, Wido den Hollander wrote: On 4/10/19 9:25 AM, jes...@krogh.cc wrote: On 4/10/19 9:07 AM, Charles Alva wrote: Hi Ceph Users, Is there a way around to minimize rocksdb compacting event so that it won't use all the spinning disk IO utilization

Re: [ceph-users] bluefs-bdev-expand experience

2019-04-08 Thread Igor Fedotov
mimic patch is approved. See https://github.com/ceph/ceph/pull/27447 Thanks, Igor On 4/5/2019 4:07 PM, Yury Shevchuk wrote: On Fri, Apr 05, 2019 at 02:42:53PM +0300, Igor Fedotov wrote: wrt Round 1 - an ability to expand block(main) device has been added to Nautilus, see: https://github.com

Re: [ceph-users] bluefs-bdev-expand experience

2019-04-05 Thread Igor Fedotov
Hi Yuri, wrt Round 1 - an ability to expand block(main) device has been added to Nautilus, see: https://github.com/ceph/ceph/pull/25308 wrt Round 2: - not setting 'size' label looks like a bug although I recall I fixed it... Will double check. - wrong stats output is probably related to

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
the case. Thanks, Igor Forwarded Message Subject:     High CPU in StupidAllocator Date:     Tue, 12 Feb 2019 10:24:37 +0100 From:     Adam Kupczyk To:     IGOR FEDOTOV Hi Igor, I have observed that StupidAllocator can burn a lot of CPU in StupidAllocator::allocate_int

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-03-01 Thread Igor Fedotov
Subject:High CPU in StupidAllocator Date: Tue, 12 Feb 2019 10:24:37 +0100 From: Adam Kupczyk To: IGOR FEDOTOV Hi Igor, I have observed that StupidAllocator can burn a lot of CPU in StupidAllocator::allocate_int(). This comes from loops: while (p != free[bin].end

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Also I think it makes sense to create a ticket at this point. Any volunteers? On 3/1/2019 1:00 AM, Igor Fedotov wrote: Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On 2/28

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-28 Thread Igor Fedotov
Wondering if somebody would be able to apply simple patch that periodically resets StupidAllocator? Just to verify/disprove the hypothesis it's allocator relateted On 2/28/2019 11:57 PM, Stefan Kooman wrote: Quoting Wido den Hollander (w...@42on.com): Just wanted to chime in, I've seen

Re: [ceph-users] Blocked ops after change from filestore on HDD to bluestore on SDD

2019-02-27 Thread Igor Fedotov
Hi Uwe, AFAIR Samsung 860 Pro isn't for enterprise market, you shouldn't use consumer SSDs for Ceph. I had some experience with Samsung 960 Pro a while ago and it turned out that it handled fsync-ed writes very slowly (comparing to the original/advertised performance). Which one can

Re: [ceph-users] [Bluestore] Some of my osd's uses BlueFS slow storage for db - why?

2019-02-20 Thread Igor Fedotov
You're right - WAL/DB expansion capability is present in Luminous+ releases. But David meant volume migration stuff which appeared in Nautilus, see: https://github.com/ceph/ceph/pull/23103 Thanks, Igor On 2/20/2019 9:22 AM, Konstantin Shalygin wrote: On 2/19/19 11:46 PM, David Turner

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-19 Thread Igor Fedotov
eplication time ? - Mail original - De: "Wido den Hollander" À: "aderumier" Cc: "Igor Fedotov" , "ceph-users" , "ceph-devel" Envoyé: Vendredi 15 Février 2019 14:59:30 Objet: Re: [ceph-users] ceph osd commit latency increase over time,

Re: [ceph-users] single OSDs cause cluster hickups

2019-02-15 Thread Igor Fedotov
daemon osd.417 config show | grep discard "bdev_async_discard": "false", "bdev_enable_discard": "false", [...] So there must be something else causing the problems. Thanks, Denny Am 15.02.2019 um 12:41 schrieb Igor Fedotov : Hi Denny, Do not

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Igor Fedotov
18.13:30.dump_mempools.txt > Then is decreasing over time (around 3,7G this morning), but RSS is still at 8G > > > I'm graphing mempools counters too since yesterday, so I'll able to track them over time. > > - Mail original - > De: "Igor Fedotov" >

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-15 Thread Igor Fedotov
00 }, "buffer_anon": { "items": 19664, "bytes": 25486050 }, "buffer_meta": { "items": 46189, "bytes": 2956096 }, "osd": { "items": 243, "bytes": 3089016 }, "osd_mapbl": { "items": 17, "

Re: [ceph-users] single OSDs cause cluster hickups

2019-02-15 Thread Igor Fedotov
Hi Denny, Do not remember exactly when discards appeared in BlueStore but they are disabled by default: See bdev_enable_discard option. Thanks, Igor On 2/15/2019 2:12 PM, Denny Kreische wrote: Hi, two weeks ago we upgraded one of our ceph clusters from luminous 12.2.8 to mimic 13.2.4,

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-11 Thread Igor Fedotov
"osdmap": { "items": 3803, "bytes": 224552 }, "osdmap_mapping": { "items": 0, "bytes": 0 }, "pgmap": {

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
it directly to you or what is the best procedure for you? Reasonable (up to 10MB?) email attachment is OK, for larger ones - whatever publicly available site is fine. Thanks for your support! Eugen Zitat von Igor Fedotov : Eugen, At first - you should upgrade to 12.2.11 (or bring

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
on the result we'll decide how to continue, right? Is there anything else to be enabled for that command or can I simply run 'ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-'? Any other obstacles I should be aware of when running fsck? Thanks! Eugen Zitat von Igor Fedotov : Hi

Re: [ceph-users] SSD OSD crashing after upgrade to 12.2.10

2019-02-07 Thread Igor Fedotov
Hi Eugen, looks like this isn't [1] but rather https://tracker.ceph.com/issues/38049 and https://tracker.ceph.com/issues/36541 (= https://tracker.ceph.com/issues/36638 for luminous) Hence it's not fixed in 12.2.10, target release is 12.2.11 Also please note the patch allows to avoid new

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-05 Thread Igor Fedotov
les with 8-12 hours interval? Wrt to backporting bitmap allocator to mimic - we haven't had such plans before that but I'll discuss this at BlueStore meeting shortly. Thanks, Igor - Mail original - De: "Alexandre Derumier" À: "Igor Fedotov" Cc: "Stefan Prieb

Re: [ceph-users] ceph osd commit latency increase over time, until restart

2019-02-04 Thread Igor Fedotov
Hi Alexandre, looks like a bug in StupidAllocator. Could you please collect BlueStore performance counters right after OSD startup and once you get high latency. Specifically 'l_bluestore_fragmentation' parameter is of interest. Also if you're able to rebuild the code I can probably make a

Re: [ceph-users] bluestore block.db

2019-01-25 Thread Igor Fedotov
Hi Frank, you might want to use ceph-kvstore-tool, e.g. ceph-kvstore-tool bluestore-kv dump Thanks, Igor On 1/25/2019 10:49 PM, F Ritchie wrote: Hi all, Is there a way to dump the contents of block.db to a text file? I am not trying to fix a problem just curious and want to poke around.

Re: [ceph-users] Bluestore 32bit max_object_size limit

2019-01-21 Thread Igor Fedotov
On 1/18/2019 6:33 PM, KEVIN MICHAEL HRPCEK wrote: On 1/18/19 7:26 AM, Igor Fedotov wrote: Hi Kevin, On 1/17/2019 10:50 PM, KEVIN MICHAEL HRPCEK wrote: Hey, I recall reading about this somewhere but I can't find it in the docs or list archive and confirmation from a dev or someone who

Re: [ceph-users] Bluestore 32bit max_object_size limit

2019-01-18 Thread Igor Fedotov
Hi Kevin, On 1/17/2019 10:50 PM, KEVIN MICHAEL HRPCEK wrote: Hey, I recall reading about this somewhere but I can't find it in the docs or list archive and confirmation from a dev or someone who knows for sure would be nice. What I recall is that bluestore has a max 4GB file size limit

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2019-01-11 Thread Igor Fedotov
019-01-11 18:56:00.135 7fb74a8272c0 10 bluestore(/var/lib/ceph/osd/ceph-1) _flush_cache And that is where the -EIO is coming from: https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L5305 So I guess there is an inconsistency between some metadata here? On 27/12/20

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Igor Fedotov
Hector, One more thing to mention - after expansion please run fsck using ceph-bluestore-tool prior to running osd daemon and collect another log using CEPH_ARGS variable. Thanks, Igor On 12/27/2018 2:41 PM, Igor Fedotov wrote: Hi Hector, I've never tried bluefs-bdev-expand over

Re: [ceph-users] `ceph-bluestore-tool bluefs-bdev-expand` corrupts OSDs

2018-12-27 Thread Igor Fedotov
Hi Hector, I've never tried bluefs-bdev-expand over encrypted volumes but it works absolutely fine for me in other cases. So it would be nice to troubleshoot this a bit. Suggest to do the following: 1) Backup first 8K for all OSD.1 devices (block, db and wal) using dd. This will probably

Re: [ceph-users] SLOW SSD's after moving to Bluestore

2018-12-11 Thread Igor Fedotov
Hi Tyler, I suspect you have BlueStore DB/WAL at these drives as well, don't you? Then perhaps you have performance issues with f[data]sync requests which DB/WAL invoke pretty frequently. See the following links for details:

Re: [ceph-users] How to recover from corrupted RocksDb

2018-11-29 Thread Igor Fedotov
Yeah, that may be the way. Preferably to disable compaction during this procedure though. To do that please set bluestore rocksdb options = "disable_auto_compactions=true" in [osd] section in ceph.conf Thanks, Igor On 11/29/2018 4:54 PM, Paul Emmerich wrote: does objectstore-tool still

Re: [ceph-users] How to recover from corrupted RocksDb

2018-11-29 Thread Igor Fedotov
'ceph-bluestore-tool repair' checks and repairs BlueStore metadata consistency not RocksDB one. It looks like you're observing CRC mismatch during DB compaction which is probably not triggered during the repair. Good point is that it looks like Bluestore's metadata are consistent and hence

Re: [ceph-users] Raw space usage in Ceph with Bluestore

2018-11-28 Thread Igor Fedotov
Hi Jody, yes, this is a known issue. Indeed, currently 'ceph df detail' reports raw space usage in GLOBAL section and 'logical' in the POOLS one. While logical one has some flaws. There is a pending PR targeted to Nautilus to fix that: https://github.com/ceph/ceph/pull/19454 If you want to

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-22 Thread Igor Fedotov
the bluestore-tool standalone and static? Unfortunately I don't know such a method. May be try hex editing instead? All the best, Florian Am 11/21/18 um 9:34 AM schrieb Igor Fedotov: Actually  (given that your devices are already expanded) you don't need to expand them once again - one can just

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-21 Thread Igor Fedotov
, Florian Engelmann wrote: Great support Igor Both thumbs up! We will try to build the tool today and expand those bluefs devices once again. Am 11/20/18 um 6:54 PM schrieb Igor Fedotov: FYI: https://github.com/ceph/ceph/pull/25187 On 11/20/2018 8:13 PM, Igor Fedotov wrote: On 11/20/2018 7

Re: [ceph-users] RocksDB and WAL migration to new block device

2018-11-20 Thread Igor Fedotov
FYI: https://github.com/ceph/ceph/pull/25187 On 11/20/2018 8:13 PM, Igor Fedotov wrote: On 11/20/2018 7:05 PM, Florian Engelmann wrote: Am 11/20/18 um 4:59 PM schrieb Igor Fedotov: On 11/20/2018 6:42 PM, Florian Engelmann wrote: Hi Igor, what's your Ceph version? 12.2.8 (SES 5.5

  1   2   >