[ceph-users] Re: removing/flattening a bucket without data movement?

2019-08-30 Thread Konstantin Shalygin
On 8/31/19 3:42 AM, Zoltan Arnold Nagy wrote: Originally our osd tree looked like this: ID  CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF  -1   2073.15186 root default -14    176.63100 rack s01-rack -19    176.63100 host s01 -15   

[ceph-users] Re: Out of memory

2019-08-30 Thread Konstantin Shalygin
On 8/30/19 9:20 PM, Sylvain PORTIER wrote: On my ceph osd servers I have lot of "out of memory messages". My servers are configured with :     - 32 G of memory     - 11 HDD (3,5 T each) (+ 2 HDD for the system) And the error messages are : /[101292.017968] Out of memory: Kill process 2597

[ceph-users] Re: Safe to reboot host?

2019-08-30 Thread Konstantin Shalygin
On 8/31/19 12:33 AM, Brett Chancellor wrote: Before I write something that's already been done, are there any built in utilities or tools that can tell me if it's safe to reboot a host? I'm looking for something better than just checking the health status, but rather checking pg status and

[ceph-users] Re: Howto define OSD weight in Crush map

2019-08-30 Thread Konstantin Shalygin
On 8/30/19 5:01 PM, 74cmo...@gmail.com wrote: Hi, after adding an OSD to Ceph it is adviseable to create a relevant entry in Crush map using a weight size depending on disk size. Example: ceph osd crush set osd. root=default host= Question: How is the weight defined depending on disk size?

[ceph-users] removing/flattening a bucket without data movement?

2019-08-30 Thread Zoltan Arnold Nagy
Hi folks, Originally our osd tree looked like this: ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 2073.15186 root default -14176.63100 rack s01-rack -19176.63100 host s01 -15171.29900 rack s02-rack -20

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-08-30 Thread Danny Abukalam
Yes I’m having the same problem - resorting to cached pages by Google for the time being!br,DannyOn 29 Aug 2019, at 14:00, Florian Haas wrote:Hi,is there any chance the list admins could copy the pipermail archivefrom lists.ceph.com over to lists.ceph.io? It seems to contain an awfullot of

[ceph-users] Re: Out of memory

2019-08-30 Thread Paul Emmerich
You are looking for the config option osd_memory_target Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Aug 30, 2019 at 4:21 PM Sylvain PORTIER wrote: > Hi,

[ceph-users] Safe to reboot host?

2019-08-30 Thread Brett Chancellor
Before I write something that's already been done, are there any built in utilities or tools that can tell me if it's safe to reboot a host? I'm looking for something better than just checking the health status, but rather checking pg status and ensuring that a reboot wouldn't take any undersized

[ceph-users] Error: err=rados: File too large caller=cephstorage_linux.go:231

2019-08-30 Thread tapas
Hi, I am trying to upload image size=>190 MB got error "err=rados: File too large caller=cephstorage_linux.go:231" Please help Thanks Tapas ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to

[ceph-users] Re: backfill_toofull after adding new OSDs

2019-08-30 Thread Frank Schilder
I observe the same issue after adding two new OSD hosts to an almost empty mimic cluster. > Let's try to restrict discussion to the original thread > "backfill_toofull while OSDs are not full" and get a tracker opened up > for this issue. Is this the issue you are referring to:

[ceph-users] Re: Out of memory

2019-08-30 Thread Brett Chancellor
You can set osd_memory_target to a lower value like 2-2.5GB. Depending on the version of Ceph you are using. On Fri, Aug 30, 2019, 10:21 AM Sylvain PORTIER wrote: > Hi, > > On my ceph osd servers I have lot of "out of memory messages". > > My servers are configured with : > > - 32 G of

[ceph-users] Re: FileStore OSD, journal direct symlinked, permission troubles.

2019-08-30 Thread Marco Gaiarin
> But, the 'code' that identify (and change permission) for journal dev > are PVE specific? or Ceph generic? I suppose the latter... OK, trying to identify how OSDs get initialized. If i understood well: 0) systemd unit for every OSD get created following a template:

[ceph-users] Out of memory

2019-08-30 Thread Sylvain PORTIER
Hi, On my ceph osd servers I have lot of "out of memory messages". My servers are configured with :     - 32 G of memory     - 11 HDD (3,5 T each) (+ 2 HDD for the system) And the error messages are : /[101292.017968] Out of memory: Kill process 2597 (ceph-osd) score 102 or sacrifice

[ceph-users] Re: Heavily-linked lists.ceph.com pipermail archive now appears to lead to 404s

2019-08-30 Thread Danny Abukalam
Yes I’m having the same problem - resorting to cached pages by Google for the time being! br, Danny > On 29 Aug 2019, at 14:00, Florian Haas wrote: > > Hi, > > is there any chance the list admins could copy the pipermail archive > from lists.ceph.com over to lists.ceph.io? It seems to

[ceph-users] Re: FileStore OSD, journal direct symlinked, permission troubles.

2019-08-30 Thread Alwin Antreich
On Thu, Aug 29, 2019 at 05:02:11PM +0200, Marco Gaiarin wrote: > Riprendo quanto scritto nel suo messaggio del 29/08/2019... > > > Another possibilty is to convert the MBR to GPT (sgdisk --mbrtogpt) and > > give the partition its UID (also sgdisk). Then it could be linked by > > its uuid. > and,

[ceph-users] Re: ceph fs crashes on simple fio test

2019-08-30 Thread Frank Schilder
Hi Robert and Paul, a quick update. I restarted all OSDs today to activate osd_op_queue_cut_off=high. I run into a serious problem right after that. The standby-replay MDS daemons started missing mon beacons and were killed by the mons: ceph-01 journal: debug [...] log [INF] Standby daemon

[ceph-users] Re: Howto add DB (aka RockDB) device to existing OSD on HDD

2019-08-30 Thread Eugen Block
Hi, did you not read my answers? ceph-2:~ # CEPH_ARGS="--bluestore-block-db-size 1048576" ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-1 bluefs-bdev-new-db --dev-target /dev/sdb inferring bluefs devices from bluestore path DB device added /dev/sdb ceph-2:~ # ll

[ceph-users] Re: help

2019-08-30 Thread Amudhan P
my cluster health status went to warning mode only after running mkdir of 1000's of folders with multiple subdirectories. if this has made OSD crash does it really takes that long to heal empty directories. On Fri, Aug 30, 2019 at 3:12 PM Janne Johansson wrote: > Den fre 30 aug. 2019 kl 10:49

[ceph-users] Howto add DB (aka RockDB) device to existing OSD on HDD

2019-08-30 Thread 74cmonty
Hi, I have created OSD on HDD w/o putting DB on faster drive. In order to improve performance I have now a single SSD drive with 3.8TB. I modified /etc/ceph/ceph.conf by adding this in [global]: bluestore_block_db_size = 53687091200 This should create RockDB with size 50GB. Then I tried to

[ceph-users] Howto define OSD weight in Crush map

2019-08-30 Thread 74cmonty
Hi, after adding an OSD to Ceph it is adviseable to create a relevant entry in Crush map using a weight size depending on disk size. Example: ceph osd crush set osd. root=default host= Question: How is the weight defined depending on disk size? Which algorithm can be used to calculate the

[ceph-users] Re: Which CephFS clients send a compressible hint?

2019-08-30 Thread Paul Emmerich
None -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, Aug 30, 2019 at 9:42 AM Erwin Bogaard wrote: > > We're mainly using CephFS using the Centos/Rhel 7 kernel

[ceph-users] Re: help

2019-08-30 Thread Janne Johansson
Den fre 30 aug. 2019 kl 10:49 skrev Amudhan P : > After leaving 12 hours time now cluster status is healthy, but why did it > take such a long time for backfill? > How do I fine-tune? if in case of same kind error pop-out again. > > The backfilling is taking a while because max_backfills = 1 and

[ceph-users] Deleted snapshot still having error, how to fix (pg repair is not working)

2019-08-30 Thread Marc Roos
I was a little bit afraid I would be deleting this snapshot without result. How do I fix this error (pg repair is not working) pg 17.36 is active+clean+inconsistent, acting [7,29,12] 2019-08-30 10:40:04.580470 7f9b3f061700 -1 log_channel(cluster) log [ERR] : repair 17.36

[ceph-users] 645% Clean PG's in Dashboard

2019-08-30 Thread c . lilja
Hi, I've upgraded to Nautilus from Mimic a while ago and enabled the pg_autoscaler. When pg_autoscaler was activated I got a HEALTH_WARN regarding: POOL_TARGET_SIZE_BYTES_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_bytes Pools ['cephfs_data_reduced', 'cephfs_data',

[ceph-users] 645% Clean PG's in Dashboard

2019-08-30 Thread c . lilja
Hi, I've upgraded to Nautilus from Mimic a while ago and enabled the pg_autoscaler. When pg_autoscaler was activated I got a HEALTH_WARN regarding: POOL_TARGET_SIZE_BYTES_OVERCOMMITTED 1 subtrees have overcommitted pool target_size_bytes Pools ['cephfs_data_reduced', 'cephfs_data',

[ceph-users] Re: help

2019-08-30 Thread Amudhan P
After leaving 12 hours time now cluster status is healthy, but why did it take such a long time for backfill? How do I fine-tune? if in case of same kind error pop-out again. On Thu, Aug 29, 2019 at 6:52 PM Caspar Smit wrote: > Hi, > > This output doesn't show anything 'wrong' with the

[ceph-users] Re: active+remapped+backfilling with objects misplaced

2019-08-30 Thread Arash Shams
Thanks David, I will dig for pg-upmap From: David Casier Sent: Tuesday, August 27, 2019 12:26 PM To: ceph-users@ceph.io Subject: [ceph-users] Re: active+remapped+backfilling with objects misplaced Hi, First, do not panic :) Secondly, verify that the number

[ceph-users] Re: Identify rbd snapshot

2019-08-30 Thread Marc Roos
Oh ok, that is easy. So the :4 is the snapshot id. rbd_data.1f114174b0dc51.0974:4 ^ -Original Message- From: Ilya Dryomov [mailto:idryo...@gmail.com] Sent: vrijdag 30 augustus 2019 10:27 To: Marc Roos

[ceph-users] Re: Identify rbd snapshot

2019-08-30 Thread Ilya Dryomov
On Thu, Aug 29, 2019 at 11:20 PM Marc Roos wrote: > > > I have this error. I have found the rbd image with the > block_name_prefix:1f114174b0dc51, how can identify what snapshot this > is? (Is it a snapshot?) > > 2019-08-29 16:16:49.255183 7f9b3f061700 -1 log_channel(cluster) log > [ERR] :

[ceph-users] admin socket for OpenStack client vanishes

2019-08-30 Thread Georg Fleig
Hi, I am trying to get detailed information about the RBD images used by OpenStack (r/w operations, throughput, ..). On the mailing list I found instructions that this is possible using an admin socket of the client [1]. So I enabled the socket on one of my hosts according to [2]. The

[ceph-users] Which CephFS clients send a compressible hint?

2019-08-30 Thread Erwin Bogaard
We're mainly using CephFS using the Centos/Rhel 7 kernel client and I'm pondering if I should go for bluestore compression mode" = passive or aggressive with this client to get compression on (preferably) only compressible objects. Is there any list of CephFS clients that send compressible

[ceph-users] Re: [Ceph-users] Re: MDS failing under load with large cache sizes

2019-08-30 Thread Janek Bevendorff
The fix has been merged into master and will be backported soon. Amazing, thanks! I've also done testing in a large cluster to confirm the issue you found. Using multiple processes to create files as fast as possible in a single client reliably reproduced the issue. The MDS cannot recall