Re: [ceph-users] hanging slow requests: failed to authpin, subtree is being exported

2019-10-21 Thread Kenneth Waegeman
I've made a ticket for this issue: https://tracker.ceph.com/issues/42338 Thanks again! K On 15/10/2019 18:00, Kenneth Waegeman wrote: Hi Robert, all, On 23/09/2019 17:37, Robert LeBlanc wrote: On Mon, Sep 23, 2019 at 4:14 AM Kenneth Waegeman wrote: Hi all, When syncing data with rsync

Re: [ceph-users] hanging slow requests: failed to authpin, subtree is being exported

2019-10-15 Thread Kenneth Waegeman
Hi Robert, all, On 23/09/2019 17:37, Robert LeBlanc wrote: On Mon, Sep 23, 2019 at 4:14 AM Kenneth Waegeman wrote: Hi all, When syncing data with rsync, I'm often getting blocked slow requests, which also block access to this path. 2019-09-23 11:25:49.477 7f4f401e8700 0 log_channel

Re: [ceph-users] mds fail ing to start 14.2.2

2019-10-15 Thread Kenneth Waegeman
Hi Zheng, Thanks, that let me think I forgot to remove some 'temporary-key' for the inconsistency issue I've got. Once those were removed,the mds started again. Thanks again! Kenneth On 12/10/2019 04:26, Yan, Zheng wrote: On Sat, Oct 12, 2019 at 1:10 AM Kenneth Waegeman

[ceph-users] mds fail ing to start 14.2.2

2019-10-11 Thread Kenneth Waegeman
Hi all, After solving some pg inconsistency problems, my fs is still in trouble.  my mds's are crashing with this error:     -5> 2019-10-11 19:02:55.375 7f2d39f10700  1 mds.1.564276 rejoin_start     -4> 2019-10-11 19:02:55.385 7f2d3d717700  5 mds.beacon.mds01 received beacon reply

Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-11 Thread Kenneth Waegeman
On 11/10/2019 01:21, Brad Hubbard wrote: On Fri, Oct 11, 2019 at 12:27 AM Kenneth Waegeman wrote: Hi Brad, all, Pool 6 has min_size 2: pool 6 'metadata' replicated size 3 min_size 2 crush_rule 1 object_hash rjenkins pg_num 1024 pgp_num 1024 autoscale_mode warn last_change 172476 flags

Re: [ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Kenneth Waegeman
poch": 207196,     "inconsistents": [] } {     "epoch": 207626,     "inconsistents": [] } {     "epoch": 207432,     "inconsistents": [] } {     "epoch": 207652,     "inconsistents": [] } {     "epoch": 207427,   

[ceph-users] lot of inconsistent+failed_repair - failed to pick suitable auth object (14.2.3)

2019-10-10 Thread Kenneth Waegeman
Hi all, After some node failure and rebalancing, we have a lot of pg's in inconsistent state. I tried to repair, but it din't work. This is also in the logs: 2019-10-10 11:23:27.221 7ff54c9b0700  0 log_channel(cluster) log [DBG] : 6.327 repair starts 2019-10-10 11:23:27.431 7ff5509b8700 -1

[ceph-users] ssd requirements for wal/db

2019-10-04 Thread Kenneth Waegeman
Hi all, We are thinking about putting our wal/db of hdds/ on ssds. If we would put the wal of 4 HDDS on 1 SSD as recommended, what type of SSD would suffice? We were thinking of using SATA Read Intensive 6Gbps 1DWPD SSDs. Does someone has some experience with this configuration? Would we

[ceph-users] hanging slow requests: failed to authpin, subtree is being exported

2019-09-23 Thread Kenneth Waegeman
Hi all, When syncing data with rsync, I'm often getting blocked slow requests, which also block access to this path. 2019-09-23 11:25:49.477 7f4f401e8700  0 log_channel(cluster) log [WRN] : slow request 31.895478 seconds old, received at 2019-09-23 11:25:17.598152:

Re: [ceph-users] ceph mdss keep on crashing after update to 14.2.3

2019-09-23 Thread Kenneth Waegeman
(on the MDS's only) ?? -- dan On Thu, Sep 19, 2019 at 4:59 PM Kenneth Waegeman wrote: Hi all, I updated our ceph cluster to 14.2.3 yesterday, and today the mds are crashing one after another. I'm using two active mds. I've made a tracker ticket, but I was wondering if someone else also has

Re: [ceph-users] ceph mdss keep on crashing after update to 14.2.3

2019-09-19 Thread Kenneth Waegeman
I forgot to mention the tracker issue: https://tracker.ceph.com/issues/41935 On 19/09/2019 16:59, Kenneth Waegeman wrote: Hi all, I updated our ceph cluster to 14.2.3 yesterday, and today the mds are crashing one after another. I'm using two active mds. I've made a tracker ticket, but I

[ceph-users] ceph mdss keep on crashing after update to 14.2.3

2019-09-19 Thread Kenneth Waegeman
Hi all, I updated our ceph cluster to 14.2.3 yesterday, and today the mds are crashing one after another. I'm using two active mds. I've made a tracker ticket, but I was wondering if someone else also has seen this issue yet? -27> 2019-09-19 15:42:00.196 7f036c2f0700 4 mds.1.server

Re: [ceph-users] regurlary 'no space left on device' when deleting on cephfs

2019-09-10 Thread Kenneth Waegeman
Hi Paul, all, Thanks! But I don't seem to find how to debug the purge queue. When I check the purge queue, I get these numbers: [root@mds02 ~]# ceph daemon mds.mds02 perf dump | grep -E 'purge|pq'     "purge_queue": {     "pq_executing_ops": 0,     "pq_executing": 0,    

[ceph-users] regurlary 'no space left on device' when deleting on cephfs

2019-09-06 Thread Kenneth Waegeman
Hi all, We are using cephfs to make a copy of another fs via rsync, and also use snapshots. I'm seeing this issue now and then when I try to delete files on cephFS: |[root@osd001 ~]# rm -f /mnt/ceph/backups/osd00*|| ||rm: cannot remove

[ceph-users] cephfs deleting files No space left on device

2019-05-10 Thread Kenneth Waegeman
Hi all, I am seeing issues on cephfs running 13.2.5 when deleting files: [root@osd006 ~]# rm /mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700 rm: remove regular empty file ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’? y rm: cannot remove

Re: [ceph-users] Unexplainable high memory usage OSD with BlueStore

2019-05-06 Thread Kenneth Waegeman
Hi all, I am also switching osds to the new bitmap allocater on 13.2.5. That went quite fluently for now, except for one OSD that keeps segfaulting when I enable the bitmap allocator. Each time I disable bitmap allocater on that again, osd is ok again. Segfault error of the OSD: --- begin

[ceph-users] slow ops after cephfs snapshot removal

2018-11-09 Thread Kenneth Waegeman
Hi all, On Mimic 13.2.1, we are seeing blocked ops on cephfs after removing some snapshots: [root@osd001 ~]# ceph -s   cluster:     id: 92bfcf0a-1d39-43b3-b60f-44f01b630e47     health: HEALTH_WARN     5 slow ops, oldest one blocked for 1162 sec, mon.mds03 has slow ops  

Re: [ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread Kenneth Waegeman
mmediately. Thanks, it works now! K On 31/07/18 17:06, John Spray wrote: On Tue, Jul 31, 2018 at 3:45 PM Kenneth Waegeman wrote: Hi all, I updated an existing Luminous cluster to Mimic 13.2.1. All daemons were updated, so I did ceph osd require-osd-release mimic, so everything seems up to date. I w

[ceph-users] CephFS Snapshots in Mimic

2018-07-31 Thread Kenneth Waegeman
Hi all, I updated an existing Luminous cluster to Mimic 13.2.1. All daemons were updated, so I did ceph osd require-osd-release mimic, so everything seems up to date. I want to try the snapshots in Mimic, since this should be stable, so i ran: [root@osd2801 alleee]# ceph fs set cephfs

Re: [ceph-users] HELP! --> CLUSER DOWN (was "v13.2.1 Mimic released")

2018-07-30 Thread Kenneth Waegeman
kzal t maar eens testen :) On 30/07/18 10:54, Nathan Cutler wrote: for all others on this list, it might also be helpful to know which setups are likely affected. Does this only occur for Filestore disks, i.e. if ceph-volume has taken over taking care of these? Does it happen on every RHEL

[ceph-users] ceph mount nofail option

2018-03-12 Thread Kenneth Waegeman
Hi all, Is there a way to mount ceph kernel client with the nofail option ? I get an invalid argument when trying to mount ceph with nofail option, in fstab / mount mon01,mon02,mon03:/ /mnt/ceph ceph name=cephfs,secretfile=/etc/ceph/secret,noatime,nofail 0 0 or [root@osd003 ~]# mount -t ceph

[ceph-users] CephFS Client Capabilities questions

2018-03-07 Thread Kenneth Waegeman
Hi all, I am playing with limiting client access to certain subdirectories of cephfs running latest 12.2.4 and latest centos 7.4 kernel, both using kernel client and fuse I am following http://docs.ceph.com/docs/luminous/cephfs/client-auth/: /To completely restrict the client to the

Re: [ceph-users] force scrubbing

2018-03-01 Thread Kenneth Waegeman
changed. Any clues? Thanks!! K On 13/11/17 10:01, Kenneth Waegeman wrote: Hi all, Is there a way to force scrub a pg of an erasure coded pool? I tried  ceph pg deep-scrub 5.4c7, but after a week it still hasn't scrubbed the pg (last scrub timestamp not changed) Thanks! Kenneth

[ceph-users] removing cache of ec pool (bluestore) with ec_overwrites enabled

2018-02-14 Thread Kenneth Waegeman
Hi all, I'm trying to remove the cache from a erasure coded pool where all osds are bluestore osds and allow_ec_overwrites is true. I followed the steps on http://docs.ceph.com/docs/master/rados/operations/cache-tiering/, but with the remove-overlay step I'm getting a EBUSY error:

[ceph-users] mons segmentation faults New 12.2.2 cluster

2018-01-12 Thread Kenneth Waegeman
Hi all, I installed a new Luminous 12.2.2 cluster. The monitors were up at first, but quickly started failing, segfaulting. I only installed some mons, mgr, mds with ceph-deploy and osds with ceph volume. No pools or fs were created yet. When I start all mons again, there is a short window

[ceph-users] force scrubbing

2017-11-13 Thread Kenneth Waegeman
Hi all, Is there a way to force scrub a pg of an erasure coded pool? I tried  ceph pg deep-scrub 5.4c7, but after a week it still hasn't scrubbed the pg (last scrub timestamp not changed) Thanks! Kenneth ___ ceph-users mailing list

Re: [ceph-users] ceph inconsistent pg missing ec object

2017-11-09 Thread Kenneth Waegeman
Hi Greg, Thanks! This seems to have worked for at least 1 of 2 inconsistent pgs: The inconsistency disappeared after a new scrub. Still waiting for the result of the second pg. I tried to force deep-scrub with `ceph pg deep-scrub ` yesterday, but today the last deep scrub is still from a

[ceph-users] inconsistent pg on erasure coded pool

2017-10-04 Thread Kenneth Waegeman
Hi, We have some inconsistency / scrub error on a Erasure coded pool, that I can't seem to solve. [root@osd008 ~]# ceph health detail HEALTH_ERR 1 pgs inconsistent; 1 scrub errors pg 5.144 is active+clean+inconsistent, acting [81,119,148,115,142,100,25,63,48,11,43] 1 scrub errors In the

[ceph-users] bluestore-osd and block.dbs of other osds on ssd

2017-07-26 Thread Kenneth Waegeman
Hi all, Using filestore, we have some clusters were we put some journals of regular osds(hdd) together with eg. cache or metadata osd on one SSD. Even with the OS too on the OSD, this gave us better performance than with journals on disk. Now using bluestore, i was thinking if it is

Re: [ceph-users] ceph-disk --osd-id param

2017-07-25 Thread Kenneth Waegeman
noticed that in 12.1.0 the ceph command was missing options that were supposed to be there, but 12.1.1 had them. Maybe you're seeing a similar issue? -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kenneth Waegeman Sent: Tuesday, July 25, 2017 7

[ceph-users] ceph-disk --osd-id param

2017-07-25 Thread Kenneth Waegeman
Hi all, From the release notes of the Luminous RC, I read: 'There is a simplified OSD replacement process that is more robust.' , linked to http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/#replacing-an-osd If I try running 'ceph-disk prepare --bluestore /dev/sdX --osd-id

Re: [ceph-users] CEPHFS file or directories disappear when ls (metadata problem)

2016-09-29 Thread Kenneth Waegeman
On 29/09/16 14:29, Yan, Zheng wrote: On Thu, Sep 29, 2016 at 8:13 PM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, Following up on this thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008537.html we still see files missing when doing ls on

[ceph-users] CEPHFS file or directories disappear when ls (metadata problem)

2016-09-29 Thread Kenneth Waegeman
Hi all, Following up on this thread: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-March/008537.html we still see files missing when doing ls on cephfs with 3.10.0-327.18.2.el7.ug.x86_64 Is there already a solution for this?I don't see anything ceph related popping up in the

Re: [ceph-users] how to debug pg inconsistent state - no ioerrors seen

2016-08-09 Thread Kenneth Waegeman
for the pg and see if we spot something wrong? Cheers G. On 08/08/2016 09:40 PM, Kenneth Waegeman wrote: Hi all, Since last week, some pg's are going in the inconsistent state after a scrub error. Last week we had 4 pgs in that state, They were on different OSDS, but all of the metadata pool

[ceph-users] how to debug pg inconsistent state - no ioerrors seen

2016-08-08 Thread Kenneth Waegeman
Hi all, Since last week, some pg's are going in the inconsistent state after a scrub error. Last week we had 4 pgs in that state, They were on different OSDS, but all of the metadata pool. I did a pg repair on them, and all were healthy again. But now again one pg is inconsistent. with

Re: [ceph-users] OSD host swap usage

2016-07-27 Thread Kenneth Waegeman
On 27/07/16 10:59, Christian Balzer wrote: Hello, On Wed, 27 Jul 2016 10:21:34 +0200 Kenneth Waegeman wrote: Hi all, When our OSD hosts are running for some time, we start see increased usage of swap on a number of them. Some OSDs don't use swap for weeks, while others has a full (4G) swap

[ceph-users] OSD host swap usage

2016-07-27 Thread Kenneth Waegeman
Hi all, When our OSD hosts are running for some time, we start see increased usage of swap on a number of them. Some OSDs don't use swap for weeks, while others has a full (4G) swap, and start filling swap again after we did a swapoff/swapon. We have 8 8TB OSDS and 2 cache SSDs on each hosts,

Re: [ceph-users] mds0: Behind on trimming (58621/30)

2016-07-05 Thread Kenneth Waegeman
On 04/07/16 11:22, Kenneth Waegeman wrote: On 01/07/16 16:01, Yan, Zheng wrote: On Fri, Jul 1, 2016 at 6:59 PM, John Spray <jsp...@redhat.com> wrote: On Fri, Jul 1, 2016 at 11:35 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, While syncing a lot of files to

Re: [ceph-users] mds0: Behind on trimming (58621/30)

2016-07-04 Thread Kenneth Waegeman
On 01/07/16 16:01, Yan, Zheng wrote: On Fri, Jul 1, 2016 at 6:59 PM, John Spray <jsp...@redhat.com> wrote: On Fri, Jul 1, 2016 at 11:35 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, While syncing a lot of files to cephfs, our mds cluster got haywire: the mds

Re: [ceph-users] mds0: Behind on trimming (58621/30)

2016-07-01 Thread Kenneth Waegeman
On 01/07/16 12:59, John Spray wrote: On Fri, Jul 1, 2016 at 11:35 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, While syncing a lot of files to cephfs, our mds cluster got haywire: the mdss have a lot of segments behind on trimming: (58621/30) Because of this t

[ceph-users] mds0: Behind on trimming (58621/30)

2016-07-01 Thread Kenneth Waegeman
Hi all, While syncing a lot of files to cephfs, our mds cluster got haywire: the mdss have a lot of segments behind on trimming: (58621/30) Because of this the mds cluster gets degraded. RAM usage is about 50GB. The mdses were respawning and replaying continiously, and I had to stop all

[ceph-users] cephfs snapshots

2016-06-22 Thread Kenneth Waegeman
Hi all, In Jewel ceph fs snapshots are still experimental. Does someone has a clue when this would become stable, or how experimental this is ? ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] cephfs rm -rf on directory of 160TB /40M files

2016-04-05 Thread Kenneth Waegeman
-Greg On Monday, April 4, 2016, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, I want to remove a large directory containing +- 40M files /160TB of data in CephFS by running rm -rf on the directory via the ceph kernel client. After 7h , the rm command is still running. I checked

[ceph-users] cephfs rm -rf on directory of 160TB /40M files

2016-04-04 Thread Kenneth Waegeman
Hi all, I want to remove a large directory containing +- 40M files /160TB of data in CephFS by running rm -rf on the directory via the ceph kernel client. After 7h , the rm command is still running. I checked the rados df output, and saw that only about 2TB and 2M files are gone. I know this

Re: [ceph-users] DONTNEED fadvise flag

2016-03-21 Thread Kenneth Waegeman
Thanks! As we are using the kernel client of EL7, does someone knows if that client supports it? On 16/03/16 20:29, Gregory Farnum wrote: On Wed, Mar 16, 2016 at 9:46 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all, Quick question: Does cephFS pass the fadvise DONTNEE

[ceph-users] DONTNEED fadvise flag

2016-03-20 Thread Kenneth Waegeman
Hi all, Quick question: Does cephFS pass the fadvise DONTNEED flag and take it into account? I want to use the --drop-cache option of rsync 3.1.1 to not fill the cache when rsyncing to cephFS Thanks! Kenneth ___ ceph-users mailing list

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-17 Thread Kenneth Waegeman
On 05/02/16 11:43, John Spray wrote: On Fri, Feb 5, 2016 at 9:36 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: On 04/02/16 16:17, Gregory Farnum wrote: On Thu, Feb 4, 2016 at 1:42 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi, Hi, we are runnin

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-05 Thread Kenneth Waegeman
his. I don't know if it is related, or the 'new fs' has broken it..? I added the full log with debug level 20 in an issue ticket: http://tracker.ceph.com/issues/14672 On 05/02/16 11:43, John Spray wrote: On Fri, Feb 5, 2016 at 9:36 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote:

Re: [ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-05 Thread Kenneth Waegeman
On 04/02/16 16:17, Gregory Farnum wrote: On Thu, Feb 4, 2016 at 1:42 AM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi, Hi, we are running ceph 9.2.0. Overnight, our ceph state went to 'mds mds03 is laggy' . When I checked the logs, I saw this mds crashed with a stacktr

[ceph-users] ceph 9.2.0 mds cluster went down and now constantly crashes with Floating point exception

2016-02-04 Thread Kenneth Waegeman
Hi, Hi, we are running ceph 9.2.0. Overnight, our ceph state went to 'mds mds03 is laggy' . When I checked the logs, I saw this mds crashed with a stacktrace. I checked the other mdss, and I saw the same there. When I try to start the mds again, I get again a stacktrace and it won't come up:

[ceph-users] very high OSD RAM usage values

2016-01-06 Thread Kenneth Waegeman
Hi all, We experienced some serious trouble with our cluster: A running cluster started failing and started a chain reaction until the ceph cluster was down, as about half the OSDs are down (in a EC pool) Each host has 8 OSDS of 8 TB (i.e. RAID 0 of 2 4TB disk) for an EC pool (10+3, 14

[ceph-users] python-flask not in repo's for infernalis

2015-12-14 Thread Kenneth Waegeman
Hi, Is there a reason python-flask is not in the repo of infernalis anymore ? In centos7 it is still not in the standard repos or epel.. Thanks! Kenneth ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] ceph new installation of ceph 0.9.2 issue and crashing osds

2015-12-08 Thread Kenneth Waegeman
Hi, I installed ceph 0.9.2 on a new cluster of 3 nodes, with 50 OSDs on each node (300GB disks, 96GB RAM) While installing, I got some issue that I even could not login as ceph user. So I increased some limits: security/limits.conf ceph- nproc 1048576 ceph

[ceph-users] upgrading 0.94.5 to 9.2.0 notes

2015-11-20 Thread Kenneth Waegeman
Hi, I recently started a test to upgrade ceph from 0.94.5 to 9.2.0 on Centos7. I had some issues not mentioned in the release notes. Hereby some notes: * Upgrading instructions are only in the release notes, not updated on the upgrade page in the docs:

[ceph-users] all pgs of erasure coded pool stuck stale

2015-11-13 Thread Kenneth Waegeman
Hi all, What could be the reason that all pgs of a whole Erasure Coded pool are stuck stale? All OSDS are restarted and up.. The details: We have a setup with 14 OSD hosts with specific OSDs for an Erasure coded pool and 2 SSDS for a cache pool, and 3 seperate monitor/metadata nodes with

Re: [ceph-users] Problem with infernalis el7 package

2015-11-10 Thread Kenneth Waegeman
On 10/11/15 02:07, c...@dolphin-it.de wrote: Hello, I filed a new ticket: http://tracker.ceph.com/issues/13739 Regards, Kevin [ceph-users] Problem with infernalis el7 package (10-Nov-2015 1:57) From: Bob R To:ceph-users@lists.ceph.com Hello, We've got two problems trying to update

Re: [ceph-users] Problem with infernalis el7 package

2015-11-10 Thread Kenneth Waegeman
Because our problem was not related to ceph-deploy, I created a new ticket: http://tracker.ceph.com/issues/13746 On 10/11/15 16:53, Kenneth Waegeman wrote: On 10/11/15 02:07, c...@dolphin-it.de wrote: Hello, I filed a new ticket: http://tracker.ceph.com/issues/13739 Regards, Kevin [ceph

[ceph-users] ceph mds operations

2015-11-10 Thread Kenneth Waegeman
Hi all, Is there a way to see what an MDS is actually doing? We are testing metadata operations, but in the ceph status output only see about 50 ops/s : client io 90791 kB/s rd, 54 op/s Our active ceph-mds is using a lot of cpu and 25GB of memory, so I guess it is doing a lot of operations

[ceph-users] upgrading from 0.9.3 to 9.1.0 and systemd

2015-10-19 Thread Kenneth Waegeman
Hi all, I tried upgrading ceph from 0.9.3 to 9.1.0, but ran into some troubles. I chowned the /var/lib/ceph folder as described in the release notes, but my journal is on a seperate partition, so I get: Oct 19 11:58:59 ceph001.cubone.os systemd[1]: Started Ceph object storage daemon. Oct 19

[ceph-users] mds0: Client client008 failing to respond to capability release

2015-09-21 Thread Kenneth Waegeman
Hi all! A quick question: We are syncing data over cephfs , and we are seeing messages in our output like: mds0: Client client008 failing to respond to capability release What does this mean? I don't find information about this somewhere else. We are running ceph 9.0.3 On earlier versions,

Re: [ceph-users] mds0: Client client008 failing to respond to capability release

2015-09-21 Thread Kenneth Waegeman
On 21/09/15 16:32, John Spray wrote: On Mon, Sep 21, 2015 at 2:33 PM, Kenneth Waegeman <kenneth.waege...@ugent.be> wrote: Hi all! A quick question: We are syncing data over cephfs , and we are seeing messages in our output like: mds0: Client client008 failing to respond to capa

[ceph-users] Questions about erasure code pools

2015-08-03 Thread Kenneth Waegeman
Hi, I read here in the documentation: http://docs.ceph.com/docs/master/architecture/#erasure-coding In an erasure coded pool, the primary OSD in the up set receives all write operations. I dont' find what happens with read operations. Does the client contact the primary and does this OSD

[ceph-users] rados bench multiple clients error

2015-07-31 Thread Kenneth Waegeman
Hi, I was trying rados bench, and first wrote 250 objects from 14 hosts with --no-cleanup. Then I ran the read tests from the same 14 hosts and ran into this: [root@osd007 test]# /usr/bin/rados -p ectest bench 100 seq 2015-07-31 17:52:51.027872 7f6c40de17c0 -1 WARNING: the following

Re: [ceph-users] A cache tier issue with rate only at 20MB/s when data move from cold pool to hot pool

2015-07-30 Thread Kenneth Waegeman
On 06/16/2015 01:17 PM, Kenneth Waegeman wrote: Hi! We also see this at our site: When we cat a large file from cephfs to /dev/null, we get about 10MB/s data transfer. I also do not see a system resource bottleneck. Our cluster consists of 14 servers with each 16 disks, together forming

Re: [ceph-users] OSD RAM usage values

2015-07-29 Thread Kenneth Waegeman
On 07/28/2015 04:04 PM, Dan van der Ster wrote: On Tue, Jul 28, 2015 at 12:07 PM, Gregory Farnum g...@gregs42.com wrote: On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: On 07/17/2015 02:50 PM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM

Re: [ceph-users] OSD RAM usage values

2015-07-29 Thread Kenneth Waegeman
On 07/28/2015 04:21 PM, Mark Nelson wrote: On 07/17/2015 07:50 AM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I've read in the documentation that OSDs use around 512MB on a healthy cluster.(http://ceph.com/docs/master

[ceph-users] Migrate OSDs to different backend

2015-07-29 Thread Kenneth Waegeman
Hi all, We are considering to migrate all our OSDs of our EC pool from KeyValue to Filestore. Does someone has experience with this? What would be a good procedure? We have Erasure Code using k+m: 10+3, with host-level failure domain on 14 servers. Our pool is 30% filled. I was thinking:

Re: [ceph-users] OSD RAM usage values

2015-07-28 Thread Kenneth Waegeman
On 07/17/2015 02:50 PM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I've read in the documentation that OSDs use around 512MB on a healthy cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram) Now, our

[ceph-users] OSD RAM usage values

2015-07-17 Thread Kenneth Waegeman
Hi all, I've read in the documentation that OSDs use around 512MB on a healthy cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram) Now, our OSD's are all using around 2GB of RAM memory while the cluster is healthy. PID USER PR NIVIRTRESSHR S %CPU

[ceph-users] fuse mount in fstab

2015-07-09 Thread Kenneth Waegeman
Hi all, we are trying to mount ceph-fuse in fstab, following this: http://ceph.com/docs/master/cephfs/fstab/ When we add this: id=cephfs,conf=/etc/ceph/ceph.conf /mnt/ceph fuse.ceph defaults0 0 to fstab, we get an error message running mount: mount: can't find

Re: [ceph-users] fuse mount in fstab

2015-07-09 Thread Kenneth Waegeman
Hmm, it looks like a version issue.. I am testing with these versions on centos7: ~]# mount -V mount from util-linux 2.23.2 (libmount 2.23.0: selinux, debug, assert) ~]# ceph-fuse -v ceph version 0.94.1 (e4bfad3a3c51054df7e537a724c8d0bf9be972ff) This do not work.. On my fedora box, with

Re: [ceph-users] A cache tier issue with rate only at 20MB/s when data move from cold pool to hot pool

2015-06-16 Thread Kenneth Waegeman
Hi! We also see this at our site: When we cat a large file from cephfs to /dev/null, we get about 10MB/s data transfer. I also do not see a system resource bottleneck. Our cluster consists of 14 servers with each 16 disks, together forming a EC coded pool. We also have 2SSDs per server for

Re: [ceph-users] bursty IO, ceph cache pool can not follow evictions

2015-06-03 Thread Kenneth Waegeman
storage disks aren't doing a lot of usage (see iostat usage other email), so is there a way to increase the flushing speed by tuning the cache agent for eg parallelism.. ? Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Kenneth Waegeman

Re: [ceph-users] bursty IO, ceph cache pool can not follow evictions

2015-06-03 Thread Kenneth Waegeman
flushing, and therefore not getting a full cache.. -- *Paul * * * * * On Jun 2, 2015, at 9:53 AM, Kenneth Waegeman kenneth.waege...@ugent.be mailto:kenneth.waege...@ugent.be wrote: Hi, we were rsync-streaming with 4 cephfs client to a ceph cluster with a cache layer upon an erasure coded pool

[ceph-users] bursty IO, ceph cache pool can not follow evictions

2015-06-02 Thread Kenneth Waegeman
Hi, we were rsync-streaming with 4 cephfs client to a ceph cluster with a cache layer upon an erasure coded pool. This was going on for some time, and didn't have real problems. Today we added 2 more streams, and very soon we saw some strange behaviour: - We are getting blocked requests on

Re: [ceph-users] Ceph MDS continually respawning (hammer)

2015-05-28 Thread Kenneth Waegeman
On 05/27/2015 10:30 PM, Gregory Farnum wrote: On Wed, May 27, 2015 at 6:49 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: We are also running a full backup sync to cephfs, using multiple distributed rsync streams (with zkrsync), and also ran in this issue today on Hammer 0.94.1

Re: [ceph-users] Ceph MDS continually respawning (hammer)

2015-05-27 Thread Kenneth Waegeman
We are also running a full backup sync to cephfs, using multiple distributed rsync streams (with zkrsync), and also ran in this issue today on Hammer 0.94.1 . After setting the beacon higer, and eventually clearing the journal, it stabilized again. We were using ceph-fuse to mount the

Re: [ceph-users] ceph.conf boolean value for mon_cluster_log_to_syslog

2015-05-27 Thread Kenneth Waegeman
On 05/23/2015 08:26 AM, Abhishek L wrote: Gregory Farnum writes: On Thu, May 21, 2015 at 8:24 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, Some strange issue wrt boolean values in the config: this works: osd_crush_update_on_start = 0 - osd not updated

[ceph-users] ceph.conf boolean value for mon_cluster_log_to_syslog

2015-05-21 Thread Kenneth Waegeman
Hi, Some strange issue wrt boolean values in the config: this works: osd_crush_update_on_start = 0 - osd not updated osd_crush_update_on_start = 1 - osd updated In a previous version we could set boolean values in the ceph.conf file with the integers 1(true) and false(0) also for

Re: [ceph-users] ceph tell changed?

2015-05-21 Thread Kenneth Waegeman
the command you run and its output ? Cheers On 21/05/2015 17:34, Kenneth Waegeman wrote: Hi, We're using ceph tell in our configuration system since emperor, and before we could run 'ceph tell *.$host injectargs -- ...' , and while I'm honestly not completely sure anymore this did all what I think

Re: [ceph-users] cache pool parameters and pressure

2015-05-12 Thread Kenneth Waegeman
On 04/30/2015 07:50 PM, Gregory Farnum wrote: On Thu, Apr 30, 2015 at 2:03 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: So the cache is empty, but I get warning when I check the health: health HEALTH_WARN mds0: Client cephtst.cubone.os failing to respond to cache

[ceph-users] ceph-fuse options: writeback cache

2015-05-11 Thread Kenneth Waegeman
Hi all, I have a few questions about ceph-fuse options: - Is the fuse writeback cache being used? How can we see this? Can it be turned on with allow_wbcache somehow? - What is the default of the big_writes option? (as seen in /usr/bin/ceph-fuse --help) . Where can we see this? If we run

[ceph-users] cache pool parameters and pressure

2015-04-30 Thread Kenneth Waegeman
Hi all, I have some question related to the caching layer. I am using the latest version of ceph: 0.94.1. I created the ceph pool with this options: ceph osd tier add ecdata cache ceph osd tier cache-mode cache writeback ceph osd tier set-overlay ecdata cache ceph osd pool set cache

Re: [ceph-users] cluster not coming up after reboot

2015-04-27 Thread Kenneth Waegeman
On 04/23/2015 06:58 PM, Craig Lewis wrote: Yes, unless you've adjusted: [global] mon osd min down reporters = 9 mon osd min down reports = 12 OSDs talk to the MONs on the public network. The cluster network is only used for OSD to OSD communication. If one OSD node can't talk on that

Re: [ceph-users] removing a ceph fs

2015-04-23 Thread Kenneth Waegeman
mds.0.4 suicide. wanted down:dne, now up:active -Greg On Wed, Apr 22, 2015 at 6:46 AM Kenneth Waegeman kenneth.waege...@ugent.be mailto:kenneth.waege...@ugent.be wrote: forgot to mention I'm running 0.94.1 On 04/22/2015 03:02 PM, Kenneth Waegeman wrote: Hi, I tried

Re: [ceph-users] cluster not coming up after reboot

2015-04-23 Thread Kenneth Waegeman
On 04/22/2015 07:35 PM, Gregory Farnum wrote: On Wed, Apr 22, 2015 at 8:17 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, I changed the cluster network parameter in the config files, restarted the monitors , and then restarted all the OSDs (shouldn't have done that). Do you mean

[ceph-users] removing a ceph fs

2015-04-22 Thread Kenneth Waegeman
Hi, I tried to recreate a ceph fs ( well actually an underlying pool, but for that I need to first remove the fs) , but this seems not that easy to achieve. When I run `ceph fs rm ceph_fs` I get: `Error EINVAL: all MDS daemons must be inactive before removing filesystem` I stopped the 3

Re: [ceph-users] removing a ceph fs

2015-04-22 Thread Kenneth Waegeman
forgot to mention I'm running 0.94.1 On 04/22/2015 03:02 PM, Kenneth Waegeman wrote: Hi, I tried to recreate a ceph fs ( well actually an underlying pool, but for that I need to first remove the fs) , but this seems not that easy to achieve. When I run `ceph fs rm ceph_fs` I get: `Error

[ceph-users] cluster not coming up after reboot

2015-04-22 Thread Kenneth Waegeman
Hi, I changed the cluster network parameter in the config files, restarted the monitors , and then restarted all the OSDs (shouldn't have done that). Now the OSDS keep on crashing, and the cluster is not able to restore.. I eventually rebooted the whole cluster, but the problem remains: For

Re: [ceph-users] where to download 0.87 RPMS?

2014-10-31 Thread Kenneth Waegeman
:37:34 +0400 From: Irek Fasikhov malm...@gmail.com Subject: Re: [ceph-users] where to download 0.87 RPMS? To: Kenneth Waegeman kenneth.waege...@ugent.be Cc: Patrick McGarry patr...@inktank.com, ceph-users ceph-users@lists.ceph.com Hi. Use http://ceph.com/rpm-giant/ 2014-10-30 12

Re: [ceph-users] where to download 0.87 RPMS?

2014-10-30 Thread Kenneth Waegeman
list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com - End message from Patrick McGarry patr...@inktank.com - -- Met vriendelijke groeten, Kenneth Waegeman ___ ceph-users mailing list ceph-users@lists.ceph.com

[ceph-users] use ZFS for OSDs

2014-10-29 Thread Kenneth Waegeman
Hi, We are looking to use ZFS for our OSD backend, but I have some questions. My main question is: Does Ceph already supports the writeparallel mode for ZFS ? (as described here: http://www.sebastien-han.fr/blog/2013/12/02/ceph-performance-interesting-things-going-on/) I've found this, but

Re: [ceph-users] OSD troubles on FS+Tiering

2014-09-16 Thread Kenneth Waegeman
- Message from Gregory Farnum g...@inktank.com - Date: Mon, 15 Sep 2014 10:37:07 -0700 From: Gregory Farnum g...@inktank.com Subject: Re: [ceph-users] OSD troubles on FS+Tiering To: Kenneth Waegeman kenneth.waege...@ugent.be Cc: ceph-users ceph-users@lists.ceph.com

[ceph-users] OSDs crashing on CephFS and Tiering

2014-09-15 Thread Kenneth Waegeman
Hi, I have some strange OSD problems. Before the weekend I started some rsync tests over CephFS, on a cache pool with underlying EC KV pool. Today the cluster is completely degraded: [root@ceph003 ~]# ceph status cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d health HEALTH_WARN

[ceph-users] OSD troubles on FS+Tiering

2014-09-15 Thread Kenneth Waegeman
Hi, I have some strange OSD problems. Before the weekend I started some rsync tests over CephFS, on a cache pool with underlying EC KV pool. Today the cluster is completely degraded: [root@ceph003 ~]# ceph status cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d health HEALTH_WARN 19

Re: [ceph-users] Cephfs upon Tiering

2014-09-12 Thread Kenneth Waegeman
- Message from Sage Weil sw...@redhat.com - Date: Thu, 11 Sep 2014 14:10:46 -0700 (PDT) From: Sage Weil sw...@redhat.com Subject: Re: [ceph-users] Cephfs upon Tiering To: Gregory Farnum g...@inktank.com Cc: Kenneth Waegeman kenneth.waege...@ugent.be, ceph-users ceph

[ceph-users] Cephfs upon Tiering

2014-09-11 Thread Kenneth Waegeman
Hi all, I am testing the tiering functionality with cephfs. I used a replicated cache with an EC data pool, and a replicated metadata pool like this: ceph osd pool create cache 1024 1024 ceph osd pool set cache size 2 ceph osd pool set cache min_size 1 ceph osd erasure-code-profile set

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-08 Thread Kenneth Waegeman
To: Kenneth Waegeman kenneth.waege...@ugent.be Cc: ceph-users@lists.ceph.com I have found the root cause. It's a bug. When chunky scrub happen, it will iterate the who pg's objects and each iterator only a few objects will be scan. osd/PG.cc:3758 ret = get_pgbackend

[ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-01 Thread Kenneth Waegeman
again!! Kenneth - Message from Haomai Wang haomaiw...@gmail.com - Date: Tue, 26 Aug 2014 17:11:43 +0800 From: Haomai Wang haomaiw...@gmail.com Subject: Re: [ceph-users] ceph cluster inconsistency? To: Kenneth Waegeman kenneth.waege...@ugent.be Cc: ceph-users

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-01 Thread Kenneth Waegeman
let me know if you need more information, and thanks ! Kenneth - Message from Haomai Wang haomaiw...@gmail.com - Date: Mon, 1 Sep 2014 21:30:16 +0800 From: Haomai Wang haomaiw...@gmail.com Subject: Re: ceph cluster inconsistency keyvaluestore To: Kenneth Waegeman

  1   2   >