Re: [ceph-users] Complete freeze of a cephfs client (unavoidable hard reboot)

2015-05-27 Thread Gregory Farnum
Sorry for the delay; I've been traveling. On Sun, May 17, 2015 at 3:49 PM, Francois Lafont flafdiv...@free.fr wrote: Hi, Sorry for my late answer. Gregory Farnum wrote: 1. Is this kind of freeze normal? Can I avoid these freezes with a more recent version of the kernel in the client

Re: [ceph-users] Cache Pool Flush/Eviction Limits - Hard of Soft?

2015-05-27 Thread Gregory Farnum
The max target limit is a hard limit: the OSDs won't let more than that amount of data in the cache tier. They will start flushing and evicting based on the percentage ratios you can set (I don't remember the exact parameter names) and you may need to set these more aggressively for your given

Re: [ceph-users] replication over slow uplink

2015-05-27 Thread Gregory Farnum
On Tue, May 19, 2015 at 7:35 PM, John Peebles johnp...@gmail.com wrote: Hi, I'm hoping for advice on whether Ceph could be used in an atypical use case. Specifically, I have about ~20TB of files that need replicated to 2 different sites. Each site has its own internal gigabit ethernet

Re: [ceph-users] Discuss: New default recovery config settings

2015-05-29 Thread Gregory Farnum
On Fri, May 29, 2015 at 2:47 PM, Samuel Just sj...@redhat.com wrote: Many people have reported that they need to lower the osd recovery config options to minimize the impact of recovery on client io. We are talking about changing the defaults as follows: osd_max_backfills to 1 (from 10)

Re: [ceph-users] replication over slow uplink

2015-05-27 Thread Gregory Farnum
On Wed, May 27, 2015 at 6:57 PM, Christian Balzer ch...@gol.com wrote: On Wed, 27 May 2015 14:06:43 -0700 Gregory Farnum wrote: On Tue, May 19, 2015 at 7:35 PM, John Peebles johnp...@gmail.com wrote: Hi, I'm hoping for advice on whether Ceph could be used in an atypical use case

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-05-27 Thread Gregory Farnum
details, speak up now. I'd open an issue, but I don't have a reliable way to reproduce this and even less desire to do so on this production cluster. ^_- Christian On Sat, 6 Dec 2014 12:48:25 +0900 Christian Balzer wrote: On Fri, 5 Dec 2014 11:23:19 -0800 Gregory Farnum wrote: On Thu, Dec 4

Re: [ceph-users] Hammer cache behavior

2015-05-27 Thread Gregory Farnum
On Mon, May 18, 2015 at 9:34 AM, Brian Rak b...@gameservers.com wrote: We just enabled a small cache pool on one of our clusters (v 0.94.1) and have run into some issues: 1) Cache population appears to happen via the public network (not the cluster network). We're seeing basically no traffic

Re: [ceph-users] OSD trashed by simple reboot (Debian Jessie, systemd?)

2015-05-28 Thread Gregory Farnum
On Thu, May 28, 2015 at 12:22 AM, Christian Balzer ch...@gol.com wrote: Hello Greg, On Wed, 27 May 2015 22:53:43 -0700 Gregory Farnum wrote: The description of the logging abruptly ending and the journal being bad really sounds like part of the disk is going back in time. I'm not sure

Re: [ceph-users] What do internal_safe_to_start_threads and leveldb_compression do?

2015-06-02 Thread Gregory Farnum
On Tue, Jun 2, 2015 at 6:47 AM, Erik Logtenberg e...@logtenberg.eu wrote: What does this do? - leveldb_compression: false (default: true) - leveldb_block/cache/write_buffer_size (all bigger than default) I take it you're running these commands on a monitor (from I think the Dumpling

Re: [ceph-users] Read Errors and OSD Flapping

2015-06-02 Thread Gregory Farnum
On Sat, May 30, 2015 at 2:23 PM, Nick Fisk n...@fisk.me.uk wrote: Hi All, I was noticing poor performance on my cluster and when I went to investigate I noticed OSD 29 was flapping up and down. On investigation it looks like it has 2 pending sectors, kernel log is filled with the

Re: [ceph-users] Discuss: New default recovery config settings

2015-06-01 Thread Gregory Farnum
On Mon, Jun 1, 2015 at 6:39 PM, Paul Von-Stamwitz pvonstamw...@us.fujitsu.com wrote: On Fri, May 29, 2015 at 4:18 PM, Gregory Farnum g...@gregs42.com wrote: On Fri, May 29, 2015 at 2:47 PM, Samuel Just sj...@redhat.com wrote: Many people have reported that they need to lower the osd recovery

Re: [ceph-users] Node reboot -- OSDs not logging off from cluster

2015-07-01 Thread Gregory Farnum
On Tue, Jun 30, 2015 at 10:36 AM, Daniel Schneller daniel.schnel...@centerdevice.com wrote: Hi! We are seeing a strange - and problematic - behavior in our 0.94.1 cluster on Ubuntu 14.04.1. We have 5 nodes, 4 OSDs each. When rebooting one of the nodes (e. g. for a kernel upgrade) the OSDs

Re: [ceph-users] Round-trip time for monitors

2015-07-01 Thread Gregory Farnum
On Wed, Jul 1, 2015 at 8:38 AM, - - francois.pe...@san-services.com wrote: Hi everybody, We have 3 monitors in our ceph cluster: 2 in one local site (2 data centers a few km away from each other), and the 3rd one on a remote site, with a maximum round-trip time (RTT) of 30ms between the local

Re: [ceph-users] file/directory invisible through ceph-fuse

2015-07-01 Thread Gregory Farnum
On Wed, Jul 1, 2015 at 9:02 AM, flisky yinjif...@lianjia.com wrote: Hi list, I meet a strange problem: sometimes I cannot see the file/directory created by another ceph-fuse client. It comes into visible after I touch/mkdir the same name. Any thoughts? What version are you running? We've

Re: [ceph-users] Ceph MDS continually respawning (hammer)

2015-05-22 Thread Gregory Farnum
On Fri, May 22, 2015 at 12:45 PM, Adam Tygart mo...@ksu.edu wrote: Fair enough. Anyway, is it safe to now increase the 'mds beacon grace' to try and get the mds server functional again? Yep! Let us know how it goes... I realize there is nothing simple about the things that are being

Re: [ceph-users] Ceph MDS continually respawning (hammer)

2015-05-22 Thread Gregory Farnum
On Fri, May 22, 2015 at 11:34 AM, Adam Tygart mo...@ksu.edu wrote: On Fri, May 22, 2015 at 11:47 AM, John Spray john.sp...@redhat.com wrote: On 22/05/2015 15:33, Adam Tygart wrote: Hello all, The ceph-mds servers in our cluster are performing a constant boot-replay-crash in our systems.

Re: [ceph-users] rados_clone_range

2015-05-22 Thread Gregory Farnum
On Thu, May 21, 2015 at 3:09 AM, Michel Hollands mholla...@velocix.com wrote: Hello, Is it possible to use the rados_clone_range() librados API call with an erasure coded pool ? The documentation doesn’t mention it’s not possible. However running the clonedata command from the rados utility

Re: [ceph-users] Ceph MDS continually respawning (hammer)

2015-05-22 Thread Gregory Farnum
with the kernel dcache, unfortunately. We improved it a bit just last week but we'll have to try and diagnose what happened in this case more before we can say if it was that issue or something else. -Greg -- Adam On Fri, May 22, 2015 at 2:06 PM, Gregory Farnum g...@gregs42.com wrote: On Fri

Re: [ceph-users] HDFS on Ceph (RBD)

2015-05-22 Thread Gregory Farnum
If you guys have stuff running on Hadoop, you might consider testing out CephFS too. Hadoop is a predictable workload that we haven't seen break at all in several years and the bindings handle data locality and such properly. :) -Greg On Thu, May 21, 2015 at 11:24 PM, Wang, Warren

Re: [ceph-users] ceph.conf boolean value for mon_cluster_log_to_syslog

2015-05-22 Thread Gregory Farnum
On Thu, May 21, 2015 at 8:24 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi, Some strange issue wrt boolean values in the config: this works: osd_crush_update_on_start = 0 - osd not updated osd_crush_update_on_start = 1 - osd updated In a previous version we could set boolean

Re: [ceph-users] CephFS archive use case

2015-07-07 Thread Gregory Farnum
That's not something that CephFS supports yet; raw RADOS doesn't have any kind of immutability support either. :( -Greg On Tue, Jul 7, 2015 at 5:28 PM Peter Tiernan ptier...@tchpc.tcd.ie wrote: Hi, i have a use case for CephFS whereby files can be added but not modified or deleted. Is this

Re: [ceph-users] Memory-Usage

2015-08-18 Thread Gregory Farnum
On Mon, Aug 17, 2015 at 8:21 PM, Patrik Plank pat...@plank.me wrote: Hi, have a ceph cluster witch tree nodes and 32 osds. The tree nodes have 16Gb memory but only 5Gb is in use. Nodes are Dell Poweredge R510. my ceph.conf: [global] mon_initial_members = ceph01 mon_host =

Re: [ceph-users] Ceph File System ACL Support

2015-08-18 Thread Gregory Farnum
On Mon, Aug 17, 2015 at 4:12 AM, Yan, Zheng uker...@gmail.com wrote: On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman eric.east...@keepertech.com wrote: Hi, I need to verify in Ceph v9.0.2 if the kernel version of Ceph file system supports ACLs and the libcephfs file system interface does not.

Re: [ceph-users] Repair inconsistent pgs..

2015-08-18 Thread Gregory Farnum
From a quick peek it looks like some of the OSDs are missing clones of objects. I'm not sure how that could happen and I'd expect the pg repair to handle that but if it's not there's probably something wrong; what version of Ceph are you running? Sam, is this something you've seen, a new bug, or

Re: [ceph-users] Testing CephFS

2015-08-21 Thread Gregory Farnum
On Thu, Aug 20, 2015 at 11:07 AM, Simon Hallam s...@pml.ac.uk wrote: Hey all, We are currently testing CephFS on a small (3 node) cluster. The setup is currently: Each server has 12 OSDs, 1 Monitor and 1 MDS running on it: The servers are running: 0.94.2-0.el7 The clients are

Re: [ceph-users] Object Storage and POSIX Mix

2015-08-21 Thread Gregory Farnum
On Fri, Aug 21, 2015 at 10:27 PM, Scottix scot...@gmail.com wrote: I saw this article on Linux Today and immediately thought of Ceph. http://www.enterprisestorageforum.com/storage-management/object-storage-vs.-posix-storage-something-in-the-middle-please-1.html I was thinking would it

Re: [ceph-users] Testing CephFS

2015-08-24 Thread Gregory Farnum
. Cheers, Simon -Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: 21 August 2015 12:16 To: Simon Hallam Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Testing CephFS On Thu, Aug 20, 2015 at 11:07 AM, Simon Hallam s...@pml.ac.uk wrote: Hey all

Re: [ceph-users] Weird behaviour of cephfs with samba

2015-07-28 Thread Gregory Farnum
On Mon, Jul 27, 2015 at 6:25 PM, Jörg Henne henn...@gmail.com wrote: Gregory Farnum greg@... writes: Yeah, I think there were some directory listing bugs in that version that Samba is probably running into. They're fixed in a newer kernel release (I'm not sure which one exactly, sorry). Ok

Re: [ceph-users] hadoop on ceph

2015-07-28 Thread Gregory Farnum
On Mon, Jul 27, 2015 at 6:34 PM, Patrick McGarry pmcga...@redhat.com wrote: Moving this to the ceph-user list where it has a better chance of being answered. On Mon, Jul 27, 2015 at 5:35 AM, jingxia@baifendian.com jingxia@baifendian.com wrote: Dear , I have questions to ask. The

Re: [ceph-users] OSD RAM usage values

2015-07-28 Thread Gregory Farnum
On Tue, Jul 28, 2015 at 11:00 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: On 07/17/2015 02:50 PM, Gregory Farnum wrote: On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I've read in the documentation that OSDs use around 512MB

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-28 Thread Gregory Farnum
On Tue, Jul 28, 2015 at 8:01 AM, Burkhard Linke burkhard.li...@computational.bio.uni-giessen.de wrote: Hi, On 07/27/2015 05:42 PM, Gregory Farnum wrote: On Mon, Jul 27, 2015 at 4:33 PM, Burkhard Linke burkhard.li...@computational.bio.uni-giessen.de wrote: Hi, the nfs-ganesha

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
of your solution, but I'm not sure how much of can do on its own. What's your end goal? -- Peter Hinman On 7/29/2015 1:57 PM, Gregory Farnum wrote: This sounds like you're trying to reconstruct a cluster after destroying the monitors. That is...not going to work well. The monitors define

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
This sounds like you're trying to reconstruct a cluster after destroying the monitors. That is...not going to work well. The monitors define the cluster and you can't move OSDs into different clusters. We have ideas for how to reconstruct monitors and it can be done manually with a lot of hassle,

Re: [ceph-users] Recovery question

2015-07-29 Thread Gregory Farnum
This sounds odd. Can you create a ticket in the tracker with all the details you can remember or reconstruct? -Greg On Wed, Jul 29, 2015 at 8:34 PM Steve Taylor steve.tay...@storagecraft.com wrote: I recently migrated 240 OSDs to new servers this way in a single cluster, and it worked great.

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-27 Thread Gregory Farnum
On Mon, Jul 27, 2015 at 4:33 PM, Burkhard Linke burkhard.li...@computational.bio.uni-giessen.de wrote: Hi, the nfs-ganesha documentation states: ... This FSAL links to a modified version of the CEPH library that has been extended to expose its distributed cluster and replication facilities

Re: [ceph-users] Weird behaviour of cephfs with samba

2015-07-27 Thread Gregory Farnum
What's the full stack you're using to run this with? If you're using the kernel client, try updating it or switching to the userspace (ceph-fuse, or Samba built-in) client. If using userspace, please make sure you've got the latest one. -Greg On Mon, Jul 27, 2015 at 3:16 PM, Jörg Henne

Re: [ceph-users] Weird behaviour of cephfs with samba

2015-07-27 Thread Gregory Farnum
On Mon, Jul 27, 2015 at 5:46 PM, Jörg Henne henn...@gmail.com wrote: Gregory Farnum greg@... writes: What's the full stack you're using to run this with? If you're using the kernel client, try updating it or switching to the userspace (ceph-fuse, or Samba built-in) client. If using userspace

Re: [ceph-users] Cephfs and ERESTARTSYS on writes

2015-07-23 Thread Gregory Farnum
On Thu, Jul 23, 2015 at 1:17 PM, Vedran Furač vedran.fu...@gmail.com wrote: Hello, I'm having an issue with nginx writing to cephfs. Often I'm getting: writev() /home/ceph/temp/44/94/1/119444 failed (4: Interrupted system call) while reading upstream looking with strace, this happens:

Re: [ceph-users] Ceph Tech Talk next week

2015-07-21 Thread Gregory Farnum
On Tue, Jul 21, 2015 at 6:09 PM, Patrick McGarry pmcga...@redhat.com wrote: Hey cephers, Just a reminder that the Ceph Tech Talk on CephFS that was scheduled for last month (and cancelled due to technical difficulties) has been rescheduled for this month's talk. It will be happening next

Re: [ceph-users] Ceph 0.94 (and lower) performance on 1 hosts ??

2015-07-22 Thread Gregory Farnum
We might also be able to help you improve or better understand your results if you can tell us exactly what tests you're conducting that are giving you these numbers. -Greg On Wed, Jul 22, 2015 at 4:44 AM, Florent MONTHEL fmont...@flox-arts.net wrote: Hi Frederic, When you have Ceph cluster

Re: [ceph-users] Clients' connection for concurrent access to ceph

2015-07-23 Thread Gregory Farnum
On Wed, Jul 22, 2015 at 8:39 PM, Shneur Zalman Mattern shz...@eimsys.co.il wrote: Workaround... We're building now a huge computing cluster 140 computing DISKLESS nodes and they are pulling to storage a lot of computing data concurrently User that put job for the cluster - need also access

Re: [ceph-users] ceph-mon cpu usage

2015-07-23 Thread Gregory Farnum
On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito periqu...@gmail.com wrote: The ceph-mon is already taking a lot of memory, and I ran a heap stats MALLOC: 32391696 ( 30.9 MiB) Bytes in use by application MALLOC: + 27597135872 (26318.7

Re: [ceph-users] Ceph 0.94 (and lower) performance on 1 hosts ??

2015-07-23 Thread Gregory Farnum
I'm not sure. It looks like Ceph and your disk controllers are doing basically the right thing since you're going from 1GB/s to 420MB/s when moving from dd to Ceph (the full data journaling cuts it in half), but just fyi that dd task is not doing nearly the same thing as Ceph does — you'd need to

Re: [ceph-users] osd_agent_max_ops relating to number of OSDs in the cache pool

2015-07-22 Thread Gregory Farnum
On Sat, Jul 18, 2015 at 10:25 PM, Nick Fisk n...@fisk.me.uk wrote: Hi All, I’m doing some testing on the new High/Low speed cache tiering flushing and I’m trying to get my head round the effect that changing these 2 settings have on the flushing speed. When setting the osd_agent_max_ops to

Re: [ceph-users] OSD RAM usage values

2015-07-17 Thread Gregory Farnum
On Fri, Jul 17, 2015 at 1:13 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I've read in the documentation that OSDs use around 512MB on a healthy cluster.(http://ceph.com/docs/master/start/hardware-recommendations/#ram) Now, our OSD's are all using around 2GB of RAM memory

Re: [ceph-users] 10d

2015-07-17 Thread Gregory Farnum
at 11:09 AM, Dan van der Ster d...@vanderster.com wrote: On Wed, Jun 17, 2015 at 10:52 AM, Gregory Farnum g...@gregs42.com wrote: On Wed, Jun 17, 2015 at 8:56 AM, Dan van der Ster d...@vanderster.com wrote: Hi, After upgrading to 0.94.2 yesterday on our test cluster, we've had 3 PGs go

Re: [ceph-users] Firefly 0.80.10 ready to upgrade to?

2015-07-13 Thread Gregory Farnum
On Mon, Jul 13, 2015 at 11:25 AM, Kostis Fardelas dante1...@gmail.com wrote: Hello, it seems that new packages for firefly have been uploaded to repo. However, I can't find any details in Ceph Release notes. There is only one thread in ceph-devel [1], but it is not clear what this new version

Re: [ceph-users] CephFS kernel client reboots on write

2015-07-13 Thread Gregory Farnum
On Mon, Jul 13, 2015 at 9:49 AM, Ilya Dryomov idryo...@gmail.com wrote: On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař jan.pe...@imatic.cz wrote: Hi all, I think I found a bug in cephfs kernel client. When I create directory in cephfs and set layout to ceph.dir.layout=stripe_unit=1073741824

Re: [ceph-users] xattrs vs omap

2015-07-14 Thread Gregory Farnum
On Tue, Jul 14, 2015 at 10:53 AM, Jan Schermer j...@schermer.cz wrote: Thank you for your reply. Comments inline. I’m still hoping to get some more input, but there are many people running ceph on ext4, and it sounds like it works pretty good out of the box. Maybe I’m overthinking this,

Re: [ceph-users] ceph daemons stucked in FUTEX_WAIT syscall

2015-07-14 Thread Gregory Farnum
with the journal ops. Simion Rad. From: Gregory Farnum [g...@gregs42.com] Sent: Tuesday, July 14, 2015 12:38 To: Simion Rad Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] ceph daemons stucked in FUTEX_WAIT syscall On Mon, Jul 13, 2015 at 11:00 PM

Re: [ceph-users] ceph daemons stucked in FUTEX_WAIT syscall

2015-07-14 Thread Gregory Farnum
On Mon, Jul 13, 2015 at 11:00 PM, Simion Rad simion@yardi.com wrote: Hi , I'm running a small cephFS ( 21 TB , 16 OSDs having different sizes between 400G and 3.5 TB ) cluster that is used as a file warehouse (both small and big files). Every day there are times when a lot of processes

Re: [ceph-users] Failures with Ceph without redundancy/replication

2015-07-16 Thread Gregory Farnum
On Thu, Jul 16, 2015 at 11:58 AM, Vedran Furač vedran.fu...@gmail.com wrote: Hello, I'm experimenting with ceph for caching, it's configured with size=1 (so no redundancy/replication) and exported via cephfs to clients, now I'm wondering what happens is an SSD dies and all of its data is

Re: [ceph-users] backing Hadoop with Ceph ??

2015-07-16 Thread Gregory Farnum
On Wed, Jul 15, 2015 at 10:50 PM, John Spray john.sp...@redhat.com wrote: On 15/07/15 16:57, Shane Gibson wrote: We are in the (very) early stages of considering testing backing Hadoop via Ceph - as opposed to HDFS. I've seen a few very vague references to doing that, but haven't found

Re: [ceph-users] ceph failure on sf.net?

2015-07-20 Thread Gregory Farnum
We responded immediately and confirmed the issue was related to filesystem corruption on our storage platform. This incident impacted all block devices on our Ceph cluster. Just guessing from that, I bet they lost power and discovered their local filesystems/disks were misconfigured to not be

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-23 Thread Gregory Farnum
On Fri, Oct 23, 2015 at 7:08 AM, Burkhard Linke <burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > > On 10/14/2015 06:32 AM, Gregory Farnum wrote: >> >> On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke >> <burkhard.li...@computational.bio.uni-gi

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
On Tue, Oct 20, 2015 at 7:22 AM, John-Paul Robinson wrote: > Hi folks > > I've been rebuilding drives in my cluster to add space. This has gone > well so far. > > After the last batch of rebuilds, I'm left with one placement group in > an incomplete state. > > [sudo] password for

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
alternative idea I had was to take osd.30 back out of the cluster so > that pg 3.ae [30,11] would get mapped to some other osd to maintain > replication. This seems a bit heavy handed though, given that only this > one pg is affected. > > Thanks for any follow up. > > ~jpr &g

Re: [ceph-users] Benchmark individual OSD's

2015-10-29 Thread Gregory Farnum
You can also extend that command line to specify specific block and total sizes. Check the help text. :) -Greg On Thursday, October 29, 2015, Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > > On 29 October 2015 at 19:24, Burkhard Linke < >

Re: [ceph-users] Our 0.94.2 OSD are not restarting : osd/PG.cc: 2856: FAILED assert(values.size() == 1)

2015-10-27 Thread Gregory Farnum
You might see if http://tracker.ceph.com/issues/13060 could apply to your cluster. If so upgrading to .94.4 should fix it. *Don't* reset your OSD journal. That is never the answer and is basically the same as trashing the OSD in question. -Greg On Tue, Oct 27, 2015 at 9:59 AM, Laurent GUERBY

Re: [ceph-users] PGs stuck in active+clean+replay

2015-10-27 Thread Gregory Farnum
On Tue, Oct 27, 2015 at 11:03 AM, Gregory Farnum <gfar...@redhat.com> wrote: > On Thu, Oct 22, 2015 at 3:58 PM, Andras Pataki > <apat...@simonsfoundation.org> wrote: >> Hi ceph users, >> >> We’ve upgraded to 0.94.4 (all ceph daemons got restarted) – an

Re: [ceph-users] PGs stuck in active+clean+replay

2015-10-27 Thread Gregory Farnum
On Tue, Oct 27, 2015 at 11:22 AM, Andras Pataki wrote: > Hi Greg, > > No, unfortunately I haven¹t found any resolution to it. We are using > cephfs, the whole installation is on 0.94.4. What I did notice is that > performance is extremely poor when backfilling is

Re: [ceph-users] PGs stuck in active+clean+replay

2015-10-27 Thread Gregory Farnum
On Thu, Oct 22, 2015 at 3:58 PM, Andras Pataki wrote: > Hi ceph users, > > We’ve upgraded to 0.94.4 (all ceph daemons got restarted) – and are in the > middle of doing some rebalancing due to crush changes (removing some disks). > During the rebalance, I see that

Re: [ceph-users] CephFS and page cache

2015-10-29 Thread Gregory Farnum
On Wed, Oct 28, 2015 at 8:38 PM, Yan, Zheng wrote: > On Thu, Oct 29, 2015 at 1:10 AM, Burkhard Linke >> I tried to dig into the ceph-fuse code, but I was unable to find the >> fragment that is responsible for flushing the data from the page cache. >> > > fuse kernel code

Re: [ceph-users] values of "ceph daemon osd.x perf dump objecters " are zero

2015-10-28 Thread Gregory Farnum
[ Removed ceph-devel ] On Wednesday, October 28, 2015, Libin Wu wrote: > Hi, all > > As my understand, command "ceph daemon osd.x perf dump objecters" should > output the perf data of osdc(librados). But when i use this command, > why all those values are zero expcept

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-21 Thread Gregory Farnum
On Tue, Oct 13, 2015 at 10:09 PM, Goncalo Borges wrote: > Hi all... > > Thank you for the feedback, and I am sorry for my delay in replying. > > 1./ Just to recall the problem, I was testing cephfs using fio in two > ceph-fuse clients: > > - Client A is in the same

Re: [ceph-users] CephFS file to rados object mapping

2015-10-21 Thread Gregory Farnum
On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont <flafdiv...@free.fr> wrote: > Hi, > > On 14/10/2015 06:45, Gregory Farnum wrote: > >>> Ok, however during my tests I had been careful to replace the correct >>> file by a bad file with *exactly* the same size (

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-22 Thread Gregory Farnum
On Thu, Oct 22, 2015 at 1:59 AM, Yan, Zheng wrote: > direct IO only bypass kernel page cache. data still can be cached in > ceph-fuse. If I'm correct, the test repeatedly writes data to 8M > files. The cache make multiple write assimilate into single OSD > write Ugh, of

Re: [ceph-users] why was osd pool default size changed from 2 to 3.

2015-10-23 Thread Gregory Farnum
On Fri, Oct 23, 2015 at 8:17 AM, Stefan Eriksson wrote: > Hi > > I have been looking for info about "osd pool default size" and the reason > its 3 as default. > > I see it got changed in v0.82 from 2 to 3, > > Here its 2. >

Re: [ceph-users] cephfs best practice

2015-10-21 Thread Gregory Farnum
On Wed, Oct 21, 2015 at 3:12 PM, Erming Pei wrote: > Hi, > > I am just wondering which use case is better: (within one single file > system) set up one data pool for each project, or let project to share a big > pool? I don't think anybody has that kind of operational

Re: [ceph-users] ceph-fuse crush

2015-10-21 Thread Gregory Farnum
On Thu, Oct 15, 2015 at 10:41 PM, 黑铁柱 wrote: > > cluster info: >cluster b23b48bf-373a-489c-821a-31b60b5b5af0 > health HEALTH_OK > monmap e1: 3 mons at > {node1=192.168.0.207:6789/0,node2=192.168.0.208:6789/0,node3=192.168.0.209:6789/0}, > election epoch 24,

Re: [ceph-users] CephFS and page cache

2015-10-21 Thread Gregory Farnum
On Sun, Oct 18, 2015 at 8:27 PM, Yan, Zheng wrote: > On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke > wrote: >> Hi, >> >> I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) >> remove files from page

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failing to respondtocapabilityrelease

2015-11-09 Thread Gregory Farnum
On Mon, Nov 9, 2015 at 6:57 AM, Burkhard Linke wrote: > Hi, > > On 11/09/2015 02:07 PM, Burkhard Linke wrote: >> >> Hi, > > *snipsnap* > >> >> >> Cluster is running Hammer 0.94.5 on top of Ubuntu 14.04. Clients use >> ceph-fuse with patches for

Re: [ceph-users] Permanent MDS restarting under load

2015-11-10 Thread Gregory Farnum
On Tue, Nov 10, 2015 at 6:32 AM, Oleksandr Natalenko wrote: > Hello. > > We have CephFS deployed over Ceph cluster (0.94.5). > > We experience constant MDS restarting under high IOPS workload (e.g. > rsyncing lots of small mailboxes from another storage to CephFS using >

Re: [ceph-users] cephfs: Client hp-s3-r4-compute failing torespondtocapabilityrelease

2015-11-10 Thread Gregory Farnum
Can you dump the metadata ops in flight on each ceph-fuse when it hangs? ceph daemon mds_requests -Greg On Mon, Nov 9, 2015 at 8:06 AM, Burkhard Linke <burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > > On 11/09/2015 04:03 PM, Gregory Farnum wrote: >> >&g

Re: [ceph-users] crush rule with two parts

2015-11-09 Thread Gregory Farnum
On Mon, Nov 9, 2015 at 9:42 AM, Deneau, Tom wrote: > I don't have much experience with crush rules but wanted one that does the > following: > > On a 3-node cluster, I wanted a rule where I could have an erasure-coded pool > of k=3,m=2 > and where the first 3 chunks (the

Re: [ceph-users] Seeing which Ceph version OSD/MON data is

2015-11-09 Thread Gregory Farnum
The daemons print this in their debug logs on every boot. (There might be a minimum debug level required, but I think it's at 0!) -Greg On Mon, Nov 9, 2015 at 7:23 AM, Wido den Hollander wrote: > Hi, > > Recently I got my hands on a Ceph cluster which was pretty damaged due > to a

Re: [ceph-users] Erasure coded pools and 'feature set mismatch' issue

2015-11-08 Thread Gregory Farnum
With that release it shouldn't be the EC pool causing trouble; it's the CRUSH tunables also mentioned in that thread. Instructions should be available in the docs for using older tunable that are compatible with kernel 3.13. -Greg On Saturday, November 7, 2015, Bogdan SOLGA

Re: [ceph-users] all pgs of erasure coded pool stuck stale

2015-11-13 Thread Gregory Farnum
Somebody else will need to do the diagnosis, but it'll help them if you can get logs with "debug ms = 1", "debug osd = 20" in the log. Based on the required features update in the crush map, it looks like maybe you've upgraded some of your OSDs — is that a thing happening right now? Perhaps you

Re: [ceph-users] Ceph object mining

2015-11-13 Thread Gregory Farnum
I think I saw somebody working on a RADOS interface to Apache Hadoop once, maybe search for that? Your other option is to try and make use of object classes directly, but that's a bit orimitive to build full map-reduce on top of without a lot of effort. -Greg On Friday, November 13, 2015, min

Re: [ceph-users] Using straw2 crush also with Hammer

2015-11-11 Thread Gregory Farnum
On Wednesday, November 11, 2015, Wido den Hollander wrote: > On 11/10/2015 09:49 PM, Vickey Singh wrote: > > On Mon, Nov 9, 2015 at 8:16 PM, Wido den Hollander > wrote: > > > >> On 11/09/2015 05:27 PM, Vickey Singh wrote: > >>> Hello Ceph Geeks > >>>

Re: [ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Gregory Farnum
On Wed, Nov 11, 2015 at 2:28 PM, Eric Eastman wrote: > On Wed, Nov 11, 2015 at 11:09 AM, John Spray wrote: >> On Wed, Nov 11, 2015 at 5:39 PM, Eric Eastman >> wrote: >>> I am trying to figure out why my Ceph file

Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Gregory Farnum
Regardless of what the crush tool does, I wouldn't muck around with the IDs of the OSDs. The rest of Celh will probably not handle it well if the crush IDs don't match the OSD numbers. -Greg On Monday, November 2, 2015, Loris Cuoghi wrote: > Le 02/11/2015 12:47, Wido den

Re: [ceph-users] Changing CRUSH map ids

2015-11-02 Thread Gregory Farnum
t's id, and testing the two > maps with : > > crushtool -i crush.map --test --show-statistics --rule 0 --num-rep 3 --min-x > 1 --max-x $N --show-mappings > > (with $N varying from as little as 32 to "big numbers"TM) shows that nearly > the 50% of the mappings chang

Re: [ceph-users] SHA1 wrt hammer release and tag v0.94.3

2015-10-30 Thread Gregory Farnum
On Fri, Oct 30, 2015 at 6:20 PM, Artie Ziff wrote: > Hello, > > In the RELEASE INFORMATION section of the hammer v0.94.3 issue tracker [1] > the git commit SHA1 is: b2503b0e15c0b13f480f0835060479717b9cf935 > > On the github page for Ceph Release v0.94.3 [2], when I click on

Re: [ceph-users] Understanding the number of TCP connections between clients and OSDs

2015-11-04 Thread Gregory Farnum
On Wed, Nov 4, 2015 at 12:27 PM, Rick Balsano wrote: > Just following up since this thread went silent after a few comments showing > similar concerns, but no explanation of the behavior. Can anyone point to > some code or documentation which explains how to estimate the expected

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Gregory Farnum
Blanc <rob...@leblancnet.us > <javascript:;>> написал(а): > > > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA256 > > > > Try: > > > > rados -p {cachepool} cache-flush-evict-all > > > > and see if the objects clean up. > > - --

Re: [ceph-users] Increased pg_num and pgp_num

2015-11-04 Thread Gregory Farnum
It shouldn't be -- if you changed pg_num then a bunch of PGs will need to move and will report in this state. We can check more thoroughly if you provide the full "Ceph -s" output. (Stuff to check for: that all PGs are active, none are degraded, etc) -Greg On Wednesday, November 4, 2015, Erming

Re: [ceph-users] data size less than 4 mb

2015-10-31 Thread Gregory Farnum
On Friday, October 30, 2015, mad Engineer wrote: > i am learning ceph,block storage and read that each object size is 4 Mb.I > am not clear about the concepts of object storage still what will happen if > the actual size of data written to block is less than 4 Mb lets

Re: [ceph-users] Soft removal of RBD images

2015-11-06 Thread Gregory Farnum
On Fri, Nov 6, 2015 at 2:03 AM, Wido den Hollander wrote: > Hi, > > Since Ceph Hammer we can protect pools from being removed from the > cluster, but we can't protect against this: > > $ rbd ls|xargs -n 1 rbd rm > > That would remove all not opened RBD images from the cluster. > >

Re: [ceph-users] osd fails to start, rbd hangs

2015-11-06 Thread Gregory Farnum
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/ :) On Friday, November 6, 2015, Philipp Schwaha wrote: > Hi, > > I have an issue with my (small) ceph cluster after an osd failed. > ceph -s reports the following: > cluster

Re: [ceph-users] rados bench leaves objects in tiered pool

2015-11-03 Thread Gregory Farnum
When you have a caching pool in writeback mode, updates to objects (including deletes) are handled by writeback rather than writethrough. Since there's no other activity against these pools, there is nothing prompting the cache pool to flush updates out to the backing pool, so the backing pool

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-13 Thread Gregory Farnum
On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke <burkhard.li...@computational.bio.uni-giessen.de> wrote: > Hi, > > On 10/08/2015 09:14 PM, John Spray wrote: >> >> On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnum <gfar...@redhat.com> wrote: >>> >>

Re: [ceph-users] CephFS file to rados object mapping

2015-10-13 Thread Gregory Farnum
On Fri, Oct 9, 2015 at 5:49 PM, Francois Lafont <flafdiv...@free.fr> wrote: > Hi, > > Thanks for your answer Greg. > > On 09/10/2015 04:11, Gregory Farnum wrote: > >> The size of the on-disk file didn't match the OSD's record of the >> object size, so it r

Re: [ceph-users] error while upgrading to infernalis last release on OSD serv

2015-10-19 Thread Gregory Farnum
As the infernalis release notes state, if you're upgrading you first need to step through the current development hammer branch or the (not-quite-release 0.94.4). -Greg On Thu, Oct 15, 2015 at 7:27 AM, German Anders wrote: > Hi all, > > I'm trying to upgrade a ceph cluster

Re: [ceph-users] Ceph journal - isn't it a bit redundant sometimes?

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 11:18 AM, Jan Schermer wrote: > I'm sorry for appearing a bit dull (on purpose), I was hoping I'd hear what > other people using Ceph think. > > If I were to use RADOS directly in my app I'd probably rejoice at its > capabilities and how useful and

Re: [ceph-users] CephFS namespace

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 3:06 PM, Erming Pei wrote: > Hi, > >Is there a way to list the namespaces in cephfs? How to set it up? > >From man page of ceph.mount, I see this: > > To mount only part of the namespace: > > mount.ceph monhost1:/some/small/thing

Re: [ceph-users] CephFS namespace

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 3:26 PM, Erming Pei wrote: > I see. That's also what I needed. > Thanks. > > Can we only allow a part of the 'namespace' or directory tree to be mounted > from server end? Just like NFS exporting? > And even setting of permissions as well? This just

Re: [ceph-users] CephFS file to rados object mapping

2015-10-08 Thread Gregory Farnum
e right copy of an object when scrubbing. If you have 3+ copies I'd recommend checking each of them and picking the one that's duplicated... -Greg > > Andras > > > On 9/29/15, 9:58 AM, "Gregory Farnum" <gfar...@redhat.com> wrote: > >>The formula for objec

Re: [ceph-users] Peering algorithm questions

2015-10-08 Thread Gregory Farnum
On Tue, Sep 29, 2015 at 12:08 AM, Balázs Kossovics wrote: > Hey! > > I'm trying to understand the peering algorithm based on [1] and [2]. There > are things that aren't really clear or I'm not entirely sure if I understood > them correctly, so I'd like to ask some

Re: [ceph-users] OSD reaching file open limit - known issues?

2015-10-08 Thread Gregory Farnum
On Fri, Sep 25, 2015 at 10:04 AM, Jan Schermer wrote: > I get that, even though I think it should be handled more gracefuly. > But is it expected to also lead to consistency issues like this? I don't think it's expected, but obviously we never reproduced it in the lab. Given

<    6   7   8   9   10   11   12   13   14   15   >