Re: [ceph-users] Two osds are spaming dmesg every 900 seconds

2014-08-26 Thread Gregory Farnum
This is being output by one of the kernel clients, and it's just saying that the connections to those two OSDs have died from inactivity. Either the other OSD connections are used a lot more, or aren't used at all. In any case, it's not a problem; just a noisy notification. There's not much you

Re: [ceph-users] Ceph-fuse fails to mount

2014-08-26 Thread Gregory Farnum
In particular, we changed things post-Firefly so that the filesystem isn't created automatically. You'll need to set it up (and its pools, etc) explicitly to use it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Aug 25, 2014 at 2:40 PM, Sean Crosby

Re: [ceph-users] MDS dying on Ceph 0.67.10

2014-08-26 Thread Gregory Farnum
I don't think the log messages you're showing are the actual cause of the failure. The log file should have a proper stack trace (with specific function references and probably a listed assert failure), can you find that? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue,

Re: [ceph-users] Fresh Firefly install degraded without modified default tunables

2014-08-26 Thread Gregory Farnum
e34: 12 osds: 12 up, 12 in pgmap v61: 832 pgs, 8 pools, 840 bytes data, 43 objects 403 MB used, 10343 MB / 10747 MB avail 43/86 objects degraded (50.000%) 832 active+degraded Thanks, Ripal On Aug 25, 2014, at 12:45 PM, Gregory Farnum g

Re: [ceph-users] Ceph-fuse fails to mount

2014-08-26 Thread Gregory Farnum
for documentation on the newer versions? (we're doing evaluations at present, so I had wanted to work with newer versions, since it would be closer to what we would end up using). -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: Tuesday, August 26, 2014 4:05

Re: [ceph-users] error ioctl(BTRFS_IOC_SNAP_CREATE) failed: (17) File exists

2014-08-27 Thread Gregory Farnum
This looks new to me. Can you try and start up the OSD with debug osd = 20 and debug filestore = 20 in your conf, then put the log somewhere accessible? (You can also use ceph-post-file if it's too large for pastebin or something.) Also, check dmesg and see if btrfs is complaining, and see what

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-27 Thread Gregory Farnum
On Tue, Aug 26, 2014 at 10:46 PM, John Morris j...@zultron.com wrote: In the docs [1], 'incomplete' is defined thusly: Ceph detects that a placement group is missing a necessary period of history from its log. If you see this state, report a bug, and try to start any failed OSDs that

Re: [ceph-users] RAID underlying a Ceph config

2014-08-28 Thread Gregory Farnum
There aren't too many people running RAID under Ceph, as it's a second layer of redundancy that in normal circumstances is a bit pointless. But there are scenarios where it might be useful. You might check the list archives for the anti-cephalopod question thread. -Greg Software Engineer #42 @

Re: [ceph-users] Ceph Filesystem - Production?

2014-08-28 Thread Gregory Farnum
On Thu, Aug 28, 2014 at 10:36 AM, Brian C. Huffman bhuff...@etinternational.com wrote: Is Ceph Filesystem ready for production servers? The documentation says it's not, but I don't see that mentioned anywhere else. http://ceph.com/docs/master/cephfs/ Everybody has their own standards, but

Re: [ceph-users] MSWin CephFS

2014-08-28 Thread Gregory Farnum
On Thu, Aug 28, 2014 at 10:41 AM, LaBarre, James (CTR) A6IT james.laba...@cigna.com wrote: Just out of curiosity, is there a way to mount a Ceph filesystem directly on a MSWindows system (2008 R2 server)? Just wanted to try something out from a VM. Nope, sorry. -Greg

Re: [ceph-users] 'incomplete' PGs: what does it mean?

2014-08-29 Thread Gregory Farnum
, 5, 7], down_osds_we_would_probe: [], peering_blocked_by: []}, { name: Started, enter_time: 2014-08-29 01:22:50.132784}]} On Wed, Aug 27, 2014 at 12:40 PM, Gregory Farnum g...@inktank.com javascript:; wrote: On Tue, Aug 26, 2014

Re: [ceph-users] question about monitor and paxos relationship

2014-08-29 Thread Gregory Farnum
On Thu, Aug 28, 2014 at 9:52 PM, pragya jain prag_2...@yahoo.co.in wrote: I have some basic question about monitor and paxos relationship: As the documents says, Ceph monitor contains cluster map, if there is any change in the state of the cluster, the change is updated in the cluster map.

Re: [ceph-users] Misdirected client messages

2014-09-03 Thread Gregory Farnum
The clients are sending messages to OSDs which are not the primary for the data. That shouldn't happen — clients which don't understand the whole osdmap ought to be gated and prevented from accessing the cluster at all. What version of Ceph are you running, and what clients? (We've seen this in

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I saw the value was the default value of 64 ceph osd

Re: [ceph-users] Updating the pg and pgp values

2014-09-08 Thread Gregory Farnum
, at 10:31 AM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 8, 2014 at 10:08 AM, JIten Shah jshah2...@me.com wrote: While checking the health of the cluster, I ran to the following error: warning: health HEALTH_WARN too few pgs per osd (1 min 20) When I checked the pg and php numbers, I

Re: [ceph-users] Delays while waiting_for_osdmap according to dump_historic_ops

2014-09-08 Thread Gregory Farnum
On Sun, Sep 7, 2014 at 4:28 PM, Alex Moore a...@lspeed.org wrote: I recently found out about the ceph --admin-daemon /var/run/ceph/ceph-osd.id.asok dump_historic_ops command, and noticed something unexpected in the output on my cluster, after checking numerous output samples... It looks to

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 1:42 AM, Francois Deppierraz franc...@ctrlaltdel.ch wrote: Hi, This issue is on a small 2 servers (44 osds) ceph cluster running 0.72.2 under Ubuntu 12.04. The cluster was filling up (a few osds near full) and I tried to increase the number of pg per pool to 1024 for

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-08 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz franc...@ctrlaltdel.ch wrote: Hi Greg, Thanks for your support! On 08. 09. 14 20:20, Gregory Farnum wrote: The first one is not caused by the same thing as the ticket you reference (it was fixed well before emperor), so it appears

Re: [ceph-users] Remaped osd at remote restart

2014-09-09 Thread Gregory Farnum
On Mon, Sep 8, 2014 at 6:33 AM, Eduard Kormann ekorm...@dunkel.de wrote: Hello, have I missed something or is it a feature: When I restart a osd on the belonging server so it restarts normally: root@cephosd10:~# service ceph restart osd.76 === osd.76 === === osd.76 === Stopping Ceph

Re: [ceph-users] max_bucket limit -- safe to disable?

2014-09-09 Thread Gregory Farnum
On Tue, Sep 9, 2014 at 9:11 AM, Daniel Schneller daniel.schnel...@centerdevice.com wrote: Hi list! Under http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-September/033670.html I found a situation not unlike ours, but unfortunately either the list archive fails me or the discussion

Re: [ceph-users] max_bucket limit -- safe to disable?

2014-09-10 Thread Gregory Farnum
On Wednesday, September 10, 2014, Daniel Schneller daniel.schnel...@centerdevice.com wrote: On 09 Sep 2014, at 21:43, Gregory Farnum g...@inktank.com javascript:_e(%7B%7D,'cvml','g...@inktank.com'); wrote: Yehuda can talk about this with more expertise than I can, but I think it should

Re: [ceph-users] CephFS roadmap (was Re: NAS on RBD)

2014-09-10 Thread Gregory Farnum
On Tue, Sep 9, 2014 at 6:10 PM, Blair Bethwaite blair.bethwa...@gmail.com wrote: Hi Sage, Thanks for weighing into this directly and allaying some concerns. It would be good to get a better understanding about where the rough edges are - if deployers have some knowledge of those then they

Re: [ceph-users] why one osd-op from client can get two osd-op-reply?

2014-09-10 Thread Gregory Farnum
any processing of the PG, it requires all participating members to respond before it sends any messages back to the client. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com greg, thanks very much 在 2014-09-11 01:36:39,Gregory Farnum g...@inktank.com 写道: The important bit

Re: [ceph-users] osd cpu usage is bigger than 100%

2014-09-11 Thread Gregory Farnum
Presumably it's going faster when you have a deeper iodepth? So the reason it's using more CPU is because it's doing more work. That's all there is to it. (And the OSD uses a lot more CPU than some storage systems do, because it does a lot more work than them.) -Greg On Thursday, September 11,

Re: [ceph-users] why one osd-op from client can get two osd-op-reply?

2014-09-11 Thread Gregory Farnum
At 2014-09-11 12:19:18, Gregory Farnum g...@inktank.com javascript:_e(%7B%7D,'cvml','g...@inktank.com'); wrote: On Wed, Sep 10, 2014 at 8:29 PM, yuelongguang fasts...@163.com javascript:_e(%7B%7D,'cvml','fasts...@163.com'); wrote: as for ack and ondisk, ceph has size and min_size to decide

Re: [ceph-users] Cephfs upon Tiering

2014-09-11 Thread Gregory Farnum
On Thu, Sep 11, 2014 at 4:13 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I am testing the tiering functionality with cephfs. I used a replicated cache with an EC data pool, and a replicated metadata pool like this: ceph osd pool create cache 1024 1024 ceph osd pool set

Re: [ceph-users] Cephfs upon Tiering

2014-09-11 Thread Gregory Farnum
On Thu, Sep 11, 2014 at 11:39 AM, Sage Weil sw...@redhat.com wrote: On Thu, 11 Sep 2014, Gregory Farnum wrote: On Thu, Sep 11, 2014 at 4:13 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: Hi all, I am testing the tiering functionality with cephfs. I used a replicated cache

Re: [ceph-users] Upgraded now MDS won't start

2014-09-11 Thread Gregory Farnum
On Wed, Sep 10, 2014 at 4:24 PM, McNamara, Bradley bradley.mcnam...@seattle.gov wrote: Hello, This is my first real issue since running Ceph for several months. Here's the situation: I've been running an Emperor cluster for several months. All was good. I decided to upgrade since I'm

[ceph-users] Cephfs upon Tiering

2014-09-12 Thread Gregory Farnum
upon Tiering To: Gregory Farnum g...@inktank.com javascript:; Cc: Kenneth Waegeman kenneth.waege...@ugent.be javascript:;, ceph-users ceph-users@lists.ceph.com javascript:; On Thu, 11 Sep 2014, Gregory Farnum wrote: On Thu, Sep 11, 2014 at 11:39 AM, Sage Weil sw...@redhat.com

Re: [ceph-users] Showing package loss in ceph main log

2014-09-12 Thread Gregory Farnum
Ceph messages are transmitted using tcp, so the system isn't directly aware of packet loss at any level. I suppose we could try and export messenger reconnect counts via the admin socket, but that'd be a very noisy measure -- it seems simplest to just query the OS or hardware directly? -Greg On

Re: [ceph-users] a question regarding sparse file

2014-09-12 Thread Gregory Farnum
On Fri, Sep 12, 2014 at 9:26 AM, brandon li brandon.li@gmail.com wrote: Hi, I am new to ceph file system, and have got a newbie question: For a sparse file, how could ceph file system know the hole in the file was never created or some stripe was just simply lost? CephFS does not keep

Re: [ceph-users] CephFS : rm file does not remove object in rados

2014-09-12 Thread Gregory Farnum
On Fri, Sep 12, 2014 at 6:49 AM, Florent Bautista flor...@coppint.com wrote: Hi all, Today I have a problem using CephFS. I use firefly last release, with kernel 3.16 client (Debian experimental). I have a directory in CephFS, associated to a pool pool2 (with set_layout). All is working

Re: [ceph-users] osd crash: trim_objectcould not find coid

2014-09-12 Thread Gregory Farnum
00:23, Gregory Farnum wrote: On Mon, Sep 8, 2014 at 2:53 PM, Francois Deppierraz franc...@ctrlaltdel.ch wrote: Hi Greg, Thanks for your support! On 08. 09. 14 20:20, Gregory Farnum wrote: The first one is not caused by the same thing as the ticket you reference (it was fixed well before

Re: [ceph-users] Removing MDS

2014-09-12 Thread Gregory Farnum
You can turn off the MDS and create a new FS in new pools. The ability to shut down a filesystem more completely is coming in Giant. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Sep 12, 2014 at 1:16 PM, LaBarre, James (CTR) A6IT james.laba...@cigna.com wrote:

Re: [ceph-users] why no likely() and unlikely() used in Ceph's source code?

2014-09-15 Thread Gregory Farnum
I don't know where the file came from, but likely/unlikely markers are the kind of micro-optimization that isn't worth the cost in Ceph dev resources right now. -Greg On Monday, September 15, 2014, Tim Zhang cofol1...@gmail.com wrote: Hey guys, After reading ceph source code, I find that there

Re: [ceph-users] Dumpling cluster can't resolve peering failures, ceph pg query blocks, auth failures in logs

2014-09-15 Thread Gregory Farnum
Not sure, but have you checked the clocks on their nodes? Extreme clock drift often results in strange cephx errors. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Sep 14, 2014 at 11:03 PM, Florian Haas flor...@hastexo.com wrote: Hi everyone, [Keeping this on the

Re: [ceph-users] OSD troubles on FS+Tiering

2014-09-15 Thread Gregory Farnum
The pidfile bug is already fixed in master/giant branches. As for the crashing, I'd try killing all the osd processes and turning them back on again. It might just be some daemon restart failed, or your cluster could be sufficiently overloaded that the node disks are going unresponsive and

Re: [ceph-users] Cephfs upon Tiering

2014-09-15 Thread Gregory Farnum
of having a EC backed pool fronted by a replicated cache for use with cephfs. Thanks, Berant On Fri, Sep 12, 2014 at 12:37 PM, Gregory Farnum g...@inktank.com wrote: On Fri, Sep 12, 2014 at 1:53 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: - Message from Sage Weil sw

Re: [ceph-users] does CephFS still have no fsck utility?

2014-09-15 Thread Gregory Farnum
On Mon, Sep 15, 2014 at 3:23 PM, brandon li brandon.li@gmail.com wrote: If it's true, is there any other tools I can use to check and repair the file system? Not much, no. That said, you shouldn't really need an fsck unless the underlying RADOS store went through some catastrophic event. Is

Re: [ceph-users] does CephFS still have no fsck utility?

2014-09-15 Thread Gregory Farnum
15, 2014 at 3:49 PM, Gregory Farnum g...@inktank.com wrote: On Mon, Sep 15, 2014 at 3:23 PM, brandon li brandon.li@gmail.com wrote: If it's true, is there any other tools I can use to check and repair the file system? Not much, no. That said, you shouldn't really need an fsck unless

Re: [ceph-users] OSD troubles on FS+Tiering

2014-09-16 Thread Gregory Farnum
@ http://inktank.com | http://ceph.com On Tue, Sep 16, 2014 at 5:28 AM, Kenneth Waegeman kenneth.waege...@ugent.be wrote: - Message from Gregory Farnum g...@inktank.com - Date: Mon, 15 Sep 2014 10:37:07 -0700 From: Gregory Farnum g...@inktank.com Subject: Re: [ceph-users

Re: [ceph-users] does CephFS still have no fsck utility?

2014-09-16 Thread Gregory Farnum
http://tracker.ceph.com/issues/4137 contains links to all the tasks we have so far. You can also search any of the ceph-devel list archives for forward scrub. On Mon, Sep 15, 2014 at 10:16 PM, brandon li brandon.li@gmail.com wrote: Great to know you are working on it! I am new to the

Re: [ceph-users] what are these files for mon?

2014-09-16 Thread Gregory Farnum
Greg, just picked up this one from the archive while researching a different issue and thought I'd follow up. On Tue, Aug 19, 2014 at 6:24 PM, Gregory Farnum g...@inktank.com javascript:; wrote: The sst files are files used by leveldb to store its data; you cannot remove them. Are you

Re: [ceph-users] Mount ceph block device over specific NIC

2014-09-16 Thread Gregory Farnum
Assuming you're using the kernel? In any case, Ceph generally doesn't do anything to select between different NICs; it just asks for a connection to a given IP. So you should just be able to set up a route for that IP. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue,

Re: [ceph-users] Still seing scrub errors in .80.5

2014-09-16 Thread Gregory Farnum
On Tue, Sep 16, 2014 at 12:03 AM, Marc m...@shoowin.de wrote: Hello fellow cephalopods, every deep scrub seems to dig up inconsistencies (i.e. scrub errors) that we could use some help with diagnosing. I understand there used to be a data corruption issue before .80.3 so we made sure that

Re: [ceph-users] Still seing scrub errors in .80.5

2014-09-16 Thread Gregory Farnum
... See the thread firefly scrub error. Cheers, Dan From: Gregory Farnum g...@inktank.com Sent: Sep 16, 2014 8:15 PM To: Marc Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Still seing scrub errors in .80.5 On Tue, Sep 16, 2014 at 12:03 AM, Marc m...@shoowin.de wrote: Hello

Re: [ceph-users] Packages for 0.85?

2014-09-16 Thread Gregory Farnum
Thanks for the poke; looks like something went wrong during the release build last week. We're investigating now. -Greg On Tue, Sep 16, 2014 at 11:08 AM, Daniel Swarbrick daniel.swarbr...@profitbricks.com wrote: Hi, I saw that the development snapshot 0.85 was released last week, and have

Re: [ceph-users] Replication factor of 50 on a 1000 OSD node cluster

2014-09-16 Thread Gregory Farnum
On Tue, Sep 16, 2014 at 5:10 PM, JIten Shah jshah2...@me.com wrote: Hi Guys, We have a cluster with 1000 OSD nodes and 5 MON nodes and 1 MDS node. In order to be able to loose quite a few OSD’s and still survive the load, we were thinking of making the replication factor to 50. Is that

Re: [ceph-users] Replication factor of 50 on a 1000 OSD node cluster

2014-09-16 Thread Gregory Farnum
On Sep 16, 2014, at 5:35 PM, Gregory Farnum g...@inktank.com wrote: On Tue, Sep 16, 2014 at 5:10 PM, JIten Shah jshah2...@me.com wrote: Hi Guys, We have a cluster with 1000 OSD nodes and 5 MON nodes and 1 MDS node. In order to be able to loose quite a few OSD’s and still survive the load, we

Re: [ceph-users] [Ceph-community] Can't Start-up MDS

2014-09-17 Thread Gregory Farnum
That looks like the beginning of an mds creation to me. What's your problem in more detail, and what's the output of ceph -s? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 15, 2014 at 5:34 PM, Shun-Fa Yang shu...@gmail.com wrote: Hi all, I'm installed ceph v

Re: [ceph-users] ceph mds unable to start with 0.85

2014-09-18 Thread Gregory Farnum
On Wed, Sep 17, 2014 at 9:59 PM, 廖建锋 de...@f-club.cn wrote: dear, my ceph cluster worked for about two weeks, mds crashed every 2-3 days, Now it stuck on replay , looks like replay crash and restart mds process again what can i do for this? 1015 = # ceph -s cluster

Re: [ceph-users] Still seing scrub errors in .80.5

2014-09-18 Thread Gregory Farnum
On Thu, Sep 18, 2014 at 3:09 AM, Marc m...@shoowin.de wrote: Hi, we did run a deep scrub on everything yesterday, and a repair afterwards. Then a new deep scrub today, which brought new scrub errors. I did check the osd config, they report filestore_xfs_extsize: false, as it should be if I

Re: [ceph-users] CephFS : rm file does not remove object in rados

2014-09-18 Thread Gregory Farnum
On Thu, Sep 18, 2014 at 10:39 AM, Florent B flor...@coppint.com wrote: On 09/12/2014 07:38 PM, Gregory Farnum wrote: On Fri, Sep 12, 2014 at 6:49 AM, Florent Bautista flor...@coppint.com wrote: Hi all, Today I have a problem using CephFS. I use firefly last release, with kernel 3.16 client

Re: [ceph-users] [Ceph-community] Can't Start-up MDS

2014-09-18 Thread Gregory Farnum
1:22 GMT+08:00 Gregory Farnum g...@inktank.com: That looks like the beginning of an mds creation to me. What's your problem in more detail, and what's the output of ceph -s? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Sep 15, 2014 at 5:34 PM, Shun-Fa Yang shu

Re: [ceph-users] ceph mds unable to start with 0.85

2014-09-18 Thread Gregory Farnum
would you like to log into the server to check? From: Gregory Farnum Date: 2014-09-19 02:33 To: 廖建锋 CC: ceph-users Subject: Re: [ceph-users] ceph mds unable to start with 0.85 On Wed, Sep 17, 2014 at 9:59 PM, 廖建锋 de...@f-club.cn wrote: dear, my ceph cluster worked for about two

Re: [ceph-users] Renaming pools used by CephFS

2014-09-19 Thread Gregory Farnum
On Fri, Sep 19, 2014 at 10:21 AM, Jeffrey Ollie j...@ocjtech.us wrote: I've got a Ceph system (running 0.80.5) at home that I've been messing around with, partly to learn Ceph, but also as reliable storage for all of my media. During the process I deleted the data and metadata pools used by

Re: [ceph-users] Reassigning admin server

2014-09-23 Thread Gregory Farnum
On Mon, Sep 22, 2014 at 1:22 PM, LaBarre, James (CTR) A6IT james.laba...@cigna.com wrote: If I have a machine/VM I am using as an Admin node for a ceph cluster, can I relocate that admin to another machine/VM after I’ve built a cluster? I would expect as the Admin isn’t an actual

Re: [ceph-users] pgs stuck in active+clean+replay state

2014-09-25 Thread Gregory Farnum
I imagine you aren't actually using the data/metadata pool that these PGs are in, but it's a previously-reported bug we haven't identified: http://tracker.ceph.com/issues/8758 They should go away if you restart the OSDs that host them (or just remove those pools), but it's not going to hurt

Re: [ceph-users] PG stuck creating

2014-09-30 Thread Gregory Farnum
rebuilt the primary OSD (29) in the hopes it would unblock whatever it was, but no luck. I'll check the admin socket and see if there is anything I can find there. On Tue, Sep 30, 2014 at 10:36 AM, Gregory Farnum g...@inktank.com wrote: On Tuesday, September 30, 2014, Robert LeBlanc rob

Re: [ceph-users] PG stuck creating

2014-09-30 Thread Gregory Farnum
On Tuesday, September 30, 2014, Robert LeBlanc rob...@leblancnet.us wrote: On our dev cluster, I've got a PG that won't create. We had a host fail with 10 OSDs that needed to be rebuilt. A number of other OSDs were down for a few days (did I mention this was a dev cluster?). The other OSDs

Re: [ceph-users] Why performance of benchmarks with small blocks is extremely small?

2014-10-01 Thread Gregory Farnum
On Wed, Oct 1, 2014 at 5:24 AM, Andrei Mikhailovsky and...@arhont.com wrote: Timur, As far as I know, the latest master has a number of improvements for ssd disks. If you check the mailing list discussion from a couple of weeks back, you can see that the latest stable firefly is not that well

Re: [ceph-users] Why performance of benchmarks with small blocks is extremely small?

2014-10-01 Thread Gregory Farnum
| http://ceph.com On Wed, Oct 1, 2014 at 7:07 AM, Andrei Mikhailovsky and...@arhont.com wrote: Greg, are they going to be a part of the next stable release? Cheers From: Gregory Farnum g...@inktank.com To: Andrei Mikhailovsky and...@arhont.com Cc: Timur

Re: [ceph-users] Why performance of benchmarks with small blocks is extremely small?

2014-10-01 Thread Gregory Farnum
On Wed, Oct 1, 2014 at 9:21 AM, Mark Nelson mark.nel...@inktank.com wrote: On 10/01/2014 11:18 AM, Gregory Farnum wrote: All the stuff I'm aware of is part of the testing we're doing for Giant. There is probably ongoing work in the pipeline, but the fast dispatch, sharded work queues

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-07 Thread Gregory Farnum
16:43 Aan: Gregory Farnum Onderwerp: RE: [ceph-users] mds isn't working anymore after osd's running full I did restart it but you are right about the epoch number which has changed but the situation looks the same. 2014-08-21 16:33:06.032366 7f9b5f3cd700 1 mds.0.27 need osdmap epoch 1994

Re: [ceph-users] accept: got bad authorizer

2014-10-08 Thread Gregory Farnum
Check your clock sync on that node. That's the usual cause of this issue. -Greg On Wednesday, October 8, 2014, Nathan Stratton nat...@robotics.net wrote: I have one out of 16 of my OSDs doing something odd. The logs show some sort of authentication issue. If I restart the OSD things are fine,

Re: [ceph-users] Regarding Primary affinity configuration

2014-10-09 Thread Gregory Farnum
On Thu, Oct 9, 2014 at 10:55 AM, Johnu George (johnugeo) johnu...@cisco.com wrote: Hi All, I have few questions regarding the Primary affinity. In the original blueprint (https://wiki.ceph.com/Planning/Blueprints/Firefly/osdmap%3A_primary_role_affinity ), one example has been

Re: [ceph-users] Regarding Primary affinity configuration

2014-10-09 Thread Gregory Farnum
On Thu, Oct 9, 2014 at 4:24 PM, Johnu George (johnugeo) johnu...@cisco.com wrote: Hi Greg, Thanks for your extremely informative post. My related questions are posted inline On 10/9/14, 2:21 PM, Gregory Farnum g...@inktank.com wrote: On Thu, Oct 9, 2014 at 10:55 AM, Johnu George

Re: [ceph-users] Blueprints

2014-10-09 Thread Gregory Farnum
On Thu, Oct 9, 2014 at 4:01 PM, Robert LeBlanc rob...@leblancnet.us wrote: I have a question regarding submitting blueprints. Should only people who intend to do the work of adding/changing features of Ceph submit blueprints? I'm not primarily a programmer (but can do programming if needed),

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-10 Thread Gregory Farnum
into the log. I attached the log to this email. I'm looking forward for the new release because it would be nice to have more possibilities to diagnose problems. Kind regards, Jasper Siero Van: Gregory Farnum [g...@inktank.com] Verzonden: dinsdag 7

Re: [ceph-users] ceph tell osd.6 version : hang

2014-10-12 Thread Gregory Farnum
On Sun, Oct 12, 2014 at 7:46 AM, Loic Dachary l...@dachary.org wrote: Hi, On a 0.80.6 cluster the command ceph tell osd.6 version hangs forever. I checked that it establishes a TCP connection to the OSD, raised the OSD debug level to 20 and I do not see

Re: [ceph-users] ceph tell osd.6 version : hang

2014-10-12 Thread Gregory Farnum
On Sun, Oct 12, 2014 at 9:10 AM, Loic Dachary l...@dachary.org wrote: On 12/10/2014 17:48, Gregory Farnum wrote: On Sun, Oct 12, 2014 at 7:46 AM, Loic Dachary l...@dachary.org wrote: Hi, On a 0.80.6 cluster the command ceph tell osd.6 version hangs forever. I checked that it establishes

Re: [ceph-users] ceph tell osd.6 version : hang

2014-10-12 Thread Gregory Farnum
On Sun, Oct 12, 2014 at 9:29 AM, Loic Dachary l...@dachary.org wrote: On 12/10/2014 18:22, Gregory Farnum wrote: On Sun, Oct 12, 2014 at 9:10 AM, Loic Dachary l...@dachary.org wrote: On 12/10/2014 17:48, Gregory Farnum wrote: On Sun, Oct 12, 2014 at 7:46 AM, Loic Dachary l...@dachary.org

Re: [ceph-users] Handling of network failures in the cluster network

2014-10-13 Thread Gregory Farnum
On Mon, Oct 13, 2014 at 11:32 AM, Martin Mailand mar...@tuxadero.com wrote: Hi List, I have a ceph cluster setup with two networks, one for public traffic and one for cluster traffic. Network failures in the public network are handled quite well, but network failures in the cluster network

Re: [ceph-users] Misconfigured caps on client.admin key, anyway to recover from EAESS denied?

2014-10-13 Thread Gregory Farnum
On Mon, Oct 13, 2014 at 4:04 PM, Wido den Hollander w...@42on.com wrote: On 14-10-14 00:53, Anthony Alba wrote: Following the manual starter guide, I set up a Ceph cluster with HEALTH_OK, (1 mon, 2 osd). In testing out auth commands I misconfigured the client.admin key by accidentally deleting

Re: [ceph-users] Ceph OSD very slow startup

2014-10-14 Thread Gregory Farnum
On Monday, October 13, 2014, Lionel Bouton lionel+c...@bouton.name wrote: Hi, # First a short description of our Ceph setup You can skip to the next section (Main questions) to save time and come back to this one if you need more context. We are currently moving away from DRBD-based

Re: [ceph-users] Misconfigured caps on client.admin key, anyway to recover from EAESS denied?

2014-10-14 Thread Gregory Farnum
On Monday, October 13, 2014, Anthony Alba ascanio.al...@gmail.com wrote: You can disable cephx completely, fix the key and enable cephx again. auth_cluster_required, auth_service_required and auth_client_required That did not work: i.e disabling cephx in the cluster conf and restarting

Re: [ceph-users] Handling of network failures in the cluster network

2014-10-14 Thread Gregory Farnum
(97dcc0539dfa7dac3de74852305d51580b7b1f82). On 13.10.2014 21:45, Gregory Farnum wrote: How did you test taking down the connection? What config options have you specified on the OSDs and in the monitor? None of the scenarios you're describing make much sense on a semi-recent (post-dumpling-release) version of Ceph

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-14 Thread Gregory Farnum
: Gregory Farnum [g...@inktank.com] Verzonden: vrijdag 10 oktober 2014 23:45 Aan: Jasper Siero CC: ceph-users Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full Ugh, debug journaler, not debug journaled. That said, the filer output tells me that you're missing

Re: [ceph-users] Firefly maintenance release schedule

2014-10-15 Thread Gregory Farnum
On Wed, Oct 15, 2014 at 9:39 AM, Dmitry Borodaenko dborodae...@mirantis.com wrote: On Tue, Sep 30, 2014 at 6:49 PM, Dmitry Borodaenko dborodae...@mirantis.com wrote: Last stable Firefly release (v0.80.5) was tagged on July 29 (over 2 months ago). Since then, there were twice as many commits

Re: [ceph-users] Performance doesn't scale well on a full ssd cluster.

2014-10-16 Thread Gregory Farnum
can reach the peak. The client is fio and also running on osd nodes. But there're no bottlenecks on cpu or network. I also tried running client on two non osd servers, but the same result. 2014 年 10 月 17 日 上午 12:29于 Gregory Farnum g...@inktank.com写道: If you're running a single client to drive

Re: [ceph-users] why the erasure code pool not support random write?

2014-10-20 Thread Gregory Farnum
This is a common constraint in many erasure coding storage system. It arises because random writes turn into a read-modify-write cycle (in order to redo the parity calculations). So we simply disallow them in EC pools, which works fine for the target use cases right now. -Greg On Monday, October

Re: [ceph-users] Ceph OSD very slow startup

2014-10-20 Thread Gregory Farnum
On Mon, Oct 20, 2014 at 8:25 AM, Lionel Bouton lionel+c...@bouton.name wrote: Hi, More information on our Btrfs tests. Le 14/10/2014 19:53, Lionel Bouton a écrit : Current plan: wait at least a week to study 3.17.0 behavior and upgrade the 3.12.21 nodes to 3.17.0 if all goes well.

Re: [ceph-users] CRUSH depends on host + OSD?

2014-10-21 Thread Gregory Farnum
On Tuesday, October 21, 2014, Chad Seys cws...@physics.wisc.edu wrote: Hi Craig, It's part of the way the CRUSH hashing works. Any change to the CRUSH map causes the algorithm to change slightly. Dan@cern could not replicate my observations, so I plan to follow his procedure (fake

Re: [ceph-users] Question/idea about performance problems with a few overloaded OSDs

2014-10-21 Thread Gregory Farnum
On Tue, Oct 21, 2014 at 10:15 AM, Lionel Bouton lionel+c...@bouton.name wrote: Hi, I've yet to install 0.80.7 on one node to confirm its stability and use the new IO prirority tuning parameters enabling prioritized access to data from client requests. In the meantime, faced with large

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Gregory Farnum
Are these tests conducted using a local fs on RBD, or using CephFS? If CephFS, do you have multiple clients mounting the FS, and what are they doing? What client (kernel or ceph-fuse)? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Oct 21, 2014 at 9:05 AM, Sergey

Re: [ceph-users] Extremely slow small files rewrite performance

2014-10-21 Thread Gregory Farnum
, kernel 3.2.0-4-amd64. On Tue, Oct 21, 2014 at 1:44 PM, Gregory Farnum g...@inktank.com wrote: Are these tests conducted using a local fs on RBD, or using CephFS? If CephFS, do you have multiple clients mounting the FS, and what are they doing? What client (kernel or ceph-fuse)? -Greg

Re: [ceph-users] Fio rbd stalls during 4M reads

2014-10-24 Thread Gregory Farnum
There's an issue in master branch temporarily that makes rbd reads greater than the cache size hang (if the cache was on). This might be that. (Jason is working on it: http://tracker.ceph.com/issues/9854) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Oct 23, 2014 at

Re: [ceph-users] error when executing ceph osd pool set foo-hot cache-mode writeback

2014-10-28 Thread Gregory Farnum
On Tue, Oct 28, 2014 at 3:24 AM, Cristian Falcas cristi.fal...@gmail.com wrote: Hello, In the documentation about creating an cache pool, you find this: Cache mode The most important policy is the cache mode: ceph osd pool set foo-hot cache-mode writeback But when trying to run the

Re: [ceph-users] Adding a monitor to

2014-10-28 Thread Gregory Farnum
On Mon, Oct 27, 2014 at 11:37 AM, Patrick Darley patrick.dar...@codethink.co.uk wrote: Hi there Over the last week or so, I've been trying to connect a ceph monitor node running on a baserock system to connect to a simple 3-node ubuntu ceph cluster. The 3 node ubunutu cluster was created by

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-28 Thread Gregory Farnum
Van: john.sp...@inktank.com [john.sp...@inktank.com] namens John Spray [john.sp...@redhat.com] Verzonden: donderdag 16 oktober 2014 12:23 Aan: Jasper Siero CC: Gregory Farnum; ceph-users Onderwerp: Re: [ceph-users] mds isn't working anymore after osd's running full

Re: [ceph-users] Adding a monitor to

2014-10-28 Thread Gregory Farnum
don't need to run it from the new monitor, so if you're having trouble getting the keys to behave I'd just run it from an existing system. :) -Greg On Tue, Oct 28, 2014 at 10:11 AM, Patrick Darley patrick.dar...@codethink.co.uk wrote: On 2014-10-28 16:08, Gregory Farnum wrote: On Mon, Oct 27

Re: [ceph-users] Troubleshooting Incomplete PGs

2014-10-28 Thread Gregory Farnum
On Thu, Oct 23, 2014 at 6:41 AM, Chris Kitzmiller ckitzmil...@hampshire.edu wrote: On Oct 22, 2014, at 8:22 PM, Craig Lewis wrote: Shot in the dark: try manually deep-scrubbing the PG. You could also try marking various osd's OUT, in an attempt to get the acting set to include osd.25 again,

Re: [ceph-users] Adding a monitor to

2014-10-29 Thread Gregory Farnum
[Re-adding the list, so this is archived for future posterity.] On Wed, Oct 29, 2014 at 6:11 AM, Patrick Darley patrick.dar...@codethink.co.uk wrote: Thanks again for the reply Greg! On 2014-10-28 17:39, Gregory Farnum wrote: I'm sorry, you're right — I misread it. :( No worries, I had

Re: [ceph-users] mds isn't working anymore after osd's running full

2014-10-29 Thread Gregory Farnum
On Wed, Oct 29, 2014 at 7:51 AM, Jasper Siero jasper.si...@target-holding.nl wrote: Hello Greg, I added the debug options which you mentioned and started the process again: [root@th1-mon001 ~]# /usr/bin/ceph-mds -i th1-mon001 --pid-file /var/run/ceph/mds.th1-mon001.pid -c

Re: [ceph-users] Delete pools with low priority?

2014-10-29 Thread Gregory Farnum
Dan (who wrote that slide deck) is probably your best bet here, but I believe pool deletion is not very configurable and fairly expensive right now. I suspect that it will get better in Hammer or Infernalis, once we have a unified op work queue that we can independently prioritize all IO through

Re: [ceph-users] Crash with rados cppool and snapshots

2014-10-29 Thread Gregory Farnum
On Wed, Oct 29, 2014 at 7:49 AM, Daniel Schneller daniel.schnel...@centerdevice.com wrote: Hi! We are exploring options to regularly preserve (i.e. backup) the contents of the pools backing our rados gateways. For that we create nightly snapshots of all the relevant pools when there is no

Re: [ceph-users] Swift + radosgw: How do I find accounts/containers/objects limitation?

2014-10-31 Thread Gregory Farnum
On Fri, Oct 31, 2014 at 9:55 AM, Narendra Trivedi (natrived) natri...@cisco.com wrote: Hi All, I have been working with Openstack Swift + radosgw to stress the whole object storage from the Swift side (I have been creating containers and objects for days now) but can’t actually find the

Re: [ceph-users] Swift + radosgw: How do I find accounts/containers/objects limitation?

2014-10-31 Thread Gregory Farnum
user has been configured? --Narendra -Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Friday, October 31, 2014 11:58 AM To: Narendra Trivedi (natrived) Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Swift + radosgw: How do I find accounts/containers

Re: [ceph-users] giant release osd down

2014-11-02 Thread Gregory Farnum
What happened when you did the OSD prepare and activate steps? Since your OSDs are either not running or can't communicate with the monitors, there should be some indication from those steps. -Greg On Sun, Nov 2, 2014 at 6:44 AM Shiv Raj Singh virk.s...@gmail.com wrote: Hi All I am new to ceph

Re: [ceph-users] emperor - firefly 0.80.7 upgrade problem

2014-11-03 Thread Gregory Farnum
On Mon, Nov 3, 2014 at 7:46 AM, Chad Seys cws...@physics.wisc.edu wrote: Hi All, I upgraded from emperor to firefly. Initial upgrade went smoothly and all placement groups were active+clean . Next I executed 'ceph osd crush tunables optimal' to upgrade CRUSH mapping. Okay...you know

  1   2   3   4   5   6   7   8   9   10   >