Re: github pull requests, comments and rebase

2013-08-14 Thread Gregory Farnum
On Thu, Aug 8, 2013 at 1:46 PM, Sage Weil s...@inktank.com wrote: On Thu, 8 Aug 2013, Loic Dachary wrote: Hi Sage, During the discussions about continuous integration at the CDS this week ( http://youtu.be/cGosx5zD4FM?t=1h16m05s ) you mentionned that github was able to keep track of the

Re: cephfs set_layout - EINVAL - solved

2013-08-14 Thread Gregory Farnum
On Fri, Aug 9, 2013 at 2:03 AM, Kasper Dieter dieter.kas...@ts.fujitsu.com wrote: OK, I found this nice page: http://ceph.com/docs/next/dev/file-striping/ which explains --stripe_unit --stripe_count --object_size But still I'm not sure about (1) what is the equivalent command on cephfs to

Re: cephfs set_layout - tuning

2013-08-14 Thread Gregory Farnum
On Wed, Aug 14, 2013 at 1:38 PM, Kasper Dieter dieter.kas...@ts.fujitsu.com wrote: On Wed, Aug 14, 2013 at 10:17:24PM +0200, Gregory Farnum wrote: On Fri, Aug 9, 2013 at 2:03 AM, Kasper Dieter dieter.kas...@ts.fujitsu.com wrote: OK, I found this nice page: http://ceph.com/docs/next/dev/file

Re: [ceph-users] Flapping osd / continuously reported as failed

2013-08-19 Thread Gregory Farnum
On Fri, Aug 16, 2013 at 5:47 AM, Mostowiec Dominik dominik.mostow...@grupaonet.pl wrote: Hi, Thanks for your response. It's possible, as deep scrub in particular will add a bit of load (it goes through and compares the object contents). It is possible that the scrubbing blocks access(RW or

Re: [ceph-users] Flapping osd / continuously reported as failed

2013-08-19 Thread Gregory Farnum
On Mon, Aug 19, 2013 at 3:09 PM, Mostowiec Dominik dominik.mostow...@grupaonet.pl wrote: Hi, Yes, it definitely can as scrubbing takes locks on the PG, which will prevent reads or writes while the message is being processed (which will involve the rgw index being scanned). It is possible to

Re: [PATCH] enable mds rejoin with active inodes' old parent xattrs

2013-08-23 Thread Gregory Farnum
On Fri, Aug 23, 2013 at 4:00 AM, Alexandre Oliva ol...@gnu.org wrote: On Aug 22, 2013, Yan, Zheng uker...@gmail.com wrote: This is not bug. Only the tail entry of the path encoded in the parent xattrs need to be updated. (the entry for inode's parent directory) Why store the others, if

Re: [PATCH 2/2] client: trim deleted inode

2013-08-23 Thread Gregory Farnum
Looks like this patch hasn't been merged in yet, although its partner to make the MDS notify about deleted inodes was. Any particular reason, or just still waiting for review? :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sat, Jul 20, 2013 at 7:21 PM, Yan, Zheng

Re: [PATCH 3/3] ceph: rework trim caps code

2013-08-23 Thread Gregory Farnum
Did this patch get dropped on purpose? I also don't see it in our testing branch. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Sun, Aug 4, 2013 at 11:10 PM, Yan, Zheng zheng.z@intel.com wrote: From: Yan, Zheng zheng.z@intel.com The trim caps code that handles

Re: [PATCH] mds: remove waiting lock before merging with neighbours

2013-08-23 Thread Gregory Farnum
Hi David, I'm really sorry it took us so long to get back to you on this. :( However, I've reviewed the patch and, apart from going over the code making me want to strangle myself for structuring it that way, everything looks good. I changed the last paragraph in the commit message very slightly

Re: Questions about mds locks

2013-08-29 Thread Gregory Farnum
On Wed, Aug 28, 2013 at 4:41 PM, 袁冬 yuandong1...@gmail.com wrote: Hello, everyone. I have some questions about mds locks. I search google and read almost all Sage's papers, but I found no details about mds locks. :( Unfortunately these encompass some of the most complicated and least

Re: Questions about mds locks

2013-08-29 Thread Gregory Farnum
On Thu, Aug 29, 2013 at 6:33 PM, Dong Yuan yuandong1...@gmail.com wrote: It seems that different lock item uses different class with different state machine for different MDRequest process. :) Maybe I should concentrate on a particular lock item first, Can you give me some suggest?

Re: How Might a Full-Text Searching Capability be Integrated with Ceph?

2013-08-29 Thread Gregory Farnum
On Wed, Aug 28, 2013 at 9:56 PM, Kevin Frey kevin.f...@internode.net.au wrote: Hello All, This is my first post to the list, and my question is very general to encourage discussion (perhaps derision). I am the team-leader involved with the development of an application of which one common

Re: ocfs2 for OSDs?

2013-09-11 Thread Gregory Farnum
On Wed, Sep 11, 2013 at 12:55 PM, David Disseldorp dd...@suse.de wrote: Hi Sage, On Wed, 11 Sep 2013 09:18:13 -0700 (PDT) Sage Weil s...@inktank.com wrote: REFLINKs (inode-based writeable snapshots) This is the one item on this list I see that the ceph-osds could take real advantage of;

Re: Paxos vs Raft

2013-09-14 Thread Gregory Farnum
On Fri, Sep 13, 2013 at 11:39 PM, Loic Dachary l...@dachary.org wrote: Hi, Ceph ( http://ceph.com/ ) relies on a custom implementation of Paxos to provide exabyte scale distributed storage. Like most people recently exposed to Paxos, I struggle to understand it ... but will keep studying

Re: [ceph-users] About ceph testing

2013-09-18 Thread Gregory Farnum
On Tue, Sep 17, 2013 at 10:07 PM, david zhang zhang.david2...@gmail.com wrote: Hi ceph-users, Previously I sent one mail to ask for help on ceph unit test and function test. Thanks to one of your guys, I got replied about unit test. Since we are planning to use ceph, but with strict quality

Re: Object Write Latency

2013-09-20 Thread Gregory Farnum
On Fri, Sep 20, 2013 at 5:27 AM, Andreas Joachim Peters andreas.joachim.pet...@cern.ch wrote: Hi, we made some benchmarks about object read/write latencies on the CERN ceph installation. The cluster has 44 nodes and ~1k disks, all on 10GE and the pool configuration has 3 copies. Client

Re: crc32 for erasure code

2013-09-23 Thread Gregory Farnum
On Mon, Sep 23, 2013 at 1:34 AM, Loic Dachary l...@dachary.org wrote: Hi, Unless I'm mistaken, ceph_crc32() is currently used in master via the crc32c() method of bufferlist to: * encode_with_checksum/decode_with_checksum a PGLog entry * Message::decode_message/Message::encode_message a

Re: bloom filter thoughts

2013-09-26 Thread Gregory Farnum
On Thu, Sep 26, 2013 at 8:52 AM, Sage Weil s...@inktank.com wrote: On Thu, 26 Sep 2013, Mark Nelson wrote: On 09/25/2013 07:34 PM, Sage Weil wrote: I spent some time on the plane playing with bloom filters. We're looking at using these on the OSD to (more) efficiently keep track of which

Re: thought on storing bloom (hit) info

2013-10-02 Thread Gregory Farnum
On Wed, Oct 2, 2013 at 5:02 PM, Sage Weil s...@inktank.com wrote: If we make this a special internal object we need to complicate recovery and namespacing to keep is separate from user data. We also need to implement a new API for retrieving, trimming, and so forth. Instead, we could just

Re: thought on storing bloom (hit) info

2013-10-02 Thread Gregory Farnum
On Wed, Oct 2, 2013 at 5:19 PM, Sage Weil s...@inktank.com wrote: On Wed, 2 Oct 2013, Gregory Farnum wrote: On Wed, Oct 2, 2013 at 5:02 PM, Sage Weil s...@inktank.com wrote: If we make this a special internal object we need to complicate recovery and namespacing to keep is separate from user

Re: rados_clone_range for different pgs

2013-10-08 Thread Gregory Farnum
On Tue, Oct 8, 2013 at 7:40 AM, Oleg Krasnianskiy oleg.krasnians...@gmail.com wrote: We use ceph to store huge files stripped into small (4mb) objects. Due to the fact that files can be changed unpredictably (data insertion/modification/deletion in any part of a file), we have to copy parts of

Re: [PATCH] mds: update backtrace when old format inode is touched

2013-10-16 Thread Gregory Farnum
I came across this patch while going through my email backlog and it looks like we haven't pulled in this patch or anything like it. Did you do something about this problem in a different way? (The patch doesn't apply cleanly so I'll need to update it if this is still what we've got.) -Greg

Re: [RFC PATCH] ceph: add acl for cephfs

2013-10-18 Thread Gregory Farnum
Isn't the UID/GID mismatch a generic problem when using CephFS? ;) I've got this patch in my queue as well if nobody else beats me to it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Oct 17, 2013 at 5:39 AM, Li Wang liw...@ubuntukylin.com wrote: Hi, I did not

Re: issues when bucket index deep-scrubbing

2013-10-18 Thread Gregory Farnum
On Fri, Oct 18, 2013 at 4:01 AM, Dominik Mostowiec dominikmostow...@gmail.com wrote: Hi, I plan to shard my largest bucket because of issues of deep-scrubbing (when PG which index for this bucket is stored on is deep-scrubbed, it appears many slow requests and OSD grows in memory - after

Re: Removing disks / OSDs

2013-10-21 Thread Gregory Farnum
I'm not quite sure what questions you're actually asking here... In general, the OSD is not removed from the system without explicit admin intervention. When it is removed, all traces of it should be zapped (including its key), so it can't reconnect. If it hasn't been removed, then indeed it will

Re: issues when bucket index deep-scrubbing

2013-10-21 Thread Gregory Farnum
the bucket index itself, rather than sharding across buckets in the application. :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com Regards Dominik 2013/10/18 Gregory Farnum g...@inktank.com: On Fri, Oct 18, 2013 at 4:01 AM, Dominik Mostowiec dominikmostow...@gmail.com

Re: Removing disks / OSDs

2013-10-21 Thread Gregory Farnum
On Mon, Oct 21, 2013 at 9:57 AM, Loic Dachary l...@dachary.org wrote: On 21/10/2013 18:49, Gregory Farnum wrote: I'm not quite sure what questions you're actually asking here... I guess I was asking if my understanding was correct. In general, the OSD is not removed from the system without

Re: issues when bucket index deep-scrubbing

2013-10-21 Thread Gregory Farnum
limitations in ceph that can affect us? -- Regards Dominik 2013/10/21 Gregory Farnum g...@inktank.com: On Mon, Oct 21, 2013 at 2:26 AM, Dominik Mostowiec dominikmostow...@gmail.com wrote: Hi, Thanks for your response. That is definitely the obvious next step, but it's a non-trivial amount

Re: set mds message priority to MSG_PRIO_HIGH

2013-10-21 Thread Gregory Farnum
On Sat, Oct 19, 2013 at 7:14 AM, Yan, Zheng uker...@gmail.com wrote: On Sat, Oct 19, 2013 at 8:58 PM, hjwsm1989-gmail hjwsm1...@gmail.com wrote: Hi, I'm testing ceph with samba. I have 20 OSD nodes on 4 hosts dy01:1MON, 5 OSDs, 1 samba server dy02: 1MDS, 5OSDs, 1 samba server dy03: 5 OSDs,

Re: Removing disks / OSDs

2013-10-22 Thread Gregory Farnum
On Mon, Oct 21, 2013 at 11:13 PM, Loic Dachary l...@dachary.org wrote: On 21/10/2013 18:49, Gregory Farnum wrote: I'm not quite sure what questions you're actually asking here... In general, the OSD is not removed from the system without explicit admin intervention. When it is removed, all

Re: [PATCH] ceph: cleanup aborted requests when re-sending requests.

2013-10-23 Thread Gregory Farnum
A little delayed, but Sage just pushed this into our testing repo. Thanks! (Feel free to poke me in future if you know you have patches that have been hanging for a while.) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Wed, Sep 25, 2013 at 11:25 PM, Yan, Zheng

Re: [ceph-users] radosgw - complete_multipart errors

2013-10-31 Thread Gregory Farnum
On Thu, Oct 31, 2013 at 6:22 AM, Dominik Mostowiec dominikmostow...@gmail.com wrote: Hi, I have strange radosgw error: == 2013-10-26 21:18:29.844676 7f637beaf700 0 setting object tag=_ZPeVs7d6W8GjU8qKr4dsilbGeo6NOgw 2013-10-26 21:18:30.049588 7f637beaf700 0 WARNING: set_req_state_err

Re: cache tier blueprint (part 2)

2013-11-08 Thread Gregory Farnum
On Thu, Nov 7, 2013 at 6:56 AM, Sage Weil s...@inktank.com wrote: I typed up what I think is remaining for the cache tier work for firefly. Greg, can you take a look? I'm most likely missing a bunch of stuff here.

Re: cache tier blueprint (part 2)

2013-11-09 Thread Gregory Farnum
On Fri, Nov 8, 2013 at 10:25 PM, Sage Weil s...@inktank.com wrote: On Fri, 8 Nov 2013, Gregory Farnum wrote: On Thu, Nov 7, 2013 at 6:56 AM, Sage Weil s...@inktank.com wrote: I typed up what I think is remaining for the cache tier work for firefly. Greg, can you take a look? I'm most likely

Re: messenger refactor notes

2013-11-09 Thread Gregory Farnum
On Sat, Nov 9, 2013 at 10:13 AM, Samuel Just sam.j...@inktank.com wrote: Currently, the messenger delivers messages to the Dispatcher implementation from a single thread (See src/msg/DispatchQueue.h/cc). My take away from the performance work so far is that we probably need client IO related

Re: HSM

2013-11-11 Thread Gregory Farnum
On Mon, Nov 11, 2013 at 3:04 AM, John Spray john.sp...@inktank.com wrote: This is a really useful summary from Malcolm. In addition to the coordinator/copytool interface, there is the question of where the policy engine gets its data from. Lustre has the MDS changelog, which Robinhood uses

Re: messenger refactor notes

2013-11-11 Thread Gregory Farnum
On Mon, Nov 11, 2013 at 7:00 AM, Atchley, Scott atchle...@ornl.gov wrote: On Nov 9, 2013, at 4:18 AM, Sage Weil s...@inktank.com wrote: The SimpleMessenger implementation of the Messenger interface has grown organically over many years and is one of the cruftier bits of code in Ceph. The

Re: [RFC] Ceph encryption support

2013-11-12 Thread Gregory Farnum
On Tue, Nov 12, 2013 at 6:10 AM, Li Wang liw...@ubuntukylin.com wrote: Hi, We want to implement encryption support for Ceph. Currently, we have the draft design, 1 When user mount a ceph directory for the first time, he can specify a passphrase and the encryption algorithm and length of

Re: CDS blueprint: strong auth for cephfs

2013-11-13 Thread Gregory Farnum
On Wed, Nov 13, 2013 at 8:05 AM, Dan van der Ster d...@vanderster.com wrote: Hi all, This mail is just to let you know that we've prepared a draft blueprint related to adding strong(er) authn/authz to cephfs:

Re: Out-of-tree build of ceph

2013-11-14 Thread Gregory Farnum
Hrm, I don't think many of the developers do builds like that too often, so it's not surprising it got a little busted. :( Can you make a ticket in the tracker with whatever you've figured out about the cause and timing so we don't lose it? :) -Greg Software Engineer #42 @ http://inktank.com |

Re: CDS blueprint: strong auth for cephfs

2013-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2013 at 2:00 AM, Dan van der Ster d...@vanderster.com wrote: Hi Greg, On Wed, Nov 13, 2013 at 6:45 PM, Gregory Farnum g...@inktank.com wrote: On Wed, Nov 13, 2013 at 8:05 AM, Dan van der Ster d...@vanderster.com wrote: Hi all, This mail is just to let you know that we've

Re: CDS blueprint: strong auth for cephfs

2013-11-14 Thread Gregory Farnum
On Thu, Nov 14, 2013 at 12:30 PM, Arne Wiebalck arne.wieba...@cern.ch wrote: On Nov 14, 2013, at 5:37 PM, Gregory Farnum g...@inktank.com wrote: On Thu, Nov 14, 2013 at 8:21 AM, Dan van der Ster d...@vanderster.com wrote: On Thu, Nov 14, 2013 at 4:55 PM, Gregory Farnum g...@inktank.com

Re: possible bug in init-ceph.in

2013-11-21 Thread Gregory Farnum
Can we take that diff you provided as coming with a signed-off-by, as in the pull request Loic generated? :) -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Nov 21, 2013 at 9:57 AM, Loic Dachary l...@dachary.org wrote: Hi, It turns out there was no pull request or

Re: possible bug in init-ceph.in

2013-11-24 Thread Gregory Farnum
Merged; thanks guys. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Nov 21, 2013 at 8:54 PM, Dietmar Maurer diet...@proxmox.com wrote: Can we take that diff you provided as coming with a signed-off-by, as in the pull request Loic generated? :) Sure. -- To

Re: How to use the class Filer in Ceph

2013-11-24 Thread Gregory Farnum
I haven't looked at the Filer code (or anything around it) in a while, but if I were to guess, in-snapid is set to something which doesn't exist. Are you actually using the Filer in some new code that includes inodes, or modifying the Client classes? Looking at how they initialize things should

Re: How to use the class Filer in Ceph

2013-11-27 Thread Gregory Farnum
=102450 size=10 mtime=2013-11-25 15:46:15.539420 caps=- objectset[134 ts 1/18446744073709551615 objects 0 dirty_or_tx 0] parents=0x31d4ba0 0x3253480) 2013-11-25 15:46:15.539563 7fb1b37a4780 10 client.5705 nothing to flush -Original Message- From: Gregory Farnum [mailto:g

Re: MDS can't join in

2013-12-03 Thread Gregory Farnum
Does the MDS have access to a keyring which contains its key, and does that match what's on the monitor? You're just referring to the client.admin one, which it won't use (it's not a client). It certainly looks like there's a mismatch based on the verification error. -Greg Software Engineer #42 @

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-05 Thread Gregory Farnum
On Thu, Dec 5, 2013 at 5:47 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/03/2013 03:12 PM, Josh Durgin wrote: These patches allow rbd to block writes instead of returning errors when OSDs are full enough that the FULL flag is set in the osd map. This avoids filesystems on top of rbd

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-06 Thread Gregory Farnum
On Fri, Dec 6, 2013 at 6:16 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/05/2013 08:58 PM, Gregory Farnum wrote: On Thu, Dec 5, 2013 at 5:47 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/03/2013 03:12 PM, Josh Durgin wrote: These patches allow rbd to block writes instead

Re: [PATCH 0/3] block I/O when cluster is full

2013-12-09 Thread Gregory Farnum
On Mon, Dec 9, 2013 at 4:11 PM, Josh Durgin josh.dur...@inktank.com wrote: On 12/06/2013 06:24 PM, Gregory Farnum wrote: On Fri, Dec 6, 2013 at 6:16 PM, Josh Durgin josh.dur...@inktank.com wrote: Don't bother trying to stop ENOSPC on the client side, since it'd need some restructuring

Re: Ceph Messaging on Accelio (libxio) RDMA

2013-12-11 Thread Gregory Farnum
On Wed, Dec 11, 2013 at 2:32 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Ceph devs, For the last several weeks, we've been working with engineers at Mellanox on a prototype Ceph messaging implementation that runs on the Accelio RDMA messaging service (libxio). Very cool! An RDMA

Re: tcmalloc

2013-12-17 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 4:15 PM, Milosz Tanski mil...@adfin.com wrote: I wanted to bring up an issue with Ceph's use of tcmalloc. I know that in Ubuntu (12.04) Ceph uses the distro version of tcmalloc which older. I've personally ran into issues with tcmalloc for our application where the

Re: [PATCH] reinstate ceph cluster_snap support

2013-12-18 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 4:14 AM, Alexandre Oliva ol...@gnu.org wrote: On Aug 27, 2013, Sage Weil s...@inktank.com wrote: Hi, On Sat, 24 Aug 2013, Alexandre Oliva wrote: On Aug 23, 2013, Sage Weil s...@inktank.com wrote: FWIW Alexandre, this feature was never really complete. For it to

Re: enable old OSD snapshot to re-join a cluster

2013-12-18 Thread Gregory Farnum
On Tue, Dec 17, 2013 at 3:36 AM, Alexandre Oliva ol...@gnu.org wrote: On Feb 20, 2013, Gregory Farnum g...@inktank.com wrote: On Tue, Feb 19, 2013 at 2:52 PM, Alexandre Oliva ol...@gnu.org wrote: It recently occurred to me that I messed up an OSD's storage, and decided that the easiest way

Re: [PATCH] mds: handle setxattr ceph.parent

2013-12-18 Thread Gregory Farnum
On Wed, Dec 18, 2013 at 9:09 AM, Sage Weil s...@inktank.com wrote: On Wed, 18 Dec 2013, Alexandre Oliva wrote: On Dec 18, 2013, Yan, Zheng uker...@gmail.com wrote: On Tue, Dec 17, 2013 at 7:25 PM, Alexandre Oliva ol...@gnu.org wrote: # setfattr -n ceph.parent /cephfs/mount/path/name Can

Re: [PATCH] mds: drop unused find_ino_dir

2013-12-18 Thread Gregory Farnum
Sage applied this in commit f5d32a33d25a5f9ddccadb4c3ebbd5ccd211204f; thanks! -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Dec 17, 2013 at 3:00 AM, Alexandre Oliva ol...@gnu.org wrote: I was looking at inconsistencies in xattrs in my OSDs, and found out that only

Re: Ceph Messaging on Accelio (libxio) RDMA

2013-12-18 Thread Gregory Farnum
(Sorry for the delay getting back on this.) On Wed, Dec 11, 2013 at 5:13 PM, Matt W. Benjamin m...@cohortfs.com wrote: Hi Greg, I haven't fixed the decision to reify replies in the Messenger at this point, but it is what the current prototype code tries to do. The request/response model is

Re: enable old OSD snapshot to re-join a cluster

2013-12-19 Thread Gregory Farnum
On Wed, Dec 18, 2013 at 11:32 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 18, 2013, Gregory Farnum g...@inktank.com wrote: On Tue, Dec 17, 2013 at 3:36 AM, Alexandre Oliva ol...@gnu.org wrote: Here's an updated version of the patch, that makes it much faster than the earlier version

Re: CephFS standup

2014-01-06 Thread Gregory Farnum
A little late on this (I was on vacation and my phone doesn't do plain-text email!), but I prefer the morning slot to the later ones. :) -Greg On Thu, Jan 2, 2014 at 12:51 PM, Sage Weil s...@inktank.com wrote: 2014 will be the Year of the Linux Desktop^W^W^WCephFS! To that end, we should

Re: [PATCH] mds: handle setxattr ceph.parent

2014-01-06 Thread Gregory Farnum
On Fri, Dec 20, 2013 at 4:50 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 20, 2013, Alexandre Oliva ol...@gnu.org wrote: back many of the osds to recent snapshots thereof, from which I'd cleaned all traces of the user.ceph._parent. I intended to roll back Err, I meant user.ceph._path, of

Re: [PATCH] mds: handle setxattr ceph.parent

2014-01-07 Thread Gregory Farnum
On Mon, Jan 6, 2014 at 8:15 PM, Alexandre Oliva ol...@gnu.org wrote: On Jan 6, 2014, Gregory Farnum g...@inktank.com wrote: On Fri, Dec 20, 2013 at 4:50 PM, Alexandre Oliva ol...@gnu.org wrote: On Dec 20, 2013, Alexandre Oliva ol...@gnu.org wrote: back many of the osds to recent snapshots

Re: Proposal for adding disable FileJournal option

2014-01-09 Thread Gregory Farnum
The FileJournal is also for data safety whenever we're using write ahead. To disable it we need a backing store that we know can provide us consistent checkpoints (i.e., we can use parallel journaling mode — so for the FileJournal, we're using btrfs, or maybe zfs someday). But for those systems

Re: Proposal for adding disable FileJournal option

2014-01-09 Thread Gregory Farnum
Wang haomaiw...@gmail.com wrote: On Fri, Jan 10, 2014 at 1:28 AM, Gregory Farnum g...@inktank.com wrote: The FileJournal is also for data safety whenever we're using write ahead. To disable it we need a backing store that we know can provide us consistent checkpoints (i.e., we can use parallel

Re: [ceph-users] many meta files in osd

2014-01-27 Thread Gregory Farnum
Looks like you got lost over the Christmas holidays; sorry! I'm not an expert on running rgw but it sounds like garbage collection isn't running or something. What version are you on, and have you done anything to set it up? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On

Re: DISCARD support in kernel driver

2014-01-30 Thread Gregory Farnum
On Thu, Jan 30, 2014 at 1:31 AM, Jean-Tiare LE BIGOT jean-tiare.le-bi...@ovh.net wrote: Hi, I started to implement 'DISCARD' support in RBD kernel driver as described on http://tracker.ceph.com/issues/190 This first (easy) step was to add at the end of drivers/block/rbd.c:rbd_init_disk

Re: DISCARD support in kernel driver

2014-01-30 Thread Gregory Farnum
that $ fstrim /mnt # neither Maybe missing something there ? I expected '-o discard' to be enough ? On 01/30/14 16:24, Gregory Farnum wrote: On Thu, Jan 30, 2014 at 1:31 AM, Jean-Tiare LE BIGOT jean-tiare.le-bi...@ovh.net wrote: Hi, I started to implement 'DISCARD' support in RBD

Re: [ceph-users] Ceph GET latency

2014-02-20 Thread Gregory Farnum
On Tue, Feb 18, 2014 at 7:24 AM, Guang Yang yguan...@yahoo.com wrote: Hi ceph-users, We are using Ceph (radosgw) to store user generated images, as GET latency is critical for us, most recently I did some investigation over the GET path to understand where time spend. I first confirmed that

Re: Assertion error in librados

2014-02-25 Thread Gregory Farnum
Do you have logs? The assert indicates that the messenger got back something other than okay when trying to grab a local Mutex, which shouldn't be able to happen. It may be that some error-handling path didn't drop it (within the same thread that later tried to grab it again), but we'll need more

Re: Assertion error in librados

2014-02-25 Thread Gregory Farnum
, Unfortunately we don't keep any Ceph related logs on the client side. On the server side, we kept the default log settings to avoid overlogging. Do you think that there might be something usefull on the OSD side ? On Tue, Feb 25, 2014 at 07:28:30AM -0800, Gregory Farnum wrote: Do you have logs? The assert

Re: [ceph-users] PG folder hierarchy

2014-02-25 Thread Gregory Farnum
On Tue, Feb 25, 2014 at 7:13 PM, Guang yguan...@yahoo.com wrote: Hello, Most recently when looking at PG's folder splitting, I found that there was only one sub folder in the top 3 / 4 levels and start having 16 sub folders starting from level 6, what is the design consideration behind this?

Re: location-aware file placement in Ceph

2014-02-27 Thread Gregory Farnum
There are some options within CRUSH to let you decide where you want to place particular classes of data, but it's not really available on a per-object or per-file basis. You should look through the CRUSH stuff at ceph.com/docs to get an idea of what's possible. -Greg Software Engineer #42 @

Re: cache pool user interfaces

2014-02-28 Thread Gregory Farnum
On Fri, Feb 28, 2014 at 7:21 AM, Sage Weil s...@inktank.com wrote: On Wed, 26 Feb 2014, Gregory Farnum wrote: We/you/somebody need(s) to sit down and decide on what kind of interface we want to actually expose to users for working with caching pools. What we have right now is very flexible

Re: contraining crush placement possibilities

2014-03-07 Thread Gregory Farnum
On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote: On Thu, Mar 6, 2014 at 9:30 PM, Sage Weil s...@inktank.com wrote: Sheldon just pointed out a talk from ATC that discusses the basic problem:

Re: contraining crush placement possibilities

2014-03-07 Thread Gregory Farnum
On Fri, Mar 7, 2014 at 9:43 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Gregory Farnum wrote: On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote: On Thu, Mar 6, 2014 at 9:30 PM, Sage Weil s...@inktank.com wrote

Re: contraining crush placement possibilities

2014-03-10 Thread Gregory Farnum
, why not map object_id to OSD combinations directly, will it achieve a more uniform distribution? On 2014/3/8 1:43, Sage Weil wrote: On Fri, 7 Mar 2014, Gregory Farnum wrote: On Fri, Mar 7, 2014 at 7:10 AM, Sage Weil s...@inktank.com wrote: On Fri, 7 Mar 2014, Dan van der Ster wrote

Re: [PATCH v2] ceph: use fl-fl_file as owner identifier of flock and posix lock

2014-03-10 Thread Gregory Farnum
Okay, this problem makes sense to me and I think your basic approach is good. I've got no problem with it after all. :) Just throwing this out there, if we're worried about exposing kernel addresses to external processes, and don't want them to collide, should we just keep a mapping of

Re: help - i/o error when mounting with cephfs

2014-03-14 Thread Gregory Farnum
On Fri, Mar 14, 2014 at 6:06 PM, Shaun Keenan skee...@gmail.com wrote: When trying to mount off my ceph cluster I get this: mount error 5 = Input/output error cluster looks healthy: [root@ceph-mds2 ~]# ceph -s cluster 61b6dda1-5412-41f7-9769-3ae7e47241b7 health HEALTH_OK monmap

Re: XioMessenger (RDMA) Performance results

2014-03-18 Thread Gregory Farnum
On Tue, Mar 18, 2014 at 1:05 PM, Yaron Haviv yar...@mellanox.com wrote: Im happy to share test results we run in the lab with Matt's latest XioMessenger code which implements Ceph messaging over Accelio RDMA library Results look pretty encouraging, demonstrating a * 20x * performance boost

Re: rbd client map error

2014-03-19 Thread Gregory Farnum
, 1129 GB / 1489 GB avail 6553604/9830406 objects degraded (66.667%) 1776 active+degraded Thanks Regards Somnath -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: Wednesday, March 19, 2014 3:38 PM To: Somnath Roy Cc: Sage Weil

Re: Limiting specific to specific directory, client separation

2014-03-24 Thread Gregory Farnum
This is not currently a priority in Inktank's roadmap for the MDS. :( But we discussed client security in more detail than those tickets during the Dumpling Ceph Developer Summit: http://wiki.ceph.com/Planning/CDS/Dumpling (search for 1G: Client Security for CephFS -- there's a blueprint, an

Re: Question about librados notification

2014-03-25 Thread Gregory Farnum
On Tue, Mar 25, 2014 at 5:50 AM, Shinji Matsumoto shinji.matsum...@us.sios.com wrote: Hello all, I have a question about Ceph notification mechanism. http://ceph.com/docs/master/architecture/#object-watch-notify Scenario: (1) 3 clients (client1, client2, client3) have interests on a Ceph

Re: ceph-0.77-900.gce9bfb8 Testing Rados EC/Tiering CephFS ...

2014-03-25 Thread Gregory Farnum
On Thu, Mar 20, 2014 at 3:49 AM, Andreas Joachim Peters andreas.joachim.pet...@cern.ch wrote: Hi, I did some Firefly ceph-0.77-900.gce9bfb8 testing of EC/Tiering deploying 64 OSD with in-memory filesystems (RapidDisk with ext4) on a single 256 GB box. The raw write performance of this box

Re: Assertion error in librados

2014-03-31 Thread Gregory Farnum
Nope, I don't think anybody's looked into it. If you have core dumps you could get a backtrace and the return value referenced. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri, Mar 28, 2014 at 2:54 AM, Filippos Giannakos philipg...@grnet.gr wrote: Hello, We recently

Re: [ceph-users] Ceph User Committee monthly meeting #1 : executive summary

2014-04-04 Thread Gregory Farnum
On Fri, Apr 4, 2014 at 11:15 AM, Milosz Tanski mil...@adfin.com wrote: Loic, The writeup has been helpful. What I'm curious about (and hasn't been mentioned) is can we use erasure with CephFS? What steps have to be taken in order to setup erasure coding for CephFS? Lots. CephFS takes

Re: Multiple Posix namespaces?

2014-04-04 Thread Gregory Farnum
Not yet, no. There are a couple different approaches to this that a third-party contributor could work on without too much difficulty (I *think* there's a blueprint floating around somewhere), but nobody's done so yet. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Fri,

Re: Deterministic thrashing

2014-04-07 Thread Gregory Farnum
This would be really nice but there are unfortunately even more hiccups than you've noted here: 1) Thrashing is both time and disk access sensitive, and hardware differs 2) The teuthology thrashing is triggered largely based on PG state events (eg, all PGs are clean, so restart an OSD) 3) The

Re: Deterministic thrashing

2014-04-07 Thread Gregory Farnum
On Mon, Apr 7, 2014 at 10:13 AM, Loic Dachary l...@dachary.org wrote: On 07/04/2014 18:55, Gregory Farnum wrote: This would be really nice but there are unfortunately even more hiccups than you've noted here: 1) Thrashing is both time and disk access sensitive, and hardware differs 2

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-04-09 Thread Gregory Farnum
On Wed, Apr 9, 2014 at 3:05 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi all, I would like to share some ideas about how to improve performance on ceph with SSD. Not much preciseness. Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). ceph version is 0.67.5(Dumping) At

Re: Ubuntu 12.04 MDS tcmalloc leaks

2014-04-11 Thread Gregory Farnum
On Fri, Apr 11, 2014 at 8:59 AM, Milosz Tanski mil...@adfin.com wrote: I'd like to restart this debate about tcmalloc slow leaks in MDS. This time around I have some charts. Looking at OSDs and MONs, it doesn't seam to affect those (as much). Here's the chart: http://i.imgur.com/xMCINAD.png

Re: Ceph daemon memory utilization: 'heap release' drops use by 50%

2014-04-14 Thread Gregory Farnum
What distro are you running on? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 14, 2014 at 5:28 AM, David McBride dw...@cam.ac.uk wrote: Hello, I'm currently experimenting with a Ceph deployment, and am noting that some of my machines are having processes

Re: Ceph daemon memory utilization: 'heap release' drops use by 50%

2014-04-14 Thread Gregory Farnum
as well — it might be a tcmalloc issue they can resolve in their repo. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Mon, Apr 14, 2014 at 7:04 AM, David McBride dw...@cam.ac.uk wrote: On 14/04/14 14:53, Gregory Farnum wrote: What distro are you running on? -Greg Hi Greg

Re: xio-rados-firefly branch update

2014-04-22 Thread Gregory Farnum
Awesome! I'll try and take a preliminary look at this in the next day or two. What kind of feedback are you interested in right now? -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, Apr 22, 2014 at 12:44 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi, We've

Re: [ceph-users] Ceph mds laggy and failed assert in function replay mds/journal.cc

2014-04-25 Thread Gregory Farnum
Hmm, it looks like your on-disk SessionMap is horrendously out of date. Did your cluster get full at some point? In any case, we're working on tools to repair this now but they aren't ready for use yet. Probably the only thing you could do is create an empty sessionmap with a higher version than

Re: bandwidth with Ceph - v0.59 (Bobtail)

2014-04-25 Thread Gregory Farnum
Bobtail is really too old to draw any meaningful conclusions from; why did you choose it? That's not to say that performance on current code will be better (though it very much might be), but the internal architecture has changed in some ways that will be particularly important for the futex

Re: xio-rados-firefly branch update

2014-04-28 Thread Gregory Farnum
A few days later than I wanted, but I got through various pieces of this today. It wasn't a thorough review but more a shape of things check, but I have a bunch of notes. On Tue, Apr 22, 2014 at 2:50 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi Greg, Sure. I'm interested in all feedback

Re: default filestore max sync interval

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 1:10 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi all, Why is the default max sync interval only 5 seconds? Today we realized what a huge difference that increasing this to 30 or 60s can do for the small write latency. Basically, with a 5s interval our 4k

Re: default filestore max sync interval

2014-04-29 Thread Gregory Farnum
On Tue, Apr 29, 2014 at 1:35 PM, Stefan Priebe s.pri...@profihost.ag wrote: H Greg, Am 29.04.2014 22:23, schrieb Gregory Farnum: On Tue, Apr 29, 2014 at 1:10 PM, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi all, Why is the default max sync interval only 5 seconds? Today we

Re: xio-rados-firefly branch update

2014-05-01 Thread Gregory Farnum
On Mon, Apr 28, 2014 at 8:14 PM, Matt W. Benjamin m...@linuxbox.com wrote: Hi Greg, - Gregory Farnum g...@inktank.com wrote: The re-org mostly looks fine. I notice you're adding a few more friend declarations though, and I don't think those should be necessary — Connection can label

Re: [PATCH] locks: ensure that fl_owner is always initialized properly in flock and lease codepaths

2014-05-06 Thread Gregory Farnum
The Ceph bit is fine. Acked-by: Greg Farnum g...@inktank.com On Mon, Apr 28, 2014 at 10:50 AM, Jeff Layton jlay...@poochiereds.net wrote: Currently, the fl_owner isn't set for flock locks. Some filesystems use byte-range locks to simulate flock locks and there is a common idiom in those that

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 8:44 AM, Dan van der Ster daniel.vanders...@cern.ch wrote: Hi, Sage Weil wrote: * *Primary affinity*: Ceph now has the ability to skew selection of OSDs as the primary copy, which allows the read workload to be cheaply skewed away from parts of the cluster

  1   2   3   4   5   6   7   8   9   10   >