Re: [ceph-users] IO wait spike in VM

2014-09-29 Thread Bécholey Alexandre
Hello Quenten, Thanks for your reply. We have a 5GB journal for each OSD on the same disk. Right now, we are migrating our OSD to XFS and we'll add a 5th monitor. We will perform the benchmarks afterwards. Cheers, Alexandre -Original Message- From: Quenten Grasso

Re: [ceph-users] what does osd's ms_objecter do? and who will connect it?

2014-09-29 Thread yuelongguang
thanks, sage weil. writing fs is a serious matter,we should make it clear, includes coding style. there are other places we should fix. thanks At 2014-09-29 12:10:52, Sage Weil sw...@redhat.com wrote: On Mon, 29 Sep 2014, yuelongguang wrote: hi, sage will1. you mean if i use cache

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi Wido, On 26 Sep 2014, at 23:14, Wido den Hollander w...@42on.com wrote: On 26-09-14 17:16, Dan Van Der Ster wrote: Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? Suppose you have 5 spinning disks

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Daniel Swarbrick
On 26/09/14 17:16, Dan Van Der Ster wrote: Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? I’m just curious, for such a routine operation, what are most admins doing in this case? I think ceph-osd is

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi, On 29 Sep 2014, at 10:01, Daniel Swarbrick daniel.swarbr...@profitbricks.com wrote: On 26/09/14 17:16, Dan Van Der Ster wrote: Hi, Apologies for this trivial question, but what is the correct procedure to replace a failed OSD that uses a shared journal device? I’m just curious,

[ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour
Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year :( We used Samsung 850 pro (120Go) with two setup (small nodes with 2 ssd, 2

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Owen Synge
Hi Dan, At least looking at upstream to get journals and partitions persistently working, this requires gpt partitions, and being able to add a GPT partition UUID to work perfectly with minimal modification. I am not sure the status of this on RHEL6, The latest Fedora and OpenSUSE support this

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
Hi Owen, On 29 Sep 2014, at 10:33, Owen Synge osy...@suse.com wrote: Hi Dan, At least looking at upstream to get journals and partitions persistently working, this requires gpt partitions, and being able to add a GPT partition UUID to work perfectly with minimal modification. I am not

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Christian Balzer
Hello, On Mon, 29 Sep 2014 10:31:03 +0200 Emmanuel Lacour wrote: Dear ceph users, we are managing ceph clusters since 1 year now. Our setup is typically made of Supermicro servers with OSD sata drives and journal on SSD. Those SSD are all failing one after the other after one year

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Dan Van Der Ster
Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! The conventional wisdom has been to use the Intel DC S3700 because of its massive durability. Anyway, I’m curious what do the SMART counters say

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 05:57:12PM +0900, Christian Balzer wrote: Given your SSDs, are they failing after more than 150TB have been written? between 30 and 40 TB ... Thought, statistics gives 60GB (option 2) to 100 GB (option 1) writes per day on SSD on a not really over loaded cluster.

Re: [ceph-users] SSD MTBF

2014-09-29 Thread Emmanuel Lacour
On Mon, Sep 29, 2014 at 08:58:38AM +, Dan Van Der Ster wrote: Hi Emmanuel, This is interesting, because we’ve had sales guys telling us that those Samsung drives are definitely the best for a Ceph journal O_o ! The conventional wisdom has been to use the Intel DC S3700 because of its

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Dan Van Der Ster
On 29 Sep 2014, at 10:47, Dan Van Der Ster daniel.vanders...@cern.ch wrote: Hi Owen, On 29 Sep 2014, at 10:33, Owen Synge osy...@suse.com wrote: Hi Dan, At least looking at upstream to get journals and partitions persistently working, this requires gpt partitions, and being able to

[ceph-users] high load on snap rollback

2014-09-29 Thread Stefan Priebe - Profihost AG
Hi, i saw the following commit in dumpling: commit b5dafe1c0f7ecf7c3a25d0be5dfddcbe3d07e69e Author: Sage Weil s...@redhat.com Date: Wed Jun 18 11:02:58 2014 -0700 osd: allow io priority to be set for the disk_tp The disk_tp covers scrubbing, pg deletion, and snap trimming I've

Re: [ceph-users] ceph osd replacement with shared journal device

2014-09-29 Thread Sage Weil
On Mon, 29 Sep 2014, Dan Van Der Ster wrote: Hi Owen, On 29 Sep 2014, at 10:33, Owen Synge osy...@suse.com wrote: Hi Dan, At least looking at upstream to get journals and partitions persistently working, this requires gpt partitions, and being able to add a GPT partition UUID to

Re: [ceph-users] IO wait spike in VM

2014-09-29 Thread Christian Balzer
On Mon, 29 Sep 2014 09:04:51 + Quenten Grasso wrote: Hi Alexandre, No problem, I hope this saves you some pain It's probably worth going for a larger journal probably around 20Gig if you wish to play with tuning of filestore max sync interval could be have some interesting results.

Re: [ceph-users] Ceph Filesystem - Production?

2014-09-29 Thread James Devine
The issue hasn't popped up since I upgraded the kernel so the issue I was experiencing seems to have been addressed. On Tue, Sep 9, 2014 at 12:13 PM, James Devine fxmul...@gmail.com wrote: The issue isn't so much mounting the ceph client as it is the mounted ceph client becoming unusable

[ceph-users] rbd command and kernel driver compatibility

2014-09-29 Thread Shawn Edwards
What's the limit of which versions of the 'rbd' command and versions of the rbd kernel driver compatibility? I have a project that requires running 'rbd' on machines that have a fairly new kernel (3.10) and really old versions of libc and other libs (based on Centos5.10). It would be really

[ceph-users] failed to sync object

2014-09-29 Thread Lyn Mitchell
Hello ceph users, We have a federated gateway configured to replicate between two zones. Replication seems to be working smoothly between the master and slave zone, however I have a recurring error in the replication log with the following info: INFO:radosgw_agent.worker:17573 is

Re: [ceph-users] failed to sync object

2014-09-29 Thread Yehuda Sadeh
On Mon, Sep 29, 2014 at 10:44 AM, Lyn Mitchell mitc...@bellsouth.net wrote: Hello ceph users, We have a federated gateway configured to replicate between two zones. Replication seems to be working smoothly between the master and slave zone, however I have a recurring error in the

Re: [ceph-users] dumpling fiemap

2014-09-29 Thread Haomai Wang
Sorry, it seemed that I missed this. You can test it via ./ceph_test_librbd_fsx and running fsxtest in vm with librbd backend On Fri, Sep 26, 2014 at 4:07 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Hi, Am 26.09.2014 um 10:02 schrieb Haomai Wang: If user enable fiemap