[ceph-users] Multi Level Tiering

2014-09-19 Thread Nick Fisk
Hi, I'm just wondering if its possible (or planned to be implemented) a way of configuring more than 2 levels of Tiering? I'm thinking SSD Pool-Normal Replica Pool-EC Pool We are looking at building a cluster to hold backups of VM's. As the retention copies are effectively static data

Re: [ceph-users] Multi Level Tiering

2014-09-20 Thread Nick Fisk
Excellent thank you for your response Sage Weil sweil@... writes: Eventually, yes, but right now only 2 levels are supported. There is a blueprint, see http://wiki.ceph.com/Planning/Blueprints/Emperor/osd%3A_tiering%3A_objec t_redirects sage

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-10-27 Thread Nick Fisk
runs, please let me know. Nick Nick Fisk Technical Support Engineer System Professional Ltd tel: 01825 83 mob: 07711377522 fax: 01825 830001 mail: nick.f...@sys-pro.co.uk web: www.sys-pro.co.ukhttp://www.sys-pro.co.uk IT SUPPORT SERVICES | VIRTUALISATION | STORAGE | BACKUP AND DR

Re: [ceph-users] What a maximum theoretical and practical capacity in ceph cluster?

2014-10-28 Thread Nick Fisk
I've been looking at various categories of disks and how the performance/reliability/cost varies. There seems to be 5 main categories: (WD disks given as example)- Budget (WD Green - 5400 no TLER) Desktop Drives (WD Blue - /7200RPM no TLER) NAS Drives (WD Red - 5400RPM TLER) Enterprise Capacity

[ceph-users] Redundant Power Supplies

2014-10-30 Thread Nick Fisk
What’s everyone’s opinions on having redundant power supplies in your OSD nodes? One part of me says let Ceph do the redundancy and plan for the hardware to fail, the other side says that they are probably worth having as they lessen the chance of losing a whole node. Considering they can

Re: [ceph-users] prioritizing reads over writes

2014-10-31 Thread Nick Fisk
Hi Simon, Have you tried using the Deadline scheduler on the Linux nodes? The deadline scheduler prioritises reads over writes. I believe it tries to service all reads within 500ms whilst writes can be delayed up to 5s. I don’t the exact effect Ceph will have over the top of this, but

Re: [ceph-users] prioritizing reads over writes

2014-10-31 Thread Nick Fisk
[mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Xu (Simon) Chen Sent: 31 October 2014 19:51 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] prioritizing reads over writes I am already using deadline scheduler, with the default parameters: read_expire=500 write_expire

Re: [ceph-users] prioritizing reads over writes

2014-10-31 Thread Nick Fisk
October 2014 20:15 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] prioritizing reads over writes We have SSD journals, backend disks are actually on SSD-fronted bcache devices in writeback mode. The client VMs have rbd cache enabled too... -Simon On Fri, Oct 31

[ceph-users] RBD Diff based on Timestamp

2014-11-06 Thread Nick Fisk
I have been thinking about the implications of losing the snapshot chain on a RBD when doing export-diff-import-diff between two separate physical locations. As I understand it, in this scenario when you take the first snapshot again on the source, you would In effect end up copying the whole RBD

[ceph-users] Cache Tier Statistics

2014-11-08 Thread Nick Fisk
Hi, Does anyone know if there any statistics available specific to the cache tier functionality, I'm thinking along the lines of cache hit ratios? Or should I be pulling out the Read statistics for backing+cache pools and assuming that if a read happens from the backing pool it was a miss and

Re: [ceph-users] Cache Tier Statistics

2014-11-10 Thread Nick Fisk
-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jean-Charles Lopez Sent: 09 November 2014 01:43 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cache Tier Statistics Hi Nick If my brain doesn't fail me you can try ceph daemon osd.{id} perf dump ceph report

[ceph-users] Stackforge Puppet Module

2014-11-11 Thread Nick Fisk
Hi, I'm just looking through the different methods of deploying Ceph and I particularly liked the idea that the stackforge puppet module advertises of using discover to automatically add new disks. I understand the principle of how it should work; using ceph-disk list to find unknown disks, but I

Re: [ceph-users] Stackforge Puppet Module

2014-11-12 Thread Nick Fisk
:05 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Stackforge Puppet Module Hi Nick, The great thing about puppet-ceph's implementation on Stackforge is that it is both unit and integration tested. You can see the integration tests here: https://github.com/ceph/puppet-ceph

Re: [ceph-users] Stackforge Puppet Module

2014-11-13 Thread Nick Fisk
Sent: 12 November 2014 14:25 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Stackforge Puppet Module What comes to mind is that you need to make sure that you've cloned the git repository to /etc/puppet/modules/ceph and not /etc/puppet/modules/puppet-ceph. Feel free to hop

Re: [ceph-users] Ceph Monitoring with check_MK

2014-11-14 Thread Nick Fisk
Hi Robert, I've just been testing your ceph check and I have made a small modification to allow it to adjust itself to suit the autoscaling of the units Ceph outputs. Here is the relevant section I have modified:- if line[1] == 'TB': used = saveint(line[0]) * 1099511627776

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-18 Thread Nick Fisk
Has anyone tried applying this fix to see if it makes any difference? https://github.com/ceph/ceph/pull/2374 I might be in a position in a few days to build a test cluster to test myself, but was wondering if anyone else has had any luck with it? Nick -Original Message- From:

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-18 Thread Nick Fisk
Hi David, Have you tried on a normal replicated pool with no cache? I've seen a number of threads recently where caching is causing various things to block/hang. It would be interesting to see if this still happens without the caching layer, at least it would rule it out. Also is there any sign

Re: [ceph-users] Stackforge Puppet Module

2014-11-18 Thread Nick Fisk
14:25 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Stackforge Puppet Module What comes to mind is that you need to make sure that you've cloned the git repository to /etc/puppet/modules/ceph and not /etc/puppet/modules/puppet-ceph. Feel free to hop on IRC to discuss about

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-20 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David Moreau Simard Sent: 19 November 2014 10:48 To: Ramakrishna Nishtala (rnishtal) Cc: ceph-users@lists.ceph.com; Nick Fisk Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target Rama, Thanks

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-11-20 Thread Nick Fisk
emit } -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David Moreau Simard Sent: 20 November 2014 20:03 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target Nick, Can you share more

Re: [ceph-users] evaluating Ceph

2014-11-25 Thread Nick Fisk
The two numbers (ints) are meant to the ids of the pools you have created for data and meta data. Assuming you have already created the pools, run ceph osd lspools and use the numbers from there to create the FS From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On

Re: [ceph-users] Ceph as backend for 2012 Hyper-v?

2014-11-26 Thread Nick Fisk
Hi Jay, The way I would doit until Ceph supports HA iSCSI (see blueprint) would be to configure a Ceph cluster as normal and then create RBD’s for your block storage. I would then map these RBD’s on some “proxy” servers, these would be running in an HA cluster with resource agents for

Re: [ceph-users] Suitable SSDs for journal

2014-12-04 Thread Nick Fisk
Hi Eneko, There has been various discussions on the list previously as to the best SSD for Journal use. All of them have pretty much come to the conclusion that the Intel S3700 models are the best suited and in fact work out the cheapest in terms of write durability. Nick -Original

[ceph-users] Erasure Encoding Chunks

2014-12-05 Thread Nick Fisk
Hi All, Does anybody have any input on what the best ratio + total numbers of Data + Coding chunks you would choose? For example I could create a pool with 7 data chunks and 3 coding chunks and get an efficiency of 70%, or I could create a pool with 17 data chunks and 3 coding chunks and

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Nick Fisk
This is probably due to the Kernel RBD client not being recent enough. Have you tried upgrading your kernel to a newer version? 3.16 should contain all the relevant features required by Giant. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of

Re: [ceph-users] Giant or Firefly for production

2014-12-05 Thread Nick Fisk
should be released early next year. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Antonio Messina Sent: 05 December 2014 15:38 To: Nick Fisk Cc: ceph-users@lists.ceph.com; Antonio Messina Subject: Re: [ceph-users] Giant or Firefly

Re: [ceph-users] Erasure Encoding Chunks

2014-12-06 Thread Nick Fisk
Sent: 05 December 2014 17:28 To: Nick Fisk; 'Ceph Users' Subject: Re: [ceph-users] Erasure Encoding Chunks On 05/12/2014 17:41, Nick Fisk wrote: Hi Loic, Thanks for your response. The idea for this cluster will be for our VM Replica storage in our secondary site. Initially we

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-12-06 Thread Nick Fisk
-boun...@lists.ceph.com] On Behalf Of David Moreau Simard Sent: 05 December 2014 16:03 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Poor RBD performance as LIO iSCSI target I've flushed everything - data, pools, configs and reconfigured the whole thing. I was particularly

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-12-08 Thread Nick Fisk
? Or is it more about a capacity thing ? Perhaps if someone else can chime in, I'm really curious. -- David Moreau Simard On Dec 6, 2014, at 11:18 AM, Nick Fisk n...@fisk.me.uk wrote: Hi David, Very strange, but I'm glad you managed to finally get the cluster working normally. Thank you for posting

Re: [ceph-users] Number of SSD for OSD journal

2014-12-15 Thread Nick Fisk
Hi Florent, Journals don’t need to be very big, 5-10GB per OSD would normally be ample. The key is that you get a SSD with high write endurance, this makes the Intel S3700 drives perfect for journal use. In terms of how many OSD’s you can run per SSD, depends purely on how important

Re: [ceph-users] Improving Performance with more OSD's?

2014-12-28 Thread Nick Fisk
Hi Lindsay, Ceph is really designed to scale across large amounts of OSD's and whilst it will still function with only 2 OSD's, I wouldn't expect it to perform as well as compared to a RAID 1 mirror with Battery Backed Cache. I wouldn't recommend running the OSD's on USB, although it should work

Re: [ceph-users] Improving Performance with more OSD's?

2014-12-29 Thread Nick Fisk
: 29 December 2014 22:24 To: Nick Fisk Cc: ceph-us...@ceph.com Subject: Re: [ceph-users] Improving Performance with more OSD's? On Sun, 28 Dec 2014 04:08:03 PM Nick Fisk wrote: If you can't add another full host, your best bet would be to add another 2-3 disks to each server. This should give

Re: [ceph-users] Block and NAS Services for Non Linux OS

2014-12-30 Thread Nick Fisk
I'm working on something very similar at the moment to present RBD's to ESXi Hosts. I'm going to run 2 or 3 VM's on the local ESXi storage to act as iSCSI proxy nodes. They will run a pacemaker HA setup with the RBD and LIO iSCSI resource agents to provide a failover iSCSI target which maps back

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-23 Thread Nick Fisk
until the Redhat patches make their way into the kernel? From: Jake Young [mailto:jak3...@gmail.com] Sent: 23 January 2015 16:46 To: Zoltan Arnold Nagy Cc: Nick Fisk; ceph-users Subject: Re: [ceph-users] Ceph, LIO, VMWARE anyone? Thanks for the feedback Nick and Zoltan, I have been

Re: [ceph-users] Ceph Supermicro hardware recommendation

2015-02-03 Thread Nick Fisk
Hi, Just a couple of points, you might want to see if you can get a Xeon v3 board+CPU as they have more performance and use less power. You can also get a SM 2U chassis which has 2x 2.5” disk slots at the rear, this would allow you to have an extra 2x 3.5” disks in the front of the

[ceph-users] Cache Settings

2015-02-07 Thread Nick Fisk
Hi All, Time for a little Saturday evening Ceph related quiz. From this documentation page http://ceph.com/docs/master/rados/operations/cache-tiering/ It seems to indicate that you can either flush/evict using relative sizing (cache_target_dirty_ratio) or absolute sizing (target_max_bytes).

Re: [ceph-users] Cache Settings

2015-02-07 Thread Nick Fisk
: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sage Weil Sent: 07 February 2015 20:57 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cache Settings On Sat, 7 Feb 2015, Nick Fisk wrote: Hi All, Time for a little Saturday evening Ceph related quiz

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Nick Fisk
/eliminate a lot of the troubles I have had with resources failing over. Nick From: Jake Young [mailto:jak3...@gmail.com] Sent: 14 January 2015 12:50 To: Nick Fisk Cc: Giuseppe Civitella; ceph-users Subject: Re: [ceph-users] Ceph, LIO, VMWARE anyone? Nick, Where did you read that having

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-28 Thread Nick Fisk
that helps Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mike Christie Sent: 28 January 2015 03:06 To: Zoltan Arnold Nagy; Jake Young Cc: Nick Fisk; ceph-users Subject: Re: [ceph-users] Ceph, LIO, VMWARE anyone? Oh yeah, I am not completely sure

Re: [ceph-users] Improving Performance with more OSD's?

2015-01-05 Thread Nick Fisk
Hi Udo, Lindsay did this for performance reasons so that the data is spread evenly over the disks, I believe it has been accepted that the remaining 2tb on the 3tb disks will not be used. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of

Re: [ceph-users] Improving Performance with more OSD's?

2015-01-05 Thread Nick Fisk
-users-boun...@lists.ceph.com] On Behalf Of Lindsay Mathieson Sent: 05 January 2015 12:35 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Improving Performance with more OSD's? On Mon, 5 Jan 2015 09:21:16 AM Nick Fisk wrote: Lindsay did this for performance reasons so that the data is spread

[ceph-users] Erasure Encoding Chunks Number of Hosts

2015-01-05 Thread Nick Fisk
Hi All, Would anybody have an idea a) If it's possible and b) if it's a good idea to have more EC chunks than the total number of hosts? For instance if I wanted to have a k=6 m=2, but only across 4 hosts and I wanted to be able to withstand 1 host failure and 1 disk failure(any host),

Re: [ceph-users] Erasure Encoding Chunks Number of Hosts

2015-01-06 Thread Nick Fisk
: 05 January 2015 17:38 To: Nick Fisk; ceph-us...@ceph.com Subject: Re: [ceph-users] Erasure Encoding Chunks Number of Hosts Hi Nick, What about subdividing your hosts using containers ? For instance four container per host on your four hosts which gives you 16 hosts. When you add more hosts you

Re: [ceph-users] Erasure coded PGs incomplete

2015-01-09 Thread Nick Fisk
Hi Italo, If you check for a post from me from a couple of days back, I have done exactly this. I created a k=5 m=3 over 4 hosts. This ensured that I could lose a whole host and then an OSD on another host and the cluster was still fully operational. I’m not sure if my method I used

Re: [ceph-users] Ceph, LIO, VMWARE anyone?

2015-01-14 Thread Nick Fisk
Hi Giuseppe, I am working on something very similar at the moment. I currently have it working on some test hardware but seems to be working reasonably well. I say reasonably as I have had a few instability’s but these are on the HA side, the LIO and RBD side of things have been rock

Re: [ceph-users] ISCSI LIO hang after 2-3 days of working

2015-02-10 Thread Nick Fisk
Hi Mike, I can also seem to reproduce this behaviour. If I shutdown a Ceph node, the delay while Ceph works out that the OSD's are down seems to trigger similar error messages. It seems fairly reliable that if a OSD is down for more than 10 seconds that LIO will have this problem. Below is an

Re: [ceph-users] combined ceph roles

2015-02-11 Thread Nick Fisk
Hi David, I have had a few weird issues when shutting down a node, although I can replicate it by doing a “stop ceph-all” as well. It seems that OSD failure detection takes a lot longer when a monitor goes down at the same time, sometimes I have seen the whole cluster grind to a halt for

Re: [ceph-users] Erasure Encoding Chunks Number of Hosts

2015-01-06 Thread Nick Fisk
Of Nick Fisk Sent: 06 January 2015 07:43 To: 'Loic Dachary'; ceph-us...@ceph.com Subject: Re: [ceph-users] Erasure Encoding Chunks Number of Hosts Hi Loic, That's an interesting idea, I suppose the same could probably be achieved by just creating more Crush Host Buckets for each actual host

Re: [ceph-users] Cache Tier Flush = immediate base tier journal sync?

2015-03-18 Thread Nick Fisk
Hi Greg, Thanks for your input and completely agree that we cannot expect developers to fully document what impact each setting has on a cluster, particularly in a performance related way That said, if you or others could spare some time for a few pointers it would be much appreciated and I will

Re: [ceph-users] Cache Tier Flush = immediate base tier journal sync?

2015-03-16 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Gregory Farnum Sent: 16 March 2015 17:33 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Cache Tier Flush = immediate base tier journal sync? On Wed, Mar 11

Re: [ceph-users] Terrible iSCSI tgt RBD performance

2015-03-17 Thread Nick Fisk
Hi Robin, Just a few things to try:- 1. Increase the number of worker threads for tgt (it's a parameter of tgtd, so modify however its being started) 2. Disable librbd caching in ceph.conf 3. Do you see the same performance problems exporting a krbd as a block device via tgt? Nick

Re: [ceph-users] tgt and krbd

2015-03-17 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Mike Christie Sent: 17 March 2015 21:27 To: Nick Fisk; 'Jake Young' Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] tgt and krbd On 03/15/2015 08:42 PM, Mike Christie wrote

Re: [ceph-users] Cache Tier Flush = immediate base tier journal sync?

2015-03-19 Thread Nick Fisk
I think this could be part of what I am seeing. I found this post from back in 2003 http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/12083 Which seems to describe a work around for the behaviour to what I am seeing. The constant small block IO I was seeing looks like it was either

Re: [ceph-users] PGs issue

2015-03-19 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Bogdan SOLGA Sent: 19 March 2015 20:51 To: ceph-users@lists.ceph.com Subject: [ceph-users] PGs issue Hello, everyone! I have created a Ceph cluster (v0.87.1-1) using the info from the

[ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-19 Thread Nick Fisk
I'm looking at trialling OSD's with a small flashcache device over them to hopefully reduce the impact of metadata updates when doing small block io. Inspiration from here:- http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/12083 One thing I suspect will happen, is that when the OSD

Re: [ceph-users] PGs issue

2015-03-20 Thread Nick Fisk
I see the Problem, as your OSD's are only 8GB they have a zero weight, I think the minimum size you can get away with is 10GB in Ceph as the size is measured in TB and only has 2 decimal places. For a work around try running :- ceph osd crush reweight osd.X 1 for each osd, this will

Re: [ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-20 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Burkhard Linke Sent: 20 March 2015 09:09 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] OSD + Flashcache + udev + Partition uuid Hi, On 03/19/2015 10:41 PM, Nick Fisk wrote

[ceph-users] Disk serial number from OSD

2015-03-09 Thread Nick Fisk
Hi All, I just created this little bash script to retrieve the /dev/disk/by-id string for each OSD on a host. Our disks are internally mounted so have no concept of drive bays, this should make it easier to work out what disk has failed. #!/bin/bash DISKS=`ceph-disk list | grep ceph data`

Re: [ceph-users] tgt and krbd

2015-03-09 Thread Nick Fisk
Hi Mike, I was using bs_aio with the krbd and still saw a small caching effect. I'm not sure if it was on the ESXi or tgt/krbd page cache side, but I was definitely seeing the IO's being coalesced into larger ones on the krbd device in iostat. Either way, it would make me potentially nervous to

Re: [ceph-users] EC Pool and Cache Tier Tuning

2015-03-09 Thread Nick Fisk
will be a lot faster using a cache tier if the data resides in it. -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Steffen Winther Sent: 09 March 2015 20:47 To: ceph-users@lists.ceph.com Subject: Re: [ceph-users] EC Pool and Cache Tier Tuning Nick

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-03-09 Thread Nick Fisk
: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mad Engineer Sent: 09 March 2015 17:23 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel Thank you Nick for explaining the problem with 4k writes.Queue

[ceph-users] Cache Tier Flush = immediate base tier journal sync?

2015-03-11 Thread Nick Fisk
I'm not sure if it's something I'm doing wrong or just experiencing an oddity, but when my cache tier flushes dirty blocks out to the base tier, the writes seem to hit the OSD's straight away instead of coalescing in the journals, is this correct? For example if I create a RBD on a standard 3 way

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Young Sent: 06 March 2015 12:52 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] tgt and krbd On Thursday, March 5, 2015, Nick Fisk n...@fisk.me.uk wrote: Hi All, Just a heads up after

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
. Nick From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Jake Young Sent: 06 March 2015 15:07 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] tgt and krbd My initator is also VMware software iscsi. I had my tgt iscsi targets' write-cache

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Just tried cfq, deadline and noop which more or less all show identical results -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: 06 March 2015 11:59 To: Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Strange krbd

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
Dryomov Sent: 06 March 2015 15:09 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Strange krbd behaviour with queue depths On Thu, Mar 5, 2015 at 8:17 PM, Nick Fisk n...@fisk.me.uk wrote: I’m seeing a strange queue depth behaviour with a kernel mapped RBD, librbd does

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
:10 AM, Nick Fisk wrote: Just tried cfq, deadline and noop which more or less all show identical results -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: 06 March 2015 11:59 To: Nick Fisk Cc: ceph-users Subject

Re: [ceph-users] tgt and krbd

2015-03-06 Thread Nick Fisk
Hi Jake, Good to see it’s not just me. I’m guessing that the fact you are doing 1MB writes means that the latency difference is having a less noticeable impact on the overall write bandwidth. What I have been discovering with Ceph + iSCSi is that due to all the extra hops (client-iscsi

Re: [ceph-users] Strange krbd behaviour with queue depths

2015-03-06 Thread Nick Fisk
: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Somnath Roy Sent: 06 March 2015 16:02 To: Alexandre DERUMIER; Nick Fisk Cc: ceph-users Subject: Re: [ceph-users] Strange krbd behaviour with queue depths Nick, I think this is because of the krbd you are using is using

[ceph-users] EC Pool and Cache Tier Tuning

2015-03-07 Thread Nick Fisk
Hi All, I have been experimenting with EC pools and Cache Tiers to make them more useful for more active data sets on RBD volumes and I thought I would share my findings so far as they have made quite a significant difference. My Ceph cluster comprises of 4 Nodes each with the following:-

Re: [ceph-users] Extreme slowness in SSD cluster with 3 nodes and 9 OSD with 3.16-3 kernel

2015-03-07 Thread Nick Fisk
You are hitting serial latency limits. For a 4kb sync write to happen it has to:- 1. Travel across network from client to Primary OSD 2. Be processed by Ceph 3. Get Written to Pri OSD 4. Ack travels across network to client At 4kb these 4 steps take up a very high percentage of the actual

Re: [ceph-users] Firefly Tiering

2015-03-11 Thread Nick Fisk
Hi Stefan, If the majority of your hot data fits on the cache tier you will see quite a marked improvement in read performance and similar write performance (assuming you would have had your hdds backed by SSD journals). However for data that is not in the cache tier you will get 10-20% less

Re: [ceph-users] Firefly Tiering

2015-03-11 Thread Nick Fisk
Am 11.03.2015 um 11:17 schrieb Nick Fisk: Hi Nick, Am 11.03.2015 um 10:52 schrieb Nick Fisk: Hi Stefan, If the majority of your hot data fits on the cache tier you will see quite a marked improvement in read performance I don't have writes ;-) just around 5%. 95% are writes

Re: [ceph-users] Many Reads of an object

2015-03-13 Thread Nick Fisk
Hi Alexander, Assuming the images would fit in the page cache of all your OSD nodes, you would see a massive performance increase as reads would be coming straight from ram. But otherwise no, reads are not balanced across replica's, only the primary one responds to reads. But don't forget a RBD

Re: [ceph-users] running Qemu / Hypervisor AND Ceph on the same nodes

2015-03-29 Thread Nick Fisk
There's probably a middle ground where you get the best of both worlds. Maybe 2-4 OSD's per compute node alongside dedicated Ceph nodes. That way you get a bit of extra storage and can still use lower end CPU's, but don't have to worry so much about resource contention. -Original

Re: [ceph-users] Persistent Write Back Cache

2015-03-04 Thread Nick Fisk
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John Spray Sent: 04 March 2015 11:34 To: Nick Fisk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Persistent Write Back Cache On 04/03/2015 08:26, Nick Fisk wrote: To illustrate the difference

[ceph-users] tgt and krbd

2015-03-05 Thread Nick Fisk
Hi All, Just a heads up after a day's experimentation. I believe tgt with its default settings has a small write cache when exporting a kernel mapped RBD. Doing some write tests I saw 4 times the write throughput when using tgt aio + krbd compared to tgt with the builtin librbd. After

Re: [ceph-users] Persistent Write Back Cache

2015-03-04 Thread Nick Fisk
I can understand that this feature would prove more of a challenge if you are using Qemu and RBD. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Christian Balzer Sent: 04 March 2015 08:40 To: ceph-users@lists.ceph.com Cc: Nick Fisk Subject

[ceph-users] Booting from journal devices

2015-02-28 Thread Nick Fisk
Hi All, Thought I would just share this in case someone finds it useful. I've just finished building our new Ceph cluster where the journals are installed on the same SSD's as the OS. The SSD's have a MD raid partitions for the OS and swap and the rest of the SSD's are used for individual

[ceph-users] New Cluster - Any requests?

2015-02-28 Thread Nick Fisk
Hi All, I've just finished building a new POC cluster comprised of the following:- 4 Hosts in 1 chassis (http://www.supermicro.com/products/system/4U/F617/SYS-F617H6-FTPT_.cfm) each with the following:- 2x Xeon 2620 v2 (2.1Ghz) 32GB Ram 2x Onboard 10GB-T into 10GB switches 10x 3TB

Re: [ceph-users] Booting from journal devices

2015-03-01 Thread Nick Fisk
cursor, which suggests to me ceph-deploy overwrote the 1st sector of the disk, where grub normally resides. Nick -Original Message- From: Christian Balzer [mailto:ch...@gol.com] Sent: 01 March 2015 03:44 To: ceph-users@lists.ceph.com Cc: Nick Fisk Subject: Re: [ceph-users] Booting from

Re: [ceph-users] stuck ceph-deploy mon create-initial / giant

2015-02-23 Thread Nick Fisk
Hi Stephan, I've just had a very similar problem today. It turned out that the problem was the mgmt network which I use to manage the nodes is different to the public network. I had created host entries resolving to the mgmt network ip's, so when initial create was running it was trying to

Re: [ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-23 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Brendan Moloney Sent: 23 March 2015 21:02 To: Noah Mehl Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] OSD + Flashcache + udev + Partition uuid This would be in addition to

Re: [ceph-users] OSD + Flashcache + udev + Partition uuid

2015-03-23 Thread Nick Fisk
Just to add, the main reason it seems to make a difference is the metadata updates which lie on the actual OSD. When you are doing small block writes, these metadata updates seem to take almost as long as the actual data, so although the writes are getting coalesced, the actual performance isn't

Re: [ceph-users] Cores/Memory/GHz recommendation for SSD based OSD servers

2015-04-02 Thread Nick Fisk
I'm probably going to get shot down for saying this...but here goes. As a very rough guide, think of it more as you need around 10Mhz for every IO, whether that IO is 4k or 4MB it uses roughly the same amount of CPU, as most of the CPU usage is around ceph data placement rather than the actual

Re: [ceph-users] Cores/Memory/GHz recommendation for SSD based OSD servers

2015-04-02 Thread Nick Fisk
On Thursday, April 2, 2015, Nick Fisk n...@fisk.me.uk wrote: I'm probably going to get shot down for saying this...but here goes. As a very rough guide, think of it more as you need around 10Mhz for every IO, whether that IO is 4k or 4MB it uses roughly the same amount of CPU, as most

Re: [ceph-users] long blocking with writes on rbds

2015-04-23 Thread Nick Fisk
Hi Jeff, I believe these are normal, they are just the connections IDLE timing out to the OSD's because no traffic has flowed recently. They are probably a symptom rather than a cause. Nick -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of

Re: [ceph-users] read performance VS network usage

2015-04-23 Thread Nick Fisk
Hi Frederic, If you are using EC pools, the primary OSD requests the remaining shards of the object from the other OSD's, reassembles it and then sends the data to the client. The entire object needs to be reconstructed even for a small IO operation, so 4kb reads could lead to quite a large IO

Re: [ceph-users] Serving multiple applications with a single cluster

2015-04-23 Thread Nick Fisk
Hi Rafael, Do you require a shared FS for these applications or would a block device with a traditional filesystem be suitable? If it is, then you could create separate pools with a RBD block device in each. Just out of interest what is the reason for separation, security or performance? Nick

Re: [ceph-users] Having trouble getting good performance

2015-04-23 Thread Nick Fisk
-Original Message- From: jdavidli...@gmail.com [mailto:jdavidli...@gmail.com] On Behalf Of J David Sent: 23 April 2015 21:22 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Having trouble getting good performance On Thu, Apr 23, 2015 at 3:05 PM, Nick

Re: [ceph-users] Having trouble getting good performance

2015-04-23 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of J David Sent: 23 April 2015 20:19 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Having trouble getting good performance On Thu, Apr 23, 2015 at 3:05 PM, Nick

Re: [ceph-users] Serving multiple applications with a single cluster

2015-04-23 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Rafael Coninck Teigão Sent: 23 April 2015 22:35 To: Nick Fisk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] Serving multiple applications with a single cluster Hi Nick, Thanks

Re: [ceph-users] Having trouble getting good performance

2015-04-23 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of J David Sent: 23 April 2015 17:51 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Having trouble getting good performance On Wed, Apr 22, 2015 at 4:30 PM, Nick

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-20 Thread Nick Fisk
: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Christian Eichelmann Sent: 20 April 2015 14:41 To: Nick Fisk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC I'm using xfs on the rbd disks. They are between 1 and 10TB in size

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-20 Thread Nick Fisk
Ah ok, good point What FS are you using on the RBD? -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Christian Eichelmann Sent: 20 April 2015 13:16 To: Nick Fisk; ceph-users@lists.ceph.com Subject: Re: [ceph-users] 100% IO Wait with CEPH

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-20 Thread Nick Fisk
Hi Christian, A very non-technical answer but as the problem seems related to the RBD client it might be worth trying the latest Kernel if possible. The RBD client is Kernel based and so there may be a fix which might stop this from happening. Nick -Original Message- From: ceph-users

Re: [ceph-users] Having trouble getting good performance

2015-04-24 Thread Nick Fisk
Hi David, Thanks for posting those results. From the Fio runs, I see you are getting around 200 iops at 128kb write io size. I would imagine you should be getting somewhere around 200-300 iops for the cluster you posted in the initial post, so it looks like its performing about right. 200 iops

Re: [ceph-users] Having trouble getting good performance

2015-04-24 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of J David Sent: 24 April 2015 15:40 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Having trouble getting good performance On Fri, Apr 24, 2015 at 6:39 AM, Nick

Re: [ceph-users] Having trouble getting good performance

2015-04-22 Thread Nick Fisk
Hi David, I suspect you are hitting problems with sync writes, which Ceph isn't known for being the fastest thing for. I'm not a big expert on ZFS but I do know that a SSD ZIL is normally recommended to allow fast sync writes. If you don't have this you are waiting on Ceph to Ack the write

Re: [ceph-users] Having trouble getting good performance

2015-04-24 Thread Nick Fisk
-Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of J David Sent: 24 April 2015 18:41 To: Nick Fisk Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] Having trouble getting good performance On Fri, Apr 24, 2015 at 10:58 AM, Nick Fisk

  1   2   3   4   5   6   7   >