[ceph-users] ceph usage for very small objects

2019-12-26 Thread Adrian Nicolae
Hi all, I have a ceph cluster with 4+2 EC used as a secondary storage system for offloading big files from another storage system. Even if most of the files are big (at least 50MB), we have also some small objects - less than 4MB each. The current storage usage is 358TB of raw data and 237TB

Re: [ceph-users] Second radosgw install

2019-02-16 Thread Adrian Nicolae
. I'm not sure what happens with the existing system pools if I already have a working rgw server... Thanks. On 2/15/2019 6:35 PM, Adrian Nicolae wrote: Hi, I want to install a second radosgw to my existing ceph cluster (mimic) on another server. Should I create it like the first one

[ceph-users] Second radosgw install

2019-02-15 Thread Adrian Nicolae
Hi, I want to install a second radosgw to my existing ceph cluster (mimic) on another server. Should I create it like the first one, with 'ceph-deploy rgw create' ? I don't want to mess with the existing rgw system pools. Thanks. ___ ceph-users

Re: [ceph-users] v12.2.8 Luminous released

2018-09-05 Thread Adrian Saul
Can I confirm if this bluestore compression assert issue is resolved in 12.2.8? https://tracker.ceph.com/issues/23540 I notice that it has a backport that is listed against 12.2.8 but there is no mention of that issue or backport listed in the release notes. > -Original Message- >

Re: [ceph-users] SSDs for data drives

2018-07-12 Thread Adrian Saul
We started our cluster with consumer (Samsung EVO) disks and the write performance was pitiful, they had periodic spikes in latency (average of 8ms, but much higher spikes) and just did not perform anywhere near where we were expecting. When replaced with SM863 based devices the difference

[ceph-users] Different write pools for RGW objects

2018-07-09 Thread Adrian Nicolae
Hi, I was wondering if I can have  different destination pools for the S3 objects uploaded to Ceph via RGW based on the object's size. For example : - smaller S3 objects (let's say smaller than 1MB) should go to a replicated pool - medium and big objects should go to a EC pool Is there

Re: [ceph-users] pg inconsistent, scrub stat mismatch on bytes

2018-06-06 Thread Adrian
ed the issue for now but the info.stats.stat_sum.num_bytes still differs so presumably will become inconsistent again next time it scrubs. Adrian. On Tue, Jun 5, 2018 at 12:09 PM, Adrian wrote: > Hi Cephers, > > We recently upgraded one of our clusters from hammer to jewel and then to &

[ceph-users] pg inconsistent, scrub stat mismatch on bytes

2018-06-04 Thread Adrian
if this is ok to issue a pg repair on 6.20 or if there's something else we should be looking at first ? Thanks in advance, Adrian. --- Adrian : aussie...@gmail.com If violence doesn't solve your problem, you're not using enough of it. ___ ceph-users mailing

Re: [ceph-users] multi site with cephfs

2018-05-21 Thread Adrian Saul
with automount would probably work for you. From: Up Safe [mailto:upands...@gmail.com] Sent: Tuesday, 22 May 2018 12:33 AM To: David Turner <drakonst...@gmail.com> Cc: Adrian Saul <adrian.s...@tpgtelecom.com.au>; ceph-users <ceph-users@lists.ceph.com> Subject: Re: [ceph-users] multi site

Re: [ceph-users] multi site with cephfs

2018-05-21 Thread Adrian Saul
We run CephFS in a limited fashion in a stretched cluster of about 40km with redundant 10G fibre between sites – link latency is in the order of 1-2ms. Performance is reasonable for our usage but is noticeably slower than comparable local ceph based RBD shares. Essentially we just setup the

Re: [ceph-users] jewel to luminous upgrade, chooseleaf_vary_r and chooseleaf_stable

2018-05-15 Thread Adrian
Thanks Dan, After talking it through we've decided to adopt your approach too and leave the tunables till after the upgrade. Regards, Adrian. On Mon, May 14, 2018 at 5:14 PM, Dan van der Ster <d...@vanderster.com> wrote: > Hi Adrian, > > Is there a strict reason why you

Re: [ceph-users] no rebalance when changing chooseleaf_vary_r tunable

2018-04-04 Thread Adrian
rebalance though, had me worried so thanks for the info. Regards, Adrian. On Thu, Apr 5, 2018 at 9:16 AM, Gregory Farnum <gfar...@redhat.com> wrote: > http://docs.ceph.com/docs/master/rados/operations/crush- > map/#firefly-crush-tunables3 > > "The optimal value (in term

[ceph-users] no rebalance when changing chooseleaf_vary_r tunable

2018-04-04 Thread Adrian
t;straw_calc_version": 0, "allowed_bucket_algs": 22, "profile": "firefly", "optimal_tunables": 1, "legacy_tunables": 0, "require_feature_tunables": 1, "require_feature_tunables2": 1, "require_fe

Re: [ceph-users] Ceph iSCSI is a prank?

2018-03-04 Thread Adrian Saul
We are using Ceph+RBD+NFS under pacemaker for VMware. We are doing iSCSI using SCST but have not used it against VMware, just Solaris and Hyper-V. It generally works and performs well enough – the biggest issues are the clustering for iSCSI ALUA support and NFS failover, most of which we have

Re: [ceph-users] Thick provisioning

2017-10-18 Thread Adrian Saul
I concur - at the moment we need to manually sum the RBD images to look at how much we have "provisioned" vs what ceph df shows. in our case we had a rapid run of provisioning new LUNs but it took a while before usage started to catch up with what was provisioned as data was migrated in.

Re: [ceph-users] Ceph-ISCSI

2017-10-11 Thread Adrian Saul
. From: Samuel Soulard [mailto:samuel.soul...@gmail.com] Sent: Thursday, 12 October 2017 11:20 AM To: Adrian Saul <adrian.s...@tpgtelecom.com.au> Cc: Zhu Lingshan <ls...@suse.com>; dilla...@redhat.com; ceph-users <ceph-us...@ceph.com> Subject: RE: [ceph-users] Ceph-I

Re: [ceph-users] Ceph-ISCSI

2017-10-11 Thread Adrian Saul
As an aside, SCST iSCSI will support ALUA and does PGRs through the use of DLM. We have been using that with Solaris and Hyper-V initiators for RBD backed storage but still have some ongoing issues with ALUA (probably our current config, we need to lab later recommendations). >

Re: [ceph-users] bad crc/signature errors

2017-10-04 Thread Adrian Saul
We see the same messages and are similarly on a 4.4 KRBD version that is affected by this. I have seen no impact from it so far that I know about > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Jason Dillaman > Sent: Thursday, 5

Re: [ceph-users] osd create returns duplicate ID's

2017-09-29 Thread Adrian Saul
Do you mean that after you delete and remove the crush and auth entries for the OSD, when you go to create another OSD later it will re-use the previous OSD ID that you have destroyed in the past? Because I have seen that behaviour as well - but only for previously allocated OSD IDs that

Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-22 Thread Adrian Saul
Thanks for bringing this to attention Wido - its of interest to us as we are currently looking to migrate mail platforms onto Ceph using NFS, but this seems far more practical. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Wido den

Re: [ceph-users] ceph-osd restartd via systemd in case of disk error

2017-09-19 Thread Adrian Saul
> I understand what you mean and it's indeed dangerous, but see: > https://github.com/ceph/ceph/blob/master/systemd/ceph-osd%40.service > > Looking at the systemd docs it's difficult though: > https://www.freedesktop.org/software/systemd/man/systemd.service.ht > ml > > If the OSD crashes due to

Re: [ceph-users] Ceph release cadence

2017-09-07 Thread Adrian Saul
> * Drop the odd releases, and aim for a ~9 month cadence. This splits the > difference between the current even/odd pattern we've been doing. > > + eliminate the confusing odd releases with dubious value > + waiting for the next release isn't quite as bad > - required upgrades every 9

Re: [ceph-users] Monitoring a rbd map rbd connection

2017-08-25 Thread Adrian Saul
If you are monitoring to ensure that it is mounted and active, a simple check_disk on the mountpoint should work. If the mount is not present, or the filesystem is non-responsive then this should pick it up. A second check to perhaps test you can actually write files to the file system would

Re: [ceph-users] Ruleset vs replica count

2017-08-25 Thread Adrian Saul
Yes - ams5-ssd would have 2 replicas, ams6-ssd would have 1 (@size 3, -2 = 1) Although for this ruleset the min_size should be set to at least 2, or more practically 3 or 4. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sinan Polat Sent: Friday, 25 August 2017

Re: [ceph-users] Ceph cluster with SSDs

2017-08-20 Thread Adrian Saul
> SSD make details : SSD 850 EVO 2.5" SATA III 4TB Memory & Storage - MZ- > 75E4T0B/AM | Samsung The performance difference between these and the SM or PM863 range is night and day. I would not use these for anything you care about with performance, particularly IOPS or latency. Their write

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Adrian Saul
> I'd be interested in details of this small versus large bit. The smaller shares is just simply to distribute the workload over more RBDs so the bottleneck doesn’t become the RBD device. The size itself doesn’t particularly matter but just the idea to distribute VMs across many shares rather

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-16 Thread Adrian Saul
We are using Ceph on NFS for VMWare – we are using SSD tiers in front of SATA and some direct SSD pools. The datastores are just XFS file systems on RBD managed by a pacemaker cluster for failover. Lessons so far are that large datastores quickly run out of IOPS and compete for performance –

Re: [ceph-users] Iscsi configuration

2017-08-08 Thread Adrian Saul
support in SCST was a lot better. HTH. Cheers, Adrian From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Samuel Soulard Sent: Wednesday, 9 August 2017 6:45 AM To: ceph-us...@ceph.com Subject: [ceph-users] Iscsi configuration Hi all, Platform : Centos 7 Luminous 12.1.2

Re: [ceph-users] Does ceph pg scrub error affect all of I/O in ceph cluster?

2017-08-03 Thread Adrian Saul
Depends on the error case – usually you will see blocked IO messages as well if there is a condition causing OSDs to be unresponsive. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of ??? Sent: Friday, 4 August 2017 1:34 PM To: ceph-users@lists.ceph.com Subject:

Re: [ceph-users] PGs per OSD guidance

2017-07-19 Thread Adrian Saul
Anyone able to offer any advice on this? Cheers, Adrian > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Friday, 14 July 2017 6:05 PM > To: 'ceph-users@lists.ceph.com' > Subject: [ceph-users] PG

[ceph-users] PGs per OSD guidance

2017-07-14 Thread Adrian Saul
will be more disbursed? I am also curious from a performance stand of view are we better off with more PGs to reduce PG lock contention etc? Cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege

[ceph-users] Deep scrub distribution

2017-07-05 Thread Adrian Saul
batches of PGs to deep scrub over time to push out the distribution again? Adrian Saul | Infrastructure Projects Team Lead IT T 02 9009 9041 | M +61 402 075 760 30 Ross St, Glebe NSW 2037 adrian.s...@tpgtelecom.com.au<mailto:adrian.s...@tpgtelecom.com.au> | www.tpg.com.a

Re: [ceph-users] VMware + CEPH Integration

2017-06-18 Thread Adrian Saul
> Hi Alex, > > Have you experienced any problems with timeouts in the monitor action in > pacemaker? Although largely stable, every now and again in our cluster the > FS and Exportfs resources timeout in pacemaker. There's no mention of any > slow requests or any peering..etc from the ceph logs so

Re: [ceph-users] design guidance

2017-06-06 Thread Adrian Saul
> > Early usage will be CephFS, exported via NFS and mounted on ESXi 5.5 > > and > > 6.0 hosts(migrating from a VMWare environment), later to transition to > > qemu/kvm/libvirt using native RBD mapping. I tested iscsi using lio > > and saw much worse performance with the first cluster, so it seems

Re: [ceph-users] rbd iscsi gateway question

2017-04-06 Thread Adrian Saul
for krbd. From: Nick Fisk [mailto:n...@fisk.me.uk] Sent: Thursday, 6 April 2017 5:43 PM To: Adrian Saul; 'Brady Deetz'; 'ceph-users' Subject: RE: [ceph-users] rbd iscsi gateway question I assume Brady is referring to the death spiral LIO gets into with some initiators, including vmware

Re: [ceph-users] rbd iscsi gateway question

2017-04-05 Thread Adrian Saul
I am not sure if there is a hard and fast rule you are after, but pretty much anything that would cause ceph transactions to be blocked (flapping OSD, network loss, hung host) has the potential to block RBD IO which would cause your iSCSI LUNs to become unresponsive for that period. For the

Re: [ceph-users] MySQL and ceph volumes

2017-03-07 Thread Adrian Saul
] Sent: Wednesday, 8 March 2017 10:36 AM To: Adrian Saul Cc: ceph-users Subject: Re: [ceph-users] MySQL and ceph volumes Thank you Adrian! I’ve forgot this option and I can reproduce the problem. Now, what could be the problem on ceph side with O_DSYNC writes? Regards Matteo

Re: [ceph-users] MySQL and ceph volumes

2017-03-07 Thread Adrian Saul
Possibly MySQL is doing sync writes, where as your FIO could be doing buffered writes. Try enabling the sync option on fio and compare results. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Matteo Dacrema > Sent: Wednesday, 8 March

Re: [ceph-users] Review of Ceph on ZFS - or how not to deploy Ceph for RBD + OpenStack

2017-01-10 Thread Adrian Saul
I would concur having spent a lot of time on ZFS on Solaris. ZIL will reduce the fragmentation problem a lot (because it is not doing intent logging into the filesystem itself which fragments the block allocations) and write response will be a lot better. I would use different devices for

Re: [ceph-users] When Zero isn't 0 (Crush weight mysteries)

2016-12-20 Thread Adrian Saul
I found the other day even though I had 0 weighted OSDs, there was still weight in the containing buckets which triggered some rebalancing. Maybe it is something similar, there was weight added to the bucket even though the OSD underneath was 0. > -Original Message- > From:

Re: [ceph-users] Crush rule check

2016-12-12 Thread Adrian Saul
max_size 4 > step take ssd-sydney > step choose firstn 2 type datacenter > step chooseleaf firstn 2 type host > step emit > } > > This way the ruleset will only work for size = 4. > > Wido > > > > thanks, > > Adrian

Re: [ceph-users] Crush rule check

2016-12-12 Thread Adrian Saul
Thanks Wido. I had found the show-utilization test, but had not seen show-mappings - that confirmed it for me. thanks, Adrian > -Original Message- > From: Wido den Hollander [mailto:w...@42on.com] > Sent: Monday, 12 December 2016 7:07 PM > To: ceph-users@lists.ceph.com;

[ceph-users] Crush rule check

2016-12-10 Thread Adrian Saul
this correctly. rule sydney-ssd { ruleset 6 type replicated min_size 2 max_size 10 step take ssd-sydney step choose firstn -2 type datacenter step chooseleaf firstn 2 type host step emit } Cheers, Adrian Confidentiality: This email

Re: [ceph-users] [EXTERNAL] Re: osd set noin ignored for old OSD ids

2016-11-23 Thread Adrian Saul
To: Gregory Farnum > Cc: Adrian Saul; ceph-users@lists.ceph.com > Subject: Re: [EXTERNAL] Re: [ceph-users] osd set noin ignored for old OSD > ids > > From my experience noin doesn't stop new OSDs from being marked in. noin > only works on OSDs already in the crushmap. To accomplis

[ceph-users] osd set noin ignored for old OSD ids

2016-11-22 Thread Adrian Saul
nobackfill,norebalance). Am I doing something wrong in this process or is there something about "noin" that is ignored for previously existing OSDs that have been removed from both the OSD map and crush map? Cheers, Adrian Confidentiality: This email and any attachments are

[ceph-users] Ceph outage - monitoring options

2016-11-21 Thread Adrian Saul
out it from the osdmap to keep the cluster operational (timeout tuning, flags to be set etc?) Any help appreciated. It's a little scary kicking into production and having an outage that I cant explain why cephs redundancy didn't kick in. Cheers, Adrian Confidentiality: This em

Re: [ceph-users] Snap delete performance impact

2016-09-23 Thread Adrian Saul
limit. Sent from my SAMSUNG Galaxy S7 on the Telstra Mobile Network Original message From: Nick Fisk <n...@fisk.me.uk> Date: 23/09/2016 7:26 PM (GMT+10:00) To: Adrian Saul <adrian.s...@tpgtelecom.com.au>, ceph-users@lists.ceph.com Subject: RE: Snap delete perfor

Re: [ceph-users] Snap delete performance impact

2016-09-23 Thread Adrian Saul
that much. Cheers, Adrian > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Thursday, 22 September 2016 7:15 PM > To: n...@fisk.me.uk; ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Snap delete

Re: [ceph-users] Snap delete performance impact

2016-09-22 Thread Adrian Saul
of very small FS metadata updates going on and that is what is killing it. Cheers, Adrian > -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: Thursday, 22 September 2016 7:06 PM > To: Adrian Saul; ceph-users@lists.ceph.com > Subject: RE: Snap delete per

Re: [ceph-users] Snap delete performance impact

2016-09-21 Thread Adrian Saul
? It really should not make the entire platform unusable for 10 minutes. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Wednesday, 6 July 2016 3:41 PM > To: 'ceph-users@lists.ceph.com' > Subject: [ceph-

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Adrian Saul
> But shouldn't freezing the fs and doing a snapshot constitute a "clean > unmount" hence no need to recover on the next mount (of the snapshot) - > Ilya? It's what I thought as well, but XFS seems to want to attempt to replay the log regardless on mount and write to the device to do so. This

Re: [ceph-users] Consistency problems when taking RBD snapshot

2016-09-14 Thread Adrian Saul
I found I could ignore the XFS issues and just mount it with the appropriate options (below from my backup scripts): # # Mount with nouuid (conflicting XFS) and norecovery (ro snapshot) # if ! mount -o ro,nouuid,norecovery $SNAPDEV /backup${FS}; then

Re: [ceph-users] Lessons learned upgrading Hammer -> Jewel

2016-07-17 Thread Adrian Saul
I have SELinux disabled and it does the restorecon on /var/lib/ceph regardless from the RPM post upgrade scripts. In my case I chose to kill the restorecon processes to save outage time – it didn’t affect the upgrade package completion. From: ceph-users

Re: [ceph-users] Terrible RBD performance with Jewel

2016-07-14 Thread Adrian Saul
I would suggest caution with " filestore_odsync_write" - its fine on good SSDs, but on poor SSDs or spinning disks it will kill performance. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Somnath Roy Sent: Friday, 15 July 2016 3:12 AM To: Garg, Pankaj;

[ceph-users] Snap delete performance impact

2016-07-05 Thread Adrian Saul
tunables (priority and cost). Is there any recommendations around setting these for a Jewel cluster? cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege. They are intended solely

[ceph-users] OSD out/down detection

2016-06-19 Thread Adrian Saul
ely attempting backfills. Any ideas on how I can improve detection of this condition? Cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege. They are intended solely for the attention and use of th

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-06 Thread Adrian Saul
Centos 7 - the ugrade was done simply with "yum update -y ceph" on each node one by one, so the package order would have been determined by yum. From: Jason Dillaman <jdill...@redhat.com> Sent: Monday, June 6, 2016 10:42 PM To: Adri

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
at.com] > Sent: Monday, 6 June 2016 12:37 PM > To: Adrian Saul > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > Odd -- sounds like you might have Jewel and Infernalis class objects and > OSDs intermixed. I would double-chec

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
they are failing. > -Original Message- > From: Adrian Saul > Sent: Monday, 6 June 2016 12:29 PM > To: Adrian Saul; dilla...@redhat.com > Cc: ceph-users@lists.ceph.com > Subject: RE: [ceph-users] Jewel upgrade - rbd errors after upgrade > > > I have traced it back

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
/rados-classes/libcls_rbd.so: undefined symbol: _ZN4ceph6buffer4list8iteratorC1EPS1_j Trying to figure out why that is the case. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Monday, 6 June 2016 11:11 AM

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
]# rados stat -p glebe-sata rbd_directory glebe-sata/rbd_directory mtime 2016-06-06 10:18:28.00, size 0 > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Adrian Saul > Sent: Monday, 6 June 2016 11:11 AM > To: dilla...@

Re: [ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
- > From: Jason Dillaman [mailto:jdill...@redhat.com] > Sent: Monday, 6 June 2016 11:00 AM > To: Adrian Saul > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Jewel upgrade - rbd errors after upgrade > > Are you able to successfully run the following command successf

[ceph-users] Jewel upgrade - rbd errors after upgrade

2016-06-05 Thread Adrian Saul
an issue. I have also tried the commands on other cluster members that have not done anything with RBD before (I was wondering if perhaps the kernel rbd was pinning the old library version open or something) but the same error occurs. Where can I start trying to resolve this? Cheers, Adrian

Re: [ceph-users] 2 networks vs 2 NICs

2016-06-04 Thread Adrian Sevcenco
switch so i am not sure if it's worth it.. Thank you! Adrian -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Adrian Sevcenco Sent: 04 June 2016 16:11 To: ceph-users@lists.ceph.com Subject: [ceph-users] 2 networks vs 2 NICs Hi! I seen

[ceph-users] 2 networks vs 2 NICs

2016-06-04 Thread Adrian Sevcenco
interface and a vlan (virtual) interface for the cluster network? Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Adrian Saul
> > For two links it should be quite good - it seemed to balance across > > that quite well, but with 4 links it seemed to really prefer 2 in my case. > > > Just for the record, did you also change the LACP policies on the switches? > > From what I gather, having fancy pants L3+4 hashing on the

Re: [ceph-users] Best Network Switches for Redundancy

2016-06-01 Thread Adrian Saul
I am currently running our Ceph POC environment using dual Nexus 9372TX 10G-T switches, each OSD host has two connections to each switch and they are formed into a single 4 link VPC (MC-LAG), which is bonded under LACP on the host side. What I have noticed is that the various hashing policies

Re: [ceph-users] Fwd: [Ceph-community] Wasting the Storage capacity when using Ceph based On high-end storage systems

2016-06-01 Thread Adrian Saul
Also if for political reasons you need a “vendor” solution – ask Dell about their DSS 7000 servers – 90 8TB disks and two compute nodes in 4RU would go a long way to making up a multi-PB Ceph solution. Supermicro also do a similar solution with some 36, 60 and 90 disk in 4RU models. Cisco

Re: [ceph-users] seqwrite gets good performance but random rw gets worse

2016-05-25 Thread Adrian Saul
Sync will always be lower – it will cause it to wait for previous writes to complete before issuing more so it will effectively throttle writes to a queue depth of 1. From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ken Peng Sent: Wednesday, 25 May 2016 6:36 PM To:

Re: [ceph-users] seqwrite gets good performance but random rw gets worse

2016-05-25 Thread Adrian Saul
Are you using image-format 2 RBD images? We found a major performance hit using format 2 images under 10.2.0 today in some testing. When we switched to using format 1 images we literally got 10x random write IOPS performance (1600 IOPs up to 3 IOPS for the same test). From: ceph-users

Re: [ceph-users] RBD removal issue

2016-05-23 Thread Adrian Saul
Thanks - all sorted. > -Original Message- > From: Nick Fisk [mailto:n...@fisk.me.uk] > Sent: Monday, 23 May 2016 6:58 PM > To: Adrian Saul; ceph-users@lists.ceph.com > Subject: RE: RBD removal issue > > See here: > > http://cephnotes.ksperis.com/blog/2014

[ceph-users] RBD removal issue

2016-05-23 Thread Adrian Saul
view. From what I can see there is only the rbd_header object remaining - can I just remove that directly or am I risking corrupting something else by not removing it using rbd rm? Cheers, Adrian [root@ceph-glb-fec-01 ~]# rbd info glebe-sata/oemprd01db_lun00 rbd image 'oemprd01db_lun00

Re: [ceph-users] NVRAM cards as OSD journals

2016-05-22 Thread Adrian Saul
I am using Intel P3700DC 400G cards in a similar configuration (two per host) - perhaps you could look at cards of that capacity to meet your needs. I would suggest having such small journals would mean you will be constantly blocking on journal flushes which will impact write performance and

Re: [ceph-users] fibre channel as ceph storage interconnect

2016-04-22 Thread Adrian Saul
> from the responses I've gotten, it looks like there's no viable option to use > fibre channel as an interconnect between the nodes of the cluster. > Would it be worth while development effort to establish a block protocol > between the nodes so that something like fibre channel could be used to

Re: [ceph-users] fibre channel as ceph storage interconnect

2016-04-21 Thread Adrian Saul
I could only see it being done using FCIP as the OSD processes use IP to communicate. I guess it would depend on why you are looking to use something like FC instead of Ethernet or IB. > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >

Re: [ceph-users] Mon placement over wide area

2016-04-12 Thread Adrian Saul
om: Maxime Guyot [mailto:maxime.gu...@elits.com] > Sent: Tuesday, 12 April 2016 5:49 PM > To: Adrian Saul; Christian Balzer; 'ceph-users@lists.ceph.com' > Subject: Re: [ceph-users] Mon placement over wide area > > Hi Adrian, > > Looking at the documentation RadosGW has multi region s

Re: [ceph-users] Mon placement over wide area

2016-04-11 Thread Adrian Saul
the heavier IOP or latency sensitive workloads onto it until we get a better feel for how it behaves at scale and can be sure of the performance. As above - for the most part we are going to be for the most part having local site pools (replicate at application level), a few metro replicated

[ceph-users] Mon placement over wide area

2016-04-11 Thread Adrian Saul
hat I need to consider when we start building at this scale. Cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege. They are intended solely for the attention and use of the named a

Re: [ceph-users] Ceph.conf

2016-03-30 Thread Adrian Saul
It is the monitors that ceph clients/daemons can connect to initially to connect with the cluster. Once they connect to one of the initial mons they will get a full list of all monitors and be able to connect to any of them to pull updated maps. From: ceph-users

[ceph-users] OSD crash after conversion to bluestore

2016-03-30 Thread Adrian Saul
s someone can suggest if this is a bug that needs looking at. Cheers, Adrian Confidentiality: This email and any attachments are confidential and may be subject to copyright, legal or some other professional privilege. They are intended solely for the attention and use of the named ad

Re: [ceph-users] Ceph RBD latencies

2016-03-06 Thread Adrian Saul
nks. The consideration was I didn't want to lose 36 or 18 OSDs due to a journal failure, so if we lost a card we could do a controlled replacement without totally rebuilding the OSDS (as they are PCI-e its host outage anyway). We could maybe look to see if we can put 3 cards i

Re: [ceph-users] Ceph RBD latencies

2016-03-03 Thread Adrian Saul
> Samsung EVO... > Which exact model, I presume this is not a DC one? > > If you had put your journals on those, you would already be pulling your hairs > out due to abysmal performance. > > Also with Evo ones, I'd be worried about endurance. No, I am using the P3700DCs for journals. The

[ceph-users] failure of public network kills connectivity

2016-01-05 Thread Adrian Imboden
mon data = /var/lib/ceph/mon/ceph-node1/ [mon.node3] host = node3 mon data = /var/lib/ceph/mon/ceph-node3/ [mon.node2] host = node2 mon data = /var/lib/ceph/mon/ceph-node2/ [mon.node4] host = node4 mon data = /var/lib/ceph/mon/ceph-node4/ Thank you very much Greeti

[ceph-users] OSD load simulator

2015-03-10 Thread Adrian Sevcenco
Intel, Atom, Xeon D and upcoming ARMs (like Cavium's ThunderX, X-Gene, AMD's opteron ARM) etc...) Thank you! Adrian smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com

Re: [ceph-users] CEPH hardware recommendations and cluster design questions

2015-03-05 Thread Adrian Sevcenco
Thank you all for all good advises and much needed documentation. I have a lot to digest :) Adrian On 03/04/2015 08:17 PM, Stephen Mercier wrote: To expand upon this, the very nature and existence of Ceph is to replace RAID. The FS itself replicates data and handles the HA functionality

[ceph-users] CEPH hardware recommendations and cluster design questions

2015-03-04 Thread Adrian Sevcenco
for me? (the read:writes ratios will be 10:1) Thank you!! Adrian smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Monitoring ceph statistics using rados python module

2014-05-14 Thread Adrian Banasiak
Thank you, that should do the trick. 2014-05-14 6:41 GMT+02:00 Kai Zhang log1...@yeah.net: Hi Adrian, You may be interested in rados -p poo_name df --format json, although it's pool oriented, you could probably add the values together :) Regards, Kai 在 2014-05-13 08:33:11,Adrian

[ceph-users] Monitoring ceph statistics

2014-05-13 Thread Adrian Banasiak
= conn.open_ioctx(pool) stats[pool] = io.get_stats() read+=int(stats[pool]['num_rd']) write+=int(stats[pool]['num_wr']) Could someone share his knowledge about rados module for retriving ceph statistics? BTW Ceph is awesome! -- Best regards, Adrian Banasiak email: adr

[ceph-users] Monitoring ceph statistics using rados python module

2014-05-13 Thread Adrian Banasiak
= conn.open_ioctx(pool) stats[pool] = io.get_stats() read+=int(stats[pool]['num_rd']) write+=int(stats[pool]['num_wr']) Could someone share his knowledge about rados module for retriving ceph statistics? BTW Ceph is awesome! -- Best regards, Adrian Banasiak email: adr

Re: [ceph-users] Monitoring ceph statistics using rados python module

2014-05-13 Thread Adrian Banasiak
ceph --admin-daemon /var/run/ceph/ceph-osd.x.asok perf dump to get the monitor infos. And the result can be parsed by simplejson easily via python. On Tue, May 13, 2014 at 10:56 PM, Adrian Banasiak adr...@banasiak.it wrote: Hi, i am working with test Ceph cluster and now I want to implement