Hi Sebastien,
That sounds promising. Did you enable the sharded ops to get this result?
Cheers, Dan
On 02 Sep 2014, at 02:19, Sebastien Han sebastien@enovance.com wrote:
Mark and all, Ceph IOPS performance has definitely improved with Giant.
With this version: ceph version
perform adequately,
that’d give us quite a few SSDs to build a dedicated high-IOPS pool.
I’d also appreciate any other suggestions/experiences which might be relevant.
Thanks!
Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department
(n-1 or n-2) will be a bit too old from where we
want to be, which I'm sure will work wonderfully on Red Hat, but how will n.1,
n.2 or n.3 run?
Robert LeBlanc
On Thu, Sep 4, 2014 at 11:22 AM, Dan Van Der Ster
daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote:
Hi Robert,
That's
to see what you decide to do and what your results are.
On Thu, Sep 4, 2014 at 12:12 PM, Dan Van Der Ster wrote:
I've just been reading the bcache docs. It's a pity the mirrored writes
aren't implemented yet. Do you know if you can use an md RAID1 as a cache dev
Hi Stefan,
September 4 2014 9:13 PM, Stefan Priebe s.pri...@profihost.ag wrote:
Hi Dan, hi Robert,
Am 04.09.2014 21:09, schrieb Dan van der Ster:
Thanks again for all of your input. I agree with your assessment -- in
our cluster we avg 3ms for a random (hot) 4k read already, but 40ms
Hi Craig,
September 4 2014 11:50 PM, Craig Lewis cle...@centraldesktop.com wrote:
On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster daniel.vanders...@cern.ch
wrote:
1) How often are DC S3700's failing in your deployments?
None of mine have failed yet. I am planning to monitor the wear
Hi Christian,
On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote:
Hello,
On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote:
On Thu, Sep 4, 2014 at 9:21 AM, Dan Van Der Ster
daniel.vanders...@cern.ch wrote:
1) How often are DC S3700's failing in your deployments
On 05 Sep 2014, at 10:30, Nigel Williams nigel.d.willi...@gmail.com wrote:
On Fri, Sep 5, 2014 at 5:46 PM, Dan Van Der Ster
daniel.vanders...@cern.ch wrote:
On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote:
You might want to look into cache pools (and dedicated SSD servers
On 05 Sep 2014, at 11:04, Christian Balzer ch...@gol.com wrote:
Hello Dan,
On Fri, 5 Sep 2014 07:46:12 + Dan Van Der Ster wrote:
Hi Christian,
On 05 Sep 2014, at 03:09, Christian Balzer ch...@gol.com wrote:
Hello,
On Thu, 4 Sep 2014 14:49:39 -0700 Craig Lewis wrote
Hi Christian,
Let's keep debating until a dev corrects us ;)
September 6 2014 1:27 PM, Christian Balzer ch...@gol.com wrote:
On Fri, 5 Sep 2014 09:42:02 + Dan Van Der Ster wrote:
On 05 Sep 2014, at 11:04, Christian Balzer ch...@gol.com wrote:
On Fri, 5 Sep 2014 07:46:12 + Dan Van
September 6 2014 4:01 PM, Christian Balzer ch...@gol.com wrote:
On Sat, 6 Sep 2014 13:07:27 + Dan van der Ster wrote:
Hi Christian,
Let's keep debating until a dev corrects us ;)
For the time being, I give the recent:
https://www.mail-archive.com/ceph-users@lists.ceph.com
considered RAID 5 over your SSDs? Practically
speaking, there's no performance downside to RAID 5 when your devices aren't
IOPS-bound.
On Sat Sep 06 2014 at 8:37:56 AM Christian Balzer
ch...@gol.commailto:ch...@gol.com wrote:
On Sat, 6 Sep 2014 14:50:20 + Dan van der Ster wrote:
September 6
Hi Blair,
On 09 Sep 2014, at 09:05, Blair Bethwaite blair.bethwa...@gmail.com wrote:
Hi folks,
In lieu of a prod ready Cephfs I'm wondering what others in the user
community are doing for file-serving out of Ceph clusters (if at all)?
We're just about to build a pretty large cluster -
On 09 Sep 2014, at 16:39, Michal Kozanecki mkozane...@evertz.com wrote:
On 9 September 2014 08:47, Blair Bethwaite blair.bethwa...@gmail.com wrote:
On 9 September 2014 20:12, Dan Van Der Ster daniel.vanders...@cern.ch
wrote:
One thing I’m not comfortable with is the idea of ZFS checking
Hi,
On 16 Sep 2014, at 16:46, shiva rkreddy
shiva.rkre...@gmail.commailto:shiva.rkre...@gmail.com wrote:
2. Has any one used SSD devices for Monitors. If so, can you please share the
details ? Any specific changes to the configuration files?
We use SSDs on our monitors — a spinning disk was
Hi Greg,
I believe Marc is referring to the corruption triggered by set_extsize on xfs.
That option was disabled by default in 0.80.4... See the thread firefly scrub
error.
Cheers,
Dan
From: Gregory Farnum g...@inktank.com
Sent: Sep 16, 2014 8:15 PM
To: Marc
Cc: ceph-users@lists.ceph.com
, xfs, something else or doesn't matter?
I think it doesn’t matter.
We use xfs.
Cheers, Dan
On Tue, Sep 16, 2014 at 10:15 AM, Dan Van Der Ster
daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote:
Hi,
On 16 Sep 2014, at 16:46, shiva rkreddy
shiva.rkre...@gmail.commailto:shiva.rkre
Hi Florian,
On 17 Sep 2014, at 17:09, Florian Haas flor...@hastexo.com wrote:
Hi Craig,
just dug this up in the list archives.
On Fri, Mar 28, 2014 at 2:04 AM, Craig Lewis cle...@centraldesktop.com
wrote:
In the interest of removing variables, I removed all snapshots on all pools,
to e.g 16, or to fix the loss of purged_snaps after
backfilling. Actually, probably both of those are needed. But a real dev would
know better.
Cheers, Dan
From: Florian Haas flor...@hastexo.com
Sent: Sep 17, 2014 5:33 PM
To: Dan Van Der Ster
Cc: Craig Lewis cle...@centraldesktop.com;ceph-users
Hi Mike,
On 25 Sep 2014, at 17:47, Mike Dawson mike.daw...@cloudapt.com wrote:
On 9/25/2014 11:09 AM, Sage Weil wrote:
v0.67.11 Dumpling
===
This stable update for Dumpling fixes several important bugs that affect a
small set of users.
We recommend that all Dumpling
Hi,
Apologies for this trivial question, but what is the correct procedure to
replace a failed OSD that uses a shared journal device?
Suppose you have 5 spinning disks (sde,sdf,sdg,sdh,sdi) and these each have a
journal partition on sda (sda1-5). Now sde fails and is replaced with a new
drive.
Hi Wido,
On 26 Sep 2014, at 23:14, Wido den Hollander w...@42on.com wrote:
On 26-09-14 17:16, Dan Van Der Ster wrote:
Hi,
Apologies for this trivial question, but what is the correct procedure to
replace a failed OSD that uses a shared journal device?
Suppose you have 5 spinning disks
Hi,
On 29 Sep 2014, at 10:01, Daniel Swarbrick
daniel.swarbr...@profitbricks.com wrote:
On 26/09/14 17:16, Dan Van Der Ster wrote:
Hi,
Apologies for this trivial question, but what is the correct procedure to
replace a failed OSD that uses a shared journal device?
I’m just curious
-disk prepare /dev/sde /dev/sda1
and try to coerce that to use the persistent name.
Cheers, Dan
Best of luck.
Owen
On 09/29/2014 10:24 AM, Dan Van Der Ster wrote:
Hi,
On 29 Sep 2014, at 10:01, Daniel Swarbrick
daniel.swarbr...@profitbricks.com wrote:
On 26/09/14 17:16
Hi Emmanuel,
This is interesting, because we’ve had sales guys telling us that those Samsung
drives are definitely the best for a Ceph journal O_o !
The conventional wisdom has been to use the Intel DC S3700 because of its
massive durability.
Anyway, I’m curious what do the SMART counters say
On 29 Sep 2014, at 10:47, Dan Van Der Ster daniel.vanders...@cern.ch wrote:
Hi Owen,
On 29 Sep 2014, at 10:33, Owen Synge osy...@suse.com wrote:
Hi Dan,
At least looking at upstream to get journals and partitions persistently
working, this requires gpt partitions, and being able
On 30 Sep 2014, at 16:38, Mark Nelson mark.nel...@inktank.com wrote:
On 09/29/2014 03:58 AM, Dan Van Der Ster wrote:
Hi Emmanuel,
This is interesting, because we’ve had sales guys telling us that those
Samsung drives are definitely the best for a Ceph journal O_o !
Our sales guys
Hi Chad,
That sounds bizarre to me, and I can't reproduce it. I added an osd (which was
previously not in the crush map) to a fake host=test:
ceph osd crush create-or-move osd.52 1.0 rack=RJ45 host=test
that resulted in some data movement of course. Then I removed that osd from the
crush
Hi,
October 15 2014 7:05 PM, Chad Seys cws...@physics.wisc.edu wrote:
Hi Dan,
I'm using Emperor (0.72). Though I would think CRUSH maps have not changed
that much btw versions?
I'm using dumpling, with the hashpspool flag enabled, which I believe could
have been the only difference.
That
Hi Ceph users,
(sorry for the novel, but perhaps this might be useful for someone)
During our current project to upgrade our cluster from disks-only to
SSD journals, we've found it useful to convert our legacy puppet-ceph
deployed cluster (using something like the enovance module) to one that
Hi,
October 24 2014 5:28 PM, HURTEVENT VINCENT vincent.hurtev...@univ-lyon1.fr
wrote:
Hello,
I was running a multi mon (3) Ceph cluster and in a migration move, I
reinstall 2 of the 3 monitors
nodes without deleting them properly into the cluster.
So, there is only one monitor left
Hi,
October 27 2014 5:07 PM, Wido den Hollander w...@42on.com wrote:
On 10/27/2014 04:30 PM, Mike wrote:
Hello,
My company is plaining to build a big Ceph cluster for achieving and
storing data.
By requirements from customer - 70% of capacity is SATA, 30% SSD.
First day data is storing
On 28 Oct 2014, at 08:25, Robert van Leeuwen
robert.vanleeu...@spilgames.com wrote:
By now we decide use a SuperMicro's SKU with 72 bays for HDD = 22 SSD +
50 SATA drives.
Our racks can hold 10 this servers and 50 this racks in ceph cluster =
36000 OSD's,
With 4tb SATA drives and
On 28 Oct 2014, at 09:30, Christian Balzer ch...@gol.com wrote:
On Tue, 28 Oct 2014 07:46:30 + Dan Van Der Ster wrote:
On 28 Oct 2014, at 08:25, Robert van Leeuwen
robert.vanleeu...@spilgames.com wrote:
By now we decide use a SuperMicro's SKU with 72 bays for HDD = 22 SSD
+ 50
Hi,
You should try the new osd_disk_thread_ioprio_class /
osd_disk_thread_ioprio_priority options.
Cheers, dan
On 28 Oct 2014, at 09:27, Mateusz Skała
mateusz.sk...@budikom.netmailto:mateusz.sk...@budikom.net wrote:
Hello,
We are using Ceph as a storage backend for KVM, used for hosting MS
Hi RHEL/CentOS users,
This is just a heads up that we observe slow requests during the RHEL6.6
upgrade. The upgrade includes selinux-policy-targeted, which runs this during
the update:
/sbin/restorecon -i -f - -R -p -e /sys -e /proc -e /dev -e /mnt -e /var/tmp
-e /home -e /tmp -e /dev
Hi Daniel,
I can't remember if deleting a pool invokes the snap trimmer to do the
actual work deleting objects. But if it does, then it is most definitely
broken in everything except latest releases (actual dumpling doesn't have
the fix yet in a release).
Given a release with those fixes (see
October 30 2014 11:32 AM, Daniel Schneller
daniel.schnel...@centerdevice.com wrote:
On 2014-10-30 10:14:44 +, Dan van der Ster said:
Hi Daniel,
I can't remember if deleting a pool invokes the snap trimmer to do the
actual work deleting objects. But if it does, then it is most
There's this one:
http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/
But that hasn't been updated since July.
Cheers, Dan
On Mon Nov 03 2014 at 5:35:23 AM Alexandre DERUMIER aderum...@odiso.com
wrote:
Hi,
I would like to known if a repository is available for
Between two hosts on an HP Procurve 6600, no jumbo frames:
rtt min/avg/max/mdev = 0.096/0.128/0.151/0.019 ms
Cheers, Dan
On Thu Nov 06 2014 at 2:19:07 PM Wido den Hollander w...@42on.com wrote:
Hello,
While working at a customer I've ran into a 10GbE latency which seems
high to me.
I
Hi,
I've only ever seen (1), EIO to read a file. In this case I've always just
killed / formatted / replaced that OSD completely -- that moves the PG to a
new master and the new replication fixes the inconsistency. This way,
I've never had to pg repair. I don't know if this is a best or even good
IIRC, the EIO we had also correlated with a SMART status that showed the
disk was bad enough for a warranty replacement -- so yes, I replaced the
disk in these cases.
Cheers, Dan
On Thu Nov 06 2014 at 2:44:08 PM GuangYang yguan...@outlook.com wrote:
Thanks Dan. By killed/formatted/replaced the
Hi,
Did you mkjournal the reused journal?
ceph-osd -i $ID --mkjournal
Cheers, Dan
On Thu Nov 13 2014 at 2:34:51 PM Anthony Alba ascanio.al...@gmail.com
wrote:
When I create a new OSD with a block device as journal that has
existing data on it, ceph is causing FAILED assert. The block
Hi,
On Thu Nov 13 2014 at 3:35:55 PM Anthony Alba ascanio.al...@gmail.com
wrote:
Ah no.
On 13 Nov 2014 21:49, Dan van der Ster daniel.vanders...@cern.ch
wrote:
Hi,
Did you mkjournal the reused journal?
ceph-osd -i $ID --mkjournal
Cheers, Dan
No - however the man page states
Hi all,
What is compatibility/incompatibility of dumpling clients to talk to
firefly and giant clusters? I know that tunables=firefly will prevent
dumpling clients from talking to a firefly cluster, but how about the
existence or not of erasure pools? Can a dumpling client talk to a
Firefly/Giant
Hi Greg,
On 24 Nov 2014, at 22:01, Gregory Farnum g...@gregs42.com wrote:
On Thu, Nov 20, 2014 at 9:08 AM, Dan van der Ster
daniel.vanders...@cern.ch wrote:
Hi all,
What is compatibility/incompatibility of dumpling clients to talk to firefly
and giant clusters?
We sadly don't have
Hi,
On 26 Nov 2014, at 17:07, Yujian Peng pengyujian5201...@126.com wrote:
Thanks a lot!
IOPS is a bottleneck in my cluster and the object disks are much slower than
SSDs. I don't know whether SSDs will be used as caches if
filestore_max_sync_interval is set to a big value. I will set
Hi all,
We throttle (with qemu-kvm) rbd devices to 100 w/s and 100 r/s (and 80MB/s
write and read).
With fio we cannot exceed 51.2MB/s sequential or random reads, no matter the
reading block size. (But with large writes we can achieve 80MB/s).
I just realised that the VM subsytem is probably
, Dan
On 27 Nov 2014 18:26, Dan Van Der Ster daniel.vanders...@cern.ch wrote:
Hi all,
We throttle (with qemu-kvm) rbd devices to 100 w/s and 100 r/s (and 80MB/s
write and read).
With fio we cannot exceed 51.2MB/s sequential or random reads, no matter the
reading block size. (But with large writes we
rule or similar that could set max_sectors_kb
when a RBD device is attached?
Cheers, Dan
On 27 Nov 2014, at 20:29, Dan Van Der Ster
daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote:
Oops, I was off by a factor of 1000 in my original subject. We actually have 4M
and 8M reads
if this impacts performance? Like small block size performance, etc?
Cheers
From: Dan Van Der Ster
daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch
To: ceph-users ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Sent: Friday, 28 November, 2014 1:33
Hi,
Which version of Ceph are you using? This could be related:
http://tracker.ceph.com/issues/9487
See ReplicatedPG: don't move on to the next snap immediately; basically, the
OSD is getting into a tight loop trimming the snapshot objects. The fix above
breaks out of that loop more frequently,
Hi,
On Apr 3, 2014 4:49 AM, Christian Balzer ch...@gol.com wrote:
On Tue, 1 Apr 2014 14:18:51 + Dan Van Der Ster wrote:
[snip]
http://www.slideshare.net/Inktank_Ceph/scaling-ceph-at-cern
[snap]
In that slide it says that replacing failed OSDs is automated via puppet.
I'm
Hi,
I'm not looking at your hardware in detail (except to say that you absolutely
must have 3 monitors and that I don't know what a load balancer would be useful
for in this setup), but perhaps the two parameters below may help you evaluate
your system.
To estimate the IOPS capacity of your
It’s pretty basic, but we run this hourly:
https://github.com/cernceph/ceph-scripts/blob/master/ceph-health-cron/ceph-health-cron
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 11 Apr 2014 at 09:12:13, Pavel V. Kaygorodov
(pa...@inasan.rumailto:pa...@inasan.ru
65536: 87783
131072: 87279
12288: 66735
49152: 50170
24576: 47794
262144: 45199
466944: 23064
So reads are mostly 512kB, which is probably some default read-ahead size.
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users
Hi,
Gregory Farnum wrote:
I forget which clients you're using — is rbd caching enabled?
Yes, the clients are qemu-kvm-rhev with latest librbd from dumpling and
rbd cache = true.
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department
The last num is the size of the write/read.
Then run this:
https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl
Cheers, Dan
--
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users mailing list
ceph
isn't the
file accesses leading to many small writes. Any other theories?
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph
ceph osd reweight-by-utilization
Is that still in 0.79?
I'd start with reweight-by-utilization 200 and then adjust that number down
until you get to 120 or so.
Cheers, Dan
On Apr 18, 2014 12:49 PM, Kenneth Waegeman kenneth.waege...@ugent.be wrote:
Hi,
Some osds of our cluster filled up:
(d7ab4244396b57aac8b7e80812115bbd079e6b73)
How can i delete it forever?
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 28/04/14 14:54, Wido den Hollander wrote:
On 04/28/2014 02:15 PM, Andrija Panic wrote:
Thank you very much Wido,
any suggestion on compiling libvirt with support (I already found a way)
or perhaps use some prebuilt , that you would recommend ?
No special suggestions, just make sure you
rbd enabled qemu, qemu-img etc from ceph.com http://ceph.com site)
I need just libvirt with rbd support ?
Thanks
On 28 April 2014 15:05, Andrija Panic andrija.pa...@gmail.com
mailto:andrija.pa...@gmail.com wrote:
Thanks Dan :)
On 28 April 2014 15:02, Dan van der Ster
I've followed this recipe successfully in the past:
http://wiki.skytech.dk/index.php/Ceph_-_howto,_rbd,_lvm,_cluster#Add.2Fmove_journal_in_running_cluster
On May 6, 2014 12:34 PM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
Hi to all,
I would like to replace a disk used as
Hi,
Sage Weil wrote:
**Primary affinity*: Ceph now has the ability to skew selection of
OSDs as the primary copy, which allows the read workload to be
cheaply skewed away from parts of the cluster without migrating any
data.
Can you please elaborate a bit on this one? I found the
-0.67.7-0.el6.x86_64.rpm (568 kiB)
Yet the dates haven't changed.
Is that understood? It's not a malicious incident, is it?
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users mailing list
ceph-users
Hi,
I think you're not getting many replies simply because those are rather
large servers and not many have such hardware in prod.
We run with 24x3TB drives, 64GB ram, one 10Gbit NIC. Memory-wise there
are no problems. Throughput-wise, the bottleneck is somewhere between
the NIC (~1GB/s) and
to a safe place.
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 23 May 2014, at 15:45, Fabian Zimmermann f.zimmerm...@xplosion.de wrote:
Hello,
I’m running a 3 node cluster with 2 hdd/osd and one mon on each node.
Sadly the fsyncs done by mon-processes
I very briefly tried kernel-rt from RH MRG, and it didn't make any noticeable
difference. Though I didn't spend any time tuning things.
Cheers, Dan
On May 25, 2014 11:04 AM, Stefan Priebe - Profihost AG s.pri...@profihost.ag
wrote:
Hi,
has anybody ever tried to use a low latency kernel for
for osd_snap_trim_sleep = … ?
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 04 Jun 2014, at 16:06, Sage Weil s...@inktank.com wrote:
On Wed, 4 Jun 2014, Dan Van Der Ster wrote:
Hi Sage, all,
On 21 May 2014, at 22:02, Sage Weil s...@inktank.com wrote:
* osd: allow snap trim throttling with simple delay (#6278, Sage Weil)
Do you have some advice about how
On 04 Jun 2014, at 16:06, Sage Weil s...@inktank.com wrote:
You can adjust this on running OSDs with something like 'ceph daemon
osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.*
injectargs -- --osd-snap-trim-sleep .01'.
Thanks, trying that now.
I noticed that using =
i have not faced this , though i have
done several ceph cluster installation with package manager. I don’t want EPEL
version of Ceph.
You probably need to tweak the repo priorities. We use priority=30 for
epel.repo, priority=5 for ceph.repo.
Cheers, Dan
-- Dan van der Ster || Data Storage
?
- If OTOH a disk/op thread is switching between scrubbing and client IO
responsibilities, could Ceph use ioprio_set to change the io priorities on the
fly??
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 10 Jun 2014, at 00:22, Craig Lewis
cle
Hi Greg,
This tracker issue is relevant: http://tracker.ceph.com/issues/7288
Cheers, Dan
On 11 Jun 2014, at 00:30, Gregory Farnum g...@inktank.com wrote:
Hey Mike, has your manual scheduling resolved this? I think I saw
another similar-sounding report, so a feature request to improve scrub
On 10 Jun 2014, at 11:59, Dan Van Der Ster daniel.vanders...@cern.ch wrote:
One idea I had was to check the behaviour under different disk io schedulers,
trying exploit thread io priorities with cfq. So I have a question for the
developers about using ionice or ioprio_set to lower the IO
Hi Greg,
We're also due for a similar splitting exercise in the not too distant future,
and will also need to minimize the impact on latency.
In addition to increasing pg_num in small steps and using a minimal
max_backfills/recoveries configuration, I was planning to increase pgp_num very
Hi,
On 09 Jul 2014, at 14:44, Robert van Leeuwen robert.vanleeu...@spilgames.com
wrote:
I cannot add a new OSD to a current Ceph cluster.
It just hangs, here is the debug log:
This is ceph 0.72.1 on CentOS.
Found the issue:
Although I installed the specific ceph (0.72.1) version the
On 09 Jul 2014, at 15:30, Robert van Leeuwen robert.vanleeu...@spilgames.com
wrote:
Which leveldb from where? 1.12.0-5 that tends to be in el6/7 repos is broken
for Ceph.
You need to remove the “basho fix” patch.
1.7.0 is the only readily available version that works, though it is so old
this, and google isn’t helping.
I suggest that this should be added to ceph.com docs.
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 08 Aug 2014, at 10:46, Christian Kauhaus k...@gocept.com wrote:
Hi,
today I'd like to share a severe problem we've found
.
BTW, do you throttle your clients? We found that its absolutely necessary,
since without a throttle just a few active VMs can eat up the entire iops
capacity of the cluster.
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 08 Aug 2014, at 13:51, Andrija
On 8 August 2014 15:44, Dan Van Der Ster
daniel.vanders...@cern.chmailto:daniel.vanders...@cern.ch wrote:
Hi,
Here’s what we do to identify our top RBD users.
First, enable log level 10 for the filestore so you can see all the IOs coming
from the VMs. Then use a script like this (used
Hi,
I changed the script to be a bit more flexible with the osd path. Give this a
try again:
https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl
Cheers, Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 11 Aug 2014, at 12:48, Andrija Panic
, then it might be that something else is reading
the disks heavily. One thing to check is updatedb — we had to disable it from
indexing /var/lib/ceph on our OSDs.
Best Regards,
Dan
-- Dan van der Ster || Data Storage Services || CERN IT Department --
On 20 Aug 2014, at 16:39, Hugo Mills h.r.mi
max ops/bytes, and the filestore wbthrottle xfs *
options. (I’m not going to publish exact configs here because I haven’t
finished tuning yet).
Cheers, Dan
Thanks a lot!!
Best regards,
German Anders
On Wednesday 20/08/2014 at 11:51, Dan Van Der Ster wrote:
Hi,
Do you get slow
Hi Hugo,
On 20 Aug 2014, at 17:54, Hugo Mills h.r.mi...@reading.ac.uk wrote:
What are you using for OSD journals?
On each machine, the three OSD journals live on the same ext4
filesystem on an SSD, which is also the root filesystem of the
machine.
Also check the CPU usage for the mons
Hi,
You only have one OSD? I’ve seen similar strange things in test pools having
only one OSD — and I kinda explained it by assuming that OSDs need peers (other
OSDs sharing the same PG) to behave correctly. Install a second OSD and see how
it goes...
Cheers, Dan
On 21 Aug 2014, at 02:59,
Hi Hugo,
On 21 Aug 2014, at 14:17, Hugo Mills h.r.mi...@reading.ac.uk wrote:
Not sure what you mean about colocated journal/OSD. The journals
aren't on the same device as the OSDs. However, all three journals on
each machine are on the same SSD.
embarrassed I obviously didn’t drink
On Sep 19, 2013, at 6:10 PM, Gregory Farnum g...@inktank.com
wrote:
On Wed, Sep 18, 2013 at 11:43 PM, Dan Van Der Ster
daniel.vanders...@cern.ch wrote:
On Sep 18, 2013, at 11:50 PM, Gregory Farnum g...@inktank.com
wrote:
On Wed, Sep 18, 2013 at 6:33 AM, Dan Van Der Ster
daniel.vanders
On Sep 19, 2013, at 3:43 PM, Mark Nelson mark.nel...@inktank.com wrote:
If you set:
osd pool default flag hashpspool = true
Theoretically that will cause different pools to be distributed more randomly.
The name seems to imply that it should be settable per pool. Is that possible
now?
. If the checksum is already stored per object in the OSD, is this retrievable
by librados? We have some applications which also need to know the checksum of
the data and this would be handy if it was already calculated by Ceph.
Thanks in advance!
Dan van der Ster
CERN
at the moment to check for myself, and the answer is relevant to this
discussion anyway).
Cheers,
Dan
Sage Weil s...@inktank.com wrote:
On Wed, 16 Oct 2013, Dan Van Der Ster wrote:
Hi all,
There has been some confusion the past couple days at the CHEP
conference during conversations about Ceph
On Wed, Oct 16, 2013 at 6:12 PM, Sage Weil s...@inktank.com wrote:
3. During deep scrub of an object with 2 replicas, suppose the checksum is
different for the two objects -- which object wins? (I.e. if you store the
checksum locally, this is trivial since the consistency of objects can be
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN
alexis.gunsth...@outscale.com wrote:
step take example
step emit
This is the problem, AFAICT. Just omit those two lines in both rules
and it should work.
Cheers, dan
___
ceph-users
On Thu, Oct 31, 2013 at 2:29 PM, Alexis GÜNST HORN
alexis.gunsth...@outscale.com wrote:
-11 0 drive hdd
-21 0 datacenter hdd-dc1
-1020 room hdd-dc1-A
-5030 host A-ceph-osd-2
Hi,
We’re trying the same, on SLC. We tried rbdmap but it seems to have some
ubuntu-isms which cause errors.
We also tried with rc.local, and you can map and mount easily, but at shutdown
we’re seeing the still-mapped images blocking a machine from shutting down
(libceph connection refused
Dear users/experts,
Does anyone know how to use radosgw-admin log show? It seems to not properly
read the --bucket parameter.
# radosgw-admin log show --bucket=asdf --date=2013-11-28-09
--bucket-id=default.7750582.1
error reading log 2013-11-28-09-default.7750582.1-: (2) No such file or
On Fri, Nov 29, 2013 at 12:13 PM, Charles 'Boyo charlesb...@gmail.com wrote:
That's because qemu-kvm
in CentOS 6.4 doesn't support librbd.
RedHat just added RBD support in qemu-kvm-rhev in RHEV 6.5. I don't
know if that will trickle down to CentOS but you can probably
recompile it yourself like
Hi,
See this one also: http://tracker.ceph.com/issues/6365
But I’m not sure the Inktank patched qemu-kvm is relevant any longer since
RedHat just released qemu-kvm-rhev with RBD support.
Cheers, Dan
On 02 Dec 2013, at 15:36, Darren Birkett darren.birk...@gmail.com wrote:
Hi List,
Any
to track down if rdb is actually enabled.
when/if I figure it out I will post it to the list.
On 12/02/2013 10:46 AM, Dan Van Der Ster wrote:
Hi,
See this one also: http://tracker.ceph.com/issues/6365
But I’m not sure the Inktank patched qemu-kvm is relevant any longer since
RedHat just
1 - 100 of 571 matches
Mail list logo