Second this. Also for long-lasting snapshot problem and related
performance issues I may say that cuttlefish improved things greatly,
but creation/deletion of large snapshot (hundreds of gigabytes of
commited data) still can bring down cluster for a minutes, despite
usage of every possible
Created #5844.
On Thu, Aug 1, 2013 at 10:38 PM, Samuel Just sam.j...@inktank.com wrote:
Is there a bug open for this? I suspect we don't sufficiently
throttle the snapshot removal work.
-Sam
On Thu, Aug 1, 2013 at 7:50 AM, Andrey Korolyov and...@xdel.ru wrote:
Second this. Also for long
On Tue, Aug 20, 2013 at 7:36 PM, Wido den Hollander w...@42on.com wrote:
Hi,
The current [0] libvirt storage pool code simply calls rbd_remove without
anything else.
As far as I know rbd_remove will fail if the image still has snapshots, you
have to remove those snapshots first before you
You may want to reduce scrubbing pgs per osd to 1 using config option
and check the results.
On Fri, Aug 30, 2013 at 8:03 PM, Mike Dawson mike.daw...@cloudapt.com wrote:
We've been struggling with an issue of spikes of high i/o latency with
qemu/rbd guests. As we've been chasing this bug, we've
PM, Andrey Korolyov wrote:
You may want to reduce scrubbing pgs per osd to 1 using config option
and check the results.
On Fri, Aug 30, 2013 at 8:03 PM, Mike Dawson mike.daw...@cloudapt.com
wrote:
We've been struggling with an issue of spikes of high i/o latency with
qemu/rbd guests
Hello,
Since it was a long time from enabling cephx by default and we may
think that everyone using it, is seems worthy to introduce bits of
code hiding the key from cmdline. First applicable place for such
improvement is most-likely OpenStack envs with their sparse security
and usage of admin
If anyone attends to the CloudConf Europe, it would be nice to meet in
in real world too.
On Wed, Sep 25, 2013 at 2:29 PM, Wido den Hollander w...@42on.com wrote:
On 09/25/2013 10:53 AM, Loic Dachary wrote:
Hi Eric Patrick,
Yesterday morning Eric suggested that organizing a ceph user meetup
Hello,
Not sure if this matches any real-world problem:
step time server 192.168.10.125 offset 30763065.968946 sec
#0 0x7f2d0294d405 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x7f2d02950b5b in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2 0x7f2d0324b875 in
Just my two cents:
XFS is a quite unstable with Ceph especially along with heavy CPU
usage up to 3.7(primarily soft lockups). I used 3.7 for eight months
before upgrade on production system and it performs just perfectly.
On Tue, Oct 22, 2013 at 1:29 PM, Jeff Liu jeff@oracle.com wrote:
Hello,
Due to lot of reports of ENOSPC for xfs-based stores may be it worth to
introduce an option to, say, ceph-deploy which will pass allocsize=
param to the mount effectively disabling Dynamic Preallocation? Of
course not every case really worth it because of related performance
impact. If
On 03/24/2014 05:30 PM, Haomai Wang wrote:
Hi all,
As we know, snapshot is a lightweight resource in librbd and we
doesn't have any statistic informations about it. But it causes some
problems to the cloud management.
We can't measure the size of snapshot, different snapshot will occur
Hello,
I do not know about how many of you aware of this work of Michael Hines
[0], but looks like it can be extremely usable for critical applications
using qemu and, of course, Ceph at the block level. My thought was that
if qemu rbd driver can provide any kind of metadata interface to mark
On Fri, Dec 27, 2013 at 9:09 PM, Andrey Korolyov and...@xdel.ru wrote:
On 12/27/2013 08:15 PM, Justin Erenkrantz wrote:
On Thu, Dec 26, 2013 at 9:17 PM, Sage Weil s...@inktank.com wrote:
I think the question comes down to whether Ceph should take some internal
action based on the information
On Wed, Jan 16, 2013 at 10:35 PM, Andrey Korolyov and...@xdel.ru wrote:
On Wed, Jan 16, 2013 at 8:58 PM, Sage Weil s...@inktank.com wrote:
Hi,
On Wed, 16 Jan 2013, Andrey Korolyov wrote:
On Wed, Jan 16, 2013 at 4:58 AM, Chen, Xiaoxi xiaoxi.c...@intel.com wrote:
Hi list,
We
Hi Matthew,
Seems to a low value in /proc/sys/kernel/threads-max value.
On Thu, Jan 17, 2013 at 12:37 PM, Matthew Anderson
matth...@base3.com.au wrote:
I've run into a limit on the maximum number of RBD backed VM's that I'm able
to run on a single host. I have 20 VM's (21 RBD volumes open)
On Thu, Jan 17, 2013 at 7:00 PM, Atchley, Scott atchle...@ornl.gov wrote:
On Jan 17, 2013, at 9:48 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2013/1/17 Atchley, Scott atchle...@ornl.gov:
IB DDR should get you close to 2 GB/s with IPoIB. I have gotten our IB QDR
PCI-E
On Tue, Jan 22, 2013 at 10:05 AM, Sage Weil s...@inktank.com wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
minutes. After 3 minutes (180s), ceph-osd gave up waiting and committed
suicide. XFS
On Thu, Jan 24, 2013 at 12:59 AM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi Sage,
I think the problem now is just that 'osd target transaction size' is
I set it to 50, and that seems to have solved all my problems.
After a day or so my cluster got to a HEALTH_OK state again.
On Thu, Jan 24, 2013 at 8:39 AM, Sage Weil s...@inktank.com wrote:
On Thu, 24 Jan 2013, Andrey Korolyov wrote:
On Thu, Jan 24, 2013 at 12:59 AM, Jens Kristian S?gaard
j...@mermaidconsulting.dk wrote:
Hi Sage,
I think the problem now is just that 'osd target transaction size' is
I set
On Fri, Jan 25, 2013 at 7:51 PM, Sage Weil s...@inktank.com wrote:
On Fri, 25 Jan 2013, Andrey Korolyov wrote:
On Fri, Jan 25, 2013 at 4:52 PM, Ugis ugi...@gmail.com wrote:
I mean if you map rbd and do not use rbd lock.. command. Can you
tell which client has mapped certain rbd anyway
On Sat, Jan 26, 2013 at 3:40 AM, Sam Lang sam.l...@inktank.com wrote:
On Fri, Jan 25, 2013 at 10:07 AM, Andrey Korolyov and...@xdel.ru wrote:
Sorry, I have written too less yesterday because of being sleepy.
That`s obviously a cache pressure since dropping caches resulted in
disappearance
On Sat, Jan 26, 2013 at 12:41 PM, Andrey Korolyov and...@xdel.ru wrote:
On Sat, Jan 26, 2013 at 3:40 AM, Sam Lang sam.l...@inktank.com wrote:
On Fri, Jan 25, 2013 at 10:07 AM, Andrey Korolyov and...@xdel.ru wrote:
Sorry, I have written too less yesterday because of being sleepy.
That`s
On Mon, Jan 28, 2013 at 5:48 PM, Sam Lang sam.l...@inktank.com wrote:
On Sun, Jan 27, 2013 at 2:52 PM, Andrey Korolyov and...@xdel.ru wrote:
Ahem. once on almost empty node same trace produced by qemu
process(which was actually pinned to the specific numa node), so seems
that`s generally
On Mon, Jan 28, 2013 at 8:55 PM, Andrey Korolyov and...@xdel.ru wrote:
On Mon, Jan 28, 2013 at 5:48 PM, Sam Lang sam.l...@inktank.com wrote:
On Sun, Jan 27, 2013 at 2:52 PM, Andrey Korolyov and...@xdel.ru wrote:
Ahem. once on almost empty node same trace produced by qemu
process(which
http://xdel.ru/downloads/ceph-log/rados-out.txt.gz
On Thu, Jan 31, 2013 at 10:31 PM, Gregory Farnum g...@inktank.com wrote:
Can you pastebin the output of rados -p rbd ls?
On Thu, Jan 31, 2013 at 10:17 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
Please take a look, this data remains
On Thu, Jan 31, 2013 at 11:18 PM, Andrey Korolyov and...@xdel.ru wrote:
On Thu, Jan 31, 2013 at 10:56 PM, Gregory Farnum g...@inktank.com wrote:
On Thu, Jan 31, 2013 at 10:50 AM, Andrey Korolyov and...@xdel.ru wrote:
http://xdel.ru/downloads/ceph-log/rados-out.txt.gz
On Thu, Jan 31, 2013
On Mon, Feb 4, 2013 at 1:46 AM, Gregory Farnum g...@inktank.com wrote:
On Sunday, February 3, 2013 at 11:45 AM, Andrey Korolyov wrote:
Just an update: this data stayed after pool deletion, so there is
probably a way to delete garbage bytes on live pool without doing any
harm(hope so), since
Hi Stefan,
you may be interested in throttle(1) as a side solution with stdout
export option. By the way, on which interconnect you have manage to
get such speeds, if you mean 'commited' bytes(e.g. not almost empty
allocated image)?
On Wed, Feb 13, 2013 at 12:22 AM, Stefan Priebe
Can anyone who hit this bug please confirm that your system contains libc 2.15+?
On Tue, Feb 5, 2013 at 1:27 AM, Sébastien Han han.sebast...@gmail.com wrote:
oh nice, the pattern also matches path :D, didn't know that
thanks Greg
--
Regards,
Sébastien Han.
On Mon, Feb 4, 2013 at 10:22 PM,
On Thu, Jan 24, 2013 at 10:01 PM, Sage Weil s...@inktank.com wrote:
On Thu, 24 Jan 2013, Andrey Korolyov wrote:
On Thu, Jan 24, 2013 at 8:39 AM, Sage Weil s...@inktank.com wrote:
On Thu, 24 Jan 2013, Andrey Korolyov wrote:
On Thu, Jan 24, 2013 at 12:59 AM, Jens Kristian S?gaard
j
On Wed, Feb 13, 2013 at 12:22 AM, Stefan Priebe s.pri...@profihost.ag wrote:
Hi,
is there a speed limit option for rbd export? Right now i'm able to produce
several SLOW requests from IMPORTANT valid requests while just exporting a
snapshot which is not really important.
rbd export runs
On Tue, Feb 26, 2013 at 6:56 PM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
Hi list,
how can i do a short maintanance like a kernel upgrade on an osd host?
Right now ceph starts to backfill immediatly if i say:
ceph osd out 41
...
Without ceph osd out command all clients
Hello,
I`m experiencing same long-lasting problem - during recovery ops, some
percentage of read I/O remains in-flight for seconds, rendering
upper-level filesystem on the qemu client very slow and almost
unusable. Different striping has almost no effect on visible delays
and reads may be
Hello,
Is there an existing or planned way to save an image from such thing,
except protected snapshot? Since ``rbd snap protect'' is good enough
for a small or inactive images, large ones may add significant overhead
by space or by I/O when 'locking' snapshot is present, so it would be nice
to
On Thu, Apr 18, 2013 at 5:43 PM, Mark Nelson mark.nel...@inktank.com wrote:
On 04/18/2013 06:46 AM, James Harper wrote:
I'm doing some basic testing so I'm not really fussed about poor
performance, but my write performance appears to be so bad I think I'm doing
something wrong.
Using dd to
Hello,
Using db2bb270e93ed44f9252d65d1d4c9b36875d0ea5 I had observed some
disaster-alike behavior after ``pool create'' command - every osd
daemon in the cluster will die at least once(some will crash times in
a row after bringing back). Please take a look on the
backtraces(almost identical)
Wow, very glad to hear that. I tried with the regular FS tunable and
there was almost no effect on the regular test, so I thought that
reads cannot be improved at all in this direction.
On Mon, Jul 29, 2013 at 2:24 PM, Li Wang liw...@ubuntukylin.com wrote:
We performed Iozone read test on a
Hi,
Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
six-node,
and I have removed a bunch of rbd objects during recovery to avoid
overfill.
Right now I`m constantly receiving a warn about nearfull state on
non-existing osd:
health HEALTH_WARN 1 near full osd(s)
monmap
On Fri, Jul 13, 2012 at 9:09 PM, Sage Weil s...@inktank.com wrote:
On Fri, 13 Jul 2012, Gregory Farnum wrote:
On Fri, Jul 13, 2012 at 1:17 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
Recently I`ve reduced my test suite from 6 to 4 osds at ~60% usage on
six-node,
and I have removed
On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum g...@inktank.com wrote:
ceph pg set_full_ratio 0.95
ceph pg set_nearfull_ratio 0.94
On Monday, July 16, 2012 at 11:42 AM, Andrey Korolyov wrote:
On Mon, Jul 16, 2012 at 8:12 PM, Gregory Farnum g...@inktank.com
(mailto:g...@inktank.com
On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum g...@inktank.com wrote:
On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov wrote:
On Mon, Jul 16, 2012 at 10:48 PM, Gregory Farnum g...@inktank.com
(mailto:g...@inktank.com) wrote:
ceph pg set_full_ratio 0.95
ceph pg set_nearfull_ratio
On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum g...@inktank.com wrote:
On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
On Wed, Jul 18, 2012 at 10:09 AM, Gregory Farnum g...@inktank.com
(mailto:g...@inktank.com) wrote:
On Monday, July 16, 2012 at 11:55 AM, Andrey Korolyov
On Wed, Jul 18, 2012 at 10:30 PM, Gregory Farnum g...@inktank.com wrote:
On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov and...@xdel.ru wrote:
On Wed, Jul 18, 2012 at 11:18 AM, Gregory Farnum g...@inktank.com wrote:
On Tuesday, July 17, 2012 at 11:22 PM, Andrey Korolyov wrote:
On Wed, Jul 18
On Thu, Jul 19, 2012 at 1:28 AM, Gregory Farnum g...@inktank.com wrote:
On Wed, Jul 18, 2012 at 12:07 PM, Andrey Korolyov and...@xdel.ru wrote:
On Wed, Jul 18, 2012 at 10:30 PM, Gregory Farnum g...@inktank.com wrote:
On Wed, Jul 18, 2012 at 12:47 AM, Andrey Korolyov and...@xdel.ru wrote
Hi,
I`ve finally managed to run rbd-related test on relatively powerful
machines and what I have got:
1) Reads on almost fair balanced cluster(eight nodes) did very well,
utilizing almost all disk and bandwidth (dual gbit 802.3ad nics, sata
disks beyond lsi sas 2108 with wt cache gave me
On 07/31/2012 07:17 PM, Mark Nelson wrote:
Hi Andrey!
On 07/31/2012 10:03 AM, Andrey Korolyov wrote:
Hi,
I`ve finally managed to run rbd-related test on relatively powerful
machines and what I have got:
1) Reads on almost fair balanced cluster(eight nodes) did very well,
utilizing almost
On 07/31/2012 07:53 PM, Josh Durgin wrote:
On 07/31/2012 08:03 AM, Andrey Korolyov wrote:
Hi,
I`ve finally managed to run rbd-related test on relatively powerful
machines and what I have got:
1) Reads on almost fair balanced cluster(eight nodes) did very well,
utilizing almost all disk
On Thu, Aug 23, 2012 at 2:33 AM, Sage Weil s...@inktank.com wrote:
On Thu, 23 Aug 2012, Andrey Korolyov wrote:
Hi,
today during heavy test a pair of osds and one mon died, resulting to
hard lockup of some kvm processes - they went unresponsible and was
killed leaving zombie processes ([kvm
On Tue, Aug 28, 2012 at 12:47 AM, Sébastien Han han.sebast...@gmail.com wrote:
Hi community,
For those of you who are interested, I performed several benchmarks of
RADOS and RBD on different types of hardware and use case.
You can find my results here:
(commit:a7ad701b9bd479f20429f19e6fea7373ca6bba7c)
On Sun, Aug 26, 2012 at 8:52 PM, Andrey Korolyov and...@xdel.ru wrote:
During recovery, following crash happens(simular to
http://tracker.newdream.net/issues/2126 which marked resolved long
ago):
http://xdel.ru/downloads/ceph-log/osd-2012-08-26.txt
Hi,
This is completely off-list, but I`m asking because only ceph trigger
such a bug :) .
With 0.51, following happens: if I kill an osd, one or more neighbor
nodes may go to hanged state with cpu lockups, not related to
temperature or overall interrupt count or la and it happens randomly
over
On Thu, Sep 13, 2012 at 1:09 AM, Tommi Virtanen t...@inktank.com wrote:
On Wed, Sep 12, 2012 at 10:33 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
This is completely off-list, but I`m asking because only ceph trigger
such a bug :) .
With 0.51, following happens: if I kill an osd, one
On Tue, Sep 18, 2012 at 4:37 PM, Guido Winkelmann
guido-c...@thisisnotatest.de wrote:
Am Dienstag, 11. September 2012, 17:25:49 schrieben Sie:
The next stable release will have cephx authentication enabled by default.
Hm, that could be a problem for me. I have tried multiple times to get cephx
On Tue, Sep 18, 2012 at 5:34 PM, Andrey Korolyov and...@xdel.ru wrote:
On Tue, Sep 18, 2012 at 4:37 PM, Guido Winkelmann
guido-c...@thisisnotatest.de wrote:
Am Dienstag, 11. September 2012, 17:25:49 schrieben Sie:
The next stable release will have cephx authentication enabled by default.
Hm
On Thu, Sep 13, 2012 at 1:43 AM, Andrey Korolyov and...@xdel.ru wrote:
On Thu, Sep 13, 2012 at 1:09 AM, Tommi Virtanen t...@inktank.com wrote:
On Wed, Sep 12, 2012 at 10:33 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
This is completely off-list, but I`m asking because only ceph trigger
On Mon, Oct 1, 2012 at 8:42 PM, Tommi Virtanen t...@inktank.com wrote:
On Sun, Sep 30, 2012 at 2:55 PM, Andrey Korolyov and...@xdel.ru wrote:
Short post mortem - EX3200/12.1R2.9 may begin to drop packets (seems
to appear more likely on 0.51 traffic patterns, which is very strange
for L2
Hi,
Recent tests on my test rack with 20G IB(iboip, 64k mtu, default
CUBIC, CFQ, LSI SAS 2108 w/ wb cache) interconnect shows a quite
fantastic performance - on both reads and writes Ceph completely
utilizing all disk bandwidth as high as 0.9 of theoretical limit of
sum of all bandwidths bearing
On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin josh.dur...@inktank.com wrote:
On 10/28/2012 03:02 AM, Andrey Korolyov wrote:
Hi,
Should following behavior considered to be normal?
$ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key
$ fdisk /dev/rbd1
Command (m for help): p
On Mon, Nov 5, 2012 at 11:33 PM, Stefan Priebe s.pri...@profihost.ag wrote:
Am 04.11.2012 15:12, schrieb Sage Weil:
On Sun, 4 Nov 2012, Stefan Priebe wrote:
Can i merge wip-rbd-read into master?
Yeah. I'm going to do a bit more testing first before I do it, but it
should apply cleanly.
On Thu, Nov 8, 2012 at 4:00 PM, Wido den Hollander w...@widodh.nl wrote:
On 08-11-12 10:04, Stefan Priebe - Profihost AG wrote:
Hello list,
is there any prefered way to use clock syncronisation?
I've tried running openntpd and ntpd on all servers but i'm still getting:
2012-11-08
On Thu, Nov 8, 2012 at 7:02 PM, Atchley, Scott atchle...@ornl.gov wrote:
On Nov 8, 2012, at 10:00 AM, Scott Atchley atchle...@ornl.gov wrote:
On Nov 8, 2012, at 9:39 AM, Mark Nelson mark.nel...@inktank.com wrote:
On 11/08/2012 07:55 AM, Atchley, Scott wrote:
On Nov 8, 2012, at 3:22 AM,
On Thu, Nov 8, 2012 at 7:53 PM, Alexandre DERUMIER aderum...@odiso.com wrote:
So it is a problem of KVM which let's the processes jump between cores a
lot.
maybe numad from redhat can help ?
http://fedoraproject.org/wiki/Features/numad
It's try to keep process on same numa node and I think
Hi,
Please take a look, seems harmless:
$ rbd mv vm0
terminate called after throwing an instance of 'std::logic_error'
what(): basic_string::_S_construct null not valid
*** Caught signal (Aborted) **
in thread 7f85f5981780
ceph version 0.53 (commit:2528b5ee105b16352c91af064af5c0b5a7d45d7c)
Hi,
For this version, rbd cp assumes that destination pool is the same as
source, not 'rbd', if pool in the destination path is omitted.
rbd cp install/img testimg
rbd ls install
img testimg
Is this change permanent?
Thanks!
--
To unsubscribe from this list: send the line unsubscribe
Hi,
In the 0.54 cephx is probably broken somehow:
$ ceph auth add client.qemukvm osd 'allow *' mon 'allow *' mds 'allow
*' -i qemukvm.key
2012-11-14 15:51:23.153910 7ff06441f780 -1 read 65 bytes from qemukvm.key
added key for client.qemukvm
$ ceph auth list
...
client.admin
key: [xx]
On Thu, Nov 15, 2012 at 4:56 AM, Dan Mick dan.m...@inktank.com wrote:
On 11/12/2012 02:47 PM, Josh Durgin wrote:
On 11/12/2012 08:30 AM, Andrey Korolyov wrote:
Hi,
For this version, rbd cp assumes that destination pool is the same as
source, not 'rbd', if pool in the destination path
On Thu, Nov 15, 2012 at 5:03 PM, Andrey Korolyov and...@xdel.ru wrote:
On Thu, Nov 15, 2012 at 5:12 AM, Yehuda Sadeh yeh...@inktank.com wrote:
On Wed, Nov 14, 2012 at 4:20 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
In the 0.54 cephx is probably broken somehow:
$ ceph auth add
closely to /dev layout, or,
at least iSCSI targets, when not specifying full path or use some
predefined default prefix make no sense at all.
On Wed, Nov 14, 2012 at 10:43 PM, Andrey Korolyov and...@xdel.ru wrote:
On Thu, Nov 15, 2012 at 4:56 AM, Dan Mick dan.m...@inktank.com wrote:
On 11/12
Hi,
Somehow I have managed to produce unkillable snapshot, which does not
allow to remove itself or parent image:
$ rbd snap purge dev-rack0/vm2
Removing all snapshots: 100% complete...done.
$ rbd rm dev-rack0/vm2
2012-11-21 16:31:24.184626 7f7e0d172780 -1 librbd: image has snapshots
- not
On Thu, Nov 22, 2012 at 2:05 AM, Josh Durgin josh.dur...@inktank.com wrote:
On 11/21/2012 04:50 AM, Andrey Korolyov wrote:
Hi,
Somehow I have managed to produce unkillable snapshot, which does not
allow to remove itself or parent image:
$ rbd snap purge dev-rack0/vm2
Removing all
Hi,
In the recent versions Ceph introduces some unexpected behavior for
the permanent connections (VM or kernel clients) - after crash
recovery, I/O will hang on the next planned scrub on the following
scenario:
- launch a bunch of clients doing non-intensive writes,
- lose one or more osd, mark
On Fri, Nov 23, 2012 at 12:35 AM, Sage Weil s...@inktank.com wrote:
On Thu, 22 Nov 2012, Andrey Korolyov wrote:
Hi,
In the recent versions Ceph introduces some unexpected behavior for
the permanent connections (VM or kernel clients) - after crash
recovery, I/O will hang on the next planned
On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil s...@inktank.com wrote:
Hi Stefan,
On Thu, 15 Nov 2012, Sage Weil wrote:
On Thu, 15 Nov 2012, Stefan Priebe - Profihost AG wrote:
Am 14.11.2012 15:59, schrieb Sage Weil:
Hi Stefan,
I would be nice to confirm that no clients are waiting on
readjusted cluster before bug shows
itself(say, in a day-long distance)?
On Tue, Nov 27, 2012 at 11:47 PM, Andrey Korolyov and...@xdel.ru wrote:
On Wed, Nov 28, 2012 at 5:51 AM, Sage Weil s...@inktank.com wrote:
Hi Stefan,
On Thu, 15 Nov 2012, Sage Weil wrote:
On Thu, 15 Nov 2012, Stefan
On Thu, Nov 29, 2012 at 8:34 PM, Sage Weil s...@inktank.com wrote:
On Thu, 29 Nov 2012, Andrey Korolyov wrote:
$ ceph osd down -
osd.0 is already down
$ ceph osd down ---
osd.0 is already down
the same for ``+'', ``/'', ``%'' and so - I think that for osd subsys
ceph cli should explicitly
:
If you can reproduce it again, what we really need are the osd logs
from the acting set of a pg stuck in scrub with
debug osd = 20
debug ms = 1
debug filestore = 20.
Thanks,
-Sam
On Sun, Nov 25, 2012 at 2:08 PM, Andrey Korolyov and...@xdel.ru wrote:
On Fri, Nov 23, 2012 at 12:35 AM, Sage
, it's our handling of active_pushes. I'll
have a patch shortly.
Thanks!
-Sam
On Fri, Nov 30, 2012 at 4:14 AM, Andrey Korolyov and...@xdel.ru wrote:
http://xdel.ru/downloads/ceph-log/ceph-scrub-stuck.log.gz
http://xdel.ru/downloads/ceph-log/cluster-w.log.gz
Here, please.
I have initiated
Hi,
Today during planned kernel upgrade one of osds (which I have not
touched yet), started to claim about ``misdirected client'':
2012-12-12 21:22:59.107648 osd.20 [WRN] client.2774043
10.5.0.33:0/1013711 misdirected client.2774043.0:114 pg 5.ad140d42 to
osd.20 in e23834, client e23834 pg 5.542
On Sun, Dec 16, 2012 at 5:59 PM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi,
My log is filling up with warnings about a single slow request that has been
around for a very long time:
osd.1 10.0.0.2:6800/900 162926 : [WRN] 1 slow requests, 1 included below;
oldest blocked for
Hi,
After recent switch do default ``--stripe-count 1'' on image upload I
have observed some strange thing - single import or deletion of the
striped image may temporarily turn off entire cluster, literally(see
log below).
Of course next issued osd map fix the situation, but all in-flight
On Mon, Dec 17, 2012 at 2:42 AM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi Andrey,
Thanks for your reply!
Please take a look to this thread:
http://comments.gmane.org/gmane.comp.file-systems.ceph.devel/10843
I took your advice and restarted each of my three osd's
On Mon, Dec 17, 2012 at 2:36 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
After recent switch do default ``--stripe-count 1'' on image upload I
have observed some strange thing - single import or deletion of the
striped image may temporarily turn off entire cluster, literally(see
log below
, I have
started playing with TCP settings and found that ipv4.tcp_low_latency
raising possibility of ``wrong mark'' event several times when enabled
- so area of all possible causes quickly collapsed to the media-only
problem and I fixed problem soon.
On Wed, Dec 19, 2012 at 3:53 AM, Andrey
On Sun, Dec 30, 2012 at 9:05 PM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi guys,
I'm testing Ceph as storage for KVM virtual machine images and found an
inconvenience that I am hoping it is possible to find the cause of.
I'm running a single KVM Linux guest on top of Ceph
On Mon, Dec 31, 2012 at 3:12 AM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi Andrey,
Thanks for your reply!
You may try do play with SCHED_RT, I have found it hard to use for
myself, but you can achieve your goal by adding small RT slices via
``cpu'' cgroup to vcpu/emulator
On Mon, Dec 31, 2012 at 2:58 PM, Jens Kristian Søgaard
j...@mermaidconsulting.dk wrote:
Hi Andrey,
As I understood right, you have md device holding both journal and
filestore? What type of raid you have here?
Yes, same md device holding both journal and filestore. It is a raid5.
Ahem, of
Hi,
All osds in the dev cluster died shortly after upgrade (packet-only,
i.e. binary upgrade, even without restart running processes), please
see attached file.
Was: 0.55.1-356-g850d1d5
Upgraded to: 0.56 tag
The only one difference is a version of the gcc corresponding
libstdc++ - 4.6 on the
On Tue, Jan 1, 2013 at 9:49 PM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
All osds in the dev cluster died shortly after upgrade (packet-only,
i.e. binary upgrade, even without restart running processes), please
see attached file.
Was: 0.55.1-356-g850d1d5
Upgraded to: 0.56 tag
The only
On Wed, Jan 2, 2013 at 12:16 AM, Andrey Korolyov and...@xdel.ru wrote:
On Tue, Jan 1, 2013 at 9:49 PM, Andrey Korolyov and...@xdel.ru wrote:
Hi,
All osds in the dev cluster died shortly after upgrade (packet-only,
i.e. binary upgrade, even without restart running processes), please
see
I have just observed that ceph-mon process, at least bobtail one, has
an extremely high density of writes - times above _overall_ cluster
amount of writes, measured by qemu driver(and they are very close to
be fair). For example, test cluster of 32 osds have 7.5 MByte/s of
writes on each mon node
On Wed, Jan 2, 2013 at 8:00 PM, Joao Eduardo Luis joao.l...@inktank.com wrote:
On 01/02/2013 03:40 PM, Andrey Korolyov wrote:
I have just observed that ceph-mon process, at least bobtail one, has
an extremely high density of writes - times above _overall_ cluster
amount of writes, measured
On Tue, Jan 8, 2013 at 11:30 AM, Stefan Priebe - Profihost AG
s.pri...@profihost.ag wrote:
Hi,
i cannot see any git tag or branch claming to be 0.56.1? Which commit id is
this?
Greets
Stefan
Same for me, github simply does not sent a new tag in the pull to
local tree by some reason.
/osd-lockup-2-14-33-16.741603.log.gz
Timestamps in filenames added for easier lookup, osdmap have marked
osds as down after couple of beats after those marks.
On Mon, Dec 31, 2012 at 1:16 AM, Andrey Korolyov and...@xdel.ru wrote:
On Sun, Dec 30, 2012 at 10:56 PM, Samuel Just sam.j...@inktank.com
with? If you run a rados bench from both
machines, what do the results look like?
Also, can you do the ceph osd bench on each of your OSDs, please?
(http://ceph.newdream.net/wiki/Troubleshooting#OSD_performance)
-Greg
On Monday, March 19, 2012 at 6:46 AM, Andrey Korolyov wrote:
More strangely
~4096] 0.17eb9fd8) v4)
Sorry for my previous question about rbd chunks, it was really stupid :)
On Mon, Mar 19, 2012 at 10:40 PM, Josh Durgin josh.dur...@dreamhost.com wrote:
On 03/19/2012 11:13 AM, Andrey Korolyov wrote:
Nope, I`m using KVM for rbd guests. Surely I`ve been noticed that Sage
On Fri, Mar 23, 2012 at 5:25 AM, Andrey Korolyov and...@xdel.ru wrote:
Hi Sam,
Can you please suggest on where to start profiling osd? If the
bottleneck has related to such non-complex things as directio speed,
I`m sure that I was able to catch it long ago, even crossing around by
results
Hi,
# virsh blkdeviotune Test vdb --write_iops_sec 50 //file block device
# virsh blkdeviotune Test vda --write_iops_sec 50 //rbd block device
error: Unable to change block I/O throttle
error: invalid argument: No device found for specified path
2012-04-03 07:38:49.170+: 30171: debug :
But I am able to set static limits in the config for rbd :) All I want
is a change on-the-fly.
It is NOT cgroups mechanism, but completely qemu-driven.
On Tue, Apr 3, 2012 at 12:21 PM, Wido den Hollander w...@widodh.nl wrote:
Hi,
Op 3-4-2012 10:02, Andrey Korolyov schreef:
Hi,
# virsh
-2012 10:28, Andrey Korolyov schreef:
But I am able to set static limits in the config for rbd :) All I want
is a change on-the-fly.
It is NOT cgroups mechanism, but completely qemu-driven.
Are you sure about that?
http://libvirt.org/formatdomain.html#elementsBlockTuning
Browsing through
Suggested hack works, seems that libvirt devs does not remove block
limitation as they count this feature as experimental, or forgot about
it.
On Tue, Apr 3, 2012 at 12:55 PM, Andrey Korolyov and...@xdel.ru wrote:
At least, elements under iotune block applies to rbd and you can
test
1 - 100 of 119 matches
Mail list logo