Re: Collection of strange lockups on 0.51

2012-09-12 Thread Andrey Korolyov
On Thu, Sep 13, 2012 at 1:09 AM, Tommi Virtanen wrote: > On Wed, Sep 12, 2012 at 10:33 AM, Andrey Korolyov wrote: >> Hi, >> This is completely off-list, but I`m asking because only ceph trigger >> such a bug :) . >> >> With 0.51, following happens: if I kill an o

Re: enabling cephx by default

2012-09-18 Thread Andrey Korolyov
On Tue, Sep 18, 2012 at 4:37 PM, Guido Winkelmann wrote: > Am Dienstag, 11. September 2012, 17:25:49 schrieben Sie: >> The next stable release will have cephx authentication enabled by default. > > Hm, that could be a problem for me. I have tried multiple times to get cephx > working in the past,

Re: enabling cephx by default

2012-09-18 Thread Andrey Korolyov
On Tue, Sep 18, 2012 at 5:34 PM, Andrey Korolyov wrote: > On Tue, Sep 18, 2012 at 4:37 PM, Guido Winkelmann > wrote: >> Am Dienstag, 11. September 2012, 17:25:49 schrieben Sie: >>> The next stable release will have cephx authentication enabled by default. >> >> H

Re: Collection of strange lockups on 0.51

2012-09-30 Thread Andrey Korolyov
On Thu, Sep 13, 2012 at 1:43 AM, Andrey Korolyov wrote: > On Thu, Sep 13, 2012 at 1:09 AM, Tommi Virtanen wrote: >> On Wed, Sep 12, 2012 at 10:33 AM, Andrey Korolyov wrote: >>> Hi, >>> This is completely off-list, but I`m asking because only ceph trigger >>&g

Re: Collection of strange lockups on 0.51

2012-10-03 Thread Andrey Korolyov
On Mon, Oct 1, 2012 at 8:42 PM, Tommi Virtanen wrote: > On Sun, Sep 30, 2012 at 2:55 PM, Andrey Korolyov wrote: >> Short post mortem - EX3200/12.1R2.9 may begin to drop packets (seems >> to appear more likely on 0.51 traffic patterns, which is very strange >> for L2 switc

Ignore O_SYNC for rbd cache

2012-10-10 Thread Andrey Korolyov
Hi, Recent tests on my test rack with 20G IB(iboip, 64k mtu, default CUBIC, CFQ, LSI SAS 2108 w/ wb cache) interconnect shows a quite fantastic performance - on both reads and writes Ceph completely utilizing all disk bandwidth as high as 0.9 of theoretical limit of sum of all bandwidths bearing i

Re: Different geoms for an rbd block device

2012-10-30 Thread Andrey Korolyov
On Wed, Oct 31, 2012 at 1:07 AM, Josh Durgin wrote: > On 10/28/2012 03:02 AM, Andrey Korolyov wrote: >> >> Hi, >> >> Should following behavior considered to be normal? >> >> $ rbd map test-rack0/debiantest --user qemukvm --secret qemukvm.key >> $

Re: BUG: kvm crashing in void librbd::AioCompletion::complete_request

2012-11-05 Thread Andrey Korolyov
On Mon, Nov 5, 2012 at 11:33 PM, Stefan Priebe wrote: > Am 04.11.2012 15:12, schrieb Sage Weil: >> >> On Sun, 4 Nov 2012, Stefan Priebe wrote: >>> >>> Can i merge wip-rbd-read into master? >> >> >> Yeah. I'm going to do a bit more testing first before I do it, but it >> should apply cleanly. Hop

Re: clock syncronisation

2012-11-08 Thread Andrey Korolyov
On Thu, Nov 8, 2012 at 4:00 PM, Wido den Hollander wrote: > > > On 08-11-12 10:04, Stefan Priebe - Profihost AG wrote: >> >> Hello list, >> >> is there any prefered way to use clock syncronisation? >> >> I've tried running openntpd and ntpd on all servers but i'm still getting: >> 2012-11-08 09:55

Re: SSD journal suggestion

2012-11-08 Thread Andrey Korolyov
On Thu, Nov 8, 2012 at 7:02 PM, Atchley, Scott wrote: > On Nov 8, 2012, at 10:00 AM, Scott Atchley wrote: > >> On Nov 8, 2012, at 9:39 AM, Mark Nelson wrote: >> >>> On 11/08/2012 07:55 AM, Atchley, Scott wrote: On Nov 8, 2012, at 3:22 AM, Gandalf Corvotempesta wrote: > 2012/

Re: less cores more iops / speed

2012-11-08 Thread Andrey Korolyov
On Thu, Nov 8, 2012 at 7:53 PM, Alexandre DERUMIER wrote: >>>So it is a problem of KVM which let's the processes jump between cores a >>>lot. > > maybe numad from redhat can help ? > http://fedoraproject.org/wiki/Features/numad > > It's try to keep process on same numa node and I think it's also d

``rbd mv'' crash when no destination issued

2012-11-09 Thread Andrey Korolyov
Hi, Please take a look, seems harmless: $ rbd mv vm0 terminate called after throwing an instance of 'std::logic_error' what(): basic_string::_S_construct null not valid *** Caught signal (Aborted) ** in thread 7f85f5981780 ceph version 0.53 (commit:2528b5ee105b16352c91af064af5c0b5a7d45d7c)

Re: Limited IOP/s on Dual Xeon KVM Host

2012-11-10 Thread Andrey Korolyov
On Sat, Nov 10, 2012 at 5:49 PM, Stefan Priebe wrote: > Am 10.11.2012 14:41, schrieb Mark Nelson: > >> On 11/10/2012 02:03 AM, Stefan Priebe wrote: >>> >>> Hello lists, >>> >>> on a dual Xeon KVM Host i get max 6000 IOP/s random 4k writes AND reads. >>> On a Single Xeon KVM Host i get 17.000-18.00

changed rbd cp behavior in 0.53

2012-11-12 Thread Andrey Korolyov
Hi, For this version, rbd cp assumes that destination pool is the same as source, not 'rbd', if pool in the destination path is omitted. rbd cp install/img testimg rbd ls install img testimg Is this change permanent? Thanks! -- To unsubscribe from this list: send the line "unsubscribe ceph-dev

Authorization issues in the 0.54

2012-11-14 Thread Andrey Korolyov
Hi, In the 0.54 cephx is probably broken somehow: $ ceph auth add client.qemukvm osd 'allow *' mon 'allow *' mds 'allow *' -i qemukvm.key 2012-11-14 15:51:23.153910 7ff06441f780 -1 read 65 bytes from qemukvm.key added key for client.qemukvm $ ceph auth list ... client.admin key: [xx]

Re: changed rbd cp behavior in 0.53

2012-11-14 Thread Andrey Korolyov
On Thu, Nov 15, 2012 at 4:56 AM, Dan Mick wrote: > > > On 11/12/2012 02:47 PM, Josh Durgin wrote: >> >> On 11/12/2012 08:30 AM, Andrey Korolyov wrote: >>> >>> Hi, >>> >>> For this version, rbd cp assumes that destination pool is the same

Re: Authorization issues in the 0.54

2012-11-15 Thread Andrey Korolyov
On Thu, Nov 15, 2012 at 5:03 PM, Andrey Korolyov wrote: > On Thu, Nov 15, 2012 at 5:12 AM, Yehuda Sadeh wrote: >> On Wed, Nov 14, 2012 at 4:20 AM, Andrey Korolyov wrote: >>> Hi, >>> In the 0.54 cephx is probably broken somehow: >>> >>> $ ceph aut

Re: changed rbd cp behavior in 0.53

2012-11-15 Thread Andrey Korolyov
Deborah Barba Speaking of standards, rbd layout is more closely to /dev layout, or, at least iSCSI targets, when not specifying full path or use some predefined default prefix make no sense at all. > > On Wed, Nov 14, 2012 at 10:43 PM, Andrey Korolyov wrote: >> >> On Thu, Nov

'zombie snapshot' problem

2012-11-21 Thread Andrey Korolyov
Hi, Somehow I have managed to produce unkillable snapshot, which does not allow to remove itself or parent image: $ rbd snap purge dev-rack0/vm2 Removing all snapshots: 100% complete...done. $ rbd rm dev-rack0/vm2 2012-11-21 16:31:24.184626 7f7e0d172780 -1 librbd: image has snapshots - not removi

Mysteriously poor write performance

2012-03-17 Thread Andrey Korolyov
Hi, I`ve did some performance tests at the following configuration: mon0, osd0 and mon1, osd1 - two twelve-core r410 with 32G ram, mon2 - dom0 with three dedicated cores and 1.5G, mostly idle. First three disks on each r410 arranged into raid0 and holds osd data when fourth holds os and osd` jour

Re: Mysteriously poor write performance

2012-03-19 Thread Andrey Korolyov
e43546dee9246773ffd6877b4f9495f1ec61cd55 and 1468d95101adfad44247016a1399aab6b86708d2 - both cases caused crashes under heavy load. On Sun, Mar 18, 2012 at 10:22 PM, Sage Weil wrote: > On Sat, 17 Mar 2012, Andrey Korolyov wrote: >> Hi, >> >> I`ve did some performance tests at the following conf

Re: Mysteriously poor write performance

2012-03-19 Thread Andrey Korolyov
nning dd with? If you run a rados bench from both > machines, what do the results look like? > Also, can you do the ceph osd bench on each of your OSDs, please? > (http://ceph.newdream.net/wiki/Troubleshooting#OSD_performance) > -Greg > > > On Monday, March 19, 2012 at 6:46 A

Re: Mysteriously poor write performance

2012-03-19 Thread Andrey Korolyov
cheaper hardware. For first time, I blamed recent crash and recreated cluster from scratch about a hour ago, but those objects created in a bare data/ pool with only one vm. On Mon, Mar 19, 2012 at 10:40 PM, Josh Durgin wrote: > On 03/19/2012 11:13 AM, Andrey Korolyov wrote: >> >

Re: Mysteriously poor write performance

2012-03-20 Thread Andrey Korolyov
599.0:2533 rb.0.2.0040 [write 1220608~4096] 0.17eb9fd8) v4) Sorry for my previous question about rbd chunks, it was really stupid :) On Mon, Mar 19, 2012 at 10:40 PM, Josh Durgin wrote: > On 03/19/2012 11:13 AM, Andrey Korolyov wrote: >> >> Nope, I`m using KVM for rbd guests

Re: Mysteriously poor write performance

2012-03-22 Thread Andrey Korolyov
0.01%, 50=0.01% On Thu, Mar 22, 2012 at 9:26 PM, Samuel Just wrote: > Our journal writes are actually sequential.  Could you send FIO > results for sequential 4k writes osd.0's journal and osd.1's journal? > -Sam > > On Thu, Mar 22, 2012 at 5:21 AM, Andrey Korolyov wrote: &

Re: Mysteriously poor write performance

2012-03-24 Thread Andrey Korolyov
t; > -Sam > > On Fri, Mar 23, 2012 at 5:25 AM, Andrey Korolyov wrote: >> Hi Sam, >> >> Can you please suggest on where to start profiling osd? If the >> bottleneck has related to such non-complex things as directio speed, >> I`m sure that I was able to catc

Setting iotune limits on rbd

2012-04-03 Thread Andrey Korolyov
Hi, # virsh blkdeviotune Test vdb --write_iops_sec 50 //file block device # virsh blkdeviotune Test vda --write_iops_sec 50 //rbd block device error: Unable to change block I/O throttle error: invalid argument: No device found for specified path 2012-04-03 07:38:49.170+: 30171: debug : virDo

Re: Setting iotune limits on rbd

2012-04-03 Thread Andrey Korolyov
But I am able to set static limits in the config for rbd :) All I want is a change on-the-fly. It is NOT cgroups mechanism, but completely qemu-driven. On Tue, Apr 3, 2012 at 12:21 PM, Wido den Hollander wrote: > Hi, > > Op 3-4-2012 10:02, Andrey Korolyov schreef: > >>

Re: Setting iotune limits on rbd

2012-04-03 Thread Andrey Korolyov
> That's why you get this error, it's assuming the device you want to set the > limits on is a block device or a regular file. > > Wido > > >> >> On Tue, Apr 3, 2012 at 12:21 PM, Wido den Hollander >>  wrote: >>> >>> Hi, >>> >>&g

Re: Setting iotune limits on rbd

2012-04-03 Thread Andrey Korolyov
Suggested hack works, seems that libvirt devs does not remove block limitation as they count this feature as experimental, or forgot about it. On Tue, Apr 3, 2012 at 12:55 PM, Andrey Korolyov wrote: > At least, elements under block applies to rbd and you can > test it by running fi

Re: defaults paths

2012-04-05 Thread Andrey Korolyov
Right, but probably we need journal separation at the directory level by default, because there is a very small amount of cases when speed of main storage is sufficient for journal or when resulting speed decrease is not significant, so journal by default may go into /var/lib/ceph/osd/journals/$i/j

Re: defaults paths

2012-04-05 Thread Andrey Korolyov
p to the sysadmin to mount / symlink the correct storage devices > on the correct paths - ceph should not be concerned that some volumes might > need to sit together. > > Rgds, > Bernard > > On 05 Apr 2012, at 09:12, Andrey Korolyov wrote: > >> Right, but probably we ne

Re: rbd snapshot in qemu and libvirt

2012-04-18 Thread Andrey Korolyov
I have tested all of them about a week ago, all works fine. Also it will be very nice if rbd can list an actual allocated size of every image or snapshot in future. On Wed, Apr 18, 2012 at 5:22 PM, Martin Mailand wrote: > Hi Wido, > > I am looking for doing the snapshots via libvirt, create, dele

Re: rbd snapshot in qemu and libvirt

2012-04-18 Thread Andrey Korolyov
lid: Disk 'rbd/vm1:rbd_cache_enabled=1' > does not support snapshotting > > maybe the rbd_cache option is the problem? > > > -martin > > > Am 18.04.2012 16:39, schrieb Andrey Korolyov: > >> I have tested all of them about a week ago, all works fine.

collectd and ceph plugin

2012-04-21 Thread Andrey Korolyov
Hello everyone, I have just tried ceph collectd fork on wheezy and noticed that all logs for ceph plugin produce nothing but zeroes(see below) for all types of nodes. Python cephtool works just fine. Collectd run as root and there is no obvious errors like socket permissions and no tips from its l

'rbd map' asynchronous behavior

2012-05-15 Thread Andrey Korolyov
Hi, There are strange bug when I tried to map excessive amounts of block devices inside the pool, like following for vol in $(rbd ls); do rbd map $vol; [some-microsleep]; [some operation or nothing, I have stubbed guestfs mount here] ; [some-microsleep]; unmap /dev/rbd/rbd/$vol ; [some-microslee

Re: 'rbd map' asynchronous behavior

2012-05-16 Thread Andrey Korolyov
ut on symlinks creation. On Tue, May 15, 2012 at 7:40 PM, Josh Durgin wrote: > On 05/15/2012 04:49 AM, Andrey Korolyov wrote: >> >> Hi, >> >> There are strange bug when I tried to map excessive amounts of block >> devices inside the pool, like following >&

Re: how to debug slow rbd block device

2012-05-22 Thread Andrey Korolyov
Hi, I`ve run in almost same problem about two months ago, and there is a couple of corner cases: near-default tcp parameters, small journal size, disks that are not backed by controller with NVRAM cache and high load on osd` cpu caused by side processes. Finally, I have able to achieve 115Mb/s for

Re: how to debug slow rbd block device

2012-05-23 Thread Andrey Korolyov
Hi, For Stefan: Increasing socket memory gave me about some percents on fio tests inside VM(I have measured 'max-iops-until-ceph-throws-message-about-delayed-write' parameter). What is more important, osd process, if possible, should be pinned to dedicated core or two, and all other processes sho

Re: 'rbd map' asynchronous behavior

2012-05-25 Thread Andrey Korolyov
99.701343] [] ? rbd_remove+0x102/0x11e [rbd] [ 99.701352] [] ? sysfs_write_file+0xd3/0x10f [ 99.701361] [] ? vfs_write+0xaa/0x136 [ 99.701369] [] ? sys_write+0x45/0x6e [ 99.701377] [] ? system_call_fastpath+0x16/0x1b On Wed, May 16, 2012 at 12:24 PM, Andrey Korolyov wrote: >>Thi

<    1   2