[ceph-users] error mapping device in firefly
Hi, I am trying to map a rbd device in Ubuntu 14.04 (kernel 3.13.0-30-generic): # rbd -p mypool create test1 --size 500 # rbd -p mypool ls test1 # rbd -p mypool map test1 rbd: add failed: (5) Input/output error and in the syslog: Jul 4 09:31:48 testceph kernel: [70503.356842] libceph: mon2 172.16.64.18:6789 feature set mismatch, my 4a042a42 server's 2004a042a42, missing 200 Jul 4 09:31:48 testceph kernel: [70503.356938] libceph: mon2 172.16.64.18:6789 socket error on read my environment: cluster version on all MONs and OSDs is 0.80.1 In the client machine: ii ceph-common 0.80.1-1trusty amd64common utilities to mount and interact with a ceph storage cluster ii python-ceph 0.80.1-1trusty amd64Python libraries for the Ceph distributed filesystem ii librados2 0.80.1-1trusty amd64RADOS distributed object store client library I think I started getting this error when I switched from tunables legacy to optimal after upgrading from 0.72 to 0.80. Thanks in advance! ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] emperor - firefly : Significant increase in RAM usage
Hi, Yesterday I finally updated our cluster to emperor (lastest stable commit) and what's fairly apparent is a much higher RAM usage on the OSD: http://i.imgur.com/qw9iKSV.png Has anyone noticed the same ? I mean 25% sudden increase in the idle ram usage is hard to ignore ... Those OSD are pretty much entirely dedicated to RGW data pools FWIW. Cheers, Sylvain ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] write performance per disk
On 07/03/2014 04:32 PM, VELARTIS Philipp Dürhammer wrote: HI, Ceph.conf: osd journal size = 15360 rbd cache = true rbd cache size = 2147483648 rbd cache max dirty = 1073741824 rbd cache max dirty age = 100 osd recovery max active = 1 osd max backfills = 1 osd mkfs options xfs = -f -i size=2048 osd mount options xfs = rw,noatime,nobarrier,logbsize=256k,logbufs=8,inode64,allocsize=4M osd op threads = 8 so it should be 8 threads? How many threads are you using with rados bench? Don't touch the op threads from the start, usually the default is just fine. All 3 machines have more or less the same disk load at the same time. also the disks: sdb 35.5687.10 6849.09 617310 48540806 sdc 26.7572.62 5148.58 514701 36488992 sdd 35.1553.48 6802.57 378993 48211141 sde 31.0479.04 6208.48 560141 44000710 sdf 32.7938.35 6238.28 271805 44211891 sdg 31.6777.84 5987.45 551680 42434167 sdh 32.9551.29 6315.76 363533 44761001 sdi 31.6756.93 5956.29 403478 42213336 sdj 35.8377.82 6929.31 551501 49109354 sdk 36.8673.84 7291.00 523345 51672704 sdl 36.02 112.90 7040.47 800177 49897132 sdm 33.2538.02 6455.05 269446 45748178 sdn 33.5239.10 6645.19 277101 47095696 sdo 33.2646.22 6388.20 327541 45274394 sdp 33.3874.12 6480.62 525325 45929369 the question is: is this a poor performance to get max 500mb/write with 45 disks and replica 2 or should I expect this? You should be able to get more as long as the I/O is done in parallel. Wido -Ursprüngliche Nachricht- Von: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] Im Auftrag von Wido den Hollander Gesendet: Donnerstag, 03. Juli 2014 15:22 An: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] write performance per disk On 07/03/2014 03:11 PM, VELARTIS Philipp Dürhammer wrote: Hi, I have a ceph cluster setup (with 45 sata disk journal on disks) and get only 450mb/sec writes seq (maximum playing around with threads in rados bench) with replica of 2 How many threads? Which is about ~20Mb writes per disk (what y see in atop also) theoretically with replica2 and having journals on disk should be 45 X 100mb (sata) / 2 (replica) / 2 (journal writes) which makes it 1125 satas in reality have 120mb/sec so the theoretical output should be more. I would expect to have between 40-50mb/sec for each sata disk Can somebody confirm that he can reach this speed with a setup with journals on the satas (with journals on ssd speed should be 100mb per disk)? or does ceph only give about ¼ of the speed for a disk? (and not the ½ as expected because of journals) Did you verify how much each machine is doing? It could be that the data is not distributed evenly and that on a certain machine the drives are doing 50MB/sec. My setup is 3 servers with: 2 x 2.6ghz xeons, 128gb ram 15 satas for ceph (and ssds for system) 1 x 10gig for external traffic, 1 x 10gig for osd traffic with reads I can saturate the network but writes is far away. And I would expect at least to saturate the 10gig with sequential writes also Should be possible, but with 3 servers the data distribution might not be optimal causing a lower write performance. I've seen 10Gbit write performance on multiple clusters without any problems. Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander Ceph consultant and trainer 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Bad Write-Performance on Ceph/Possible bottlenecks?
Hello Ceph-Community, I'm writing here because we have a bad write-performance on our Ceph-Cluster of about As an overview the technical details of our Cluster: 3 x monitoring-Servers; each with 2 x 1 Gbit/s NIC configured as Bond (Link Aggregation-Mode) 5 x datastore-Servers; each with 10 x 4 TB HDDs serving as OSDs, as Journal we use a 15 GB LVM on an 256 GB SSD-Raid1; 2 x 10 Gbit/s NIC configured as Bond (Link Aggregation-Mode) ceph.conf [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 172.30.30.8,172.30.30.9 mon_initial_members = monitoring1, monitoring2, monitoring3 fsid = 5f22ab94-8d96-48c2-88d3-cff7bad443a9 public network = 172.30.30.0/24 [mon.monitoring1] host = monitoring1 addr = 172.30.30.8:6789 [mon.monitoring2] host = monitoring2 addr = 172.30.30.9:6789 [mon.monitoring3] host = monitoring3 addr = 172.30.30.10:6789 [filestore] filestore max sync interval = 10 [osd] osd recovery max active = 1 osd journal size = 15360 osd op threads = 40 osd disk threads = 40 [osd.0] host = datastore1 [osd.1] host = datastore1 [osd.2] host = datastore1 [osd.3] host = datastore1 [osd.4] host = datastore1 [osd.5] host = datastore1 [osd.6] host = datastore1 [osd.7] host = datastore1 [osd.8] host = datastore1 [osd.9] host = datastore1 [osd.10] host = datastore2 [osd.11] host = datastore2 [osd.11] host = datastore2 [osd.12] host = datastore2 [osd.13] host = datastore2 [osd.14] host = datastore2 [osd.15] host = datastore2 [osd.16] host = datastore2 [osd.17] host = datastore2 [osd.18] host = datastore2 [osd.19] host = datastore2 [osd.20] host = datastore3 [osd.21] host = datastore3 [osd.22] host = datastore3 [osd.23] host = datastore3 [osd.24] host = datastore3 [osd.25] host = datastore3 [osd.26] host = datastore3 [osd.27] host = datastore3 [osd.28] host = datastore3 [osd.29] host = datastore3 [osd.30] host = datastore4 [osd.31] host = datastore4 [osd.32] host = datastore4 [osd.33] host = datastore4 [osd.34] host = datastore4 [osd.35] host = datastore4 [osd.36] host = datastore4 [osd.37] host = datastore4 [osd.38] host = datastore4 [osd.39] host = datastore4 [osd.0] host = datastore5 [osd.40] host = datastore5 [osd.41] host = datastore5 [osd.42] host = datastore5 [osd.43] host = datastore5 [osd.44] host = datastore5 [osd.45] host = datastore5 [osd.46] host = datastore5 [osd.47] host = datastore5 [osd.48] host = datastore5 We have 3 pools: - 2 x 1000 pgs with 2 Replicas distributing the data equally to two racks (Used for datastore 1-4) - 1 x 100 pgs without replication; data only stored on datastore 5. This Pool is used to compare the performance on local disks without networking Here are the performance values, which I get using fio-Bench on a 32GB rbd: On 1000 pgs-Pool with distribution fio --bs=1M --rw=randwrite --ioengine=libaio --direct=1 --iodepth=32 --runtime=60 --name=/dev/rbd/pool1/bench1 fio-2.0.13 Starting 1 process Jobs: 1 (f=1): [w] [100.0% done] [0K/312.0M/0K /s] [0 /312 /0 iops] [eta 00m:00s] /dev/rbd/pool1/bench1: (groupid=0, jobs=1): err= 0: pid=21675: Fri Jul 4 11:03:52 2014 write: io=21071MB, bw=358989KB/s, iops=350 , runt= 60104msec slat (usec): min=127 , max=8040 , avg=511.49, stdev=216.27 clat (msec): min=5 , max=4018 , avg=90.74, stdev=215.83 lat (msec): min=6 , max=4018 , avg=91.25, stdev=215.83 clat percentiles (msec): | 1.00th=[8], 5.00th=[9], 10.00th=[ 11], 20.00th=[ 15], | 30.00th=[ 21], 40.00th=[ 30], 50.00th=[ 45], 60.00th=[ 63], | 70.00th=[ 83], 80.00th=[ 105], 90.00th=[ 129], 95.00th=[ 190], | 99.00th=[ 1254], 99.50th=[ 1680], 99.90th=[ 2409], 99.95th=[ 2638], | 99.99th=[ 3556] bw (KB/s) : min=68210, max=479232, per=100.00%, avg=368399.55, stdev=84457.12 lat (msec) : 10=9.50%, 20=20.02%, 50=23.56%, 100=24.56%, 250=18.09% lat (msec) : 500=1.39%, 750=0.81%, 1000=0.65%, 2000=1.13%, =2000=0.29% cpu : usr=11.17%, sys=7.46%, ctx=17772, majf=0, minf=24 IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=99.9%, =64=0.0% submit: 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, =64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, =64=0.0% issued: total=r=0/w=21071/d=0, short=r=0/w=0/d=0 Run status group 0 (all jobs): WRITE: io=21071MB, aggrb=358989KB/s, minb=358989KB/s, maxb=358989KB/s,
Re: [ceph-users] Bad Write-Performance on Ceph/Possible bottlenecks?
Hi, I wouldn't put those SSD's in raid, just use them separately as journals for half of your's HDD's. This should make your write performance somewhat better. W dniu 04.07.2014 o 11:13 Marco Allevato m.allev...@nwe.de pisze: Hello Ceph-Community, I’m writing here because we have a bad write-performance on our Ceph-Cluster of about As an overview the technical details of our Cluster: 3 x monitoring-Servers; each with 2 x 1 Gbit/s NIC configured as Bond (Link Aggregation-Mode) 5 x datastore-Servers; each with 10 x 4 TB HDDs serving as OSDs, as Journal we use a 15 GB LVM on an 256 GB SSD-Raid1; 2 x 10 Gbit/s NIC configured as Bond (Link Aggregation-Mode) ceph.conf [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 172.30.30.8,172.30.30.9 mon_initial_members = monitoring1, monitoring2, monitoring3 fsid = 5f22ab94-8d96-48c2-88d3-cff7bad443a9 public network = 172.30.30.0/24 [mon.monitoring1] host = monitoring1 addr = 172.30.30.8:6789 [mon.monitoring2] host = monitoring2 addr = 172.30.30.9:6789 [mon.monitoring3] host = monitoring3 addr = 172.30.30.10:6789 [filestore] filestore max sync interval = 10 [osd] osd recovery max active = 1 osd journal size = 15360 osd op threads = 40 osd disk threads = 40 [osd.0] host = datastore1 [osd.1] host = datastore1 [osd.2] host = datastore1 [osd.3] host = datastore1 [osd.4] host = datastore1 [osd.5] host = datastore1 [osd.6] host = datastore1 [osd.7] host = datastore1 [osd.8] host = datastore1 [osd.9] host = datastore1 [osd.10] host = datastore2 [osd.11] host = datastore2 [osd.11] host = datastore2 [osd.12] host = datastore2 [osd.13] host = datastore2 [osd.14] host = datastore2 [osd.15] host = datastore2 [osd.16] host = datastore2 [osd.17] host = datastore2 [osd.18] host = datastore2 [osd.19] host = datastore2 [osd.20] host = datastore3 [osd.21] host = datastore3 [osd.22] host = datastore3 [osd.23] host = datastore3 [osd.24] host = datastore3 [osd.25] host = datastore3 [osd.26] host = datastore3 [osd.27] host = datastore3 [osd.28] host = datastore3 [osd.29] host = datastore3 [osd.30] host = datastore4 [osd.31] host = datastore4 [osd.32] host = datastore4 [osd.33] host = datastore4 [osd.34] host = datastore4 [osd.35] host = datastore4 [osd.36] host = datastore4 [osd.37] host = datastore4 [osd.38] host = datastore4 [osd.39] host = datastore4 [osd.0] host = datastore5 [osd.40] host = datastore5 [osd.41] host = datastore5 [osd.42] host = datastore5 [osd.43] host = datastore5 [osd.44] host = datastore5 [osd.45] host = datastore5 [osd.46] host = datastore5 [osd.47] host = datastore5 [osd.48] host = datastore5 We have 3 pools: - 2 x 1000 pgs with 2 Replicas distributing the data equally to two racks (Used for datastore 1-4) - 1 x 100 pgs without replication; data only stored on datastore 5. This Pool is used to compare the performance on local disks without networking Here are the performance values, which I get using fio-Bench on a 32GB rbd: On 1000 pgs-Pool with distribution fio --bs=1M --rw=randwrite --ioengine=libaio --direct=1 --iodepth=32 --runtime=60 --name=/dev/rbd/pool1/bench1 fio-2.0.13 Starting 1 process Jobs: 1 (f=1): [w] [100.0% done] [0K/312.0M/0K /s] [0 /312 /0 iops] [eta 00m:00s] /dev/rbd/pool1/bench1: (groupid=0, jobs=1): err= 0: pid=21675: Fri Jul 4 11:03:52 2014 write: io=21071MB, bw=358989KB/s, iops=350 , runt= 60104msec slat (usec): min=127 , max=8040 , avg=511.49, stdev=216.27 clat (msec): min=5 , max=4018 , avg=90.74, stdev=215.83 lat (msec): min=6 , max=4018 , avg=91.25, stdev=215.83 clat percentiles (msec): | 1.00th=[8], 5.00th=[9], 10.00th=[ 11], 20.00th=[ 15], | 30.00th=[ 21], 40.00th=[ 30], 50.00th=[ 45], 60.00th=[ 63], | 70.00th=[ 83], 80.00th=[ 105], 90.00th=[ 129], 95.00th=[ 190], | 99.00th=[ 1254], 99.50th=[ 1680], 99.90th=[ 2409], 99.95th=[ 2638], | 99.99th=[ 3556] bw (KB/s) : min=68210, max=479232, per=100.00%, avg=368399.55, stdev=84457.12 lat (msec) : 10=9.50%, 20=20.02%, 50=23.56%, 100=24.56%, 250=18.09% lat (msec) : 500=1.39%, 750=0.81%, 1000=0.65%, 2000=1.13%, =2000=0.29% cpu : usr=11.17%, sys=7.46%, ctx=17772, majf=0, minf=24 IO depths: 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%,
Re: [ceph-users] write performance per disk
I use between 1 and 128 in different steps... But 500mb write is the max playing around. Uff its so hard to tune ceph... so many people have problems... ;-) -Ursprüngliche Nachricht- Von: Wido den Hollander [mailto:w...@42on.com] Gesendet: Freitag, 04. Juli 2014 10:55 An: VELARTIS Philipp Dürhammer; ceph-users@lists.ceph.com Betreff: Re: AW: [ceph-users] write performance per disk On 07/03/2014 04:32 PM, VELARTIS Philipp Dürhammer wrote: HI, Ceph.conf: osd journal size = 15360 rbd cache = true rbd cache size = 2147483648 rbd cache max dirty = 1073741824 rbd cache max dirty age = 100 osd recovery max active = 1 osd max backfills = 1 osd mkfs options xfs = -f -i size=2048 osd mount options xfs = rw,noatime,nobarrier,logbsize=256k,logbufs=8,inode64,allocsize=4M osd op threads = 8 so it should be 8 threads? How many threads are you using with rados bench? Don't touch the op threads from the start, usually the default is just fine. All 3 machines have more or less the same disk load at the same time. also the disks: sdb 35.5687.10 6849.09 617310 48540806 sdc 26.7572.62 5148.58 514701 36488992 sdd 35.1553.48 6802.57 378993 48211141 sde 31.0479.04 6208.48 560141 44000710 sdf 32.7938.35 6238.28 271805 44211891 sdg 31.6777.84 5987.45 551680 42434167 sdh 32.9551.29 6315.76 363533 44761001 sdi 31.6756.93 5956.29 403478 42213336 sdj 35.8377.82 6929.31 551501 49109354 sdk 36.8673.84 7291.00 523345 51672704 sdl 36.02 112.90 7040.47 800177 49897132 sdm 33.2538.02 6455.05 269446 45748178 sdn 33.5239.10 6645.19 277101 47095696 sdo 33.2646.22 6388.20 327541 45274394 sdp 33.3874.12 6480.62 525325 45929369 the question is: is this a poor performance to get max 500mb/write with 45 disks and replica 2 or should I expect this? You should be able to get more as long as the I/O is done in parallel. Wido -Ursprüngliche Nachricht- Von: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] Im Auftrag von Wido den Hollander Gesendet: Donnerstag, 03. Juli 2014 15:22 An: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] write performance per disk On 07/03/2014 03:11 PM, VELARTIS Philipp Dürhammer wrote: Hi, I have a ceph cluster setup (with 45 sata disk journal on disks) and get only 450mb/sec writes seq (maximum playing around with threads in rados bench) with replica of 2 How many threads? Which is about ~20Mb writes per disk (what y see in atop also) theoretically with replica2 and having journals on disk should be 45 X 100mb (sata) / 2 (replica) / 2 (journal writes) which makes it 1125 satas in reality have 120mb/sec so the theoretical output should be more. I would expect to have between 40-50mb/sec for each sata disk Can somebody confirm that he can reach this speed with a setup with journals on the satas (with journals on ssd speed should be 100mb per disk)? or does ceph only give about ¼ of the speed for a disk? (and not the ½ as expected because of journals) Did you verify how much each machine is doing? It could be that the data is not distributed evenly and that on a certain machine the drives are doing 50MB/sec. My setup is 3 servers with: 2 x 2.6ghz xeons, 128gb ram 15 satas for ceph (and ssds for system) 1 x 10gig for external traffic, 1 x 10gig for osd traffic with reads I can saturate the network but writes is far away. And I would expect at least to saturate the 10gig with sequential writes also Should be possible, but with 3 servers the data distribution might not be optimal causing a lower write performance. I've seen 10Gbit write performance on multiple clusters without any problems. Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander Ceph consultant and trainer 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bad Write-Performance on Ceph/Possible bottlenecks?
On 07/04/2014 11:33 AM, Daniel Schwager wrote: Hi, I think, the problem is the rbd device. It's only ONE device. I fully agree. Ceph excels in parallel performance. You should run multiple fio instances in parallel on different RBD devices and even better on different clients. Then you will see a big difference. Wido fio --bs=1M --rw=randwrite --ioengine=libaio --direct=1 --iodepth=32 --runtime=60 --name=/dev/rbd/pool1/bench1 Try to create e.g. 20 (small) rbd devices, putting them all in a lvm vg, creating a logical volume (Raid0) with 20 stripes and e.g. stripeSize 1MB (better bandwith) or 4kb (better io) - or use md-raid0 (it's maybe 10% faster - but not that flexible): # create disks for i in `seq -f %02.f 0 19` ; do rbd create --size 4 vmware/vol6-$i.dsk ; done emacs -nw /etc/lvm/lvm.conf types = [ rbd, 16 ] # rbd map # pvcreate for i in `seq -f %02.f 0 19` ; do pvcreate /dev/rbd/vmware/vol6-$i.dsk ; done # vcreate VG vgcreate VG_RBD20x40_VOL6 /dev/rbd/vmware/vol6-00.dsk for i in `seq -f %02.f 1 19` ; do vgextend VG_RBD20x40_VOL6 /dev/rbd/vmware/vol6-$i.dsk ; done # lvcreate raid0 # -i, --stripes Stripes - This is equal to the number of physical volumes to scatter the logical volume. # -I, --stripesize StripeSize - Gives the number of kilobytes for the granularity of the stripes, 2^n, (n = 2 to 9) # 20 stripes und 4k StripeSize lvcreate -i20 -I1024 -L70m-n VmProd06VG_RBD20x40_VOL6 Now, try to run fio against /dev/mapper/ VG_RBD20x40_VOL6-VmProd06 I think, the performance will be about 10GBi. regards Danny *From:*ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Marco Allevato *Sent:* Friday, July 04, 2014 11:13 AM *To:* ceph-users@lists.ceph.com *Subject:* [ceph-users] Bad Write-Performance on Ceph/Possible bottlenecks? Hello Ceph-Community, I’m writing here because we have a bad write-performance on our Ceph-Cluster of about _As an overview the technical details of our Cluster:_ 3 x monitoring-Servers; each with 2 x 1 Gbit/s NIC configured as Bond (Link Aggregation-Mode) 5 x datastore-Servers; each with 10 x 4 TB HDDs serving as OSDs, as Journal we use a 15 GB LVM on an 256 GB SSD-Raid1; 2 x 10 Gbit/s NIC configured as Bond (Link Aggregation-Mode) __ _ceph.conf_ [global] auth_service_required = cephx filestore_xattr_use_omap = true auth_client_required = cephx auth_cluster_required = cephx mon_host = 172.30.30.8,172.30.30.9 mon_initial_members = monitoring1, monitoring2, monitoring3 fsid = 5f22ab94-8d96-48c2-88d3-cff7bad443a9 public network = 172.30.30.0/24 [mon.monitoring1] host = monitoring1 addr = 172.30.30.8:6789 [mon.monitoring2] host = monitoring2 addr = 172.30.30.9:6789 [mon.monitoring3] host = monitoring3 addr = 172.30.30.10:6789 [filestore] filestore max sync interval = 10 [osd] osd recovery max active = 1 osd journal size = 15360 osd op threads = 40 osd disk threads = 40 [osd.0] host = datastore1 [osd.1] host = datastore1 [osd.2] host = datastore1 [osd.3] host = datastore1 [osd.4] host = datastore1 [osd.5] host = datastore1 [osd.6] host = datastore1 [osd.7] host = datastore1 [osd.8] host = datastore1 [osd.9] host = datastore1 [osd.10] host = datastore2 [osd.11] host = datastore2 [osd.11] host = datastore2 [osd.12] host = datastore2 [osd.13] host = datastore2 [osd.14] host = datastore2 [osd.15] host = datastore2 [osd.16] host = datastore2 [osd.17] host = datastore2 [osd.18] host = datastore2 [osd.19] host = datastore2 [osd.20] host = datastore3 [osd.21] host = datastore3 [osd.22] host = datastore3 [osd.23] host = datastore3 [osd.24] host = datastore3 [osd.25] host = datastore3 [osd.26] host = datastore3 [osd.27] host = datastore3 [osd.28] host = datastore3 [osd.29] host = datastore3 [osd.30] host = datastore4 [osd.31] host = datastore4 [osd.32] host = datastore4 [osd.33] host = datastore4 [osd.34] host = datastore4 [osd.35] host = datastore4 [osd.36] host = datastore4 [osd.37] host = datastore4 [osd.38] host = datastore4 [osd.39] host = datastore4 [osd.0] host = datastore5 [osd.40] host = datastore5 [osd.41] host = datastore5 [osd.42] host = datastore5 [osd.43] host = datastore5 [osd.44] host = datastore5 [osd.45] host = datastore5 [osd.46] host = datastore5 [osd.47] host = datastore5 [osd.48] host = datastore5 We have 3 pools: - 2 x 1000 pgs
Re: [ceph-users] write performance per disk
On 07/04/2014 11:40 AM, VELARTIS Philipp Dürhammer wrote: I use between 1 and 128 in different steps... But 500mb write is the max playing around. I just mentioned it in a different thread, make sure you do parallel I/O! That's where Ceph really makes the difference. Run rados bench from multiple clients. Uff its so hard to tune ceph... so many people have problems... ;-) No, Ceph is simply different from any other storage. Distributed storage is a lot different in terms of performance from existing storage projects/products. Wido -Ursprüngliche Nachricht- Von: Wido den Hollander [mailto:w...@42on.com] Gesendet: Freitag, 04. Juli 2014 10:55 An: VELARTIS Philipp Dürhammer; ceph-users@lists.ceph.com Betreff: Re: AW: [ceph-users] write performance per disk On 07/03/2014 04:32 PM, VELARTIS Philipp Dürhammer wrote: HI, Ceph.conf: osd journal size = 15360 rbd cache = true rbd cache size = 2147483648 rbd cache max dirty = 1073741824 rbd cache max dirty age = 100 osd recovery max active = 1 osd max backfills = 1 osd mkfs options xfs = -f -i size=2048 osd mount options xfs = rw,noatime,nobarrier,logbsize=256k,logbufs=8,inode64,allocsize=4M osd op threads = 8 so it should be 8 threads? How many threads are you using with rados bench? Don't touch the op threads from the start, usually the default is just fine. All 3 machines have more or less the same disk load at the same time. also the disks: sdb 35.5687.10 6849.09 617310 48540806 sdc 26.7572.62 5148.58 514701 36488992 sdd 35.1553.48 6802.57 378993 48211141 sde 31.0479.04 6208.48 560141 44000710 sdf 32.7938.35 6238.28 271805 44211891 sdg 31.6777.84 5987.45 551680 42434167 sdh 32.9551.29 6315.76 363533 44761001 sdi 31.6756.93 5956.29 403478 42213336 sdj 35.8377.82 6929.31 551501 49109354 sdk 36.8673.84 7291.00 523345 51672704 sdl 36.02 112.90 7040.47 800177 49897132 sdm 33.2538.02 6455.05 269446 45748178 sdn 33.5239.10 6645.19 277101 47095696 sdo 33.2646.22 6388.20 327541 45274394 sdp 33.3874.12 6480.62 525325 45929369 the question is: is this a poor performance to get max 500mb/write with 45 disks and replica 2 or should I expect this? You should be able to get more as long as the I/O is done in parallel. Wido -Ursprüngliche Nachricht- Von: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] Im Auftrag von Wido den Hollander Gesendet: Donnerstag, 03. Juli 2014 15:22 An: ceph-users@lists.ceph.com Betreff: Re: [ceph-users] write performance per disk On 07/03/2014 03:11 PM, VELARTIS Philipp Dürhammer wrote: Hi, I have a ceph cluster setup (with 45 sata disk journal on disks) and get only 450mb/sec writes seq (maximum playing around with threads in rados bench) with replica of 2 How many threads? Which is about ~20Mb writes per disk (what y see in atop also) theoretically with replica2 and having journals on disk should be 45 X 100mb (sata) / 2 (replica) / 2 (journal writes) which makes it 1125 satas in reality have 120mb/sec so the theoretical output should be more. I would expect to have between 40-50mb/sec for each sata disk Can somebody confirm that he can reach this speed with a setup with journals on the satas (with journals on ssd speed should be 100mb per disk)? or does ceph only give about ¼ of the speed for a disk? (and not the ½ as expected because of journals) Did you verify how much each machine is doing? It could be that the data is not distributed evenly and that on a certain machine the drives are doing 50MB/sec. My setup is 3 servers with: 2 x 2.6ghz xeons, 128gb ram 15 satas for ceph (and ssds for system) 1 x 10gig for external traffic, 1 x 10gig for osd traffic with reads I can saturate the network but writes is far away. And I would expect at least to saturate the 10gig with sequential writes also Should be possible, but with 3 servers the data distribution might not be optimal causing a lower write performance. I've seen 10Gbit write performance on multiple clusters without any problems. Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander Ceph consultant and trainer 42on B.V. Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com
Re: [ceph-users] nginx (tengine) and radosgw
Hi David, Do you mind sharing the howto/documentation with examples of configs, etc.? I am tempted to give it a go and replace the Apache reverse proxy that I am currently using. cheers Andrei - Original Message - From: David Moreau Simard dmsim...@iweb.com To: ceph-users@lists.ceph.com Sent: Sunday, 22 June, 2014 2:37:00 AM Subject: Re: [ceph-users] nginx (tengine) and radosgw Hi, I just wanted to chime in and say that I didn’t notice any problems swapping nginx out in favor of tengine. tengine is used as a load balancer that also handles SSL termination. I found that disabling body buffering saves a lot on upload times as well. I took the time to do a post about it and linked this thread: http://dmsimard.com/2014/06/21/a-use-case-of-tengine-a-drop-in-replacement-and-fork-of-nginx/ - David On May 29, 2014, at 12:20 PM, Michael Lukzak mis...@vp.pl wrote: Re[2]: [ceph-users] nginx (tengine) and radosgw Hi, Ups, so I don't read carefully a doc... I will try this solution. Thanks! Michael From the docs, you need this setting in ceph.conf (if you're using nginx/tengine): rgw print continue = false This will fix the 100-continue issues. On 5/29/2014 5:56 AM, Michael Lukzak wrote: Re[2]: [ceph-users] nginx (tengine) and radosgw Hi, I'm also use tengine, works fine with SSL (I have a Wildcard). But I have other issue with HTTP 100-Continue. Clients like boto or Cyberduck hangs if they can't make HTTP 100-Continue. IP_REMOVED - - [29/May/2014:11:27:53 +] PUT /temp/1b6f6a11d7aa188f06f8255fdf0345b4 HTTP/1.1 100 0 - Boto/2.27.0 Python/2.7.6 Linux/3.13.0-24-generic Do You have also problem with that? I used for testing oryginal nginx and also have a problem with 100-Continue. Only Apache 2.x works fine. BR, Michael I haven't tried SSL yet. We currently don't have a wildcard certificate for this, so it hasn't been a concern (and our current use case, all the files are public anyway). On 5/20/2014 4:26 PM, Andrei Mikhailovsky wrote: That looks very interesting indeed. I've tried to use nginx, but from what I recall it had some ssl related issues. Have you tried to make the ssl work so that nginx acts as an ssl proxy in front of the radosgw? Cheers Andrei From: Brian Rak b...@gameservers.com To: ceph-users@lists.ceph.com Sent: Tuesday, 20 May, 2014 9:11:58 PM Subject: [ceph-users] nginx (tengine) and radosgw I've just finished converting from nginx/radosgw to tengine/radosgw, and it's fixed all the weird issues I was seeing (uploads failing, random clock skew errors, timeouts). The problem with nginx and radosgw is that nginx insists on buffering all the uploads to disk. This causes a significant performance hit, and prevents larger uploads from working. Supposedly, there is going to be an option in nginx to disable this, but it hasn't been released yet (nor do I see anything on the nginx devel list about it). tengine ( http://tengine.taobao.org/ ) is an nginx fork that implements unbuffered uploads to fastcgi. It's basically a drop in replacement for nginx. My configuration looks like this: server { listen 80; server_name *.rados.test rados.test; client_max_body_size 10g; # This is the important option that tengine has, but nginx does not fastcgi_request_buffering off; location / { fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; if ($request_method = PUT ) { rewrite ^ /PUT$request_uri; } include fastcgi_params; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } location /PUT/ { internal; fastcgi_pass_header Authorization; fastcgi_pass_request_headers on; include fastcgi_params; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_pass unix:/path/to/ceph.radosgw.fastcgi.sock; } } if anyone else is looking to run radosgw without having to run apache, I would recommend you look into tengine :) ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Bad Write-Performance on Ceph/Possible bottlenecks?
Try to create e.g. 20 (small) rbd devices, putting them all in a lvm vg, creating a logical volume (Raid0) with 20 stripes and e.g. stripeSize 1MB (better bandwith) or 4kb (better io) - or use md-raid0 (it's maybe 10% faster - but not that flexible): BTW - we use this approach for VMware using - one LVM LV (raid0: 20 stripes, 1MB stripe size ) LUN based on - one VG containing 20 rbd's (each 40GB) based on - a ceph pool with 24osd, 3 replicates inside our - ceph cluster, 3 nodes x 8 x 4TB OSD's, 2 x 10GBit - published by scst (Fibre channel, 4 GBit QLA) to vSphere ESX. IOmeter (one worker, one disk) inside a w2k8r2 vm @esx tells me iometer: 270/360 MB/sec write/read (1MByte block size, 4 outstanding IOs) And - important - other vm's share the bandwidth from 20 rbd volumes - so, now, our 4GBit fibrchannel is the bottle neck - not the (one) rbd volume anymore. Also, we will add a flashcache in front of the raid0 LV to bust the 4k IO's - at the moment, 4k is terrible slow iometer: 4/14 MB/sec write/read (4k block size, 8 outstanding IOs) with a 10 GByte flashcache, it's about iometer: 14/60 MB/sec write/read (4k block size, 8 outstanding IOs) regards Danny smime.p7s Description: S/MIME cryptographic signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] OSD recovery problem
Hi, I extracted a disk with two partitions (journal and data) and copied its content in the hope to restart the OSD and recover its content. mount /dev/sdb1 /mnt rsync -avH --numeric-ids /mnt/ /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/ rm /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal dd if=/dev/sdb2 of=/var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal and then start ceph-osd id=$(cat /mnt/whoami) It crashes on https://github.com/ceph/ceph/blob/v0.72.2/src/osd/PG.cc#L2182 and before it happens there is load_pgs ignoring unrecognized meta and the full debug osd = 20 logs are in http://paste.ubuntu.com/7746993/ and this is root@bm4202:/etc/ceph# dpkg -l | grep ceph ii ceph0.72.2-1trusty amd64 distributed storage and file system ii ceph-common 0.72.2-1trusty amd64 common utilities to mount and interact with a ceph storage ii python-ceph 0.72.2-1trusty amd64 Python libraries for the Ceph distributed filesystem root@bm4202:/etc/ceph# ceph --version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) Cheers -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD recovery problem
On 07/04/2014 03:18 PM, Loic Dachary wrote: Hi, I extracted a disk with two partitions (journal and data) and copied its content in the hope to restart the OSD and recover its content. mount /dev/sdb1 /mnt rsync -avH --numeric-ids /mnt/ /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/ I think you went wrong there, rsync man page: -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) -X, --xattrspreserve extended attributes So you didn't copy over the xattrs, so basically the data is lost/unusable. rm /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal dd if=/dev/sdb2 of=/var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal and then start ceph-osd id=$(cat /mnt/whoami) It crashes on https://github.com/ceph/ceph/blob/v0.72.2/src/osd/PG.cc#L2182 and before it happens there is load_pgs ignoring unrecognized meta and the full debug osd = 20 logs are in http://paste.ubuntu.com/7746993/ and this is root@bm4202:/etc/ceph# dpkg -l | grep ceph ii ceph0.72.2-1trusty amd64 distributed storage and file system ii ceph-common 0.72.2-1trusty amd64 common utilities to mount and interact with a ceph storage ii python-ceph 0.72.2-1trusty amd64 Python libraries for the Ceph distributed filesystem root@bm4202:/etc/ceph# ceph --version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) Cheers ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD recovery problem
On 04/07/2014 15:25, Wido den Hollander wrote: On 07/04/2014 03:18 PM, Loic Dachary wrote: Hi, I extracted a disk with two partitions (journal and data) and copied its content in the hope to restart the OSD and recover its content. mount /dev/sdb1 /mnt rsync -avH --numeric-ids /mnt/ /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/ I think you went wrong there, rsync man page: -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) -X, --xattrspreserve extended attributes So you didn't copy over the xattrs, so basically the data is lost/unusable. Thanks ! Fortunately the original disks are still available ;-) rm /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal dd if=/dev/sdb2 of=/var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal and then start ceph-osd id=$(cat /mnt/whoami) It crashes on https://github.com/ceph/ceph/blob/v0.72.2/src/osd/PG.cc#L2182 and before it happens there is load_pgs ignoring unrecognized meta and the full debug osd = 20 logs are in http://paste.ubuntu.com/7746993/ and this is root@bm4202:/etc/ceph# dpkg -l | grep ceph ii ceph0.72.2-1trusty amd64 distributed storage and file system ii ceph-common 0.72.2-1trusty amd64 common utilities to mount and interact with a ceph storage ii python-ceph 0.72.2-1trusty amd64 Python libraries for the Ceph distributed filesystem root@bm4202:/etc/ceph# ceph --version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) Cheers ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multipart upload on ceph 0.8 doesn't work?
Thank you Luis for your response. Quite unbelievable, but your solution worked! Unfortunately, I'm stuck again when trying to upload parts of the file. Apache's logs: == apache.access.log == 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] PUT /bucketbig/ HTTP/1.1 200 477 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] POST /bucketbig/Bosphorus?uploads HTTP/1.1 200 249 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic == apache.error.log == [Fri Jul 04 15:40:41.868621 2014] [fastcgi:error] [pid 14199] [client 127.0.0.1:46571] FastCGI: incomplete headers (0 bytes) received from server /home/pszablow/ceph/src/htdocs/rgw.fcgi == apache.access.log == 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] PUT /bucketbig/Bosphorus?uploadId=2/fURJChPdpUqA3Z1oVLUjT7ROsnxIqZ9partNumber=1 HTTP/1.1 500 531 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic == apache.error.log == [Fri Jul 04 15:40:42.571543 2014] [fastcgi:error] [pid 14200] (111)Connection refused: [client 127.0.0.1:46572] FastCGI: failed to connect to server /home/pszablow/ceph/src/htdocs/rgw.fcgi: connect() failed [Fri Jul 04 15:40:42.571660 2014] [fastcgi:error] [pid 14200] [client 127.0.0.1:46572] FastCGI: incomplete headers (0 bytes) received from server /home/pszablow/ceph/src/htdocs/rgw.fcgi I'm using the default fastcgi module, not the one provided by Ceph. I've tried installing it on my ubuntu 14.04, but unfortunately I keep getting the error: libapache2-mod-fastcgi : requires: apache2.2-common (= 2.2.4) Is the modified fastcgi module mandatory in order to use multi part upload? Thanks, Patrycja Szabłowska 2014-07-03 18:34 GMT+02:00 Luis Periquito luis.periqu...@ocado.com: I was at this issue this morning. It seems radosgw requires you to have a pool named '' to work with multipart. I just created a pool with that name rados mkpool '' either that or allow the pool be created by the radosgw... On 3 July 2014 16:27, Patrycja Szabłowska szablowska.patry...@gmail.com wrote: Hi, I'm trying to make multi part upload work. I'm using ceph 0.80-702-g9bac31b (from the ceph's github). I've tried the code provided by Mark Kirkwood here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-October/034940.html But unfortunately, it gives me the error: (multitest)pszablow@pat-desktop:~/$ python boto_multi.py begin upload of abc.yuv size 746496, 7 parts Traceback (most recent call last): File boto_multi.py, line 36, in module part = bucket.initiate_multipart_upload(objname) File /home/pszablow/venvs/multitest/local/lib/python2.7/site-packages/boto/s3/bucket.py, line 1742, in initiate_multipart_upload response.status, response.reason, body) boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden ?xml version=1.0 encoding=UTF-8?ErrorCodeAccessDenied/Code/Error The single part upload works for me. I am able to create buckets and objects. I've tried also other similar examples, but none of them works. Any ideas what's wrong? Does the ceph's multi part upload actually work for anybody? Thanks, Patrycja Szabłowska ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Luis Periquito Unix Engineer Ocado.com http://www.ocado.com/ Head Office, Titan Court, 3 Bishop Square, Hatfield Business Park, Hatfield, Herts AL10 9NE Notice: This email is confidential and may contain copyright material of members of the Ocado Group. Opinions and views expressed in this message may not necessarily reflect the opinions and views of the members of the Ocado Group. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. References to the “Ocado Group” are to Ocado Group plc (registered in England and Wales with number 7098618) and its subsidiary undertakings (as that expression is defined in the Companies Act 2006) from time to time. The registered office of Ocado Group plc is Titan Court, 3 Bishops Square, Hatfield Business Park, Hatfield, Herts. AL10 9NE. -- Pozdrawiam Patrycja Szabłowska ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] write performance per disk
On 07/03/2014 08:11 AM, VELARTIS Philipp Dürhammer wrote: Hi, I have a ceph cluster setup (with 45 sata disk journal on disks) and get only 450mb/sec writes seq (maximum playing around with threads in rados bench) with replica of 2 Which is about ~20Mb writes per disk (what y see in atop also) theoretically with replica2 and having journals on disk should be 45 X 100mb (sata) / 2 (replica) / 2 (journal writes) which makes it 1125 satas in reality have 120mb/sec so the theoretical output should be more. I would expect to have between 40-50mb/sec for each sata disk Can somebody confirm that he can reach this speed with a setup with journals on the satas (with journals on ssd speed should be 100mb per disk)? or does ceph only give about ¼ of the speed for a disk? (and not the ½ as expected because of journals) My setup is 3 servers with: 2 x 2.6ghz xeons, 128gb ram 15 satas for ceph (and ssds for system) 1 x 10gig for external traffic, 1 x 10gig for osd traffic with reads I can saturate the network but writes is far away. And I would expect at least to saturate the 10gig with sequential writes also In addition to the advice wido is providing (which I wholeheartedly agree with!), you might want to check your controller/disk configuration. If you have journals on the same disks as the data, some times putting the disks into single-disk RAID0 LUNs with writeback cache enabled can help keep journal and data writes from causing seek contention. This only works if you have a controller with cache and a battery though. Thank you ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multipart upload on ceph 0.8 doesn't work?
Still not sure do I need the ceph's modified fastcgi or not. But I guess this explains my problem with the installation: http://tracker.ceph.com/issues/8233 It would be nice to have at least a workaround for this... Thanks, Patrycja Szabłowska 2014-07-04 16:02 GMT+02:00 Patrycja Szabłowska szablowska.patry...@gmail.com: Thank you Luis for your response. Quite unbelievable, but your solution worked! Unfortunately, I'm stuck again when trying to upload parts of the file. Apache's logs: == apache.access.log == 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] PUT /bucketbig/ HTTP/1.1 200 477 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] POST /bucketbig/Bosphorus?uploads HTTP/1.1 200 249 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic == apache.error.log == [Fri Jul 04 15:40:41.868621 2014] [fastcgi:error] [pid 14199] [client 127.0.0.1:46571] FastCGI: incomplete headers (0 bytes) received from server /home/pszablow/ceph/src/htdocs/rgw.fcgi == apache.access.log == 127.0.0.1 l - [04/Jul/2014:15:40:41 +0200] PUT /bucketbig/Bosphorus?uploadId=2/fURJChPdpUqA3Z1oVLUjT7ROsnxIqZ9partNumber=1 HTTP/1.1 500 531 {Referer}i Boto/2.30.0 Python/2.7.6 Linux/3.13.0-30-generic == apache.error.log == [Fri Jul 04 15:40:42.571543 2014] [fastcgi:error] [pid 14200] (111)Connection refused: [client 127.0.0.1:46572] FastCGI: failed to connect to server /home/pszablow/ceph/src/htdocs/rgw.fcgi: connect() failed [Fri Jul 04 15:40:42.571660 2014] [fastcgi:error] [pid 14200] [client 127.0.0.1:46572] FastCGI: incomplete headers (0 bytes) received from server /home/pszablow/ceph/src/htdocs/rgw.fcgi I'm using the default fastcgi module, not the one provided by Ceph. I've tried installing it on my ubuntu 14.04, but unfortunately I keep getting the error: libapache2-mod-fastcgi : requires: apache2.2-common (= 2.2.4) Is the modified fastcgi module mandatory in order to use multi part upload? Thanks, Patrycja Szabłowska 2014-07-03 18:34 GMT+02:00 Luis Periquito luis.periqu...@ocado.com: I was at this issue this morning. It seems radosgw requires you to have a pool named '' to work with multipart. I just created a pool with that name rados mkpool '' either that or allow the pool be created by the radosgw... On 3 July 2014 16:27, Patrycja Szabłowska szablowska.patry...@gmail.com wrote: Hi, I'm trying to make multi part upload work. I'm using ceph 0.80-702-g9bac31b (from the ceph's github). I've tried the code provided by Mark Kirkwood here: http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-October/034940.html But unfortunately, it gives me the error: (multitest)pszablow@pat-desktop:~/$ python boto_multi.py begin upload of abc.yuv size 746496, 7 parts Traceback (most recent call last): File boto_multi.py, line 36, in module part = bucket.initiate_multipart_upload(objname) File /home/pszablow/venvs/multitest/local/lib/python2.7/site-packages/boto/s3/bucket.py, line 1742, in initiate_multipart_upload response.status, response.reason, body) boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden ?xml version=1.0 encoding=UTF-8?ErrorCodeAccessDenied/Code/Error The single part upload works for me. I am able to create buckets and objects. I've tried also other similar examples, but none of them works. Any ideas what's wrong? Does the ceph's multi part upload actually work for anybody? Thanks, Patrycja Szabłowska ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Luis Periquito Unix Engineer Ocado.com http://www.ocado.com/ Head Office, Titan Court, 3 Bishop Square, Hatfield Business Park, Hatfield, Herts AL10 9NE Notice: This email is confidential and may contain copyright material of members of the Ocado Group. Opinions and views expressed in this message may not necessarily reflect the opinions and views of the members of the Ocado Group. If you are not the intended recipient, please notify us immediately and delete all copies of this message. Please note that it is your responsibility to scan this message for viruses. References to the “Ocado Group” are to Ocado Group plc (registered in England and Wales with number 7098618) and its subsidiary undertakings (as that expression is defined in the Companies Act 2006) from time to time. The registered office of Ocado Group plc is Titan Court, 3 Bishops Square, Hatfield Business Park, Hatfield, Herts. AL10 9NE. -- Pozdrawiam Patrycja Szabłowska ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] radosgw-agent failed to parse
i am having issues running radosgw-agent to sync data between two radosgw zones. As far as i can tell both zones are running correctly. My issue is when i run the radosgw-agent command: radosgw-agent -v --src-access-key access_key --src-secret-key secret_key --dest-access-key access_key --dest-secret-key secret_key --src-zone us-master http://us-secondary.example.com:80 i get the following error: |DEBUG:boto:Using access key provided by client.|| ||DEBUG:boto:Using secret key provided by client.|| ||DEBUG:boto:StringToSign:|| ||GET|| || ||Fri, 04 Jul 2014 15:25:53 GMT|| ||/admin/config|| ||DEBUG:boto:Signature:|| ||AWS EA20YO07DA8JJJX7ZIPJ:WbykwyXu5m5IlbEsBzo8bKEGIzg=|| ||DEBUG:boto:url = 'http://us-secondary.example.comhttp://us-secondary.example.com/admin/config'|| ||params={}|| ||headers={'Date': 'Fri, 04 Jul 2014 15:25:53 GMT', 'Content-Length': '0', 'Authorization': 'AWS EA20YO07DA8JJJX7ZIPJ:WbykwyXu5m5IlbEsBzo8bKEGIzg=', 'User-Agent': 'Boto/2.20.1 Python/2.7.6 Linux/3.13.0-24-generic'}|| ||data=None|| ||ERROR:root:Could not retrieve region map from destination|| ||Traceback (most recent call last):|| || File /usr/lib/python2.7/dist-packages/radosgw_agent/cli.py, line 269, in main|| ||region_map = client.get_region_map(dest_conn)|| || File /usr/lib/python2.7/dist-packages/radosgw_agent/client.py, line 391, in get_region_map|| ||region_map = request(connection, 'get', 'admin/config')|| || File /usr/lib/python2.7/dist-packages/radosgw_agent/client.py, line 153, in request|| ||result = handler(url, params=params, headers=request.headers, data=data)|| || File /usr/lib/python2.7/dist-packages/requests/api.py, line 55, in get|| ||return request('get', url, **kwargs)|| || File /usr/lib/python2.7/dist-packages/requests/api.py, line 44, in request|| ||return session.request(method=method, url=url, **kwargs)|| || File /usr/lib/python2.7/dist-packages/requests/sessions.py, line 349, in request|| ||prep = self.prepare_request(req)|| || File /usr/lib/python2.7/dist-packages/requests/sessions.py, line 287, in prepare_request|| ||hooks=merge_hooks(request.hooks, self.hooks),|| || File /usr/lib/python2.7/dist-packages/requests/models.py, line 287, in prepare|| ||self.prepare_url(url, params)|| || File /usr/lib/python2.7/dist-packages/requests/models.py, line 334, in prepare_url|| ||scheme, auth, host, port, path, query, fragment = parse_url(url)|| || File /usr/lib/python2.7/dist-packages/urllib3/util.py, line 390, in parse_url|| ||raise LocationParseError(Failed to parse: %s % url)|| ||LocationParseError: Failed to parse: Failed to parse: us-secondary.example.comhttp: |||Is this a bug? or is my setup wrong? i can navigate to http://us-secondary.example.com/admin/config and it correctly outputs zone details. at the output above |DEBUG:boto:url = 'http://us-secondary.example.comhttp://us-secondary.example.com/admin/config'|| |should the url be repeated like that? any help would be greatly appreciated thanks || ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] error mapping device in firefly
On Fri, Jul 4, 2014 at 11:48 AM, Xabier Elkano xelk...@hostinet.com wrote: Hi, I am trying to map a rbd device in Ubuntu 14.04 (kernel 3.13.0-30-generic): # rbd -p mypool create test1 --size 500 # rbd -p mypool ls test1 # rbd -p mypool map test1 rbd: add failed: (5) Input/output error and in the syslog: Jul 4 09:31:48 testceph kernel: [70503.356842] libceph: mon2 172.16.64.18:6789 feature set mismatch, my 4a042a42 server's 2004a042a42, missing 200 Jul 4 09:31:48 testceph kernel: [70503.356938] libceph: mon2 172.16.64.18:6789 socket error on read my environment: cluster version on all MONs and OSDs is 0.80.1 In the client machine: ii ceph-common 0.80.1-1trusty amd64common utilities to mount and interact with a ceph storage cluster ii python-ceph 0.80.1-1trusty amd64Python libraries for the Ceph distributed filesystem ii librados2 0.80.1-1trusty amd64RADOS distributed object store client library I think I started getting this error when I switched from tunables legacy to optimal after upgrading from 0.72 to 0.80. Hi Xabier, You need to do ceph osd getcrushmap -o /tmp/crush crushtool -i /tmp/crush --set-chooseleaf_vary_r 0 -o /tmp/crush.new ceph osd setcrushmap -i /tmp/crush.new or upgrade your kernel to 3.15. Thanks, Ilya ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] OSD recovery problem
For the record here is a summary of what happened : http://dachary.org/?p=3131 On 04/07/2014 15:35, Loic Dachary wrote: On 04/07/2014 15:25, Wido den Hollander wrote: On 07/04/2014 03:18 PM, Loic Dachary wrote: Hi, I extracted a disk with two partitions (journal and data) and copied its content in the hope to restart the OSD and recover its content. mount /dev/sdb1 /mnt rsync -avH --numeric-ids /mnt/ /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/ I think you went wrong there, rsync man page: -a, --archive archive mode; equals -rlptgoD (no -H,-A,-X) -X, --xattrspreserve extended attributes So you didn't copy over the xattrs, so basically the data is lost/unusable. Thanks ! Fortunately the original disks are still available ;-) rm /var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal dd if=/dev/sdb2 of=/var/lib/ceph/osd/ceph-$(cat /mnt/whoami)/journal and then start ceph-osd id=$(cat /mnt/whoami) It crashes on https://github.com/ceph/ceph/blob/v0.72.2/src/osd/PG.cc#L2182 and before it happens there is load_pgs ignoring unrecognized meta and the full debug osd = 20 logs are in http://paste.ubuntu.com/7746993/ and this is root@bm4202:/etc/ceph# dpkg -l | grep ceph ii ceph0.72.2-1trusty amd64 distributed storage and file system ii ceph-common 0.72.2-1trusty amd64 common utilities to mount and interact with a ceph storage ii python-ceph 0.72.2-1trusty amd64 Python libraries for the Ceph distributed filesystem root@bm4202:/etc/ceph# ceph --version ceph version 0.72.2 (a913ded2ff138aefb8cb84d347d72164099cfd60) Cheers ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com