Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too slow

2015-02-21 Thread Alexandre DERUMIER
so maybe can sure this problem is cause from 0.80.8 

That's a good news if you are sure that it's come from 0.80.8

The applied commits between 0.80.7 and 0.80.8 is here:

https://github.com/ceph/ceph/compare/v0.80.7...v0.80.8


Now, we need to find which of them is for librados/librbd.


- Mail original -
De: 杨万元 yangwanyuan8...@gmail.com
À: aderumier aderum...@odiso.com
Cc: ceph-users ceph-users@lists.ceph.com
Envoyé: Vendredi 13 Février 2015 04:39:05
Objet: Re: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read 
requestbecome too slow

thanks very much for your advice . 
yes,as you said,disabled the rbd_cache will improve the read request,but if i 
disabled rbd_cache, the randwrite request will be worse. so this method maybe 
can not solve my problem, is it ? 
In addition , I also test the 0.80.6 and 0.80.7 librbd,they are as good as 
0.80.5 performance , so maybe can sure this problem is cause from 0.80.8 

2015-02-12 19:33 GMT+08:00 Alexandre DERUMIER  aderum...@odiso.com  : 


Hi, 
Can you test with disabling rbd_cache ? 

I remember of a bug detected in giant, not sure it's also the case for 
fireflt 

This was this tracker: 

http://tracker.ceph.com/issues/9513 

But It has been solved and backported to firefly. 

Also, can you test 0.80.6 and 0.80.7 ? 







- Mail original - 
De: killingwolf  killingw...@qq.com  
À: ceph-users  ceph-users@lists.ceph.com  
Envoyé: Jeudi 12 Février 2015 12:16:32 
Objet: [ceph-users] re: Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome 
too slow 

I have this problems too , Help! 

-- 原始邮件 -- 
发件人: 杨万元; yangwanyuan8...@gmail.com ; 
发送时间: 2015年2月12日(星期四) 中午11:14 
收件人:  ceph-users@lists.ceph.com  ceph-users@lists.ceph.com ; 
主题: [ceph-users] Upgrade 0.80.5 to 0.80.8 --the VM's read requestbecome too 
slow 

Hello! 
We use Ceph+Openstack in our private cloud. Recently we upgrade our centos6.5 
based cluster from Ceph Emperor to Ceph Firefly. 
At first,we use redhat yum repo epel to upgrade, this Ceph's version is 0.80.5. 
First upgrade monitor,then osd,last client. when we complete this upgrade, we 
boot a VM on the cluster,then use fio to test the io performance. The io 
performance is as better as before. Everything is ok! 
Then we upgrade the cluster from 0.80.5 to 0.80.8,when we completed , we reboot 
the VM to load the newest librbd. after that we also use fio to test the io 
performance .then we find the randwrite and write is as good as before.but the 
randread and read is become worse, randwrite's iops from 4000-5000 to 300-400 
,and the latency is worse. the write's bw from 400MB/s to 115MB/s . then I 
downgrade the ceph client version from 0.80.8 to 0.80.5, then the reslut become 
normal. 
So I think maybe something cause about librbd. I compare the 0.80.8 release 
notes with 0.80.5 ( http://ceph.com/docs/master/release-notes/#v0-80-8-firefly 
), I just find this change in 0.80.8 is something about read request : librbd: 
cap memory utilization for read requests (Jason Dillaman) . Who can explain 
this? 


My ceph cluster is 400osd,5mons : 
ceph -s 
health HEALTH_OK 
monmap e11: 5 mons at {BJ-M1-Cloud71= 
172.28.2.71:6789/0,BJ-M1-Cloud73=172.28.2.73:6789/0,BJ-M2-Cloud80=172.28.2.80:6789/0,BJ-M2-Cloud81=172.28.2.81:6789/0,BJ-M3-Cloud85=172.28.2.85:6789/0
 }, election epoch 198, quorum 0,1,2,3,4 
BJ-M1-Cloud71,BJ-M1-Cloud73,BJ-M2-Cloud80,BJ-M2-Cloud81,BJ-M3-Cloud85 
osdmap e120157: 400 osds: 400 up, 400 in 
pgmap v26161895: 29288 pgs, 6 pools, 20862 GB data, 3014 kobjects 
41084 GB used, 323 TB / 363 TB avail 
29288 active+clean 
client io 52640 kB/s rd, 32419 kB/s wr, 5193 op/s 


The follwing is my ceph client conf : 
[global] 
auth_service_required = cephx 
filestore_xattr_use_omap = true 
auth_client_required = cephx 
auth_cluster_required = cephx 
mon_host = 
172.29.204.24,172.29.204.48,172.29.204.55,172.29.204.58,172.29.204.73 
mon_initial_members = ZR-F5-Cloud24, ZR-F6-Cloud48, ZR-F7-Cloud55, 
ZR-F8-Cloud58, ZR-F9-Cloud73 
fsid = c01c8e28-304e-47a4-b876-cb93acc2e980 
mon osd full ratio = .85 
mon osd nearfull ratio = .75 
public network = 172.29.204.0/24 
mon warn on legacy crush tunables = false 

[osd] 
osd op threads = 12 
filestore journal writeahead = true 
filestore merge threshold = 40 
filestore split multiple = 8 

[client] 
rbd cache = true 
rbd cache writethrough until flush = false 
rbd cache size = 67108864 
rbd cache max dirty = 50331648 
rbd cache target dirty = 33554432 

[client.cinder] 
admin socket = /var/run/ceph/rbd-$pid.asok 



My VM is 8core16G,we use fio scripts is : 
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randread -size=60G 
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=randwrite -size=60G 
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 
fio -ioengine=libaio -bs=4k -direct=1 -thread -rw=read -size=60G 
-filename=/dev/vdb -name=EBS -iodepth=32 -runtime=200 
fio -ioengine=libaio -bs=4k 

Re: [ceph-users] HELP FOR CEPH SOURCE CODE

2015-02-21 Thread Stefan Priebe - Profihost AG
This will be very difficult with a broken keyboard!

 Am 21.02.2015 um 12:16 schrieb khyati joshi kpjosh...@gmail.com:
 
 I WANT TO ADD 2 NEW FEATURES IN CEPH NAMELY GARBAGE COLLECTION AND REPLICA 
 BALANCING.BUT I DONT KNOW WHICH SOURCE CODE TO CHANGE AND HOW?
 
 I WANT TO KNOW EXACT AND STEPWISE PROCEDURE FOR CHANGING SOURCE CODE AND 
 TESTING IT.DO I NEED TO EXTRACT ZIP FILE OF SOURCE CODE OR NEED TO INSTALL 
 ANYTHING ELSE?
 
 PLEASE HELP.
 
 
 KHYATI,
 STUDENT, M.TECH(COMP),
 GUJARAT TECHNOLOGICALUNIVERSITY.
 
 
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] HELP FOR CEPH SOURCE CODE

2015-02-21 Thread khyati joshi
I WANT TO ADD 2 NEW FEATURES IN CEPH NAMELY GARBAGE COLLECTION AND REPLICA
BALANCING.BUT I DONT KNOW WHICH SOURCE CODE TO CHANGE AND HOW?

I WANT TO KNOW EXACT AND STEPWISE PROCEDURE FOR CHANGING SOURCE CODE AND
TESTING IT.DO I NEED TO EXTRACT ZIP FILE OF SOURCE CODE OR NEED TO INSTALL
ANYTHING ELSE?

PLEASE HELP.


KHYATI,
STUDENT, M.TECH(COMP),
GUJARAT TECHNOLOGICALUNIVERSITY.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fwd: OSD fail on client writes

2015-02-21 Thread Jeffrey McDonald
Hi,

We have a ceph Giant installation with a radosgw interface.   There are 198
OSDs on seven OSD servers and we're seeing OSD failures on the system when
users try to write files via the s3 interface.We're more likely to see
the failures if the files are larger than 1 GB and if the files go to a
newly created bucket.   We have seen failures for older buckets but that
seem to happen less frequently.   I can regularly crash the OSD with a 3.6
GB file writing to a newly created bucket.

Three weeks ago, we upgraded to Giant from firefly to achieve better
performance.   Under firefly it was impossible to break the system.We
have had these issues since we've moved to giant.   We've gone  through
tests with iptables, sysctl parameters and testing different versions of
s3cmd (along with different python versions), there is no indication that
any of these matter for the failures.

Here is the client interaction:

 $ ls -lh 420N.bam
-rw---. 1 jmcdonal tech 3.6G Feb 19 07:52 420N.bam

$ s3cmd put 420N.bam s3://jmtestbigfiles2/
420N.bam - s3://jmtestbigfiles2/420N.bam  [part 1 of 4, 1024MB]
 1073741824 of 1073741824   100% in   22s45.95 MB/s  done
420N.bam - s3://jmtestbigfiles2/420N.bam  [part 2 of 4, 1024MB]
 1073741824 of 1073741824   100% in   23s44.35 MB/s  done
420N.bam - s3://jmtestbigfiles2/420N.bam  [part 3 of 4, 1024MB]
 1073741824 of 1073741824   100% in   21s48.33 MB/s  done
420N.bam - s3://jmtestbigfiles2/420N.bam  [part 4 of 4, 562MB]
 589993365 of 589993365   100% in   42s13.28 MB/s  done
ERROR: syntax error: line 1, column 49
ERROR:
Upload of '420N.bam' part 4 failed. Use
  /usr/bin/s3cmd abortmp s3://jmtestbigfiles2/420N.bam
2/A5m20_uvjRllfTNB4wplXZH0eYDjyen
to abort the upload, or
  /usr/bin/s3cmd --upload-id 2/A5m20_uvjRllfTNB4wplXZH0eYDjyen put ...
to continue the upload.

!
An unexpected error has occurred.
  Please try reproducing the error using
  the latest s3cmd code from the git master
  branch found at:
https://github.com/s3tools/s3cmd
  If the error persists, please report the
  following lines (removing any private
  info as necessary) to:
   s3tools-b...@lists.sourceforge.net

!

Invoked as: /usr/bin/s3cmd put 420N.bam s3://jmtestbigfiles2/
Problem: AttributeError: 'module' object has no attribute 'ParseError'
S3cmd:   1.5.0-rc1
python:   2.6.6 (r266:84292, Jan 22 2014, 09:42:36)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-4)]
environment LANG=en_US.UTF-8

Traceback (most recent call last):
  File /usr/bin/s3cmd, line 2523, in module
rc = main()
  File /usr/bin/s3cmd, line 2441, in main
rc = cmd_func(args)
  File /usr/bin/s3cmd, line 380, in cmd_object_put
response = s3.object_put(full_name, uri_final, extra_headers,
extra_label = seq_label)
  File /usr/lib/python2.6/site-packages/S3/S3.py, line 516, in object_put
return self.send_file_multipart(file, headers, uri, size)
  File /usr/lib/python2.6/site-packages/S3/S3.py, line 1037, in
send_file_multipart
upload.upload_all_parts()
  File /usr/lib/python2.6/site-packages/S3/MultiPart.py, line 111, in
upload_all_parts
self.upload_part(seq, offset, current_chunk_size, labels, remote_status
= remote_statuses.get(seq))
  File /usr/lib/python2.6/site-packages/S3/MultiPart.py, line 165, in
upload_part
response = self.s3.send_file(request, self.file, labels, buffer, offset
= offset, chunk_size = chunk_size)
  File /usr/lib/python2.6/site-packages/S3/S3.py, line 1010, in send_file
warning(Upload failed: %s (%s) % (resource['uri'], S3Error(response)))
  File /usr/lib/python2.6/site-packages/S3/Exceptions.py, line 51, in
__init__
except ET.ParseError:
AttributeError: 'module' object has no attribute 'ParseError'

!
An unexpected error has occurred.
  Please try reproducing the error using
  the latest s3cmd code from the git master
  branch found at:
https://github.com/s3tools/s3cmd
  If the error persists, please report the
  above lines (removing any private
  info as necessary) to:
   s3tools-b...@lists.sourceforge.net
!
-
on the gateway, I see this object request:

ops: [
{ tid: 14235,
  pg: 70.59a712df,
  osd: 21,
  object_id:
default.315696.1__shadow_420N.bam.2\/A5m20_uvjRllfTNB4wplXZH0eYDjyen.4_140,
  object_locator: @70,
  target_object_id:
default.315696.1__shadow_420N.bam.2\/A5m20_uvjRllfTNB4wplXZH0eYDjyen.4_140,
  target_object_locator: @70,
  paused: 0,
  used_replica: 0,
  precalc_pgid: 0,
  last_sent: 2015-02-21 11:20:11.317593,
  attempts: 7,
  snapid: head,
  snap_context: 0=[],
  mtime: 2015-02-21 11:18:58.114452,
  osd_ops: [
write 2621440~169365]}],


Re: [ceph-users] HELP FOR CEPH SOURCE CODE

2015-02-21 Thread Michael Andersen
C++ is case sensitive, it will be very difficult...
On Feb 21, 2015 3:44 AM, Stefan Priebe - Profihost AG 
s.pri...@profihost.ag wrote:

 This will be very difficult with a broken keyboard!

  Am 21.02.2015 um 12:16 schrieb khyati joshi kpjosh...@gmail.com:
 
  I WANT TO ADD 2 NEW FEATURES IN CEPH NAMELY GARBAGE COLLECTION AND
 REPLICA BALANCING.BUT I DONT KNOW WHICH SOURCE CODE TO CHANGE AND HOW?
 
  I WANT TO KNOW EXACT AND STEPWISE PROCEDURE FOR CHANGING SOURCE CODE AND
 TESTING IT.DO I NEED TO EXTRACT ZIP FILE OF SOURCE CODE OR NEED TO INSTALL
 ANYTHING ELSE?
 
  PLEASE HELP.
 
 
  KHYATI,
  STUDENT, M.TECH(COMP),
  GUJARAT TECHNOLOGICALUNIVERSITY.
 
 
 
 
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Radosgw keeps writing to specific OSDs wile there other free OSDs

2015-02-21 Thread B L
Hi Ceph community,

I’m trying to upload some file with 5GB size, through radosgw, I have 9 OSDs 
deployed on 3 machines, and my cluster is healthy.

The problem is: the 5GB file is being uploaded to osd.0 and osd.1 ,which are 
near full, while the other OSDs have more space that can have this file  .. why 
would ceph (may be radosgw) do this?

Is there a misconfiguration to ceph or misunderstanding of how ceph works ..


Another question: Are data are sharded by default on ceph, or we have to 
configure it? Or the problem is not related to sharding in the first place?


I appreciate everybody’s help

Thanks!
Beanos
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com