Re:Re: Consult some problems of Ceph when reading source code

2015-08-06 Thread Sage Weil
On Thu, 6 Aug 2015, ?? wrote: Dear Dr.Sage: Thank you for your detailed reply?These answers helps me a lot. I also have some problems in Question (1. In your reply, the requests according to the different PG enqueue into the ShardedWQ, if I have 3 requests( that is

Newbie question about metadata_list.

2015-08-06 Thread Łukasz Szymczyk
Hi, I'm writing some program to replace image in cluster with it's copy. But I have problem with metadata_list. I created pool: #rados mkpool dupa then I created image: #rbd create --size 1000 -p mypool image --image-format 2 Below is code which tries to get metadata, but it fails with

Re: civetweb health check

2015-08-06 Thread Wido den Hollander
On 05-08-15 18:37, Srikanth Madugundi wrote: Hi, We are planning to move our radosgw setup from apache to civetweb. We were successfully able to setup and run civetweb on a test cluster. The radosgw instances are fronted by a VIP with currently checks the health by getting /status.html

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Yan, Zheng
On Thu, Aug 6, 2015 at 5:26 AM, Sage Weil sw...@redhat.com wrote: Today I learned that syncfs(2) does an O(n) search of the superblock's inode list searching for dirty items. I've always assumed that it was only traversing dirty inodes (e.g., a list of dirty inodes), but that appears not to

Re: More ondisk_finisher thread?

2015-08-06 Thread Ding Dinghua
Sorry for the noise. I have find out the cause in our setup and case: We gathered too many logs in our RADOS IO path, and the latency seems to be reasonable(about 0.026 ms) if we don't gather that many logs... 2015-08-05 20:29 GMT+08:00 Sage Weil s...@newdream.net: On Wed, 5 Aug 2015, Ding

Re: wip-user status

2015-08-06 Thread Sage Weil
On Wed, 5 Aug 2015, Milan Broz wrote: On 08/04/2015 10:53 PM, Sage Weil wrote: I rebased the wip-user patches from wip-selinux-policy onto wip-selinux-policy-no-user + merge to master so that it sits on top of the newly-merged systemd changes. Great, so if it is build-ready state, I

Re: civetweb health check

2015-08-06 Thread Srikanth Madugundi
hitting '/' endpoint worked. Thanks Srikanth On Thu, Aug 6, 2015 at 1:26 AM, Wido den Hollander w...@42on.com wrote: On 05-08-15 18:37, Srikanth Madugundi wrote: Hi, We are planning to move our radosgw setup from apache to civetweb. We were successfully able to setup and run civetweb on a

RE: Erasure Code Plugins : PLUGINS_V3 feature

2015-08-06 Thread Miyamae, Takeshi
Hi Loic, Thank you for arranging PLUGINS_V3 feature. I had just started to review pull request #5493. Please wait just a moment. By the way, may I ask what kind of status #5257 (decoding cache: the last immediate request from SHEC) currently is? https://github.com/ceph/ceph/pull/5257 Tell us if

Re: [ceph-users] Is it safe to increase pg number in a production environment

2015-08-06 Thread Jevon Qiao
Hi Jan, Thank you very much for the suggestion. Regards, Jevon On 5/8/15 19:36, Jan Schermer wrote: Hi, comments inline. On 05 Aug 2015, at 05:45, Jevon Qiao qiaojianf...@unitedstack.com wrote: Hi Jan, Thank you for the detailed suggestion. Please see my reply in-line. On 5/8/15 01:23, Jan

Re: About the Ceph erasure pool with ISA plugin on Intel xeon CPU

2015-08-06 Thread Derek Su
Hello, Loic the following is my steps and configurations: (1) The 11 osd and 3 monitors were ran in the docker container on the same host machine. (2) Each osd had one 1T HDD. (3) I set the erasure coding pool profiles: ## Jerasure, reed-soloman $ ceph osd erasure-code-profile set reed_k4m2_A

RE: OSD sometimes stuck in init phase

2015-08-06 Thread Gurjar, Unmesh
Thanks for quick response Haomai! Please find the backtrace here [1]. [1] - http://paste.openstack.org/show/411139/ Regards, Unmesh G. IRC: unmeshg -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Thursday, August 06, 2015 5:31 PM To: Gurjar, Unmesh Cc:

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
Could you print your all thread callback via thread apply all bt? On Thu, Aug 6, 2015 at 7:52 PM, Gurjar, Unmesh unmesh.gur...@hp.com wrote: Hi, On a Ceph Firefly cluster (version [1]), OSDs are configured to use separate data and journal disks (using the ceph-disk utility). It is observed,

RE: OSD sometimes stuck in init phase

2015-08-06 Thread Gurjar, Unmesh
Please find ceph.conf at [1] and the corresponding OSD log at [2]. To clarify one thing I skipped earlier on, is while bringing up the OSDs, 'ceph-disk activate' was getting hung (due to issue [3]). To get over this, I had to temporarily disable 'journal dio' to get the disk activated (with a

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Christoph Hellwig
On Wed, Aug 05, 2015 at 02:26:30PM -0700, Sage Weil wrote: Today I learned that syncfs(2) does an O(n) search of the superblock's inode list searching for dirty items. I've always assumed that it was only traversing dirty inodes (e.g., a list of dirty inodes), but that appears not to be

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Sage Weil
On Thu, 6 Aug 2015, Haomai Wang wrote: Agree On Thu, Aug 6, 2015 at 5:38 AM, Somnath Roy somnath@sandisk.com wrote: Thanks Sage for digging down..I was suspecting something similar.. As I mentioned in today's call, in idle time also syncfs is taking ~60ms. I have 64 GB of RAM in

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Sage Weil
On Thu, 6 Aug 2015, Christoph Hellwig wrote: On Wed, Aug 05, 2015 at 02:26:30PM -0700, Sage Weil wrote: Today I learned that syncfs(2) does an O(n) search of the superblock's inode list searching for dirty items. I've always assumed that it was only traversing dirty inodes (e.g., a list

OSD sometimes stuck in init phase

2015-08-06 Thread Gurjar, Unmesh
Hi, On a Ceph Firefly cluster (version [1]), OSDs are configured to use separate data and journal disks (using the ceph-disk utility). It is observed, that few OSDs start-up fine (are 'up' and 'in' state); however, others are stuck in the 'init creating/touching snapmapper object' phase. Below

Re: Consult some problems of Ceph when reading source code

2015-08-06 Thread Sage Weil
Hi! On Thu, 6 Aug 2015, ?? wrote: Dear developers, My name is Cai Yi, and I am a graduate student majored in CS of Xi?an Jiaotong University in China. From Ceph?s homepage, I know Sage is the author of Ceph and I get the email address from your GitHub and Ceph?s official website.

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Sage Weil
On Thu, 6 Aug 2015, Yan, Zheng wrote: On Thu, Aug 6, 2015 at 5:26 AM, Sage Weil sw...@redhat.com wrote: Today I learned that syncfs(2) does an O(n) search of the superblock's inode list searching for dirty items. I've always assumed that it was only traversing dirty inodes (e.g., a list of

About the Ceph erasure pool with ISA plugin on Intel xeon CPU

2015-08-06 Thread Derek Su
Dear Mr. Dachary and all, Recently, I found your blog show the performance tests of erasure pools (http://dachary.org/?p=3042 , http://dachary.org/?p=3665). The results indicates the write throughput can be enhanced significantly using Intel xeon CPU. I tried to create an erasure pool with isa

Re: Newbie question about metadata_list.

2015-08-06 Thread Ilya Dryomov
On Thu, Aug 6, 2015 at 12:26 PM, Łukasz Szymczyk lukasz.szymc...@corp.ovh.com wrote: Hi, I'm writing some program to replace image in cluster with it's copy. But I have problem with metadata_list. I created pool: #rados mkpool dupa then I created image: #rbd create --size 1000 -p mypool

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
Don't find something strange. Could you paste your ceph.conf? And restart this osd with debug_osd=20/20, debug_filestore=20/20 :-) On Thu, Aug 6, 2015 at 8:09 PM, Gurjar, Unmesh unmesh.gur...@hp.com wrote: Thanks for quick response Haomai! Please find the backtrace here [1]. [1] -

Re: FileStore should not use syncfs(2)

2015-08-06 Thread Christoph Hellwig
On Thu, Aug 06, 2015 at 06:00:42AM -0700, Sage Weil wrote: I'm guessing the strategy here should be to fsync the file (leaf) and then any affected ancestors, such that the directory fsyncs are effectively no-ops? Or does it matter? All metadata transactions log the involve parties (parent

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
It seemed filestore doesn't do transaction as expected. Sorry, you need to add debug_journal=20/20 to help find the reason. :-) BTW, what's your os version? How many osds do you have in this cluster, how many osds failed to start like this? On Thu, Aug 6, 2015 at 9:17 PM, Gurjar, Unmesh

Re: Erasure Code Plugins : PLUGINS_V3 feature

2015-08-06 Thread Loic Dachary
Hi Takeshi, https://github.com/ceph/ceph/pull/5493 is ready for your review. The matching integration tests can be found at https://github.com/ceph/ceph-qa-suite/pull/523 Cheers On 06/08/2015 02:28, Miyamae, Takeshi wrote: Dear Sage, note that what this really means is that the on-disk

testing the teuthology OpenStack backend

2015-08-06 Thread Loic Dachary
Hi, I'm looking into testing the OpenStack backend for teuthology on a new cluster to verify it's portable. I think it is but ... ;-) I'm told you have an OpenStack cluster and would be interested in running teuthology workloads on it. Does it have a public facing API ? Cheers -- Loïc

Re:Re: Consult some problems of Ceph when reading source code

2015-08-06 Thread 蔡毅
Dear Dr.Sage: Thank you for your detailed reply!These answers helps me a lot. I also have some problems in Question (1. In your reply, the requests according to the different PG enqueue into the ShardedWQ, if I have 3 requests( that is pg1,r1,pg2,r2,pg3,r3), and I put them to the ShardedWQ, is

Re: radosgw + civetweb latency issue on Hammer

2015-08-06 Thread Mark Nelson
Hi Srikanth, Can you make a ticket on tracker.ceph.com for this? We'd like to not loose track of it. Thanks! Mark On 08/05/2015 07:01 PM, Srikanth Madugundi wrote: Hi, After upgrading to Hammer and moving from apache to civetweb. We started seeing high PUT latency in the order of 2 sec