Re: Blueprint: Add LevelDB support to ceph cluster backend store

2013-07-31 Thread Haomai Wang
2013-7-31, 2:01, Sage Weil s...@inktank.com wrote: Hi Haomai, On Wed, 31 Jul 2013, Haomai Wang wrote: Every node of ceph cluster has a backend filesystem such as btrfs, xfs and ext4 that provides storage for data objects, whose location are determined by CRUSH algorithm. There should

Re: Blueprint: Add LevelDB support to ceph cluster backend store

2013-08-28 Thread Haomai Wang
On Aug 28, 2013, at 7:01 AM, Sage Weil s...@inktank.com wrote: Hi Haomai, I just wanted to check in to see if things have progressed at all since we talked at CDS. If you have any questions or there is anything I can help with, let me know! I'd love to see this alternative backend make

[RBD][OpenStack]The way to solve problem when boot VM and root disk size is specified

2013-11-11 Thread Haomai Wang
Hi all, Now OpenStack Nova master branch still exists a bug when you boot a VM which root disk size is specified. The storage backend of Nova also is rbd. For example, you boot a VM and specify 10G as root disk size. But the image is only 1G. Then VM will be spawned and the root disk size will

Re: [RBD][OpenStack]The way to solve problem when boot VM and root disk size is specified

2013-11-13 Thread Haomai Wang
On Nov 13, 2013, at 10:58 AM, Josh Durgin josh.dur...@inktank.com wrote: On 11/11/2013 11:10 PM, Haomai Wang wrote: Hi all, Now OpenStack Nova master branch still exists a bug when you boot a VM which root disk size is specified. The storage backend of Nova also is rbd. For example, you

Re: [RBD][OpenStack]The way to solve problem when boot VM and root disk size is specified

2013-11-13 Thread Haomai Wang
On Nov 13, 2013, at 9:14 PM, Haomai Wang haomaiw...@gmail.com wrote: On Nov 13, 2013, at 10:58 AM, Josh Durgin josh.dur...@inktank.com wrote: On 11/11/2013 11:10 PM, Haomai Wang wrote: Hi all, Now OpenStack Nova master branch still exists a bug when you boot a VM which root disk size

Re: [ceph-users] rocksdb Seen today - replacement for leveldb?

2013-11-27 Thread Haomai Wang
Yes, we have related bp for rocksdb backend. wiki.ceph.com/index.php?title=01Planning/02Blueprints/Firefly/osd:_new_key%2F%2Fvalue_backend On Wed, Nov 27, 2013 at 6:56 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: the performance comparisions are very impressive:

Re: Refactor DBObjectMap Proposal

2013-12-21 Thread Haomai Wang
On Dec 13, 2013, at 1:01 AM, Sage Weil s...@inktank.com wrote: On Thu, 12 Dec 2013, Haomai Wang wrote: On Thu, Dec 12, 2013 at 1:26 PM, Sage Weil s...@inktank.com wrote: [adding cc ceph-devel] [attempt 2] On Wed, 11 Dec 2013, Haomai Wang wrote: Hi Sage, Since last CDS, you have

Re: Refactor DBObjectMap Proposal

2013-12-21 Thread Haomai Wang
On Dec 22, 2013, at 1:20 PM, Sage Weil s...@inktank.com wrote: On Sat, 21 Dec 2013, Haomai Wang wrote: On Dec 13, 2013, at 1:01 AM, Sage Weil s...@inktank.com wrote: On Thu, 12 Dec 2013, Haomai Wang wrote: On Thu, Dec 12, 2013 at 1:26 PM, Sage Weil s...@inktank.com wrote: [adding cc ceph

Re: Refactor DBObjectMap Proposal

2013-12-22 Thread Haomai Wang
On Dec 22, 2013, at 2:02 PM, Haomai Wang haomaiw...@gmail.com wrote: On Dec 22, 2013, at 1:20 PM, Sage Weil s...@inktank.com wrote: On Sat, 21 Dec 2013, Haomai Wang wrote: On Dec 13, 2013, at 1:01 AM, Sage Weil s...@inktank.com wrote: On Thu, 12 Dec 2013, Haomai Wang wrote: On Thu

Proposal for adding disable FileJournal option

2014-01-09 Thread Haomai Wang
Hi all, We know FileJournal plays a important role in FileStore backend, it can hugely reduce write latency and improve small write operations. But in practice, there exists exceptions such as we already use FlashCache or cachepool(although it's not ready). If cachepool enabled, we may use use

Re: Proposal for adding disable FileJournal option

2014-01-09 Thread Haomai Wang
on FileStore to implement it. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Thu, Jan 9, 2014 at 12:13 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi all, We know FileJournal plays a important role in FileStore backend, it can hugely reduce write latency and improve

Re: Proposal for adding disable FileJournal option

2014-01-10 Thread Haomai Wang
if the FileJournal could be disabled, there must be something else to implement the Transaction Interface. But it seems hard while no local file-system provide such function in my opinion. On 10 January 2014 10:04, Haomai Wang haomaiw...@gmail.com wrote: On Fri, Jan 10, 2014 at 1:28 AM, Gregory

Re: Proposal for adding disable FileJournal option

2014-01-12 Thread Haomai Wang
, Haomai Wang haomaiw...@gmail.com wrote: On Fri, Jan 10, 2014 at 11:13 AM, Gregory Farnum g...@inktank.com wrote: Exactly. We can't do a safe update without a journal — what if power goes out while the write is happening? When we boot back up, we don't know what version the object is actually

Re: rados io hints

2014-01-17 Thread Haomai Wang
It's great to performance, more advice from client is welcomed for ObjectStore implementation. Maybe the new operation can be just like fadvise, accept flags as argument and ObjectStore will try to do it but not certainly. KeyValueStore will also enjoy it and get much benefit from it. On Sat,

[RFC]About GenericObjectMap in KeyValueStore

2014-01-21 Thread Haomai Wang
will expose header to caller among write operations. StripObjectMap is the successor of GenericObjectMap and used by KeyValueStore directly. It encapsulate the interface of GenericObjectMap and make more suitable for KeyValueStore. Best regards, Haomai Wang, UnitedStack Inc. -- To unsubscribe from

[RFC]About FileStore optimizations

2014-01-29 Thread Haomai Wang
Hi all, I noticed a pull request(https://github.com/ceph/ceph/pull/1152) which make a deal of optimizations on FileStore. Although the way modify ObjecStore and caller's codes are not accepted for me, it still seemed good for performance purpose. I agree with we need a session mechanism to cache

[Annonce]The progress of KeyValueStore in Firely

2014-02-27 Thread Haomai Wang
Hi all, last release I propose a KeyValueStore prototype(get info from http://sebastien-han.fr/blog/2013/12/02/ceph-performance-interesting-things-going-on). It contains some performance results and problems. Now I'd like to refresh our thoughts on KeyValueStore. KeyValueStore is pursuing

Re: [Annonce]The progress of KeyValueStore in Firely

2014-02-28 Thread Haomai Wang
On Sat, Mar 1, 2014 at 8:04 AM, Danny Al-Gaaf danny.al-g...@bisect.de wrote: Hi, Am 28.02.2014 03:45, schrieb Haomai Wang: [...] I use fio which rbd supported from TelekomCloud(https://github.com/TelekomCloud/fio/commits/rbd-engine) to test rbd. I would recommend to no longer use

Re: [RFC] add rocksdb support

2014-03-05 Thread Haomai Wang
I think the reason why the little difference between leveldb and rocksdb in FileStore is that the main latency cause isn't KeyValueDB backend. So we may not get enough benefit from rocksdb instead of leveldb by FileStore. On Wed, Mar 5, 2014 at 4:23 PM, Alexandre DERUMIER aderum...@odiso.com

[librbd] Add interface of get the snapshot size?

2014-03-24 Thread Haomai Wang
Hi all, As we know, snapshot is a lightweight resource in librbd and we doesn't have any statistic informations about it. But it causes some problems to the cloud management. We can't measure the size of snapshot, different snapshot will occur different space. So we don't have way to estimate

[Share]Performance tunning on Ceph FileStore with SSD backend

2014-04-09 Thread Haomai Wang
Hi all, I would like to share some ideas about how to improve performance on ceph with SSD. Not much preciseness. Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). ceph version is 0.67.5(Dumping) At first, we find three bottleneck on filestore: 1. fdcache_lock(changed in

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-04-11 Thread Haomai Wang
the new coming leveldb backend store help for this specific case ? - Mail original - De: Gregory Farnum g...@inktank.com À: Haomai Wang haomaiw...@gmail.com Cc: ceph-devel@vger.kernel.org Envoyé: Mercredi 9 Avril 2014 16:15:14 Objet: Re: [Share]Performance tunning on Ceph FileStore

Re: osd systemtap initial steps

2014-04-29 Thread Haomai Wang
So cool! I remember a telekomcloud guy is also worked for it since last summit. On Wed, Apr 30, 2014 at 6:17 AM, Samuel Just sam.j...@inktank.com wrote: I'm starting on adding systemtap trace points to the osd. https://github.com/athanatos/ceph/tree/wip-osd-stap has some initial build

Re: [ceph-users] Red Hat to acquire Inktank

2014-04-30 Thread Haomai Wang
Congratulation! On Wed, Apr 30, 2014 at 8:18 PM, Sage Weil s...@inktank.com wrote: Today we are announcing some very big news: Red Hat is acquiring Inktank. We are very excited about what this means for Ceph, the community, the team, our partners, and our customers. Ceph has come a long way in

[Performance] Improvement on DB Performance

2014-05-21 Thread Haomai Wang
Hi all, I remember there exists discuss about DB(mysql) performance on rbd. Recently I test mysql-bench with rbd and found awful performance. So I dive into it and find that main cause is flush request from guest. As we know, applications such as mysql, ceph has own journal for durable and

Re: [Performance] Improvement on DB Performance

2014-05-21 Thread Haomai Wang
the root cause for db on rbd performance. On Wed, May 21, 2014 at 6:15 PM, Haomai Wang haomaiw...@gmail.com wrote: Hi all, I remember there exists discuss about DB(mysql) performance on rbd. Recently I test mysql-bench with rbd and found awful performance. So I dive into it and find that main

Re: [Performance] Improvement on DB Performance

2014-05-21 Thread Haomai Wang
-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Haomai Wang Sent: Wednesday, 21 May, 2014 6:22 PM To: ceph-devel@vger.kernel.org Subject: Re: [Performance] Improvement on DB Performance I pushed the commit to fix this problem(https://github.com/ceph/ceph/pull

Re: Questions of KeyValueStore (leveldb) backend

2014-05-25 Thread Haomai Wang
On Mon, May 26, 2014 at 9:46 AM, Guang Yang yguan...@outlook.com wrote: Hello Haomai, We are evaluating the key-value store backend which comes along with Firefly release (thanks for implementing it in Ceph), it is very promising for a couple of our use cases, after going through the related

Re: Questions of KeyValueStore (leveldb) backend

2014-05-26 Thread Haomai Wang
On Mon, May 26, 2014 at 5:23 PM, Wido den Hollander w...@42on.com wrote: On 05/26/2014 06:55 AM, Haomai Wang wrote: On Mon, May 26, 2014 at 9:46 AM, Guang Yang yguan...@outlook.com wrote: Hello Haomai, We are evaluating the key-value store backend which comes along with Firefly release

Re: [Performance] Improvement on DB Performance

2014-05-26 Thread Haomai Wang
On Wed, May 21, 2014 at 11:23 PM, Sage Weil s...@inktank.com wrote: On Wed, 21 May 2014, Haomai Wang wrote: I pushed the commit to fix this problem(https://github.com/ceph/ceph/pull/1848). With test program(Each sync request is issued with ten write request), a significant improvement

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-05-26 Thread Haomai Wang
performance improvements with this branch. Greets, Stefan Am 09.04.2014 12:05, schrieb Haomai Wang: Hi all, I would like to share some ideas about how to improve performance on ceph with SSD. Not much preciseness. Our ssd is 500GB and each OSD own a SSD(journal is on the same SSD). ceph

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-05-27 Thread Haomai Wang
I'm not full sure the correctness of changes although it seemed ok to me. And I apply these changes to product env and no problems. On Tue, May 27, 2014 at 2:05 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 27.05.2014 06:42, schrieb Haomai Wang: On Tue, May 27, 2014 at 4:29

Re: [Share]Performance tunning on Ceph FileStore with SSD backend

2014-05-27 Thread Haomai Wang
Still not, I will try to push to master branch On Tue, May 27, 2014 at 2:45 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 27.05.2014 08:37, schrieb Haomai Wang: I'm not full sure the correctness of changes although it seemed ok to me. And I apply these changes to product env

Re: xattr spillout appears broken :(

2014-05-30 Thread Haomai Wang
Hi Gregory, I try to reproduce the bug in my local machine but failed. My test cmdline: ./ceph_test_rados --op read 100 --op write 100 --op delete 50 --max-ops 40 --objects 1024 --max-in-flight 64 --size 400 --min-stride-size 40 --max-stride-size 80 --max-seconds 600 --op

Re: [ceph-users] [Annonce]The progress of KeyValueStore in Firely

2014-06-03 Thread Haomai Wang
using similar conf file as yours. Is this the expected behavior or am I missing something? Thanks, Sushma On Fri, Feb 28, 2014 at 11:00 PM, Haomai Wang haomaiw...@gmail.com wrote: On Sat, Mar 1, 2014 at 8:04 AM, Danny Al-Gaaf danny.al-g...@bisect.de wrote: Hi, Am 28.02.2014 03:45

Re: xattr spillout appears broken :(

2014-06-03 Thread Haomai Wang
. On Sat, May 31, 2014 at 2:06 AM, Gregory Farnum g...@inktank.com wrote: On Fri, May 30, 2014 at 2:18 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi Gregory, I try to reproduce the bug in my local machine but failed. My test cmdline: ./ceph_test_rados --op read 100 --op write 100 --op delete 50

Re: xattr spillout appears broken :(

2014-06-03 Thread Haomai Wang
/familiar/not familiar/ On Tue, Jun 3, 2014 at 10:33 PM, Haomai Wang haomaiw...@gmail.com wrote: Hi Gregory, I checked again and again each line change about spill out codes, still failed to find anything wrong. I ran ceph_test_rados then activate scrub process several times locally

Re: [ceph-users] [Annonce]The progress of KeyValueStore in Firely

2014-06-03 Thread Haomai Wang
The fix pull request is https://github.com/ceph/ceph/pull/1912/files. Someone can help to review and merge On Wed, Jun 4, 2014 at 3:38 AM, Sushma R gsus...@gmail.com wrote: ceph version : master (ceph version 0.80-713-g86754cc (86754cc78ca570f19f5a68fb634d613f952a22eb)) fio version :

[Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-06-05 Thread Haomai Wang
Hi, Previously I sent a mail about the difficult of rbd snapshot size statistic. The main solution is using object map to store the changes. The problem is we can't handle with multi client concurrent modify. Lack of object map(like pointer map in qcow2), it cause many problems in librbd. Such as

Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-06-05 Thread Haomai Wang
On Thu, Jun 5, 2014 at 3:25 PM, Wido den Hollander w...@42on.com wrote: On 06/05/2014 09:01 AM, Haomai Wang wrote: Hi, Previously I sent a mail about the difficult of rbd snapshot size statistic. The main solution is using object map to store the changes. The problem is we can't handle

Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-06-05 Thread Haomai Wang
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Haomai Wang Sent: Thursday, June 05, 2014 12:43 AM To: Wido den Hollander Cc: Sage Weil; Josh Durgin; ceph-devel@vger.kernel.org Subject: Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic

Re: xattr spillout appears broken :(

2014-06-06 Thread Haomai Wang
, Jun 3, 2014 at 10:33 PM, Haomai Wang haomaiw...@gmail.com wrote: /familiar/not familiar/ On Tue, Jun 3, 2014 at 10:33 PM, Haomai Wang haomaiw...@gmail.com wrote: Hi Gregory, I checked again and again each line change about spill out codes, still failed to find anything wrong. I ran

Re: xattr spillout appears broken :(

2014-06-06 Thread Haomai Wang
The fix should make clone method copy cephos prefix xattrs On Sat, Jun 7, 2014 at 2:54 AM, Haomai Wang haomaiw...@gmail.com wrote: Hi Greg, I have found the reason. user.cephos.spill_out can't be apply to new object when calling clone method. So if the origin object is spill out, the new

Re: xattr spillout appears broken :(

2014-06-06 Thread Haomai Wang
Yes, maybe you can add it in your branch. Because it will happen when creating object and set spill out On Sat, Jun 7, 2014 at 2:59 AM, Gregory Farnum g...@inktank.com wrote: On Fri, Jun 6, 2014 at 11:55 AM, Haomai Wang haomaiw...@gmail.com wrote: The fix should make clone method copy cephos

Re: xattr spillout appears broken :(

2014-06-07 Thread Haomai Wang
Greg, I have submit a patch to fix it. https://github.com/ceph/ceph/pull/1932 On Sat, Jun 7, 2014 at 3:00 AM, Haomai Wang haomaiw...@gmail.com wrote: Yes, maybe you can add it in your branch. Because it will happen when creating object and set spill out On Sat, Jun 7, 2014 at 2:59 AM, Gregory

Re: osd systemtap initial steps

2014-06-07 Thread Haomai Wang
bit :P. I'm focusing on instrumenting the OSD op life cycle. -Sam On Wed, Apr 30, 2014 at 6:19 AM, Danny Al-Gaaf danny.al-g...@bisect.de wrote: Am 30.04.2014 04:21, schrieb Haomai Wang: So cool! I remember a telekomcloud guy is also worked for it since last summit. Correct ... our

Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-06-10 Thread Haomai Wang
Thanks, Josh! Your points are really helpful. Maybe we can schedule this bp to the near CDS? The implementation I hope can has great performance effects on librbd. On Tue, Jun 10, 2014 at 9:16 AM, Josh Durgin josh.dur...@inktank.com wrote: On 06/05/2014 12:01 AM, Haomai Wang wrote: Hi

About set_alloc_hint op

2014-06-16 Thread Haomai Wang
Hi, Now librbd is the only user for set_alloc_hint op. void AioWrite::add_write_ops(librados::ObjectWriteOperation wr) { wr.set_alloc_hint(m_ictx-get_object_size(), m_ictx-get_object_size()); wr.write(m_object_off, m_write_data); } According above, the arguments for set_alloc_hint

Re: CEPH IOPS Baseline Measurements with MemStore

2014-06-24 Thread Haomai Wang
I would like to say that MemStore isn't a good backend for evaluating performance, it's just a prototype for ObjectStore. On Tue, Jun 24, 2014 at 8:13 PM, Andreas Joachim Peters andreas.joachim.pet...@cern.ch wrote: I made the same MemStore measurements with the master branch. It seems that the

Re: [RFC] add rocksdb support

2014-06-27 Thread Haomai Wang
As I mentioned days ago: There exists two points related kvstore perf: 1. The order of image and the strip size are important to performance. Because the header like inode in fs is much lightweight than fd, so the order of image is expected to be lower. And strip size can be configurated to 4kb

Re: how to improve read through on sparse files (vm image file)

2014-06-27 Thread Haomai Wang
you need to pay attention to fiemap option, ceph --show-config | grep fiemap There exists some tricks about fiemap, please search it at tracker.ceph.com On Fri, Jun 27, 2014 at 12:27 AM, huang jun hjwsm1...@gmail.com wrote: hi,all We migrate vmware EXSI vm from ceph cluster via NFS to local

Re: [RFC] add rocksdb support

2014-07-01 Thread Haomai Wang
understand the reason for this lock and whether it can be replaced with a RWLock or any other suggestions to avoid serialization due to this lock? Thanks, Sushma -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Friday, June 27, 2014 1:08 AM To: Sushma

Re: [RFC] add rocksdb support

2014-07-01 Thread Haomai Wang
Of Haomai Wang Sent: Monday, June 30, 2014 11:10 PM To: Sushma Gurram Cc: Shu, Xinxin; Mark Nelson; Sage Weil; Zhang, Jian; ceph-devel@vger.kernel.org Subject: Re: [RFC] add rocksdb support Hi Sushma, Thanks for your investigations! We already noticed the serializing risk on GenericObjectMap

Re: [RFC] add rocksdb support

2014-07-01 Thread Haomai Wang
DBObjectMap header cache, if enable header cache, is the header_lock still be a awful point? Same as KeyValueStore, I will try to see too. Thanks, Sushma -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Tuesday, July 01, 2014 1:06 AM To: Somnath Roy Cc: Sushma

Re: [RFC] add rocksdb support

2014-07-02 Thread Haomai Wang
implementation). Thanks, Sushma -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Tuesday, July 01, 2014 10:03 AM To: Sushma Gurram Cc: Somnath Roy; Shu, Xinxin; Mark Nelson; Sage Weil; Zhang, Jian; ceph-devel@vger.kernel.org Subject: Re: [RFC] add rocksdb

Re: [ceph-users] Why is librbd1 / librados2 from Firefly 20% slower than the one from dumpling?

2014-07-02 Thread Haomai Wang
Could you give some perf counter from rbd client side? Such as op latency? On Wed, Jul 2, 2014 at 9:01 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag wrote: Am 02.07.2014 00:51, schrieb Gregory Farnum: On Thu, Jun 26, 2014 at 11:49 PM, Stefan Priebe - Profihost AG s.pri...@profihost.ag

Re: [RFC] add rocksdb support

2014-07-02 Thread Haomai Wang
exclusive lock , there maybe some unsafe scenarios . I am not sure whether my understanding is right ? if my understanding is right , I think RWlock or a fine-grain lock is a good suggestion. -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Tuesday, July 01, 2014

Re: [Feature]Proposal for adding a new flag named shared to support performance and statistic purpose

2014-07-14 Thread Haomai Wang
down anywhere, Josh? -Greg On Tue, Jun 10, 2014 at 12:38 PM, Josh Durgin josh.dur...@inktank.com wrote: On Tue, 10 Jun 2014 14:52:54 +0800 Haomai Wang haomaiw...@gmail.com wrote: Thanks, Josh! Your points are really helpful. Maybe we can schedule this bp to the near CDS? The implementation I

Re: Any concern about Ceph on CentOS

2013-07-17 Thread Haomai Wang
Hi Kasper, Can you talk about how you make use of Ceph and detail information on CentOS? I guess that you use the CephFS on Ceph Cluster? Best regards, Wheats 在 2013-7-17,下午2:16,Kasper Dieter dieter.kas...@ts.fujitsu.com 写道: Hi Xiaoxi, we are really running Ceph on CentOS-6.4 (6 server

Blueprint: Add LevelDB support to ceph cluster backend store

2013-07-30 Thread Haomai Wang
Every node of ceph cluster has a backend filesystem such as btrfs, xfs and ext4 that provides storage for data objects, whose location are determined by CRUSH algorithm. There should exists an abstract interface sitting between osd and backend store, allowing different backend store

Re: wip-memstore and wip-objectstore

2014-07-22 Thread Haomai Wang
Thanks, I will dive into it and fix it next. On Tue, Jul 22, 2014 at 11:49 PM, Sage Weil sw...@redhat.com wrote: Hi Haomai, Hmm, one other thing: I'm testing the fix in wip-8701 and it is tripping over the KeyValueStore test. This ./ceph_test_objectstore

Re: wip-memstore and wip-objectstore

2014-07-22 Thread Haomai Wang
Hi sage, The fix is https://github.com/ceph/ceph/pull/2136. :-) On Wed, Jul 23, 2014 at 12:48 AM, Haomai Wang haomaiw...@gmail.com wrote: Thanks, I will dive into it and fix it next. On Tue, Jul 22, 2014 at 11:49 PM, Sage Weil sw...@redhat.com wrote: Hi Haomai, Hmm, one other thing: I'm

Re: KeyFileStore ?

2014-07-31 Thread Haomai Wang
Awesome job! On Thu, Jul 31, 2014 at 1:49 PM, Mark Kirkwood mark.kirkw...@catalyst.net.nz wrote: On 31/07/14 17:25, Sage Weil wrote: After the latest set of bug fixes to the FileStore file naming code I am newly inspired to replace it with something less complex. Right now I'm mostly

[RFC]About ImageIndex

2014-08-11 Thread Haomai Wang
Hi Sage, Josh: ImageIndex is aimed to hold each object's location info which avoid extra checking for none-existing object. It's only used when image flags exists LIBRBD_CREATE_NONSHARED. Otherwise, ImageIndex will become gawp and has no effect. Each object has three state: 1. UNKNOWN: default

Re: [RFC]About ImageIndex

2014-08-13 Thread Haomai Wang
On Wed, Aug 13, 2014 at 10:14 AM, Josh Durgin josh.dur...@inktank.com wrote: On 08/11/2014 07:50 PM, Haomai Wang wrote: Hi Sage, Josh: ImageIndex is aimed to hold each object's location info which avoid extra checking for none-existing object. It's only used when image flags exists

Re: [RFC]About ImageIndex

2014-08-13 Thread Haomai Wang
, 2014 7:45 AM To: Haomai Wang; Sage Weil Cc: ceph-devel@vger.kernel.org Subject: Re: [RFC]About ImageIndex On 08/11/2014 07:50 PM, Haomai Wang wrote: Hi Sage, Josh: ImageIndex is aimed to hold each object's location info which avoid extra checking for none-existing object. It's only used

Re: [ceph-users] ceph osd unexpected error

2014-09-06 Thread Haomai Wang
Hi, Could you give some more detail infos such as operation before occur errors? And what's your ceph version? On Fri, Sep 5, 2014 at 3:16 PM, 廖建锋 de...@f-club.cn wrote: Dear CEPH , Urgent question, I met a FAILED assert(0 == unexpected error) yesterday , Now i have not way to start

Re: 答复: [ceph-users] ceph osd unexpected error

2014-09-06 Thread Haomai Wang
? 发件人: Somnath Roy [somnath@sandisk.com] 发送时间: 2014年9月7日 1:12 到: Haomai Wang; 廖建锋 Cc: ceph-users; ceph-devel 主题: RE: [ceph-users] ceph osd unexpected error Have you set the open file descriptor limit in the OSD node ? Try setting it like 'ulimit -n 65536 -Original Message- From

Re: Regarding key/value interface

2014-09-11 Thread Haomai Wang
tell more about your key/value interface. I'm doing some jobs for NVMe interface with intel NVMe SSD. Thanks Regards Somnath -Original Message- From: Sage Weil [mailto:sw...@redhat.com] Sent: Thursday, September 11, 2014 6:31 PM To: Somnath Roy Cc: Haomai Wang (haomaiw

[RFC]New Message Implementation Based on Event

2014-09-11 Thread Haomai Wang
Hi all, Recently, I did some basic work on new message implementation based on event(https://github.com/yuyuyu101/ceph/tree/msg-event). The basic idea is that we use a Processor thread for each Messenger to monitor all sockets and dispatch fd to threadpool. The event mechanism can be epoll,

Re: [RFC]New Message Implementation Based on Event

2014-09-15 Thread Haomai Wang
at 11:51 PM, Sage Weil sw...@redhat.com wrote: Hi Haomai, On Fri, 12 Sep 2014, Haomai Wang wrote: Hi all, Recently, I did some basic work on new message implementation based on event(https://github.com/yuyuyu101/ceph/tree/msg-event). The basic idea is that we use a Processor thread for each

Re: severe librbd performance degradation in Giant

2014-09-17 Thread Haomai Wang
According http://tracker.ceph.com/issues/9513, do you mean that rbd cache will make 10x performance degradation for random read? On Thu, Sep 18, 2014 at 7:44 AM, Somnath Roy somnath@sandisk.com wrote: Josh/Sage, I should mention that even after turning off rbd cache I am getting ~20%

Re: Impact of page cache on OSD read performance for SSD

2014-09-23 Thread Haomai Wang
Good point, but do you have considered that the impaction for write ops? And if skip page cache, FileStore is responsible for data cache? On Wed, Sep 24, 2014 at 3:29 AM, Sage Weil sw...@redhat.com wrote: On Tue, 23 Sep 2014, Somnath Roy wrote: Milosz, Thanks for the response. I will see if I

Re: Impact of page cache on OSD read performance for SSD

2014-09-23 Thread Haomai Wang
. Thanks Regards Somnath -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Tuesday, September 23, 2014 7:07 PM To: Sage Weil Cc: Somnath Roy; Milosz Tanski; ceph-devel@vger.kernel.org Subject: Re: Impact of page cache on OSD read performance for SSD Good

Re: Impact of page cache on OSD read performance for SSD

2014-09-24 Thread Haomai Wang
On Wed, Sep 24, 2014 at 8:38 PM, Sage Weil sw...@redhat.com wrote: On Wed, 24 Sep 2014, Haomai Wang wrote: I agree with that direct read will help for disk read. But if read data is hot and small enough to fit in memory, page cache is a good place to hold data cache. If discard page cache, we

Re: Impact of page cache on OSD read performance for SSD

2014-09-24 Thread Haomai Wang
direct_io read option (Need to quantify direct_io write also) as Sage suggested. Thanks Regards Somnath -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Wednesday, September 24, 2014 9:06 AM To: Sage Weil Cc: Somnath Roy; Milosz Tanski; ceph-devel

Re: Weekly performance meeting

2014-09-25 Thread Haomai Wang
) On Fri, Sep 26, 2014 at 10:27 AM, Haomai Wang haomaiw...@gmail.com wrote: Thanks for sage! I'm on the flight at Oct 1. :-( Now my team is mainly worked on the performance of ceph, we have observed these points: 1. encode/decode plays remarkable latency, especially in ObjectStore::Transaction

Re: ceph-disk vs keyvaluestore

2014-09-29 Thread Haomai Wang
No problem, I would like do it. On Tue, Sep 30, 2014 at 7:48 AM, Sage Weil sw...@redhat.com wrote: Hi Haomai, Not sure if you saw http://tracker.ceph.com/issues/9580 which came from an issue Mark had getting ceph-disk to work with the keyvaluestore backend. I think the answer is

Re: ceph-disk vs keyvaluestore

2014-09-29 Thread Haomai Wang
Hi sage, What do you think use existing ObjectStore::peek_journal_fsid interface to detect whether journal needed. KeyValueStore and MemStore could set passing argument fsid to zero to indicate no journal. On Tue, Sep 30, 2014 at 10:23 AM, Haomai Wang haomaiw...@gmail.com wrote: No problem, I

Re: Weekly Ceph Performance Meeting Invitation

2014-10-01 Thread Haomai Wang
Thanks for Mark! It's a pity that I can't join it while on the flight. I hope I have time to view the video(exists?). As a reminder, AsyncMessenger(https://github.com/yuyuyu101/ceph/tree/msg-event-worker-mode) is ready to test for developer. For io depth 1(4k randwrite), AsyncMessenger saves

Re: ceph-disk vs keyvaluestore

2014-10-01 Thread Haomai Wang
On Thu, Oct 2, 2014 at 7:53 AM, Sage Weil sw...@redhat.com wrote: On Thu, 2 Oct 2014, Mark Kirkwood wrote: On 30/09/14 17:05, Sage Weil wrote: On Tue, 30 Sep 2014, Haomai Wang wrote: Hi sage, What do you think use existing ObjectStore::peek_journal_fsid interface to detect whether

Re: Regarding key/value interface

2014-10-02 Thread Haomai Wang
Correctly, maybe we can move these super metadata to backend! On Fri, Oct 3, 2014 at 6:47 AM, Somnath Roy somnath@sandisk.com wrote: Hi Sage/Haomai, I was going through the key/value store implementation and have one basic question regarding the way it is designed. I think key/value

Re: The Async messenger benchmark with latest master

2014-10-18 Thread Haomai Wang
Thanks Somnath! I have another simple performance test for async messenger: For 4k object read, master branch used 4.46s to complete tests, async Messenger used 3.14s For 4k object write, master branch used 10.6s to complete, async Messenger used 6.6s!! Detailf results see below, 4k object read

Re: Ceph Full-SSD Performance Improvement

2014-10-21 Thread Haomai Wang
[cc to ceph-devel] On Tue, Oct 21, 2014 at 11:51 PM, Sage Weil s...@newdream.net wrote: Hi Haomai, You and your team have been doing great work and I'm very happy that you are working with Ceph! The performance gains you've seen are very encouraging. 1. Use AsyncMessenger for both

Re: The Async messenger benchmark with latest master

2014-10-22 Thread Haomai Wang
? OR I need to add some config option for that ? 3. Other than ms_event_op_threads , is there any tunable parameter I should be playing with ? Thanks Regards Somnath -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Saturday, October 18, 2014 10:15 PM

Re: The Async messenger benchmark with latest master

2014-10-22 Thread Haomai Wang
with the client which is not using this async messenger for example krbd ? Regards Somnath -Original Message- From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Wednesday, October 22, 2014 10:02 AM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: The Async messenger benchmark

Re: 10/22/2014 Weekly Ceph Performance Meeting

2014-10-23 Thread Haomai Wang
https://wiki.ceph.com/Planning/Blueprints/Submissions/Fixed_memory_layout_for_Message%2F%2FOp_passing @sage, thanks On Thu, Oct 23, 2014 at 12:24 AM, Sage Weil s...@newdream.net wrote: Hi Everyone, Just a reminder that there won't be a performance call next week because of CDS, which is

Re: optimizing buffers, encode/decode

2014-10-29 Thread Haomai Wang
I'm interested in bufferlist's own encode/decode performance. But as I performed until now, I think we need to consider change caller's behavior to get better performance. Combined (https://wiki.ceph.com/Planning/Blueprints/Hammer/Fixed_memory_layout_for_Message%2F%2FOp_passing) with

Re: [ceph-users] Micro Ceph and OpenStack Design Summit November 3rd, 2014 11:40am

2014-10-30 Thread Haomai Wang
Thanks for Loic! I will join. On Thu, Oct 30, 2014 at 1:54 AM, Loic Dachary l...@dachary.org wrote: Hi Ceph, TL;DR: Register for the Micro Ceph and OpenStack Design Summit November 3rd, 2014 11:40am http://kilodesignsummit.sched.org/event/f2e49f4547a757cc3d51f5641b2000cb November 3rd,

Re: krbd blk-mq support ?

2014-10-30 Thread Haomai Wang
Could you describe more about 2x7 iops? So you mean 8 OSD each backend with SSD can achieve with 14w iops? is it read or write? could you give fio options? On Fri, Oct 31, 2014 at 12:01 AM, Alexandre DERUMIER aderum...@odiso.com wrote: I'll try to add more OSD next week, if it's scale it's a

Re: The Async messenger benchmark with latest master

2014-11-04 Thread Haomai Wang
an instance of 'ceph::FailedAssertion' Abandon - Mail original - De: Haomai Wang haomaiw...@gmail.com À: Alexandre DERUMIER aderum...@odiso.com Cc: Somnath Roy somnath@sandisk.com, ceph-devel@vger.kernel.org, Sage Weil s...@newdream.net Envoyé: Mardi 4 Novembre 2014 10:29:46 Objet: Re

Re: KeyValueStore: Some bug fixing in KeyValueStore prevent osd runtime crash #2875

2014-11-08 Thread Haomai Wang
Hi Chendi, It seemed that you find two bugs for KeyValueStore: 1. Potential race conflict with strip_header-buffers: strip_header is owner by a thread who want to access header. Now KeyValueStore in order to avoid lock bottleneck, only use Sequencer-level(much like PG) to solve concurrent

Re: ObjectStore collections

2014-11-09 Thread Haomai Wang
On Sun, Nov 9, 2014 at 5:59 AM, Sage Weil s...@newdream.net wrote: On Sat, 8 Nov 2014, Haomai Wang wrote: As for OOM, I think the root cause is the mistake commit above too. Because meta collection will be updated each transaction and StripObjectHeader::buffers will be always kept in memory

Re: question on OSDService::infos_oid

2014-11-12 Thread Haomai Wang
On Wed, Nov 12, 2014 at 9:51 PM, Sage Weil sw...@redhat.com wrote: On Wed, 12 Nov 2014, xinxin shu wrote: recently we focus on 4k random write , dump transaction on every 4k random write op , found that , for every 4k random write , it will update pg epoch and pg info on OSDService::infos_oid

Re: client cpu usage : kbrd vs librbd perf report

2014-11-13 Thread Haomai Wang
Hmm, I think it's a good perf topic to discuss about buffer alloc/dealloc. For example, maybe frequency alloced object can use memory pool(each pool stores the same objects), but the most challenge to this is also STL structures. On Fri, Nov 14, 2014 at 1:05 AM, Mark Nelson

Re: [ceph-users] fiemap bug on giant

2014-11-24 Thread Haomai Wang
It's surprised that the test machine run on a very new kernel but occur this problem: plana 47 is 12.04.5 with kernel 3.18.0-rc6-ceph-00024-geb0e5fd plana 50 is 12.04.4 with kernel 3.17.0-rc6-ceph-2-ge8acad6 Which local filesystem is ran on? On Tue, Nov 25, 2014 at 5:03 AM, Samuel Just

Re: [ceph-users] fiemap bug on giant

2014-11-24 Thread Haomai Wang
Oh, sorry. This series seemed that already backported... On Tue, Nov 25, 2014 at 10:14 AM, Haomai Wang haomaiw...@gmail.com wrote: Backport this series(https://github.com/yuandong1222/ceph/commit/3f8fb85b341726ae7bb44b4c699707333cd63ccd) and test again ? Or Actually, my up to ten ceph

Re: [ceph-users] fiemap bug on giant

2014-11-24 Thread Haomai Wang
think the bug is actually with FileStore::_do_sparse_copy_range, which isn't truncating out properly to create 0 filled sections. -Sam On Mon, Nov 24, 2014 at 6:17 PM, Haomai Wang haomaiw...@gmail.com wrote: Oh, sorry. This series seemed that already backported... On Tue, Nov 25, 2014 at 10:14

Lock Constrains about Fast Dispatch for Messenger

2014-12-02 Thread Haomai Wang
Hi Gregoy and Sage, I'm just writing Messenger's unit tests to ensure that SimpleMessenger and AsyncMessenger behavior same and expected. I think the most unclear thing is the lock rule of fast dispatch. When introduced fast dispatch, there exists three methods ms_fast_connect, ms_fast_accept,

Re: Lock Constrains about Fast Dispatch for Messenger

2014-12-02 Thread Haomai Wang
supplement On Tue, Dec 2, 2014 at 4:27 PM, Haomai Wang haomaiw...@gmail.com wrote: Hi Gregoy and Sage, I'm just writing Messenger's unit tests to ensure that SimpleMessenger and AsyncMessenger behavior same and expected. I think the most unclear thing is the lock rule of fast dispatch. When

  1   2   3   >