Re: ack vs commit

2015-12-03 Thread Haomai Wang
On Fri, Dec 4, 2015 at 9:39 AM, Gregory Farnum wrote: > On Thu, Dec 3, 2015 at 4:54 PM, Sage Weil wrote: >> From the beginning Ceph has had two kinds of acks for rados write/update >> operations: ack (indicating the operation is accepted, serialized, and >> staged in the osd's buffer cache) and c

Re: Ceph_objectstore_bench crashed on keyvaluestore bench with ceph master branch

2015-12-02 Thread Haomai Wang
thanks! Fixed in https://github.com/ceph/ceph/pull/6783. plz review On Thu, Dec 3, 2015 at 3:19 AM, James (Fei) Liu-SSI wrote: > Hi Haomai, >I happened to run ceph_objectstore_bench against key value store on master > branch. It always crashed at finisher_thread_entry : > assert(!ls_rval.e

Re: queue_transaction interface + unique_ptr

2015-12-02 Thread Haomai Wang
On Thu, Dec 3, 2015 at 8:17 AM, Somnath Roy wrote: > Hi Sage/Sam, > As discussed in today's performance meeting , I am planning to change the > queue_transactions() interface to the following. > > int queue_transactions(Sequencer *osr, list& tls, > Context *onreadable,

Re: Compiling for FreeBSD

2015-11-29 Thread Haomai Wang
On Mon, Nov 30, 2015 at 1:44 AM, Willem Jan Withagen wrote: > Hi, > > Not unlike many others running FreeBSD I'd like to see if I/we can get > Ceph to build and run on FreeBSD. If not all components than at least > certain components. > > With compilation I do get quite some way, even with the CLA

Re: why my cluster become unavailable (min_size of pool)

2015-11-26 Thread Haomai Wang
s! > > > -- > hzwulibin > 2015-11-26 > > --------- > 发件人:"hzwulibin" > 发送日期:2015-11-23 09:00 > 收件人:Sage Weil,Haomai Wang > 抄送:ceph-devel > 主题:Re: why my cluster become unavailable >

Re: 答复: journal alignment

2015-11-23 Thread Haomai Wang
, 2015 at 9:29 PM, Haomai Wang wrote: > On Fri, Nov 20, 2015 at 9:08 PM, Sage Weil wrote: >> On Fri, 20 Nov 2015, Haomai Wang wrote: >>> On Fri, Nov 20, 2015 at 7:41 PM, Sage Weil wrote: >>> > On Fri, 20 Nov 2015, changtao381 wrote: >>> >> Hi All, >>

Re: why my cluster become unavailable

2015-11-21 Thread Haomai Wang
On Thu, Nov 19, 2015 at 11:26 PM, Libin Wu wrote: > Hi, cepher > > I have a cluster of 6 OSD server, every server has 8 OSDs. > > I out 4 OSDs on every server, then my client io is blocking. > > I reboot my client and then create a new rbd device, but the new > device also can't write io. > > Yeah

Re: 答复: journal alignment

2015-11-20 Thread Haomai Wang
On Fri, Nov 20, 2015 at 9:08 PM, Sage Weil wrote: > On Fri, 20 Nov 2015, Haomai Wang wrote: >> On Fri, Nov 20, 2015 at 7:41 PM, Sage Weil wrote: >> > On Fri, 20 Nov 2015, changtao381 wrote: >> >> Hi All, >> >> >> >> Thanks for you apply!

Re: 答复: journal alignment

2015-11-20 Thread Haomai Wang
On Fri, Nov 20, 2015 at 7:41 PM, Sage Weil wrote: > On Fri, 20 Nov 2015, changtao381 wrote: >> Hi All, >> >> Thanks for you apply! >> >> If directioIO + async IO requirement that alignment, it shouldn't aligned by >> PAGE for each journal entry. >> For it may write many entries of journal once ti

Re: journal alignment

2015-11-20 Thread Haomai Wang
On Fri, Nov 20, 2015 at 4:33 PM, changtao381 wrote: > HI All, > > Why it is needed an entry of journal t is aligned by CEPH_PAGE_MASK ? For > it causes the data of journal write are amplified by 2X for small io > linux aio/dio required this > For example write io size 4096 bytes, it may write

Re: Request for Comments: Weighted Round Robin OP Queue

2015-11-09 Thread Haomai Wang
On Tue, Nov 10, 2015 at 2:19 AM, Samuel Just wrote: > Ops are hashed from the messenger (or any of the other enqueue sources > for non-message items) into one of N queues, each of which is serviced > by M threads. We can't quite have a single thread own a single queue > yet because the current de

Re: ceph encoding optimization

2015-11-07 Thread Haomai Wang
Hi sage, Could we know about your progress to refactor MSubOP and hobject_t, pg_stat_t decode problem? We could work on this based on your work if any. On Thu, Nov 5, 2015 at 1:29 AM, Haomai Wang wrote: > On Thu, Nov 5, 2015 at 1:19 AM, piotr.da...@ts.fujitsu.com > wrote: >>>

Re: ceph encoding optimization

2015-11-04 Thread Haomai Wang
On Thu, Nov 5, 2015 at 1:19 AM, piotr.da...@ts.fujitsu.com wrote: >> -Original Message- >> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- >> ow...@vger.kernel.org] On Behalf Of ??? >> Sent: Wednesday, November 04, 2015 4:34 PM >> To: Gregory Farnum >> Cc: ceph-devel@vger.kernel

Re: Re: [ceph-users] Understanding the number of TCP connections between clients and OSDs

2015-10-26 Thread Haomai Wang
On Tue, Oct 27, 2015 at 9:12 AM, hzwulibin wrote: > Hi, develops > > I also concerns about this problem. And my problem is how many threads will > the qemu-system-x86 has? > When will it cut down the threads? It's because of network model, each connection will has two threads. We are actually wo

Re: [ceph-users] Write performance issue under rocksdb kvstore

2015-10-20 Thread Haomai Wang
Actually keyvaluestore would submit transaction with sync flag too(rely to keyvaluedb impl journal/logfile). Yes, if we disable sync flag, keyvaluestore's performance will increase a lot. But we dont provide with this option now On Tue, Oct 20, 2015 at 9:22 PM, Z Zhang wrote: > Thanks, Sage, for

Re: [ceph-users] Write performance issue under rocksdb kvstore

2015-10-20 Thread Haomai Wang
On Tue, Oct 20, 2015 at 8:47 PM, Sage Weil wrote: > On Tue, 20 Oct 2015, Z Zhang wrote: >> Hi Guys, >> >> I am trying latest ceph-9.1.0 with rocksdb 4.1 and ceph-9.0.3 with >> rocksdb 3.11 as OSD backend. I use rbd to test performance and following >> is my cluster info. >> >> [ceph@xxx ~]$ ceph -

Re: newstore direction

2015-10-19 Thread Haomai Wang
On Tue, Oct 20, 2015 at 3:49 AM, Sage Weil wrote: > The current design is based on two simple ideas: > > 1) a key/value interface is better way to manage all of our internal > metadata (object metadata, attrs, layout, collection membership, > write-ahead logging, overlay data, etc.) > > 2) a fil

Re: XFS xattr limit and Ceph

2015-10-15 Thread Haomai Wang
xfs has three store types for xattrs like inline, btree and extents. We only want to let xattr stored inline, so it won't need to hit disk. So we need to limit the number of xattrs On Thu, Oct 15, 2015 at 10:54 PM, Somnath Roy wrote: > Sage, > Why we are using XFS max inline xattr value as 10 onl

Re: [ceph-users] Potential OSD deadlock?

2015-10-13 Thread Haomai Wang
On Wed, Oct 14, 2015 at 1:03 AM, Sage Weil wrote: > On Mon, 12 Oct 2015, Robert LeBlanc wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA256 >> >> After a weekend, I'm ready to hit this from a different direction. >> >> I replicated the issue with Firefly so it doesn't seem an issue that >

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Haomai Wang
gt; > Regards > Somnath > > > -----Original Message- > From: Haomai Wang [mailto:haomaiw...@gmail.com] > Sent: Monday, October 12, 2015 11:35 PM > To: Somnath Roy > Cc: Mark Nelson; ceph-devel; ceph-us...@lists.ceph.com > Subject: Re: [ceph-users] Initial performance cluste

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Haomai Wang
s out of messenger thread. > > Could you please send out any documentation around Async messenger ? I tried > to google it , but, not even blueprint is popping up. > > > > > > Thanks & Regards > > Somnath > > From: ceph-users [mailto:ceph-users-boun...@lists

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Haomai Wang
resend On Tue, Oct 13, 2015 at 10:56 AM, Haomai Wang wrote: > COOL > > Interesting that async messenger will consume more memory than simple, in my > mind I always think async should use less memory. I will give a look at this > > On Tue, Oct 13, 2015 at 12:50 AM, Mark Nelso

Re: wip-addr

2015-10-09 Thread Haomai Wang
resend to ML On Sat, Oct 10, 2015 at 11:20 AM, Haomai Wang wrote: > > > On Sat, Oct 10, 2015 at 5:49 AM, Sage Weil wrote: >> >> Hey Marcus, >> >> On Fri, 2 Oct 2015, Marcus Watts wrote: >> > wip-addr >> > >> > 1. where is it? >&g

Re: advice on indexing sequential data?

2015-10-01 Thread Haomai Wang
resend On Thu, Oct 1, 2015 at 7:56 PM, Haomai Wang wrote: > > > On Thu, Oct 1, 2015 at 6:44 PM, Tom Nakamura wrote: >> >> Hello ceph-devel, >> >> My lab is concerned with developing data mining application for >> detecting and 'deanonymizing'

Re: About Fio backend with ObjectStore API

2015-09-12 Thread Haomai Wang
It's really cool. Do you prepare to push to upstream? I think it should be more convenient if we make fio repo as submodule. On Sat, Sep 12, 2015 at 5:04 PM, Haomai Wang wrote: > I found my problem why segment: > > because fio links librbd/librados from my /usr/local/lib but

Re: About Fio backend with ObjectStore API

2015-09-12 Thread Haomai Wang
t; > I just looked back at the results you posted, and saw that you were using > iodepth=1. Setting this higher should help keep the FileStore busy. > > Casey > > - Original Message - >> From: "James (Fei) Liu-SSI" >> To: "Casey Bodley" &

Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3

2015-09-11 Thread Haomai Wang
On Fri, Sep 11, 2015 at 10:09 PM, Sage Weil wrote: > On Fri, 11 Sep 2015, Haomai Wang wrote: >> On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil wrote: >> On Fri, 11 Sep 2015, ?? wrote: >> > Thank Sage Weil: >> > >> > 1. I delete so

Re: Failed on starting osd-daemon after upgrade giant-0.87.1 tohammer-0.94.3

2015-09-11 Thread Haomai Wang
Yesterday I have a chat with wangrui and the reason is "infos"(legacy oid) is missing. I'm not sure why it's missing. PS: resend again because of plain text On Fri, Sep 11, 2015 at 8:56 PM, Sage Weil wrote: > On Fri, 11 Sep 2015, ?? wrote: >> Thank Sage Weil: >> >> 1. I delete some testing pools

Re: [NewStore]About PGLog Workload With RocksDB

2015-09-08 Thread Haomai Wang
On Tue, Sep 8, 2015 at 10:12 PM, Gregory Farnum wrote: > On Tue, Sep 8, 2015 at 3:06 PM, Haomai Wang wrote: >> Hit "Send" by accident for previous mail. :-( >> >> some points about pglog: >> 1. short-alive but frequency(HIGH) > > Is this really true? Th

Re: [NewStore]About PGLog Workload With RocksDB

2015-09-08 Thread Haomai Wang
other omap keys. 5. a simple loopback impl is efficient and simple On Tue, Sep 8, 2015 at 9:58 PM, Haomai Wang wrote: > Hi Sage, > > I notice your post in rocksdb page about make rocksdb aware of short > alive key/value pairs. > > I think it would be great if one keyvalue db i

[NewStore]About PGLog Workload With RocksDB

2015-09-08 Thread Haomai Wang
Hi Sage, I notice your post in rocksdb page about make rocksdb aware of short alive key/value pairs. I think it would be great if one keyvalue db impl could support different key types with different store behaviors. But it looks like difficult for me to add this feature to an existing db. So co

Re: About Fio backend with ObjectStore API

2015-09-04 Thread Haomai Wang
ing without any problems. I > wonder if there's a problem with the autotools build? I've only tested it > with cmake. When I find some time, I'll rebase it on master and do another > round of testing. > > Casey > > - Original Message - >> From:

Re: wakeup( ) in async messenger‘ event

2015-08-28 Thread Haomai Wang
On Fri, Aug 28, 2015 at 2:35 PM, Jianhui Yuan wrote: > Hi Haomai, > > when we use async messenger, the client(as: ceph -s) always stuck in > WorkerPool::barrier for 30 seconds. It seems the wakeup don't work. What's the ceph version and os version? It should be a bug we already fixed before. > >

Re: async messenger

2015-08-27 Thread Haomai Wang
I ran a random job(http://pulpito.ceph.com/haomai-2015-08-27_04:09:28-rados-master-distro-basic-multi/) and the result seemed ok to me. On Tue, Aug 25, 2015 at 9:22 AM, Haomai Wang wrote: > On Tue, Aug 25, 2015 at 5:28 AM, Sage Weil wrote: >> Hi Haomai, >> >> How did y

Re: support of non-block connect in async messenger?

2015-08-27 Thread Haomai Wang
On Thu, Aug 27, 2015 at 3:47 PM, Jianhui Yuan wrote: > Hi Haomai, > > In my environment, I suffer from long timeout when connect a breakdown node. > So I write some code to support non-block connect in async . And It seems to > be working well. So, I want to know if non-block connect in async may

Re: format 2TB rbd device is too slow

2015-08-26 Thread Haomai Wang
On Wed, Aug 26, 2015 at 11:16 PM, huang jun wrote: > hi,all > we create a 2TB rbd image, after map it to local, > then we format it to xfs with 'mkfs.xfs /dev/rbd0', it spent 318 > seconds to finish, but local physical disk with the same size just > need 6 seconds. > I think librbd have two PR r

Re: async messenger

2015-08-24 Thread Haomai Wang
On Tue, Aug 25, 2015 at 5:28 AM, Sage Weil wrote: > Hi Haomai, > > How did your most recent async messenger run go? > > If there aren't major issues, we'd like to start mixing it in with the > regular rados suite by doing 'ms type = random'... >From last run, we have no async related failed jobs.

Re: Inline dedup/compression

2015-08-20 Thread Haomai Wang
sorry, should be this blog(http://mysqlserverteam.com/innodb-transparent-page-compression/) On Fri, Aug 21, 2015 at 10:51 AM, Haomai Wang wrote: > I found a > blog(http://mysqlserverteam.com/innodb-transparent-pageio-compression/) > about mysql innodb transparent compression. It&#x

Re: Inline dedup/compression

2015-08-20 Thread Haomai Wang
030| M: +1 408 780 6416 > allen.samu...@sandisk.com > > > -Original Message- > From: Chaitanya Huilgol > Sent: Thursday, July 02, 2015 3:50 AM > To: James (Fei) Liu-SSI; Allen Samuels; Haomai Wang > Cc: ceph-devel > Subject: RE: Inline dedup/compression > &g

Re: Ceph Hackathon: More Memory Allocator Testing

2015-08-20 Thread Haomai Wang
On Thu, Aug 20, 2015 at 2:35 PM, Dałek, Piotr wrote: >> -Original Message- >> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- >> ow...@vger.kernel.org] On Behalf Of Blinick, Stephen L >> Sent: Wednesday, August 19, 2015 6:58 PM >> >> [.. >> Regarding the all-HDD or high density

Re: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Haomai Wang
On Wed, Aug 19, 2015 at 1:36 PM, Somnath Roy wrote: > Mark, > Thanks for verifying this. Nice report ! > Since there is a big difference in memory consumption with jemalloc, I would > say a recovery performance data or client performance data during recovery > would be helpful. > The RSS memory

Re: infernalis feature freeze

2015-08-12 Thread Haomai Wang
I hope this PR could be pushed to I :-) It seemed waited so long. https://github.com/ceph/ceph/pull/3595 On Thu, Aug 13, 2015 at 5:20 AM, Sage Weil wrote: > The infernalis feature freeze is coming up Real Soon Now. I've marked > some of the pull requests on github that I would like to see merged

Re: Async reads, sync writes, op thread model discussion

2015-08-12 Thread Haomai Wang
map > > Thanks & Regards > Somnath > > -Original Message- > From: ceph-devel-ow...@vger.kernel.org > [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Haomai Wang > Sent: Tuesday, August 11, 2015 7:50 PM > To: Yehuda Sadeh-Weinraub > Cc: Samuel Just;

Re: bufferlist allocation optimization ideas

2015-08-11 Thread Haomai Wang
Sure, so we could introduce it to async messenger. We could create a buffer pool, then bufferlist's api could use passing in buffer pool to alloc memory. On Wed, Aug 12, 2015 at 12:43 PM, Dałek, Piotr wrote: >> -Original Message- >> From: Haomai Wang [mailto:haomaiw...@g

Re: bufferlist allocation optimization ideas

2015-08-11 Thread Haomai Wang
On Wed, Aug 12, 2015 at 5:48 AM, Dałek, Piotr wrote: >> -Original Message- >> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- >> ow...@vger.kernel.org] On Behalf Of Sage Weil >> Sent: Tuesday, August 11, 2015 10:11 PM >> >> I went ahead and implemented both of these pieces. See

Re: Async reads, sync writes, op thread model discussion

2015-08-11 Thread Haomai Wang
On Wed, Aug 12, 2015 at 6:34 AM, Yehuda Sadeh-Weinraub wrote: > Already mentioned it on irc, adding to ceph-devel for the sake of > completeness. I did some infrastructure work for rgw and it seems (at > least to me) that it could at least be partially useful here. > Basically it's an async execut

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
lly on first attempt > and others on subsequent service restarts)! > > [1] - http://paste.openstack.org/show/411161/ > [2] - http://paste.openstack.org/show/411162/ > [3] - http://tracker.ceph.com/issues/9768 > > Regards, > Unmesh G. > IRC: unmeshg > >> -Original Me

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
.openstack.org/show/411139/ > > Regards, > Unmesh G. > IRC: unmeshg > >> -Original Message- >> From: Haomai Wang [mailto:haomaiw...@gmail.com] >> Sent: Thursday, August 06, 2015 5:31 PM >> To: Gurjar, Unmesh >> Cc: ceph-devel@vger.kernel.org >&g

Re: OSD sometimes stuck in init phase

2015-08-06 Thread Haomai Wang
Could you print your all thread callback via "thread apply all bt"? On Thu, Aug 6, 2015 at 7:52 PM, Gurjar, Unmesh wrote: > Hi, > > On a Ceph Firefly cluster (version [1]), OSDs are configured to use separate > data and journal disks (using the ceph-disk utility). It is observed, that > few OSD

Re: FileStore should not use syncfs(2)

2015-08-05 Thread Haomai Wang
Agree On Thu, Aug 6, 2015 at 5:38 AM, Somnath Roy wrote: > Thanks Sage for digging down..I was suspecting something similar.. As I > mentioned in today's call, in idle time also syncfs is taking ~60ms. I have > 64 GB of RAM in the system. > The workaround I was talking about today is working p

Re: More ondisk_finisher thread?

2015-08-04 Thread Haomai Wang
It's interesting to see ondisk_finisher will occur 1ms, could you replay this workload and see whether exists read io from iostat. I guess it may help to see the cause. On Wed, Aug 5, 2015 at 12:13 AM, Somnath Roy wrote: > Yes, it has to re-acquire pg_lock today.. > But, between journal write and

Re: Odd QA Test Running

2015-08-03 Thread Haomai Wang
github.com/ceph/ceph-qa-suite/pull/518) and hope fix this point. On Fri, Jul 31, 2015 at 5:50 PM, Haomai Wang wrote: > Hi all, > > I ran a test > suite(http://pulpito.ceph.com/haomai-2015-07-29_11:40:40-rados-master-distro-basic-multi/) > and found the failed jobs are failed

Odd QA Test Running

2015-07-31 Thread Haomai Wang
Hi all, I ran a test suite(http://pulpito.ceph.com/haomai-2015-07-29_11:40:40-rados-master-distro-basic-multi/) and found the failed jobs are failed by "2015-07-29 10:52:35.313197 7f16ae655780 -1 unrecognized ms_type 'async'" Then I found the failed jobs(like http://pulpito.ceph.com/haomai-2015

Re: Ceph write path optimization

2015-07-28 Thread Haomai Wang
On Wed, Jul 29, 2015 at 5:08 AM, Somnath Roy wrote: > Hi, > Eventually, I have a working prototype and able to gather some performance > comparison data with the changes I was talking about in the last performance > meeting. Mark's suggestion of a write up was long pending, so, trying to > summ

Re: About Fio backend with ObjectStore API

2015-07-22 Thread Haomai Wang
calls get_ioengine(). All I can suggest is that you verify that your job > file is pointing to the correct fio_ceph_objectstore.so. If you've made any > other interesting changes to the job file, could you share it here? > > Casey > > - Original Message - > From:

Re: About Fio backend with ObjectStore API

2015-07-21 Thread Haomai Wang
that the read support doesn't appear to work anymore, so give > "rw=write" a try. And because it does a mkfs(), make sure you're > pointing it to an empty xfs directory with the "directory=" option. > > Casey > > On Tue, Jul 14, 2015 at 2:45 AM, H

Re: About Fio backend with ObjectStore API

2015-07-13 Thread Haomai Wang
is not null. Maybe it's related to dlopen? On Fri, Jul 10, 2015 at 3:51 PM, Haomai Wang wrote: > I have rebased the branch with master, and push it to ceph upstream > repo. https://github.com/ceph/ceph/compare/fio-objectstore?expand=1 > > Plz let me know if who is working on t

Re: About Adding eventfd support for LibRBD

2015-07-13 Thread Haomai Wang
m > http://www.redhat.com > > > - Original Message - >> From: "Haomai Wang" >> To: "Josh Durgin" >> Cc: ceph-devel@vger.kernel.org, "Jason Dillaman" >> Sent: Thursday, July 9, 2015 11:16:14 PM >> Subject: Re: About

Re: Patches for review on keyvaluestore

2015-07-10 Thread Haomai Wang
I suggest we could split this PR into plugindb impl and ceph-disk,init-script things. So I think it will be more easier to be merge ready. On Thu, Jul 9, 2015 at 4:31 PM, Varada Kari wrote: > Hi Sage/Sam/Haomai, > > Sent pull requests for two enhancement for key value store. Can you please > rev

Re: About Fio backend with ObjectStore API

2015-07-10 Thread Haomai Wang
ny help from my side. >> >> >> Regards, >> James >> >> >> >> -Original Message- >> From: Casey Bodley [mailto:cbod...@gmail.com] >> Sent: Thursday, July 09, 2015 12:32 PM >> To: James (Fei) Liu-SSI >> Cc: Haomai Wang; ceph

Re: About Adding eventfd support for LibRBD

2015-07-09 Thread Haomai Wang
; will get -EOPNOTSUPP. On Wed, Jul 8, 2015 at 11:46 AM, Haomai Wang wrote: > On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin wrote: >> On 07/07/2015 08:18 AM, Haomai Wang wrote: >>> >>> Hi All, >>> >>> Currently librbd support aio_read/write with specif

Re: About Adding eventfd support for LibRBD

2015-07-07 Thread Haomai Wang
On Wed, Jul 8, 2015 at 11:08 AM, Josh Durgin wrote: > On 07/07/2015 08:18 AM, Haomai Wang wrote: >> >> Hi All, >> >> Currently librbd support aio_read/write with specified >> callback(AioCompletion). It would be nice for simple caller logic, but >> it also

About Adding eventfd support for LibRBD

2015-07-07 Thread Haomai Wang
Hi All, Currently librbd support aio_read/write with specified callback(AioCompletion). It would be nice for simple caller logic, but it also has some problems: 1. Performance bottleneck: Create/Free AioCompletion and librbd internal finisher thread complete "callback" isn't a *very littleweight"

Re: About Fio backend with ObjectStore API

2015-07-07 Thread Haomai Wang
>>> >>> It would be fantastic if folks decided to work on this and got it pushed >>> upstream into fio proper. :D >>> >>> Mark >>> >>> On 06/30/2015 04:19 PM, James (Fei) Liu-SSI wrote: >>>> >>>> Hi Casey

Re: Transaction struct Op

2015-07-02 Thread Haomai Wang
Yes, some fields only used for special ops. But union may increase the complexity of stuct. And the extra memory may not a problem because "Ops" in one transaction should be within ten. On Thu, Jul 2, 2015 at 10:05 PM, Dałek, Piotr wrote: > Hello, > > In ObjectStore.h we have the following stuc

About Fio backend with ObjectStore API

2015-06-30 Thread Haomai Wang
Hi all, Long long ago, is there someone said about fio backend with Ceph ObjectStore API? So we could use the existing mature fio facility to benchmark ceph objectstore. -- Best Regards, Wheat -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to m

Re: Inline dedup/compression

2015-06-29 Thread Haomai Wang
. Let me think > more about it. > > Regards, > James > > -Original Message- > From: Haomai Wang [mailto:haomaiw...@gmail.com] > Sent: Friday, June 26, 2015 8:55 PM > To: James (Fei) Liu-SSI > Cc: ceph-devel > Subject: Re: Inline dedup/compression > > On Sat

Re: Inline dedup/compression

2015-06-26 Thread Haomai Wang
ke Hdevig > and Springpath provide inline dedupe/compression. It is not apple to apple > comparison. But it is good reference. The datacenters need cost effective > solution. > > Regards, > James > > > > -Original Message- > From: Haomai Wang [mailto:haoma

Re: Inline dedup/compression

2015-06-25 Thread Haomai Wang
On Fri, Jun 26, 2015 at 6:01 AM, James (Fei) Liu-SSI wrote: > Hi Cephers, > It is not easy to ask when Ceph is going to support inline > dedup/compression across OSDs in RADOS because it is not easy task and > answered. Ceph is providing replication and EC for performance and failure > reco

Re: [ceph-users] Blueprint Submission Open for CDS Jewel

2015-06-08 Thread Haomai Wang
Hi Partick, It looks confusing to use this. Is it need that we upload a txt file to describe blueprint instead of editing directly online? On Wed, May 27, 2015 at 5:05 AM, Patrick McGarry wrote: > It's that time again, time to gird up our loins and submit blueprints > for all work slated for the

Re: Looking to improve small I/O performance

2015-06-06 Thread Haomai Wang
We could wait for the next benchmark until this PR(https://github.com/ceph/ceph/pull/4775) merged On Sat, Jun 6, 2015 at 11:06 PM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > I found similar results in my testing as well. Ceph is certainly great > at large I/O, b

Re: Looking to improve small I/O performance

2015-06-06 Thread Haomai Wang
On Sat, Jun 6, 2015 at 2:07 PM, Dałek, Piotr wrote: >> -Original Message- >> From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- >> >> I'm digging into perf and the code to see here/how I might be able to >> improve performance for small I/O around 16K. >> >> I ran fio with rados an

Re: [RFC] Implement a new journal mode

2015-06-02 Thread Haomai Wang
On Tue, Jun 2, 2015 at 5:28 PM, Li Wang wrote: > I think for scrub, we have a relatively easy way to solve it, > add a field to object metadata with the value being either UNSTABLE > or STABLE, the algorithm is as below, > 1 Mark the object be UNSTABLE > 2 Perform object data write I guess this w

Re: [ceph-users] Memory Allocators and Ceph

2015-05-27 Thread Haomai Wang
On Thu, May 28, 2015 at 1:40 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > With all the talk of tcmalloc and jemalloc, I decided to do some > testing og the different memory allocating technologies between KVM > and Ceph. These tests were done a pre-production s

Re: Some thoughts regarding the new store

2015-05-27 Thread Haomai Wang
On Wed, May 27, 2015 at 4:41 PM, Li Wang wrote: > I have just noticed the new store development, and had a > look at the idea behind it (http://www.spinics.net/lists/ceph- > devel/msg22712.html), so my understanding, we wanna avoid the > double-write penalty of WRITE_AHEAD_LOGGING journal mechanis

Re: OSD-Based Object Stubs

2015-05-27 Thread Haomai Wang
I guess it should be something like sam designed in CDS(https://wiki.ceph.com/Planning/Blueprints/Infernalis/osd%3A_Tiering_II_(Warm-%3ECold)) On Wed, May 27, 2015 at 4:39 PM, Marcel Lauhoff wrote: > Hi, > > I wrote a prototype for an OSD-based object stub feature. An object stub > being an objec

[no subject]

2015-05-16 Thread Haomai Wang
ing on X86 works fine, never seen bad crc error. > > > 2015-05-16 17:30 GMT+08:00 Haomai Wang : >> is this always happen or occasionally? >> >> On Sat, May 16, 2015 at 10:10 AM, huang jun wrote: >>> hi,steve >>> >>> 2015-05-15 16:36 GMT+08:00 S

Re: newstore performance update

2015-04-30 Thread Haomai Wang
On Thu, Apr 30, 2015 at 12:38 AM, Sage Weil wrote: > On Wed, 29 Apr 2015, Chen, Xiaoxi wrote: >> Hi Mark, >> Really good test:) I only played a bit on SSD, the parallel WAL >> threads really helps but we still have a long way to go especially on >> all-ssd case. I tried this >> https://githu

Re: async messenger small benchmark result

2015-04-28 Thread Haomai Wang
Still not, we currently only focus on bug fix and stable purpose. But I think performance improvement will be pick up soon(May?), the problem is clearly I think. On Wed, Apr 29, 2015 at 2:10 PM, Alexandre DERUMIER wrote: >>>Thanks! So far we've gotten a report that asyncmesseneger was a little >

Re: async messenger small benchmark result

2015-04-28 Thread Haomai Wang
Thanks for your benchmark! Yeah, async messenger exists a bottleneck when meet high concurrency and iops. Because it exists a annoying lock related to crc calculate. Now my main job is focus on passing on qa tests for async messenger. If no failed tests, I will solve this problem. On Tue, Apr 28,

Re: 回复: Re: NewStore performance analysis

2015-04-21 Thread Haomai Wang
On Tue, Apr 21, 2015 at 2:43 PM, Chen, Xiaoxi wrote: > Hi Sage, > Well, that's > submit_transaction -- submit a transaction , whether block > waiting for fdatasync depends on rocksdb-disable-sync. > submit_transaction_sync -- queue transaction and wait unti

Re: Regarding newstore performance

2015-04-17 Thread Haomai Wang
Mark Nelson [mailto:mnel...@redhat.com] > Sent: Friday, April 17, 2015 8:11 PM > To: Sage Weil > Cc: Somnath Roy; Chen, Xiaoxi; Haomai Wang; ceph-devel > Subject: Re: Regarding newstore performance > > > > On 04/16/2015 07:38 PM, Sage Weil wrote: >> On Thu, 16 Apr 2015

Re: Regarding newstore performance

2015-04-16 Thread Haomai Wang
On Fri, Apr 17, 2015 at 8:38 AM, Sage Weil wrote: > On Thu, 16 Apr 2015, Mark Nelson wrote: >> On 04/16/2015 01:17 AM, Somnath Roy wrote: >> > Here is the data with omap separated to another SSD and after 1000GB of fio >> > writes (same profile).. >> > >> > omap writes: >> > - >> > >>

Re: Regarding newstore performance

2015-04-15 Thread Haomai Wang
On Wed, Apr 15, 2015 at 2:01 PM, Somnath Roy wrote: > Hi Sage/Mark, > I did some WA experiment with newstore with the similar settings I mentioned > yesterday. > > Test: > --- > > 64K Random write with 64 QD and writing total of 1 TB of data. > > > Newstore: > > > Fio output at t

Re: Initial newstore vs filestore results

2015-04-08 Thread Haomai Wang
On Wed, Apr 8, 2015 at 10:58 AM, Sage Weil wrote: > On Tue, 7 Apr 2015, Mark Nelson wrote: >> On 04/07/2015 02:16 PM, Mark Nelson wrote: >> > On 04/07/2015 09:57 AM, Mark Nelson wrote: >> > > Hi Guys, >> > > >> > > I ran some quick tests on Sage's newstore branch. So far given that >> > > this is

Re: Regarding ceph rbd write path

2015-04-06 Thread Haomai Wang
oblem, look forward it. > > Thanks & Regards > Somnath > > -Original Message- > From: Haomai Wang [mailto:haomaiw...@gmail.com] > Sent: Friday, April 03, 2015 9:47 PM > To: Somnath Roy > Cc: ceph-devel@vger.kernel.org > Subject: Re: Regarding ceph rbd write

Re: Regarding ceph rbd write path

2015-04-03 Thread Haomai Wang
On Sat, Apr 4, 2015 at 8:30 AM, Somnath Roy wrote: > In fact, we can probably do it from the OSD side like this. > > 1. A thread in the sharded opWq is taking the ops within a pg by acquiring > lock in the pg_for_processing data structure. > > 2. Now, before taking the job, it can do a bit proces

Re: keyvaluestore speed up?

2015-03-19 Thread Haomai Wang
On Thu, Mar 19, 2015 at 5:22 PM, Xinze Chi wrote: > hi, all: > > Currently at keyvaluestore, osd send sync > request(submit_transaction_sync) to filestore when it finishes a > transaction. But sata disk is not suitable for doing sync request. ssd > disk is more suitable. I think here you mean

Re: crc error when decode_message?

2015-03-16 Thread Haomai Wang
On Mon, Mar 16, 2015 at 10:04 PM, Xinze Chi wrote: > How to process the write request in primary? > > Thanks. > > 2015-03-16 22:01 GMT+08:00 Haomai Wang : >> AFAR Pipe and AsyncConnection both will mark self fault and shutdown >> socket and peer will detect this reset.

Re: crc error when decode_message?

2015-03-16 Thread Haomai Wang
AFAR Pipe and AsyncConnection both will mark self fault and shutdown socket and peer will detect this reset. So each side has chance to rebuild the session. On Mon, Mar 16, 2015 at 9:19 PM, Xinze Chi wrote: > Such as, Client send write request to osd.0 (primary), osd.0 send > MOSDSubOp to osd.1 a

Re: About _setattr() optimazation and recovery accelerate

2015-03-08 Thread Haomai Wang
On Mon, Mar 9, 2015 at 1:26 PM, Nicheal wrote: > 2015-03-07 16:43 GMT+08:00 Haomai Wang : >> On Sat, Mar 7, 2015 at 12:03 AM, Sage Weil wrote: >>> Hi! >>> >>> [copying ceph-devel] >>> >>> On Fri, 6 Mar 2015, Nicheal wrote: >>>>

Re: About _setattr() optimazation and recovery accelerate

2015-03-07 Thread Haomai Wang
On Sat, Mar 7, 2015 at 12:03 AM, Sage Weil wrote: > Hi! > > [copying ceph-devel] > > On Fri, 6 Mar 2015, Nicheal wrote: >> Hi Sage, >> >> Cool for issue #3878, Duplicated pg_log write, which is post early in >> my issue #3244 and Single omap_setkeys transaction improve the >> performance in FileSt

Re: FileStore performance: coalescing operations

2015-03-04 Thread Haomai Wang
I think the performance improvement can be refer to https://github.com/ceph/ceph/pull/2972 which I did a simple benchmark comparison. This coalescing should get a lighter improvement but I think obviously it should be better. On Thu, Mar 5, 2015 at 8:10 AM, Sage Weil wrote: > On Tue, 3 Mar 2015,

Re: [Manila] Ceph native driver for manila

2015-02-26 Thread Haomai Wang
On Fri, Feb 27, 2015 at 1:19 PM, Sage Weil wrote: > On Fri, 27 Feb 2015, Haomai Wang wrote: >> > Anyway, this leads to a few questions: >> > >> > - Who is interested in using Manila to attach CephFS to guest VMs? >> >> Yeah, actually we are doing this &g

Re: [Manila] Ceph native driver for manila

2015-02-26 Thread Haomai Wang
On Fri, Feb 27, 2015 at 8:01 AM, Sage Weil wrote: > Hi everyone, > > The online Ceph Developer Summit is next week[1] and among other things > we'll be talking about how to support CephFS in Manila. At a high level, > there are basically two paths: > > 1) Ganesha + the CephFS FSAL driver > > - T

Re: FileStore performance: coalescing operations

2015-02-26 Thread Haomai Wang
Hmm, we already obverse this duplicate omap keys set from pglog operations. And I think we need to resolve it in upper layer, of course, coalescing omap operations in FileStore is also useful. @Somnath Do you do this dedup work in KeyValueStore already? On Thu, Feb 26, 2015 at 10:28 PM, Andreas

Re: About in_seq, out_seq in Messenger

2015-02-25 Thread Haomai Wang
ssenger, I added a inject-error stress test for lossless_peer_reuse policy, it can reproduce it easily On Wed, Feb 25, 2015 at 2:27 AM, Gregory Farnum wrote: > >> On Feb 24, 2015, at 7:18 AM, Haomai Wang wrote: >> >> On Tue, Feb 24, 2015 at 12:04 AM, Greg Farnum wrote:

Re: About in_seq, out_seq in Messenger

2015-02-24 Thread Haomai Wang
On Tue, Feb 24, 2015 at 12:04 AM, Greg Farnum wrote: > On Feb 12, 2015, at 9:17 PM, Haomai Wang wrote: >> >> On Fri, Feb 13, 2015 at 1:26 AM, Greg Farnum wrote: >>> Sorry for the delayed response. >>> >>>> On Feb 11, 2015, at 3:48 AM, Haomai Wang wr

Re: [ceph-users] Ceph Dumpling/Firefly/Hammer SSD/Memstore performance comparison

2015-02-22 Thread Haomai Wang
I don't have detail perf number for sync io latency now. But a few days ago I did single OSD single io depth benchmark. In short, Firefly > Dumpling > Hammer per op latency. It's great to see Mark's benchmark result! As for pcie ssd, I think ceph can't make full use of it currently for one OSD. W

Re: NewStore update

2015-02-20 Thread Haomai Wang
itive cases and it may trigger lookup operation(get_onode). On Fri, Feb 20, 2015 at 11:00 PM, Sage Weil wrote: > On Fri, 20 Feb 2015, Haomai Wang wrote: >> So cool! >> >> A little notes: >> >> 1. What about sync thread in NewStore? > > My thought right now is t

  1   2   3   >