Re: Rbd map failure in 3.16.0-55

2015-12-12 Thread Somnath Roy
crc enabled is recommended..we will come back for your help if it is really hurting performance.. Thanks Somnath Sent from my iPhone > On Dec 12, 2015, at 10:56 AM, Ilya Dryomov wrote: > >> On Sat, Dec 12, 2015 at 6:42 PM, Somnath Roy wrote: >> Ilya, >> If we map with

RE: Rbd map failure in 3.16.0-55

2015-12-12 Thread Somnath Roy
Ilya, If we map with 'nocrc' would that help ? Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Ilya Dryomov Sent: Saturday, December 12, 2015 3:12 AM To: Varada Kari Cc: ceph-devel@vger.kernel.org S

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
mnath -Original Message- From: Somnath Roy Sent: Thursday, December 03, 2015 10:16 PM To: 'Adam C. Emerson' Cc: Samuel Just; Casey Bodley; Sage Weil; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: RE: queue_transaction interface + unique_ptr + performance Ad

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
lto:aemer...@redhat.com] Sent: Thursday, December 03, 2015 5:44 PM To: Somnath Roy Cc: Samuel Just; Casey Bodley; Sage Weil; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: Re: queue_transaction interface + unique_ptr + performance On 04/12/2015, Somnath Roy wrote: [snip] > #

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
:sj...@redhat.com] Sent: Thursday, December 03, 2015 3:24 PM To: Casey Bodley Cc: Sage Weil; Somnath Roy; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: Re: queue_transaction interface + unique_ptr + performance From a simplicity point of view, I'd rather just move a

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
03, 2015 9:51 AM To: Somnath Roy Cc: Adam C. Emerson; Sage Weil; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: Re: queue_transaction interface + unique_ptr + performance As far as I know, there are no current users which want to use the Transaction later. You could also

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
afterwards. Thanks & Regards Somnath -Original Message- From: Adam C. Emerson [mailto:aemer...@redhat.com] Sent: Thursday, December 03, 2015 9:25 AM To: Somnath Roy Cc: Sage Weil; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: Re: queue_transaction interfac

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
sage- From: Adam C. Emerson [mailto:aemer...@redhat.com] Sent: Thursday, December 03, 2015 9:17 AM To: Somnath Roy Cc: Casey Bodley; Sage Weil; Samuel Just (sam.j...@inktank.com); ceph-devel@vger.kernel.org Subject: Re: queue_transaction interface + unique_ptr + performance On 03/12/2015, Somnath

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
this mail chain in case you have missed) taking Transaction, any thought of that ? Should we reconsider having two queue_transaction interface ? Thanks & Regards Somnath -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: Thursday, December 03, 2015 3:50 AM To: Som

RE: queue_transaction interface + unique_ptr + performance

2015-12-03 Thread Somnath Roy
I don't think make_shared / make_unique is part of c++11 (and ceph is using that). It is part of c++14 I guess.. Thanks & Regards Somnath -Original Message- From: Casey Bodley [mailto:cbod...@redhat.com] Sent: Thursday, December 03, 2015 7:17 AM To: Sage Weil Cc: Somnath Ro

RE: queue_transaction interface + unique_ptr + performance

2015-12-02 Thread Somnath Roy
d for shared_ptr overhead is >2X. Thanks & Regards Somnath -Original Message- From: Somnath Roy Sent: Wednesday, December 02, 2015 7:59 PM To: 'James (Fei) Liu-SSI'; Sage Weil (s...@newdream.net); Samuel Just (sam.j...@inktank.com) Cc: ceph-devel@vger.kernel.org Subject:

RE: queue_transaction interface + unique_ptr + performance

2015-12-02 Thread Somnath Roy
Thanks James for looking into this.. Shared_ptr used heavily in the OSD.cc/Replicated PG path.. Regards Somnath -Original Message- From: James (Fei) Liu-SSI [mailto:james@ssi.samsung.com] Sent: Wednesday, December 02, 2015 7:50 PM To: Somnath Roy; Sage Weil (s...@newdream.net

RE: queue_transaction interface + unique_ptr + performance

2015-12-02 Thread Somnath Roy
tart.tv_usec); printf("micros_used for shared ptr: %d\n",micros_used); std::cout <<"Existing..\n"; return 0; } So, my guess is, the heavy use of these smart pointers in the Ceph IO path is bringing iops/core down substantially. My suggestion is *not

queue_transaction interface + unique_ptr

2015-12-02 Thread Somnath Roy
Hi Sage/Sam, As discussed in today's performance meeting , I am planning to change the queue_transactions() interface to the following. int queue_transactions(Sequencer *osr, list& tls, Context *onreadable, Context *ondisk=0, Context *onreadable

Write path changes

2015-11-20 Thread Somnath Roy
Hi Sage, FYI, I have sent out a new PR addressing your earlier comments and some more enhancement. Here it is.. https://github.com/ceph/ceph/pull/6670 Did some exhaustive comparison with ceph latest master code base and found up to 32 OSDs (4 OSD nodes , one per 8TB SAS SSD) , my changes are gi

RE: Regarding op_t, local_t

2015-11-18 Thread Somnath Roy
well , reducing one transaction would help there. Thanks & Regards Somnath -Original Message- From: 池信泽 [mailto:xmdx...@gmail.com] Sent: Wednesday, November 18, 2015 6:00 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Regarding op_t, local_t Good catch. I think it doe

Regarding op_t, local_t

2015-11-18 Thread Somnath Roy
Hi Sage, I saw we are now having single transaction in submit_transaction. But, in the replication path we are still having two transaction, can't we merge it to one there ? Thanks & Regards Somnath -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message

Regarding op_t, local_t

2015-11-18 Thread Somnath Roy
Hi Sage, I saw we are now having single transaction in submit_transaction. But, in the replication path we are still having two transaction, can't we merge it to one there ? Thanks & Regards Somnath -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message t

test

2015-11-11 Thread Somnath Roy
Sorry for the spam , having some issues with devl -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

RE: Increasing # Shards vs multi-OSDs per device

2015-11-11 Thread Somnath Roy
mber 11, 2015 12:57 PM To: ceph-devel@vger.kernel.org; Mark Nelson; Samuel Just; Kyle Bader; Somnath Roy Subject: Increasing # Shards vs multi-OSDs per device Sorry about the microphone issues in the performance meeting today today. This is a followup to the 11/4 performance meeting where we di

RE: why we use two ObjectStore::Transaction in ReplicatedBackend::submit_transaction?

2015-11-01 Thread Somnath Roy
Huh..It seems the op_t is already copied in generate_subop() -> ::encode(*op_t, wr->get_data());...So, this shouldn't be an issue.. -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy Sent: Sunday,

RE: why we use two ObjectStore::Transaction in ReplicatedBackend::submit_transaction?

2015-11-01 Thread Somnath Roy
Sage, Is it possible that we can't reuse the op_t because it could be still there in the messenger queue before calling parent->log_operation() ? Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sa

RE: why we use two ObjectStore::Transaction in ReplicatedBackend::submit_transaction?

2015-10-31 Thread Somnath Roy
BTW, latest code base is already separating out 2 transaction. No more append call.. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Ning Yao Sent: Saturday, October 31, 2015 8:35 AM To: Sage Weil

RE: why we should use two Mutex in OSD ShardData?

2015-10-30 Thread Somnath Roy
Sent: Friday, October 30, 2015 8:38 AM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: why we should use two Mutex in OSD ShardData? I do not see any improvement by moving to single mutex. I just fell puzzle why we use two mutex. But I also do not see any improvement using two mutex i

RE: why we should use two Mutex in OSD ShardData?

2015-10-30 Thread Somnath Roy
Hi xinze, This is mainly for reducing lock contention on a single mutex. Conditional wakeup on a mutex is expensive and that's why we wanted to make it separate from the mutex protecting Sharddata priority queue and pg_for_processing map. Are you seeing any improvement by moving to single mutex ?

RE: Lock contention in do_rule

2015-10-24 Thread Somnath Roy
Thanks Sage, I will test with this patch.. Regards Somnath -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: Saturday, October 24, 2015 3:04 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: RE: Lock contention in do_rule On Sat, 24 Oct 2015, Somnath Roy

RE: Lock contention in do_rule

2015-10-23 Thread Somnath Roy
l-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Somnath Roy Sent: Friday, October 23, 2015 7:02 PM To: Sage Weil Cc: ceph-devel@vger.kernel.org Subject: RE: Lock contention in do_rule Thanks for the clarification Sage.. I don't have much knowledge on this part ,

RE: Lock contention in do_rule

2015-10-23 Thread Somnath Roy
r thread ? Thanks & Regards Somnath -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: Friday, October 23, 2015 6:10 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Lock contention in do_rule On Sat, 24 Oct 2015, Somnath Roy wrote: > Hi Sage, >

Lock contention in do_rule

2015-10-23 Thread Somnath Roy
Hi Sage, We are seeing the following mapper_lock is heavily contended and commenting out this lock is improving performance ~10 % (in the short circuit path). This is called for every io from osd_is_valid_op_target(). I looked into the code ,but, couldn't understand the purpose of the lock , it s

RE: newstore direction

2015-10-19 Thread Somnath Roy
Sage, I fully support that. If we want to saturate SSDs , we need to get rid of this filesystem overhead (which I am in process of measuring). Also, it will be good if we can eliminate the dependency on the k/v dbs (for storing allocators and all). The reason is the unknown write amps they cause

Re: XFS xattr limit and Ceph

2015-10-15 Thread Somnath Roy
xattrs > >> On Thu, Oct 15, 2015 at 10:54 PM, Somnath Roy >> wrote: >> Sage, >> Why we are using XFS max inline xattr value as 10 only ? >> >> OPTION(filestore_max_inline_xattrs_xfs, OPT_U32, 10) >> >> XFS is supporting 1k limit I guess. Is

XFS xattr limit and Ceph

2015-10-15 Thread Somnath Roy
Sage, Why we are using XFS max inline xattr value as 10 only ? OPTION(filestore_max_inline_xattrs_xfs, OPT_U32, 10) XFS is supporting 1k limit I guess. Is there any performance reason behind that ? Thanks & Regards Somnath PLEASE NOTE: The information containe

RE: throttles

2015-10-13 Thread Somnath Roy
BTW, you can completely turn off these throttles ( other than the filestore throttle ) by setting the value to 0. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Deneau, Tom Sent: Tuesday, October

RE: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Somnath Roy
aomai Wang [mailto:haomaiw...@gmail.com] Sent: Monday, October 12, 2015 11:35 PM To: Somnath Roy Cc: Mark Nelson; ceph-devel; ceph-us...@lists.ceph.com Subject: Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results On Tue, Oct 13, 2015 at 12:18 PM, Somnath Roy

RE: throttles

2015-10-12 Thread Somnath Roy

RE: perf counters from a performance discrepancy

2015-10-08 Thread Somnath Roy
If I remember correctly, Nick faced similar issue and we debugged down to the xattr access issue in the find_object_context(). I am not sure if it is resolved though for him or not. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow

Pull request for FileStore write path optimization

2015-09-29 Thread Somnath Roy
Hi Mark, I have sent out the following pull request for my write path changes. https://github.com/ceph/ceph/pull/6112 Meanwhile, if you want to give it a spin to your SSD cluster , take the following branch. https://github.com/somnathr/ceph/tree/wip-write-path-optimization 1. Please use the fo

RE: Very slow recovery/peering with latest master

2015-09-28 Thread Somnath Roy
nput/output errors on accessing the drives which are not reserved for this host. This is an inefficiency part of blkid* calls (?) since calls like fdisk/lsscsi are not taking time. Regards Somnath -Original Message- From: Chen, Xiaoxi [mailto:xiaoxi.c...@intel.com] Sent: Monday, September 28

RE: Very slow recovery/peering with latest master

2015-09-24 Thread Somnath Roy
86_64/clone.S:111 Strace was not helpful much since other threads are not block and keep printing the futex traces.. Thanks & Regards Somnath -Original Message- From: Podoski, Igor [mailto:igor.podo...@ts.fujitsu.com] Sent: Wednesday, September 23, 2015 11:33 PM To: Somnath Roy C

Copyright header

2015-09-23 Thread Somnath Roy
Hi Sage, In the latest master, I am seeing a new Copyright header entry for HP in the file Filestore.cc. Is this incidental ? * Copyright (c) 2015 Hewlett-Packard Development Company, L.P. Thanks & Regards Somnath PLEASE NOTE: The information contained in this

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy
<mailto:joseph.t.hand...@hpe.com] Sent: Wednesday, September 23, 2015 4:20 PM To: Samuel Just Cc: Somnath Roy; Samuel Just (sam.j...@inktank.com); Sage Weil (s...@newdream.net); ceph-devel Subject: Re: Very slow recovery/peering with latest master I added that, there is code up the stack

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy
Sent: Wednesday, September 23, 2015 4:07 PM To: Somnath Roy Cc: Samuel Just (sam.j...@inktank.com); Sage Weil (s...@newdream.net); ceph-devel Subject: Re: Very slow recovery/peering with latest master Wow. Why would that take so long? I think you are correct that it's only used for metadata, we c

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy
wip-write-path-optimization/src# lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description:Ubuntu 14.04.2 LTS Release:14.04 Codename: trusty Thanks & Regards Somnath -Original Message- From: Somnath Roy Sent: Wednesday, September 16, 2015 2:2

RE: Very slow recovery/peering with latest master

2015-09-16 Thread Somnath Roy
communication getting slower ? Let me know if more verbose logging is required and how should I share the log.. Thanks & Regards Somnath -Original Message- From: Gregory Farnum [mailto:gfar...@redhat.com] Sent: Wednesday, September 16, 2015 11:35 AM To: Somnath Roy Cc: ceph-devel Subje

Very slow recovery/peering with latest master

2015-09-15 Thread Somnath Roy
Hi, I am seeing very slow recovery when I am adding OSDs with the latest master. Also, If I just restart all the OSDs (no IO is going on in the cluster) , cluster is taking a significant amount of time to reach in active+clean state (and even detecting all the up OSDs). I saw the recovery/backfi

RE: Question about big EC pool.

2015-09-13 Thread Somnath Roy
<mailto:mike.almat...@gmail.com] Sent: Sunday, September 13, 2015 10:39 AM To: Somnath Roy; ceph-devel Subject: Re: Question about big EC pool. 13-Sep-15 01:12, Somnath Roy пишет: > 12-Sep-15 19:34, Somnath Roy пишет: >> >I don't think there is any limit from Ceph side.. >

RE: Question about big EC pool.

2015-09-12 Thread Somnath Roy
<mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Mike Almateia Sent: Saturday, September 12, 2015 12:13 PM To: ceph-devel Subject: Re: Question about big EC pool. 12-Sep-15 19:34, Somnath Roy пишет: > I don't think there is any limit from Ceph side.. > We are testing with ~768

RE: Question about big EC pool.

2015-09-12 Thread Somnath Roy
I don't think there is any limit from Ceph side.. We are testing with ~768 TB deployment with 4:2 EC on Flash and it is working well so far.. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Mike Al

RE: Regarding journal replay

2015-09-10 Thread Somnath Roy
Yeah, thanks Sage for confirming this. Regards Somnath -Original Message- From: Sage Weil [mailto:sw...@redhat.com] Sent: Thursday, September 10, 2015 3:04 PM To: Somnath Roy Cc: ceph-devel Subject: Re: Regarding journal replay On Thu, 10 Sep 2015, Somnath Roy wrote: > Sage et.

Regarding journal replay

2015-09-10 Thread Somnath Roy
Sage et. al, Could you please let me know what will happen during journal replay in this scenario ? 1. Say last committed seq is 3 and after that one more independent transaction with say 4 came. Transaction seq 4, has say delete xattr, delete object, create a new object, set xattr 2. Seq 4 i

RE: Ceph Write Path Improvement

2015-09-09 Thread Somnath Roy
For mixed workload it is with QD = 8 and num_job= 1 and 10. Thanks & Regards Somnath -Original Message- From: Blinick, Stephen L [mailto:stephen.l.blin...@intel.com] Sent: Thursday, September 03, 2015 1:02 PM To: Somnath Roy Cc: ceph-devel Subject: RE: Ceph Write Path Improvement

RE: Ceph Write Path Improvement

2015-09-03 Thread Somnath Roy
Stephen L [mailto:stephen.l.blin...@intel.com] Sent: Thursday, September 03, 2015 1:02 PM To: Somnath Roy Cc: ceph-devel Subject: RE: Ceph Write Path Improvement Somnath -- thanks for publishing all the data, will be great to look at it offline. I didn't find this info: How many RBD volumes, and

RE: Ceph Write Path Improvement

2015-09-03 Thread Somnath Roy
data with that config. That's why I have introduced a new throttling scheme that should benefit in all the scenarios. Thanks & Regards Somnath -Original Message- From: Mark Nelson [mailto:mnel...@redhat.com] Sent: Thursday, September 03, 2015 9:42 AM To: Robert LeBlanc; Somna

Ceph Write Path Improvement

2015-09-02 Thread Somnath Roy
Hi, Here is the link of the document I presented in today's performance meeting. https://docs.google.com/presentation/d/1lCoLpFRjD8t_YCeHyWDV7ddv7ZkwfETgyjUzXw0-ttU/edit?usp=sharing It has the benchmark result of the filestore changes I proposed earlier for the ceph write path optimization. Than

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-23 Thread Somnath Roy
please use the patch to verify this ? Did you build fio/rados bench also with tcmalloc/jemalloc ? If not, how/why it is improving ? Thanks & Regards Somnath -Original Message- From: Alexandre DERUMIER [mailto:aderum...@odiso.com] Sent: Sunday, August 23, 2015 6:13 AM To: Somnath Ro

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-22 Thread Somnath Roy
[mailto:aderum...@odiso.com] Sent: Saturday, August 22, 2015 9:57 AM To: Somnath Roy Cc: Sage Weil; Milosz Tanski; Shishir Gowda; Stefan Priebe; Mark Nelson; ceph-devel Subject: Re: Ceph Hackathon: More Memory Allocator Testing >>Wanted to know is there any reason we didn't link clien

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-22 Thread Somnath Roy
& Regards Somnath -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: Saturday, August 22, 2015 6:56 AM To: Milosz Tanski Cc: Shishir Gowda; Somnath Roy; Stefan Priebe; Alexandre DERUMIER; Mark Nelson; ceph-devel Subject: Re: Ceph Hackathon: More Memory Allocator Testing

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
Yeah , I can see ceph-osd/ceph-mon built with jemalloc. Thanks & Regards Somnath -Original Message- From: Stefan Priebe [mailto:s.pri...@profihost.ag] Sent: Wednesday, August 19, 2015 1:41 PM To: Somnath Roy; Alexandre DERUMIER; Mark Nelson Cc: ceph-devel Subject: Re: Ceph Hacka

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
: Wednesday, August 19, 2015 1:31 PM To: Somnath Roy; Alexandre DERUMIER; Mark Nelson Cc: ceph-devel Subject: Re: Ceph Hackathon: More Memory Allocator Testing Am 19.08.2015 um 22:29 schrieb Somnath Roy: > Hmm...We need to fix that as part of configure/Makefile I guess (?).. > Since we hav

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
build environment to get this done How do I do that ? I am using Ubuntu and can't afford to remove libc* packages. Thanks & Regards Somnath -Original Message- From: Stefan Priebe [mailto:s.pri...@profihost.ag] Sent: Wednesday, August 19, 2015 1:18 PM To: Somnath Roy; Alexandre DERUMIER;

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
Alexandre, I am not able to build librados/librbd by using the following config option. ./configure –without-tcmalloc –with-jemalloc It seems it is building osd/mon/Mds/RGW with jemalloc enabled.. root@emsnode10:~/ceph-latest/src# ldd ./ceph-osd linux-vdso.so.1 => (0x7ffd0eb43000)

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
DERUMIER [mailto:aderum...@odiso.com] Sent: Wednesday, August 19, 2015 9:55 AM To: Somnath Roy Cc: Mark Nelson; ceph-devel Subject: Re: Ceph Hackathon: More Memory Allocator Testing << I think that tcmalloc have a fixed size (TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES), and share it between all process.

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-19 Thread Somnath Roy
<< I think that tcmalloc have a fixed size (TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES), and share it between all process. I think it is per tcmalloc instance loaded , so, at least with num_osds * num_tcmalloc_instance * TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES in a box. Also, I think there is no point

RE: Ceph Hackathon: More Memory Allocator Testing

2015-08-18 Thread Somnath Roy
Mark, Thanks for verifying this. Nice report ! Since there is a big difference in memory consumption with jemalloc, I would say a recovery performance data or client performance data during recovery would be helpful. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vge

RE: Async reads, sync writes, op thread model discussion

2015-08-11 Thread Somnath Roy
Haomai, Yes, one of the goals is to make async read xattr.. IMO, this scheme should benefit in the following scenario.. Ops within a PG will not be serialized any more as long as it is not coming on the same object and this could be a big win. In our workload at least we are not seeing the shar

RE: FileStore should not use syncfs(2)

2015-08-05 Thread Somnath Roy
s. As we discussed earlier, in case of only fsync approach, we still need to do a db sync to make sure the leveldb stuff persisted, right ? Thanks & Regards Somnath -Original Message- From: Sage Weil [mailto:sw...@redhat.com] Sent: Wednesday, August 05, 2015 2:27 PM To: Somnath Roy Cc

RE: More ondisk_finisher thread?

2015-08-04 Thread Somnath Roy
Yes, it has to re-acquire pg_lock today.. But, between journal write and initiating the ondisk ack, there is one context switche in the code path. So, I guess the pg_lock is not the only one that is causing this 1 ms delay... Not sure increasing the finisher threads will help in the pg_lock case

RE: Ceph write path optimization

2015-07-29 Thread Somnath Roy
similar to existing one today). Thanks & Regards Somnath -Original Message- From: Shu, Xinxin [mailto:xinxin@intel.com] Sent: Wednesday, July 29, 2015 12:50 AM To: Somnath Roy; ceph-devel@vger.kernel.org Subject: RE: Ceph write path optimization Hi Somnath, any performance data

RE: Ceph write path optimization

2015-07-29 Thread Somnath Roy
M To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Ceph write path optimization Hi Somnath, A few comments! The throttling changes you've made sound like they are a big improvement. I'm a little worried about the op_seq change, as I remember that being quite fragile, but if

RE: Ceph write path optimization

2015-07-29 Thread Somnath Roy
<mailto:h...@infradead.org] Sent: Tuesday, July 28, 2015 11:57 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Ceph write path optimization On Tue, Jul 28, 2015 at 09:08:27PM +0000, Somnath Roy wrote: > 2. Each filestore Op threads is now doing O_DSYNC write follo

RE: Ceph write path optimization

2015-07-28 Thread Somnath Roy
Haomai, <mailto:haomaiw...@gmail.com] Sent: Tuesday, July 28, 2015 7:18 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Ceph write path optimization On Wed, Jul 29, 2015 at 5:08 AM, Somnath Roy wrote: > Hi, > Eventually, I have a working prototype and able to ga

RE: Ceph write path optimization

2015-07-28 Thread Somnath Roy
Hi Lukas, According to (http://linux.die.net/man/8/mkfs.xfs) lazy-count is by default set to 1 not 0 with newer kernel. I am using 3.16.0-41-generic, so, should be fine. Thanks & Regards Somnath -Original Message- From: Somnath Roy Sent: Tuesday, July 28, 2015 3:04 PM To: &#x

RE: Ceph write path optimization

2015-07-28 Thread Somnath Roy
sage- From: mr.e...@gmail.com [mailto:mr.e...@gmail.com] On Behalf Of Lukasz Redynk Sent: Tuesday, July 28, 2015 2:46 PM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Ceph write path optimization Hi, Have you tried to tune XFS mkfs options? From mkfs.xfs(8) a) (log section, -l) la

Ceph write path optimization

2015-07-28 Thread Somnath Roy
Hi, Eventually, I have a working prototype and able to gather some performance comparison data with the changes I was talking about in the last performance meeting. Mark's suggestion of a write up was long pending, so, trying to summarize what I am trying to do. Objective: --- 1. Is to

RE: Probable memory leak in Hammer write path ?

2015-07-01 Thread Somnath Roy
allocator and moving to jemalloc. Thanks Greg for asking me to relook at tcmalloc otherwise I was kind of out of option :-).. Regards Somnath -Original Message- From: Somnath Roy Sent: Wednesday, July 01, 2015 4:58 PM To: 'Gregory Farnum' Cc: ceph-devel@vger.kernel.org Subject: RE

RE: Probable memory leak in Hammer write path ?

2015-07-01 Thread Somnath Roy
Thanks Greg! Yeah, I will double check..But, I built the code without tcmalloc (with glibc) and it was also showing the similar behavior. Thanks & Regards Somnath -Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Wednesday, July 01, 2015 9:07 AM To: Somnath Ro

RE: Probable memory leak in Hammer write path ?

2015-06-29 Thread Somnath Roy
oing is to install ceph from ceph.com and see the behavior. Thanks & Regards Somnath -Original Message- From: Gregory Farnum [mailto:g...@gregs42.com] Sent: Monday, June 29, 2015 3:53 AM To: Somnath Roy Cc: ceph-devel@vger.kernel.org Subject: Re: Probable memory leak in Hammer wri

RE: CRC32 of messages

2015-06-29 Thread Somnath Roy
28, 2015 11:27 PM To: ceph-devel@vger.kernel.org Subject: RE: CRC32 of messages > -Original Message- > From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- > ow...@vger.kernel.org] On Behalf Of Somnath Roy > Sent: Friday, June 26, 2015 7:52 PM > > ceph_crc32c_inte

RE: Probable memory leak in Hammer write path ?

2015-06-28 Thread Somnath Roy
Some more data point.. 1. I am not seeing this in 3.13.0-24-generic 2. Seeing this in 3.16.0-23-generic , 3.19.0-21-generic Could this be related to gcc 4.9.* ? Thanks & Regards Somnath -Original Message- From: Somnath Roy Sent: Saturday, June 27, 2015 5:57 PM To: ceph-d

Probable memory leak in Hammer write path ?

2015-06-27 Thread Somnath Roy
Hi, I am chasing a substantial memory leak in latest Hammer code base in the write path since yesterday and wanted to know if anybody else is also observing this or not. This is as simple as running a fio-rbd random_write workload in my single OSD server with say block size 16K and num_jobs = 8.

RE: CRC32 of messages

2015-06-26 Thread Somnath Roy
ceph_crc32c_intel_fast is ~6 times faster than ceph_crc32c_sctp. If you are not using intel cpus or you have older intel cpus where this sse4 instruction sets are not enabled , the performance will be badly impacted as you saw. If you are building ceph yourself, make sure you have 'yasm' install

RE: [ceph-users] xattrs vs. omap with radosgw

2015-06-16 Thread Somnath Roy
Guang, Try to play around with the following conf attributes specially filestore_max_inline_xattr_size and filestore_max_inline_xattrs // Use omap for xattrs for attrs over // filestore_max_inline_xattr_size or OPTION(filestore_max_inline_xattr_size, OPT_U32, 0) //Override OPTION(filestore_ma

RE: Rados multi-object transaction use cases

2015-06-12 Thread Somnath Roy
Also, wouldn't this help in case of some kind of write coalescing for librbd/librados and sending one transaction down in case of multiple ? Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Yehuda

RE: Regarding hadoop over RGW blueprint

2015-06-10 Thread Somnath Roy
Thanks Yuan ! This is helpful. Regards Somnath -Original Message- From: Zhou, Yuan [mailto:yuan.z...@intel.com] Sent: Wednesday, June 10, 2015 8:44 PM To: Somnath Roy; Zhang, Jian; ceph-devel Subject: RE: Regarding hadoop over RGW blueprint Hi Somnath, The background was a bit

RE: Regarding hadoop over RGW blueprint

2015-06-10 Thread Somnath Roy
Hadoop + S3 + RGWProxy + RGW ? Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Zhang, Jian Sent: Wednesday, June 10, 2015 7:06 PM To: Somnath Roy; ceph-devel Cc: Zhang, Jian Subject: RE: Regarding hadoop over RGW

Regarding hadoop over RGW blueprint

2015-06-10 Thread Somnath Roy
Hi Yuan/Jian I was going through your following blueprint. http://tracker.ceph.com/projects/ceph/wiki/Hadoop_over_Ceph_RGW_status_update This is very interesting. I have some query though. 1. Did you guys benchmark RGW + S3 interface integrated with Hadoop. This should work as is today. Are yo

RE: rbd_cache, limiting read on high iops around 40k

2015-06-10 Thread Somnath Roy
Hi Alexandre, Thanks for sharing the data. I need to try out the performance on qemu soon and may come back to you if I need some qemu setting trick :-) Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: T

RE: Looking to improve small I/O performance

2015-06-07 Thread Somnath Roy
06, 2015 11:06 PM To: Somnath Roy Cc: Dałek, Piotr; ceph-devel Subject: Re: Looking to improve small I/O performance -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 This is the test that we are running that simulates the workload size and ratios of our typical servers. Of course we are not doing

RE: Looking to improve small I/O performance

2015-06-06 Thread Somnath Roy
Robert, You can try the following config option to enable asyn messenger. ms_type = async enable_experimental_unrecoverable_data_corrupting_features = ms-type-async BTW, what kind of workload you are trying , random read or write ? Also, is this SSD or HDD cluster ? Thanks & Regards Somnath

RE: Discuss: New default recovery config settings

2015-05-29 Thread Somnath Roy
Sam, We are seeing some good client IO results during recovery by using the following values.. osd recovery max active = 1 osd max backfills = 1 osd recovery threads = 1 osd recovery op priority = 1 It is all flash though. The recovery time in case of entire node (~120 TB) failure/a single dri

RE: Perfomance CPU and IOPS

2015-05-26 Thread Somnath Roy
Ceph journal in case of filestore is providing transactional writes, so, you can't remove journal. Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Casier David Sent: Sunday, May 24, 2015 4:22 AM To

RE: CephFS + Erasure coding

2015-05-05 Thread Somnath Roy
Thanks Wang ! But, is this supported right now or coming with object stub implementation in Infernalis ? Regards Somnath -Original Message- From: Wang, Zhiqiang [mailto:zhiqiang.w...@intel.com] Sent: Tuesday, May 05, 2015 7:42 PM To: Somnath Roy; Gregory Farnum Cc: ceph-devel Subject

RE: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Somnath Roy
tely. Thanks & Regards Somnath -Original Message- From: Mark Nelson [mailto:mnel...@redhat.com] Sent: Monday, April 27, 2015 10:42 AM To: Somnath Roy; Alexandre DERUMIER Cc: ceph-users; ceph-devel; Milosz Tanski Subject: Re: [ceph-users] strange benchmark problem : restarting osd daemon im

RE: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-27 Thread Somnath Roy
eedback from folks trying it out, so please >>>> feel free to give it a shot. :D >> >> Some feedback, I have runned bench all the night, no speed regression. >> >> And I have a speed increase with fio with more jobs. (with tcmalloc, >> it seem to be the re

RE: Hitting tcmalloc bug even with patch applied

2015-04-27 Thread Somnath Roy
to resolve the traces. Thanks & Regards Somnath -Original Message- From: Milosz Tanski [mailto:mil...@adfin.com] Sent: Monday, April 27, 2015 7:53 AM To: Alexandre DERUMIER; ceph-devel; Somnath Roy Subject: Re: Hitting tcmalloc bug even with patch applied On 4/27/15 9:21 AM, Alexa

RE: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-04-23 Thread Somnath Roy
Alexandre, You can configure with --with-jemalloc or ./do_autogen -J to build ceph with jemalloc. Thanks & Regards Somnath -Original Message- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Alexandre DERUMIER Sent: Thursday, April 23, 2015 4:56 AM To: Mark Nelso

RE: cluster io rate is not matching with rbd client IO rate

2015-04-19 Thread Somnath Roy
Hi, Could you try with rbd_cache = false in ceph.conf (global section) and see what is the behavior ? Thanks & Regards Somnath -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Srinivasula Maram Sent: Friday, April 17, 2015

RE: tcmalloc issue

2015-04-16 Thread Somnath Roy
Thanks James ! We will try this out. Regards Somnath -Original Message- From: James Page [mailto:james.p...@ubuntu.com] Sent: Thursday, April 16, 2015 4:48 AM To: Chaitanya Huilgol; Somnath Roy; Sage Weil; ceph-maintain...@ceph.com Cc: ceph-devel@vger.kernel.org Subject: Re: tcmalloc

RE: Regarding newstore performance

2015-04-15 Thread Somnath Roy
l-ow...@vger.kernel.org] On Behalf Of Somnath Roy Sent: Wednesday, April 15, 2015 9:22 PM To: Chen, Xiaoxi; Haomai Wang Cc: ceph-devel Subject: RE: Regarding newstore performance Thanks Xiaoxi.. But, I have already initiated test by making db/ a symbolic link to another SSD..Will share the result

RE: Regarding newstore performance

2015-04-15 Thread Somnath Roy
Thanks Xiaoxi.. But, I have already initiated test by making db/ a symbolic link to another SSD..Will share the result soon. Regards Somnath -Original Message- From: Chen, Xiaoxi [mailto:xiaoxi.c...@intel.com] Sent: Wednesday, April 15, 2015 6:48 PM To: Somnath Roy; Haomai Wang Cc

  1   2   3   >