MAX AVAIL is the amount of data you can still write to the cluster
before *anyone one of your OSDs* becomes near full. If MAX AVAIL is not
what you expect it to be, look at the data distribution using ceph osd
tree and make sure you have a uniform distribution.
Mohamad
On 6/25/19 11:46 AM, Davis
Hi Florian,
On 3/7/19 10:27 AM, Florian Engelmann wrote:
>
> So the settings are recognized and used by qemu. But any value higher
> than the default (32MB) of the cache size leads to strange IOPS
> results. IOPS are very constant with 32MB ~20.000 - 23.000 but if we
> define a bigger cache size
On 2/27/19 4:57 PM, Marc Roos wrote:
> They are 'thin provisioned' meaning if you create a 10GB rbd, it does
> not use 10GB at the start. (afaik)
You can use 'rbd -p rbd du' to see how much of these devices is
provisioned and see if it's coherent.
Mohamad
>
>
> -Original Message-
>
Hi Glen,
On 2/24/19 9:21 PM, Glen Baars wrote:
> I am tracking down a performance issue with some of our mimic 13.2.4 OSDs. It
> feels like a lack of memory but I have no real proof of the issue. I have
> used the memory profiling ( pprof tool ) and the OSD's are maintaining their
> 4GB
On 2/21/19 1:22 PM, Fabio Abreu wrote:
> Hi Everybody,
>
> It's recommended join different hardwares in the same rack ?
>
> For example I have a sata rack with Apollo 4200 storage and I will get
> another hardware type to expand this rack, Hp 380 Gen10.
>
> I was made a lot tests to understand
;
>
>
> *From: *Mohamad Gebai
> *Date: *Thursday, February 21, 2019 at 9:44 AM
> *To: *"Smith, Eric" , Sinan Polat
> , "ceph-users@lists.ceph.com"
> *Subject: *Re: [ceph-users] BlueStore / OpenStack Rocky performance issues
>
>
>
> What is you
What is your setup with Bluestore? Standalone OSDs? Or do they have
their WAL/DB partitions on another device? How does it compare to your
Filestore setup for the journal?
On a separate note, these look like they're consumer SSDs, which makes
them not a great fit for Ceph.
Mohamad
On 2/21/19
Hi all,
I want to share a performance issue I just encountered on a test cluster
of mine, specifically related to tuned. I started by setting the
"throughput-performance" tuned profile on my OSD nodes and ran some
benchmarks. I then applied that same profile to my client node, which
intuitively
gt;
>
> -Original Message-
> From: Mohamad Gebai [mailto:mge...@suse.de]
> Sent: 17 January 2019 15:57
> To: Marc Roos; ceph-users
> Subject: Re: [ceph-users] monitor cephfs mount io's
>
> You can do that either straight from your client, or by querying the
>
You can do that either straight from your client, or by querying the
perf dump if you're using ceph-fuse.
Mohamad
On 1/17/19 6:19 AM, Marc Roos wrote:
>
> How / where can I monitor the ios on cephfs mount / client?
>
> ___
> ceph-users mailing list
>
us know.
>
> I wish you all a happy new year.
>
> Regards
> Marcus
>
>> Mohamad Gebai <mailto:mge...@suse.de>
>> 28 December 2018 at 16:10
>> Hi Marcus,
>>
>> On 12/27/18 4:21 PM, Marcus Murwall wrote:
>>> Hey Mohamad
>>>
&
might help. Is there anything suspicious in the logs?
Also, do you get the same throughput when benchmarking the replicated
compared to the EC pool?
Mohamad
>
>
> Regards
> Marcus
>
>> Mohamad Gebai <mailto:mge...@suse.de>
>> 26 December 2018 at 18:27
>>
What is happening on the individual nodes when you reach that point
(iostat -x 1 on the OSD nodes)? Also, what throughput do you get when
benchmarking the replicated pool?
I guess one way to start would be by looking at ongoing operations at
the OSD level:
ceph daemon osd.X dump_blocked_ops
ceph
Last I heard (read) was that the RDMA implementation is somewhat
experimental. Search for "troubleshooting ceph rdma performance" on this
mailing list for more info.
(Adding Roman in CC who has been working on this recently.)
Mohamad
On 12/18/18 11:42 AM, Michael Green wrote:
> I don't know.
>
Hi all,
I was wondering how people were using tuned with Ceph, if at all. I
think it makes sense to enable the throuhput-performance profile on OSD
nodes, and maybe the network-latency profiles on mon and mgr nodes. Is
anyone using a similar configuration, and do you have any thought on
this
On 05/16/2018 07:18 AM, Uwe Sauter wrote:
> Hi Mohamad,
>
>>
>> I think this is what you're looking for:
>>
>> $> ceph daemon osd.X dump_historic_slow_ops
>>
>> which gives you recent slow operations, as opposed to
>>
>> $> ceph daemon osd.X dump_blocked_ops
>>
>> which returns current blocked
Hi,
On 05/16/2018 04:16 AM, Uwe Sauter wrote:
> Hi folks,
>
> I'm currently chewing on an issue regarding "slow requests are blocked". I'd
> like to identify the OSD that is causing those events
> once the cluster is back to HEALTH_OK (as I have no monitoring yet that would
> get this info in
On 04/23/2018 09:24 PM, Christian Balzer wrote:
>
>> If anyone has some ideas/thoughts/pointers, I would be glad to hear them.
>>
> RAM, you'll need a lot of it, even more with Bluestore given the current
> caching.
> I'd say 1GB per TB storage as usual and 1-2GB extra per OSD.
Does that still
Just to be clear about the issue:
You have a 3 servers setup, performance is good. You add a server (with
1 OSD?) and performance goes down, is that right?
Can you give us more details? What's your complete setup? How many OSDs
per node, bluestore/filestore, WAL/DB setup, etc. You're talking
On 03/28/2018 11:11 AM, Mark Nelson wrote:
> Personally I usually use a modified version of Mark Seger's getput
> tool here:
>
> https://github.com/markhpc/getput/tree/wip-fix-timing
>
> The difference between this version and upstream is primarily to make
> getput more accurate/useful when using
On 10/17/2017 09:57 AM, Sage Weil wrote:
> On Tue, 17 Oct 2017, Mohamad Gebai wrote:
>>
>> Thanks Sage. I assume that's the card you're referring to:
>> https://trello.com/c/SAtGPq0N/65-use-time-span-monotonic-for-durations
>>
>> I can take of that one i
On 10/17/2017 09:27 AM, Sage Weil wrote:
> On Tue, 17 Oct 2017, Mohamad Gebai wrote:
>
>> It would be good to know if there are any, and maybe prepare for them?
> Adam added a new set of clock primitives that include a monotonic clock
> option that should be used in all
Hi,
I am looking at the following issue: http://tracker.ceph.com/issues/21375
In summary, during a 'rados bench', impossible latency values (e.g.
9.00648e+07) are suddenly reported. I looked briefly at the code, it
seems CLOCK_REALTIME is used, which means that wall clock changes would
affect
Hi,
I'm not answering your questions, but I just want to point out that you
might be using the documentation for an older version of Ceph:
On 10/14/2017 12:25 PM, Oscar Segarra wrote:
>
> http://docs.ceph.com/docs/giant/rbd/rbd-snapshot/
>
If you're not using the 'giant' version of Ceph (which
Hi Jorge,
On 10/10/2017 07:23 AM, Jorge Pinilla López wrote:
> Are .99 KV, .01 MetaData and .0 Data ratios right? they seem a little
> too disproporcionate.
Yes, this is correct.
> Also .99 KV and Cache of 3GB for SSD means that almost the 3GB would
> be used for KV but there is also another
Sorry for the delay. We used the default k=2 and m=1.
Mohamad
On 09/07/2017 06:22 PM, Christian Wuerdig wrote:
> What type of EC config (k+m) was used if I may ask?
>
> On Fri, Sep 8, 2017 at 1:34 AM, Mohamad Gebai <mge...@suse.de> wrote:
>> Hi,
>>
>> These num
Hi,
These numbers are probably not as detailed as you'd like, but it's
something. They show the overhead of reading and/or writing to EC pools
as compared to 3x replicated pools using 1, 2, 8 and 16 threads (single
client):
Rep EC Diff Slowdown
IOPS IOPS
On 07/10/2017 01:51 PM, Jason Dillaman wrote:
> On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar wrote:
>> These are significant differences, to the point where it may not make sense
>> to use rbd journaling / mirroring unless there is only 1 active client.
> I interpreted
Resending as my first try seems to have disappeared.
Hi,
We ran some benchmarks to assess the overhead caused by enabling
client-side RBD journaling in Luminous. The tests consists of:
- Create an image with journaling enabled (--image-feature journaling)
- Run randread, randwrite and randrw
29 matches
Mail list logo