[ceph-users] rbd image-meta

2015-07-23 Thread Maged Mokhtar
Hello i am trying to use the rbd image-meta set. i get an error from rbd that this command is not recognized yet it is documented in rdb documentation: http://ceph.com/docs/next/man/8/rbd/ I am using Hammer release deployed using ceph_deploy on Ubutnu 14.04 Is image-meta set supported in rbd in H

[ceph-users] rbd image-meta

2015-07-23 Thread Maged Mokhtar
Hello i am trying to use the rbd image-meta set. i get an error from rbd that this command is not recognized yet it is documented in rdb documentation: http://ceph.com/docs/next/man/8/rbd/ I am using Hammer release deployed using ceph_deploy on Ubutnu 14.04 Is image-meta set supported in rbd in H

[ceph-users] Number of PGs: fix from start or change as we grow ?

2016-08-03 Thread Maged Mokhtar
Hello, I would like to build a small cluster with 20 disks to start but in the future would like to gradually increase it to maybe 200 disks. Is it better to fix the number of PGs in the pool from the beginning or is it better to start with a small number and then gradually change the number of

Re: [ceph-users] Config parameters for system tuning

2017-06-22 Thread Maged Mokhtar
: filestore_queue_max_delay_multiple filestore_queue_high_delay_multiple filestore_queue_low_threshhold filestore_queue_high_threshhold again it will be good to update the docs: http://docs.ceph.com/docs/master/rados/configuration/filestore-config-ref/ I guess all eyes are on Bluestore now :) Maged Mokhtar

Re: [ceph-users] Squeezing Performance of CEPH

2017-06-22 Thread Maged Mokhtar
cluster. Cheers, Maged Mokhtar PetaSAN On 2017-06-22 19:19, Massimiliano Cuttini wrote: > Hi everybody, > > I want to squeeze all the performance of CEPH (we are using jewel 10.2.7). > We are testing a testing environment with 2 nodes having the same > configuration: &g

Re: [ceph-users] Ceph random read IOPS

2017-06-24 Thread Maged Mokhtar
My understanding was this test is targeting latency more than IOPS. This is probably why its was run using QD=1. It also makes sense that cpu freq will be more important than cores. On 2017-06-24 12:52, Willem Jan Withagen wrote: > On 24-6-2017 05:30, Christian Wuerdig wrote: > >> The general

Re: [ceph-users] Ceph random read IOPS

2017-06-26 Thread Maged Mokhtar
r latency below the value you get from the > QD=1 test. To achieve lower latency you need faster cpu freq. Yes it is > expensive and as you said you need lower latency switches and so on but you > just have to pay more to achieve this. > > /Maged > > On Sun, Jun

Re: [ceph-users] Ceph mount rbd

2017-06-28 Thread Maged Mokhtar
On 2017-06-28 22:55, li...@marcelofrota.info wrote: > Hi People, > > I am testing the new enviroment, with ceph + rbd with ubuntu 16.04, and i > have one question. > > I have my cluster ceph and mount the using the comands to ceph in my linux > enviroment : > > rbd create veeamrepo --size 204

Re: [ceph-users] Kernel mounted RBD's hanging

2017-06-30 Thread Maged Mokhtar
On 2017-06-29 16:30, Nick Fisk wrote: > Hi All, > > Putting out a call for help to see if anyone can shed some light on this. > > Configuration: > Ceph cluster presenting RBD's->XFS->NFS->ESXi > Running 10.2.7 on the OSD's and 4.11 kernel on the NFS gateways in a > pacemaker cluster > Both OSD's

Re: [ceph-users] How to force "rbd unmap"

2017-07-05 Thread Maged Mokhtar
On 2017-07-05 20:42, Ilya Dryomov wrote: > On Wed, Jul 5, 2017 at 8:32 PM, David Turner wrote: > >> I had this problem occasionally in a cluster where we were regularly mapping >> RBDs with KRBD. Something else we saw was that after this happened for >> un-mapping RBDs, was that it would start

Re: [ceph-users] New cluster - configuration tips and reccomendation - NVMe

2017-07-05 Thread Maged Mokhtar
On 2017-07-05 23:22, David Clarke wrote: > On 07/05/2017 08:54 PM, Massimiliano Cuttini wrote: > >> Dear all, >> >> luminous is coming and sooner we should be allowed to avoid double writing. >> This means use 100% of the speed of SSD and NVMe. >> Cluster made all of SSD and NVMe will not be pe

[ceph-users] krbd journal support

2017-07-06 Thread Maged Mokhtar
Hi all, Are there any plans to support rbd journal feature in kernel krbd ? Cheers /Maged ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Maged Mokhtar
On 2017-07-10 18:14, Mohamad Gebai wrote: > Resending as my first try seems to have disappeared. > > Hi, > > We ran some benchmarks to assess the overhead caused by enabling > client-side RBD journaling in Luminous. The tests consists of: > - Create an image with journaling enabled (--image-fea

Re: [ceph-users] RBD journaling benchmarks

2017-07-10 Thread Maged Mokhtar
On 2017-07-10 20:06, Mohamad Gebai wrote: > On 07/10/2017 01:51 PM, Jason Dillaman wrote: On Mon, Jul 10, 2017 at 1:39 > PM, Maged Mokhtar wrote: These are significant > differences, to the point where it may not make sense > to use rbd journaling / mirroring unless there is o

Re: [ceph-users] RBD journaling benchmarks

2017-07-13 Thread Maged Mokhtar
-- From: "Jason Dillaman" Sent: Thursday, July 13, 2017 4:45 AM To: "Maged Mokhtar" Cc: "Mohamad Gebai" ; "ceph-users" Subject: Re: [ceph-users] RBD journaling benchmarks > On Mon, Jul 10, 2017 at 3:41

Re: [ceph-users] Iscsi configuration

2017-08-09 Thread Maged Mokhtar
Hi Sam, Pacemaker will take care of HA failover but you will need to progagate the PR data yourself. If you are interested in a solution that works out of the box with Windows, have a look at PetaSAN www.petasan.org It works well with MS hyper-v/storage spaces/Scale Out File Server. Cheers /Ma

Re: [ceph-users] Book & questions

2017-08-13 Thread Maged Mokhtar
i would recommend getting all 3 books, they are all very good. i particularly like Nick's book, it has a lot of hands on issues and quite recent. /Maged On 2017-08-13 09:43, Sinan Polat wrote: > Hi all, > > I am quite new with Ceph Storage. Currently we have a Ceph environment > running,

Re: [ceph-users] VMware + Ceph using NFS sync/async ?

2017-08-19 Thread Maged Mokhtar
Hi Nick, Interesting your note on PG locking, but I would be surprised if its effect is that bad. I would think that in your example the 2 ms is a total latency, the lock will probably be applied to small portion of that, so the concurrent operations are not serialized for the entire time..but ag

Re: [ceph-users] Small-cluster performance issues

2017-08-22 Thread Maged Mokhtar
It is likely your 2 spinning disks cannot keep up with the load. Things are likely to improve if you double your OSDs hooking them up to your existing SSD journal. Technically it would be nice to run a load/performance tool (either atop/collectl/sysstat) and measure how busy your resources are, but

Re: [ceph-users] Power outages!!! help!

2017-08-28 Thread Maged Mokhtar
I would suggest either adding 1 new disk on each of the 2 machines increasing the osd_backfill_full_ratio to something like 90 or 92 from default 85. /Maged On 2017-08-28 08:01, hjcho616 wrote: > Hello! > > I've been using ceph for long time mostly for network CephFS storage, even > before

Re: [ceph-users] Power outages!!! help!

2017-08-29 Thread Maged Mokhtar
One of the things to watch out in small clusters is OSDs can get full rather unexpectedly in recovery/backfill cases: In your case you have 2 OSD nodes with 5 disks each. Since you have a replica of 2, each PG will have 1 copy on each host, so if an OSD fails, all its PGs will have to be re-creat

Re: [ceph-users] Bluestore OSD_DATA, WAL & DB

2017-09-21 Thread Maged Mokhtar
On 2017-09-21 07:56, Lazuardi Nasution wrote: > Hi, > > I'm still looking for the answer of these questions. Maybe someone can share > their thought on these. Any comment will be helpful too. > > Best regards, > > On Sat, Sep 16, 2017 at 1:39 AM, Lazuardi Nasution > wrote: > >> Hi, >>

Re: [ceph-users] Bluestore disk colocation using NVRAM, SSD and SATA

2017-09-21 Thread Maged Mokhtar
t; > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com My guess is for wal: you are dealing with a 2 step io operation so in case it is colloca

Re: [ceph-users] trying to understanding crush more deeply

2017-09-22 Thread Maged Mokhtar
Per section 3.4.4 The default bucket type straw computes the hash of (PG number, replica number, bucket id) for all buckets using the Jenkins integer hashing function, then multiply this by bucket weight (for OSD disks the weight of 1 is for 1 TB, for higher level it is the sum of contained weights

Re: [ceph-users] trying to understanding crush more deeply

2017-09-22 Thread Maged Mokhtar
d this ? Can you explain something > about this ? Apologize for my dummy. And thank you very much . : ) > > On Fri, Sep 22, 2017 at 3:50 PM, Maged Mokhtar wrote: > >> Per section 3.4.4 The default bucket type straw computes the hash of (PG >> number, replica number, bucket

Re: [ceph-users] RBD features(kernel client) with kernel version

2017-09-26 Thread Maged Mokhtar
On 2017-09-25 14:29, Ilya Dryomov wrote: > On Sat, Sep 23, 2017 at 12:07 AM, Muminul Islam Russell > wrote: > >> Hi Ilya, >> >> Hope you are doing great. >> Sorry for bugging you. I did not find enough resources for my question. I >> would be really helped if you could reply me. My questions

Re: [ceph-users] osd create returns duplicate ID's

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 10:44, Adrian Saul wrote: > Do you mean that after you delete and remove the crush and auth entries for > the OSD, when you go to create another OSD later it will re-use the previous > OSD ID that you have destroyed in the past? > > Because I have seen that behaviour as well - bu

Re: [ceph-users] osd create returns duplicate ID's

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 11:31, Maged Mokhtar wrote: > On 2017-09-29 10:44, Adrian Saul wrote: > > Do you mean that after you delete and remove the crush and auth entries for > the OSD, when you go to create another OSD later it will re-use the previous > OSD ID that you have destro

Re: [ceph-users] Ceph OSD on Hardware RAID

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 17:14, Hauke Homburg wrote: > Hello, > > Ich think that the Ceph Users don't recommend on ceph osd on Hardware > RAID. But i haven't found a technical Solution for this. > > Can anybody give me so a Solution? > > Thanks for your help > > Regards > > Hauke You get better perform

Re: [ceph-users] Get rbd performance stats

2017-09-29 Thread Maged Mokhtar
On 2017-09-29 17:13, Matthew Stroud wrote: > Is there a way I could get a performance stats for rbd images? I'm looking > for iops and throughput. > > This issue we are dealing with is that there was a sudden jump in throughput > and I want to be able to find out with rbd volume might be causi

Re: [ceph-users] rados_read versus rados_aio_read performance

2017-10-01 Thread Maged Mokhtar
On 2017-10-01 16:47, Alexander Kushnirenko wrote: > Hi, Gregory! > > Thanks for the comment. I compiled simple program to play with write speed > measurements (from librados examples). Underline "write" functions are: > rados_write(io, "hw", read_res, 1048576, i*1048576); > rados_aio_write(i

Re: [ceph-users] Ceph-ISCSI

2017-10-12 Thread Maged Mokhtar
On 2017-10-11 14:57, Jason Dillaman wrote: > On Wed, Oct 11, 2017 at 6:38 AM, Jorge Pinilla López > wrote: > >> As far as I am able to understand there are 2 ways of setting iscsi for ceph >> >> 1- using kernel (lrbd) only able on SUSE, CentOS, fedora... > > The target_core_rbd approach is on

Re: [ceph-users] Ceph-ISCSI

2017-10-12 Thread Maged Mokhtar
On 2017-10-12 11:32, David Disseldorp wrote: > On Wed, 11 Oct 2017 14:03:59 -0400, Jason Dillaman wrote: > > On Wed, Oct 11, 2017 at 1:10 PM, Samuel Soulard > wrote: Hmmm, If you failover the identity of the > LIO configuration including PGRs > (I believe they are files on disk), this would wor

Re: [ceph-users] Ceph iSCSI login failed due to authorization failure

2017-10-14 Thread Maged Mokhtar
On 2017-10-14 17:50, Kashif Mumtaz wrote: > Hello Dear, > > I am trying to configure the Ceph iscsi gateway on Ceph Luminious . As per > below > > Ceph iSCSI Gateway -- Ceph Documentation [1] > > [1] > > CEPH ISCSI GATEWAY — CEPH DOCUMENTATION > > Ceph is iscsi gateway are configured and

Re: [ceph-users] osd max scrubs not honored?

2017-10-15 Thread Maged Mokhtar
On 2017-10-14 05:02, J David wrote: > Thanks all for input on this. > > It's taken a couple of weeks, but based on the feedback from the list, > we've got our version of a scrub-one-at-a-time cron script running and > confirmed that it's working properly. > > Unfortunately, this hasn't really so

Re: [ceph-users] osd max scrubs not honored?

2017-10-15 Thread Maged Mokhtar
correction, i limit it to 128K: echo 128 > /sys/block/sdX/queue/read_ahead_kb On 2017-10-15 13:14, Maged Mokhtar wrote: > On 2017-10-14 05:02, J David wrote: > >> Thanks all for input on this. >> >> It's taken a couple of weeks, but based on the feedback

Re: [ceph-users] Bareos and libradosstriper works only for 4M sripe_unit size

2017-10-17 Thread Maged Mokhtar
>> Would it be 4 objects of 24M and 4 objects of 250KB? Or will the last 4 objects be artificially padded (with 0's) to meet the stripe_unit? It will be 4 object of 24M + 1M stored on the 5th object If you write 104M : 4 object of 24M + 8M stored on the 5th object If you write 105M : 4 obje

Re: [ceph-users] Ceph-ISCSI

2017-10-17 Thread Maged Mokhtar
The issue with active/active is the following condition: client initiator sends write operation to gateway server A server A does not respond within client timeout client initiator re-sends failed write operation to gateway server B client initiator sends another write operation to gateway server C

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
First a general comment: local RAID will be faster than Ceph for a single threaded (queue depth=1) io operation test. A single thread Ceph client will see at best same disk speed for reads and for writes 4-6 times slower than single disk. Not to mention the latency of local disks will much better.

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
0 > Total writes made: 31032 > Write size: 4096 > Object size:4096 > Bandwidth (MB/sec): 3.93282 > Stddev Bandwidth: 3.66265 > Max bandwidth (MB/sec): 13.668 > Min bandwidth (MB/sec): 0 > Average IOPS: 1006 > St

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
de.servers.com/ssd-performance-2017-c4307a92dea [2] > > On Wed, Oct 18, 2017 at 11:53 AM, Maged Mokhtar wrote: > > Check out the following link: some SSDs perform bad in Ceph due to sync > writes to journal > > https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-18 Thread Maged Mokhtar
peed in which in the dd infile can > be read? > And I assume the best test should be run with no other load. > > How does one run the rados bench "as stress"? > > -RG > > On Wed, Oct 18, 2017 at 1:33 PM, Maged Mokhtar wrote: > > measuring resource loa

Re: [ceph-users] Backup VM (Base image + snapshot)

2017-10-20 Thread Maged Mokhtar
Hi all, Can export-diff work effectively without the fast-diff rbd feature as it is not supported in kernel rbd ? Maged On 2017-10-19 23:18, Oscar Segarra wrote: > Hi Richard, > > Thanks a lot for sharing your experience... I have made deeper investigation > and it looks export-diff is t

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-25 Thread Maged Mokhtar
ry well without any hint of a problem. >>>>> >>>>> Any other ideas or suggestions? >>>>> >>>>> -RG >>>>> >>>>> >>>>> On Wed, Oct 18, 2017 at 3:40 PM, Maged Mokhtar >>>>> wrote

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-26 Thread Maged Mokhtar
and a busy% of below 90% during rados 4k test. Maged On 2017-10-26 16:44, Russell Glaue wrote: > On Wed, Oct 25, 2017 at 7:09 PM, Maged Mokhtar wrote: > >> It depends on what stage you are in: >> in production, probably the best thing is to setup a monitoring tool &g

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-10-27 Thread Maged Mokhtar
h a long stick for anything but small toy-test clusters. > > On Fri, Oct 27, 2017 at 3:44 AM, Russell Glaue wrote: > On Wed, Oct 25, 2017 at 7:09 PM, Maged Mokhtar wrote: > It depends on what stage you are in: > in production, probably the best thing is to setup a monitoring tool >

Re: [ceph-users] Bluestore OSD_DATA, WAL & DB

2017-11-03 Thread Maged Mokhtar
On 2017-11-03 15:59, Wido den Hollander wrote: > Op 3 november 2017 om 14:43 schreef Mark Nelson : > > On 11/03/2017 08:25 AM, Wido den Hollander wrote: > Op 3 november 2017 om 13:33 schreef Mark Nelson : > > On 11/03/2017 02:44 AM, Wido den Hollander wrote: > Op 3 november 2017 om 0:09 schree

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Maged Mokhtar
Hi Mark, It will be interesting to know: The impact of replication. I guess it will decrease by a higher factor than the replica count. I assume you mean the 30K IOPS per OSD is what the client sees, if so the OSD raw disk itself will be doing more IOPS, is this correct and if so what is the

Re: [ceph-users] Performance, and how much wiggle room there is with tunables

2017-11-10 Thread Maged Mokhtar
rados benchmark is a client application that simulates client io to stress the cluster. This applies whether you run the test from an external client or from a cluster server that will act as a client. For fast clusters it the client will saturate (cpu/net) before the cluster does. To get accurate

Re: [ceph-users] Bluestore performance 50% of filestore

2017-11-15 Thread Maged Mokhtar
On 2017-11-14 21:54, Milanov, Radoslav Nikiforov wrote: > Hi > > We have 3 node, 27 OSDs cluster running Luminous 12.2.1 > > In filestore configuration there are 3 SSDs used for journals of 9 OSDs on > each hosts (1 SSD has 3 journal paritions for 3 OSDs). > > I've converted filestore to bl

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-27 Thread Maged Mokhtar
On 2017-11-27 15:02, German Anders wrote: > Hi All, > > I've a performance question, we recently install a brand new Ceph cluster > with all-nvme disks, using ceph version 12.2.0 with bluestore configured. The > back-end of the cluster is using a bond IPoIB (active/passive) , and for the > fr

Re: [ceph-users] ceph-disk is now deprecated

2017-11-28 Thread Maged Mokhtar
I tend to agree with Wido. May of us still reply on ceph-disk and hope to see it live a little longer. Maged On 2017-11-28 13:54, Alfredo Deza wrote: > On Tue, Nov 28, 2017 at 3:12 AM, Wido den Hollander wrote: > Op 27 november 2017 om 14:36 schreef Alfredo Deza : > > For the upcoming Lumin

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-29 Thread Maged Mokhtar
me disk with an additional >> partition to use as journal/wal. We double check the c-state and it was >> not configure to use c1, so we change that on all the osd nodes and mon >> nodes and we're going to make some new tests, and see how it goes. I'll >> get back as soon as

[ceph-users] Single disk per OSD ?

2017-12-01 Thread Maged Mokhtar
Hi all, I believe most exiting setups use 1 disk per OSD. Is this going to be the most common setup in the future ? With the move to lvm, will this prefer the use of multiple disks per OSD ? On the other side i also see nvme vendors recommending multiple OSDs ( 2,4 ) per disk as disks are getting

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-12-08 Thread Maged Mokhtar
I have > to believe it is a hardware firmware issue. > And its peculiar seeing performance boost slightly, even 24 hours later, when > I stop then start the OSDs. > > Our actual writes are low, as most of our Ceph Cluster based images are > low-write, high-memory. So a 20GB/d

Re: [ceph-users] How to increase the size of requests written to a ceph image

2017-12-08 Thread Maged Mokhtar
4M block sizes you will only need 22.5 iops On 2017-12-08 09:59, Maged Mokhtar wrote: > Hi Russell, > > It is probably due to the difference in block sizes used in the test vs your > cluster load. You have a latency problem which is limiting your max write > iops to around

Re: [ceph-users] OSDs wrongly marked down

2017-12-20 Thread Maged Mokhtar
Could also be your hardware under powered for the io you have. try to check your resource load during peak workload together with recovery and scrubbing going on at same time. On 2017-12-20 17:03, David Turner wrote: > When I have OSDs wrongly marked down it's usually to do with the > filesto

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Maged Mokhtar
On 2018-01-22 08:39, Wido den Hollander wrote: > On 01/20/2018 02:02 PM, Marc Roos wrote: > >> If I test my connections with sockperf via a 1Gbit switch I get around >> 25usec, when I test the 10Gbit connection via the switch I have around >> 12usec is that normal? Or should there be a differnce

Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Maged Mokhtar
kets transmitted, 10 received, 0% packet loss, time 2363ms > rtt min/avg/max/mdev = 0.014/0.015/0.322/0.006 ms, ipg/ewma 0.023/0.016 ms > > On 22 January 2018 at 22:37, Nick Fisk wrote: > >> Anyone with 25G ethernet willing to do the test? Would love to see what the >> late

Re: [ceph-users] How ceph client read data from ceph cluster

2018-01-26 Thread Maged Mokhtar
On 2018-01-26 09:09, shadow_lin wrote: > Hi List, > I read a old article about how ceph client read from ceph cluster.It said the > client only read from the primary osd. Since ceph cluster in replicate mode > have serveral copys of data only read from one copy seems waste the > performance of

Re: [ceph-users] How ceph client read data from ceph cluster

2018-01-26 Thread Maged Mokhtar
.2.2) the client only read from the primary > osd(one copy),is that true? > > 2018-01-27 > - > > lin.yunfan > --------- > > 发件人:Maged Mokhtar > 发送时间:2018-01-26 20:27 > 主题:Re: [ceph-users] How ceph client read data f

Re: [ceph-users] troubleshooting ceph performance

2018-01-30 Thread Maged Mokhtar
On 2018-01-31 08:14, Manuel Sopena Ballesteros wrote: > Dear Ceph community, > > I have a very small ceph cluster for testing with this configuration: > > · 2x compute nodes each with: > > · dual port of 25 nic > > · 2x socket (56 cores with hyperthreading) > > ·

Re: [ceph-users] Ceph - incorrect output of ceph osd tree

2018-01-31 Thread Maged Mokhtar
try setting: mon_osd_min_down_reporters = 1 On 2018-01-31 20:46, Steven Vacaroaia wrote: > Hi, > > Why is ceph osd tree reports that osd.4 is up when the server on which osd.4 > is running is actually down ?? > > Any help will be appreciated > > [root@osd01 ~]# ping -c 2 osd02 > PING

Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it is bluestore) ?

2018-01-31 Thread Maged Mokhtar
I would recommend as Wido to use the dd command. block db device holds the metada/allocation of objects stored in data block, not cleaning this is asking for problems, besides it does not take any time. In our testing building new custer on top of older installation, we did see many cases where os

Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it is bluestore) ?

2018-02-01 Thread Maged Mokhtar
-- > > lin.yunfan > - > > 发件人:Maged Mokhtar > 发送时间:2018-02-01 14:22 > 主题:Re: [ceph-users] How to clean data of osd with ssd journal(wal, db if it > is bluestore) ? > 收件人:"David Turner" > 抄送:"shadow_lin","ceph-user

Re: [ceph-users] Newbie question: stretch ceph cluster

2018-02-14 Thread Maged Mokhtar
Hi, You need to set the min_size to 2 in crush rule. The exact location and replication flow when a client writes data depends on the object name and num of pgs. the crush rule determines which osds will serve a pg, the first is the primary osd for that pg. The client computes the pg from the

Re: [ceph-users] Ceph luminous performance - how to calculate expected results

2018-02-14 Thread Maged Mokhtar
On 2018-02-14 20:14, Steven Vacaroaia wrote: > Hi, > > It is very useful to "set up expectations" from a performance perspective > > I have a cluster using 3 DELL R620 with 64 GB RAM and 10 GB cluster network > > I've seen numerous posts and articles about the topic mentioning the > foll

[ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-16 Thread Maged Mokhtar
Hello, I am happy to announce PetaSAN, an open source scale-out SAN that uses Ceph storage and LIO iSCSI Target. visit us at: www.petasan.org your feedback will be much appreciated. maged mokhtar ___ ceph-users mailing list ceph-users

Re: [ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-17 Thread Maged Mokhtar
Sonnenberg 1-3 63571 Gelnhausen HRB 93402 beim Amtsgericht Hanau Geschäftsführung: Oliver Dzombic Steuer Nr.: 35 236 3622 1 UST ID: DE274086107 Am 16.10.2016 um 18:57 schrieb Maged Mokhtar: Hello, I am happy to announce PetaSAN, an open source scale-out SAN that uses Ceph storage and LIO iS

Re: [ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-17 Thread Maged Mokhtar
86107 Am 17.10.2016 um 13:37 schrieb Maged Mokhtar: Hi Oliver, if you are refering to clustering reservations through VAAI. We are using upstream code from SUSE Enterprise Storage which adds clustered support for VAAI (compare and write, write same) in the kernel as well as in ceph (implemented as a

Re: [ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-17 Thread Maged Mokhtar
016 4:21 PM To: Subject: Re: [ceph-users] new Open Source Ceph based iSCSI SAN project On 2016-10-17T13:37:29, Maged Mokhtar wrote: Hi Maged, glad to see our patches caught your attention. You're aware that they are being upstreamed by David Disseldorp and Mike Christie, right? You don'

Re: [ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-17 Thread Maged Mokhtar
Thank you David very much and thank you for the correction. -- From: "David Disseldorp" Sent: Monday, October 17, 2016 5:24 PM To: "Maged Mokhtar" Cc: ; "Oliver Dzombic" ; "Mike Christie" Subject: Re:

Re: [ceph-users] new Open Source Ceph based iSCSI SAN project

2016-10-18 Thread Maged Mokhtar
Thank you Mike for this update. I sent you and Dave the relevant changes we found for hyper-v. Cheers /maged -- From: "Mike Christie" Sent: Monday, October 17, 2016 9:40 PM To: "Maged Mokhtar" ; "Lars Marowsky-Bree"

Re: [ceph-users] Ceph Blog Articles

2016-11-11 Thread Maged Mokhtar
Nice article on write latency. If i understand correctly, this latency is measured while there is no overflow of the journal caused by long sustained writes else you will start hitting the HDD latency. Also queue depth you use is 1 ? Will be interested to see your article on hardware. /Maged

Re: [ceph-users] Ceph Blog Articles

2016-11-12 Thread Maged Mokhtar
ick > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Maged Mokhtar >> Sent: 11 November 2016 21:48 >> To: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] Ceph Blog Articles >> >> >

Re: [ceph-users] Ceph Blog Articles

2016-11-14 Thread Maged Mokhtar
-- From: "Nick Fisk" Sent: Monday, November 14, 2016 11:41 AM To: "'Maged Mokhtar'" ; Subject: RE: [ceph-users] Ceph Blog Articles Hi Maged, I would imagine as soon as you start saturating the disks, the latency impact would make t

Re: [ceph-users] Estimate Max IOPS of Cluster

2017-01-04 Thread Maged Mokhtar
Max iops depends on the hardware type/configuration for disks/cpu/network. For disks, the theoretical iops limit is read = physical disk iops x number of disks write (with journal on same disk) = physical disk iops x number of disks / num of replicas / 3 in practice real benchmarks will vary

Re: [ceph-users] Estimate Max IOPS of Cluster

2017-01-04 Thread Maged Mokhtar
if you are asking about what tools to use: http://tracker.ceph.com/projects/ceph/wiki/Benchmark_Ceph_Cluster_Performance You should run many concurrent processes on different clients From: Maged Mokhtar Sent: Wednesday, January 04, 2017 6:45 PM To: John Petrini ; ceph-users Subject: Re

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-06 Thread Maged Mokhtar
The numbers are very low. I would first benchmark the system without the vm client using rbd 4k test such as: rbd bench-write image01  --pool=rbd --io-threads=32 --io-size 4096 --io-pattern rand --rbd_cache=false Original message From: kevin parrikar Date: 07/01/2017 0

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-07 Thread Maged Mokhtar
Adding more nodes is best if you have unlimited budget :)You should add more osds per node until you start hitting cpu or network bottlenecks. Use a perf tool like atop/sysstat to know when this happens. Original message From: kevin parrikar Date: 07/01/2017 19:56 (

Re: [ceph-users] Analysing ceph performance with SSD journal, 10gbe NIC and 2 replicas -Hammer release

2017-01-08 Thread Maged Mokhtar
Why would you still be using journals when running fully OSDs on SSDs? When using a journal the data is first written to a journal, and then that same data is (later on) written again to disk. This in the assumption that the time to write the journal is only a fraction of the time it costs to wr

Re: [ceph-users] Performance tuning for SAN SSD config

2018-07-06 Thread Maged Mokhtar
On 2018-06-29 18:30, Matthew Stroud wrote: > We back some of our ceph clusters with SAN SSD disk, particularly VSP G/F and > Purestorage. I'm curious what are some settings we should look into modifying > to take advantage of our SAN arrays. We had to manually set the class for the > luns to SS

Re: [ceph-users] Performance tuning for SAN SSD config

2018-07-06 Thread Maged Mokhtar
ke FC systems > where you can use commodity hardware and grow as you need, you > generally dont need hba/fc enclosed disks but nothing stopping you > from using your existing system. Also you generally dont need any raid > mirroring configurations in the backend since ceph will hand

[ceph-users] advice with erasure coding

2018-09-07 Thread Maged Mokhtar
Good day Cephers, I want to get some guidance on erasure coding, the docs do state the different plugins and settings but to really understand them all and their use cases is not easy: -Are the majority of implementations using jerasure and just configuring k and m ? -For jerasure: when/if woul

Re: [ceph-users] WAL/DB size

2018-09-07 Thread Maged Mokhtar
On 2018-09-07 14:36, Alfredo Deza wrote: > On Fri, Sep 7, 2018 at 8:27 AM, Muhammad Junaid > wrote: > >> Hi there >> >> Asking the questions as a newbie. May be asked a number of times before by >> many but sorry, it is not clear yet to me. >> >> 1. The WAL device is just like journaling dev

Re: [ceph-users] advice with erasure coding

2018-09-07 Thread Maged Mokhtar
On 2018-09-07 13:52, Janne Johansson wrote: > Den fre 7 sep. 2018 kl 13:44 skrev Maged Mokhtar : > >> Good day Cephers, >> >> I want to get some guidance on erasure coding, the docs do state the >> different plugins and settings but to really understand them

Re: [ceph-users] Benchmark does not show gains with DB on SSD

2018-09-12 Thread Maged Mokhtar
On 12/09/18 17:06, Ján Senko wrote: We are benchmarking a test machine which has: 8 cores, 64GB RAM 12 * 12 TB HDD (SATA) 2 * 480 GB SSD (SATA) 1 * 240 GB SSD (NVME) Ceph Mimic Baseline benchmark for HDD only (Erasure Code 4+2) Write 420 MB/s, 100 IOPS, 150ms latency Read 1040 MB/s, 260 IOPS,

Re: [ceph-users] Slow Ceph: Any plans on torrent-like transfers from OSDs ?

2018-09-14 Thread Maged Mokhtar
On 14/09/18 12:13, Alex Lupsa wrote: Hi, Thank you for the answer Ronny. I did indeed try 2x RBD drives (rdb-cache was already active), striping them, and got double write/read speed instantly. So I am chalking this one on KVM who is single-threaded and not fully ceph-aware it seems. Althoug

Re: [ceph-users] Hyper-v ISCSI support

2018-09-21 Thread Maged Mokhtar
Hi Glen, Yes you need clustered SCSI-3 persistent reservations support. This is supported in SUSE SLE kernels, you may also be interested in PetaSAN: http://www.petasan.org which is based on these kernels. Maged On 21/09/18 12:48, Glen Baars wrote: Hello Ceph Users, We have been using ce

Re: [ceph-users] CRUSH puzzle: step weighted-take

2018-09-27 Thread Maged Mokhtar
On 27/09/18 17:18, Dan van der Ster wrote: Dear Ceph friends, I have a CRUSH data migration puzzle and wondered if someone could think of a clever solution. Consider an osd tree like this: -2 4428.02979 room 0513-R-0050 -72911.81897 rack RA01 -4917.

[ceph-users] bcache, dm-cache support

2018-10-04 Thread Maged Mokhtar
Hello all, Do  bcache and dm-cache work well with Ceph ? Is one recommended on the other ? Are there any issues ? There are a few posts in this list around them, but i could not determine if they are ready for mainstream use or not Appreciate any clarifications.  /Maged _

Re: [ceph-users] bcache, dm-cache support

2018-10-10 Thread Maged Mokhtar
On 10/10/18 21:08, Ilya Dryomov wrote: On Wed, Oct 10, 2018 at 8:48 PM Kjetil Joergensen wrote: Hi, We tested bcache, dm-cache/lvmcache, and one more which name eludes me with PCIe NVME on top of large spinning rust drives behind a SAS3 expander - and decided this were not for us. This w

Re: [ceph-users] A basic question on failure domain

2018-10-20 Thread Maged Mokhtar
On 20/10/18 05:28, Cody wrote: Hi folks, I have a rookie question. Does the number of the buckets chosen as the failure domain must be equal or greater than the number of replica (or k+m for erasure coding)? E.g., for an erasure code profile where k=4, m=2, failure domain=rack, does it only w

Re: [ceph-users] Drive for Wal and Db

2018-10-20 Thread Maged Mokhtar
On 20/10/18 18:57, Robert Stanford wrote:  Our OSDs are BlueStore and are on regular hard drives. Each OSD has a partition on an SSD for its DB.  Wal is on the regular hard drives.  Should I move the wal to share the SSD with the DB?  Regards R ___

Re: [ceph-users] Slow requests troubleshooting in Luminous - details missing

2018-03-02 Thread Maged Mokhtar
On 2018-03-02 07:54, Alex Gorbachev wrote: > On Thu, Mar 1, 2018 at 10:57 PM, David Turner wrote: > Blocked requests and slow requests are synonyms in ceph. They are 2 names > for the exact same thing. > > On Thu, Mar 1, 2018, 10:21 PM Alex Gorbachev > wrote: > On Thu, Mar 1, 2018 at 2:47 PM

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-09 Thread Maged Mokhtar
Hi Mike, > For the easy case, the SCSI command is sent directly to krbd and so if > osd_request_timeout is less than M seconds then the command will be > failed in time and we would not hit the problem above. > If something happens in the target stack like the SCSI command gets > stuck/queued the

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-10 Thread Maged Mokhtar
-- From: "Jason Dillaman" Sent: Sunday, March 11, 2018 1:46 AM To: "shadow_lin" Cc: "Lazuardi Nasution" ; "Ceph Users" Subject: Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock On Sat, Mar 10, 2018 at 10:11 AM, shadow

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar
On 2018-03-12 14:23, David Disseldorp wrote: > On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote: > >> 2)I undertand that before switching the path, the initiator will send a >> TMF ABORT can we pass this to down to the same abort_request() function >> in osd

Re: [ceph-users] Fwd: [ceph bad performance], can't find a bottleneck

2018-03-12 Thread Maged Mokhtar
Hi, Try increasing the queue depth from default 128 to 1024: rbd map image-XX -o queue_depth=1024 Also if you run multiple rbd images/fio tests, do you get higher combined performance ? Maged On 2018-03-12 17:16, Sergey Kotov wrote: > Dear moderator, i subscribed to ceph list today, cou

Re: [ceph-users] iSCSI Multipath (Load Balancing) vs RBD Exclusive Lock

2018-03-12 Thread Maged Mokhtar
On 2018-03-12 21:00, Ilya Dryomov wrote: > On Mon, Mar 12, 2018 at 7:41 PM, Maged Mokhtar wrote: > >> On 2018-03-12 14:23, David Disseldorp wrote: >> >> On Fri, 09 Mar 2018 11:23:02 +0200, Maged Mokhtar wrote: >> >> 2)I undertand that before switching

  1   2   >