Re: [ceph-users] read performance, separate client CRUSH maps or limit osd read access from each client

2018-11-18 Thread Konstantin Shalygin

On 11/17/18 1:07 AM, Vlad Kopylov wrote:

This is what Jean suggested. I understand it and it works with primary.
*But what I need is for all clients to access same files, not separate 
sets (like red blue green)*


You should look to other solutions, like GlusterFS. Ceph is overhead for 
this case IMHO.




k


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: what are the potential risks of mixed cluster and client ms_type

2018-11-18 Thread Piotr Dałek

On 2018-11-19 8:17 a.m., Honggang(Joseph) Yang wrote:

thank you. but I encountered a problem:
https://tracker.ceph.com/issues/37300

I don't know if this is because of mix use of messger type.


Have you done basic troubleshooting, like checking osd.179 networking? 
Usually this means firewall or network hardware issues.


--
Piotr Dałek
piotr.da...@corp.ovh.com
https://www.ovhcloud.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get cephfs mounting clients' infomation

2018-11-18 Thread Yan, Zheng
On Mon, Nov 19, 2018 at 3:06 PM Zhenshi Zhou  wrote:
>
> Many thanks Yan!
>
> This command can get IP, hostname, mounting point and kernel version. All
> of these data are exactly what I need.
> Besides, is there a way I can get the sub directory's usage other than the 
> whole
> cephfs usage from the server. For instance, I have /docker, /kvm, /backup, 
> etc.
> I wanna know how much space is taken up by each of them.
>

'getfattr -d -m - sub-dir'

> Thanks.
>
> Yan, Zheng  于2018年11月19日周一 下午2:50写道:
>>
>> 'ceph daemon mds.xx session ls'
>> On Mon, Nov 19, 2018 at 2:40 PM Zhenshi Zhou  wrote:
>> >
>> > Hi,
>> >
>> > I have a cluster providing cephfs and it looks well. But as times
>> > goes by, more and more clients use it. I wanna write a script
>> > for getting the clients' informations so that I can keep everything
>> > in good order.
>> >
>> > I google a lot but dont find any solution which I can get clients
>> > information. Is there a way for me to get statistics, such as clients'
>> > IP or mounting point and etc, so that I can deal with it.
>> >
>> > Thanks.
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get cephfs mounting clients' infomation

2018-11-18 Thread Zhenshi Zhou
Many thanks Yan!

This command can get IP, hostname, mounting point and kernel version. All
of these data are exactly what I need.
Besides, is there a way I can get the sub directory's usage other than the
whole
cephfs usage from the server. For instance, I have /docker, /kvm, /backup,
etc.
I wanna know how much space is taken up by each of them.

Thanks.

Yan, Zheng  于2018年11月19日周一 下午2:50写道:

> 'ceph daemon mds.xx session ls'
> On Mon, Nov 19, 2018 at 2:40 PM Zhenshi Zhou  wrote:
> >
> > Hi,
> >
> > I have a cluster providing cephfs and it looks well. But as times
> > goes by, more and more clients use it. I wanna write a script
> > for getting the clients' informations so that I can keep everything
> > in good order.
> >
> > I google a lot but dont find any solution which I can get clients
> > information. Is there a way for me to get statistics, such as clients'
> > IP or mounting point and etc, so that I can deal with it.
> >
> > Thanks.
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: what are the potential risks of mixed cluster and client ms_type

2018-11-18 Thread Piotr Dałek

On 2018-11-19 5:05 a.m., Honggang(Joseph) Yang wrote:

hello,

Our cluster side ms_type is async, while client side ms_type is
simple. I want to know if this is a proper way to use, what are the
potential risks?


None if Ceph doesn't complain about async messenger being experimental - 
both messengers use the same protocol.


--
Piotr Dałek
piotr.da...@corp.ovh.com
https://www.ovhcloud.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] get cephfs mounting clients' infomation

2018-11-18 Thread Yan, Zheng
'ceph daemon mds.xx session ls'
On Mon, Nov 19, 2018 at 2:40 PM Zhenshi Zhou  wrote:
>
> Hi,
>
> I have a cluster providing cephfs and it looks well. But as times
> goes by, more and more clients use it. I wanna write a script
> for getting the clients' informations so that I can keep everything
> in good order.
>
> I google a lot but dont find any solution which I can get clients
> information. Is there a way for me to get statistics, such as clients'
> IP or mounting point and etc, so that I can deal with it.
>
> Thanks.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] get cephfs mounting clients' infomation

2018-11-18 Thread Zhenshi Zhou
Hi,

I have a cluster providing cephfs and it looks well. But as times
goes by, more and more clients use it. I wanna write a script
for getting the clients' informations so that I can keep everything
in good order.

I google a lot but dont find any solution which I can get clients
information. Is there a way for me to get statistics, such as clients'
IP or mounting point and etc, so that I can deal with it.

Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] openstack swift multitenancy problems with ceph RGW

2018-11-18 Thread Dilip Renkila
Hi all,

We are provisioning openstack swift api though ceph rgw (mimic). We have
problems when trying to create two containers in two projects of same name.
After scraping web, i came to know that i have to enable

* rgw_keystone_implicit_tenants in ceph conf file. But no use. Is really
multitenancy supported for swift ? Can any one share ceph conf file ?

ceph conf file


[client.rgw.ctrl1]
host = ctrl1
keyring = /var/lib/ceph/radosgw/ceph-rgw.ctrl1/keyring
log file = /var/log/ceph/ceph-rgw-ctrl1.log
rgw frontends = civetweb port=10.70.1.1:8080 num_threads=100


[client.rgw.ctrl2]
host = ctrl2
keyring = /var/lib/ceph/radosgw/ceph-rgw.ctrl2/keyring
log file = /var/log/ceph/ceph-rgw-ctrl2.log
rgw frontends = civetweb port=10.70.1.2:8080 num_threads=100

[client.rgw.ctrl3]
host = ctrl3
keyring = /var/lib/ceph/radosgw/ceph-rgw.ctrl3/keyring
log file = /var/log/ceph/ceph-rgw-ctrl3.log
rgw frontends = civetweb port=10.70.1.3:8080 num_threads=100


# Please do not change this file directly since it is managed by Ansible
and will be overwritten
[global]
cluster network = 10.60.0.0/22
fsid = 6cb9b9ca-7cdd-4200-a311-5b132ddc89f7
mon host = 10.70.1.1,10.70.1.2,10.70.1.3
mon initial members = ctrl1,ctrl2,ctrl3
public network = 10.70.0.0/22
rgw bucket default quota max objects = 1638400
rgw override bucket index max shards = 16
mon_max_pg_per_osd = 500
# Keystone information
rgw_keystone_url = http://10.20.0.199:5000
rgw keystone api version = 3
rgw_keystone_admin_user = swift
rgw_keystone_admin_password = password
rgw_keystone_admin_domain = default
rgw_keystone_admin_project = service

rgw_keystone_accepted_roles = admin,user,project-admin,cloud-admin
rgw_keystone_token_cache_size = 0
rgw_keystone_revocation_interval = 0
rgw_keystone_make_new_tenants = true
rgw_keystone_implicit_tenants = true
rgw_s3_auth_use_keystone = true
#nss_db_path = {path to nss db}
rgw_keystone_verify_ssl = false
mon_pg_warn_max_object_skew = 1000

#mon_allow_pool_delete = true
[mon]
mgr initial modules = dashboard, prometheus

[osd]
bluestore_block_db_size = 200




Best Regards / Kind Regards

Dilip Renkila
Linux / Unix SysAdmin



Linserv AB
Direct: +46 8 473 60 64
Mobile: +46 705080243
dilip.renk...@linserv.se

www.linserv.se
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Huge latency spikes

2018-11-18 Thread Ashley Merrick
Ah yes sorry be because your behind a raid card.

Your need to check the raid config I know on a HP card for example you have
an option called enabled disk cache.

This is separate to enabling the raid card cache, the config should be per
a drive (is on HP) so worth checking the config outputs for your raid CLI.

On Mon, 19 Nov 2018 at 12:47 AM, Alex Litvak 
wrote:

> Hmm,
>
> On all nodes
>
>   hdparm -W /dev/sdb
>
> /dev/sdb:
> SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0d 00 00 00 00
> 20 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>   write-caching = not supported
>
>
> On 11/18/2018 10:30 AM, Ashley Merrick wrote:
> > hdparm -W /dev/xxx should show you
> >
> > On Mon, 19 Nov 2018 at 12:28 AM, Alex Litvak <
> alexander.v.lit...@gmail.com > wrote:
> >
> > All machines state the same.
> >
> > /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -Lall -a0
> >
> > Adapter 0-VD 0(target id: 0): Disk Write Cache : Disk's Default
> > Adapter 0-VD 1(target id: 1): Disk Write Cache : Disk's Default
> >
> > I assume they are all on which is actually bad based on common sense.
> >
> >
> https://notesbytom.wordpress.com/2016/10/21/dell-perc-megaraid-disk-cache-policy/
> >
> > An I couldn't find how to confirm it if it is true but vendor
> wouldn't ship drives with cache disabled.
> >
> > I am getting logs in the controller log which are not shown on other
> servers
> >
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03407a0, localAddr
> e03407a0
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03409e0, localAddr
> e03409e0
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340c20, localAddr
> e0340c20
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340e60, localAddr
> e0340e60
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03410a0, localAddr
> e03410a0
> > 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03401a0, localAddr
> e03401a0
> >
> > Not sure if it has any relation to the issue of latency but search
> returned nothing substantial.
> >
> > On 11/18/2018 8:52 AM, Serkan Çoban wrote:
> >  > I am not saying controller cache, you should check ssd disk
> caches.
> >  > On Sun, Nov 18, 2018 at 11:40 AM Alex Litvak
> >  >  alexander.v.lit...@gmail.com>> wrote:
> >  >>
> >  >> All 3 nodes have this status for SSD mirror.  Controller cache
> is on for all 3.
> >  >>
> >  >> Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write
> Cache if Bad BBU
> >  >> Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write
> Cache if Bad BBU
> >  >>
> >  >>
> >  >> On 11/18/2018 12:45 AM, Serkan Çoban wrote:
> >  >>> Does write cache on SSDs enabled on three servers? Can you
> check them?
> >  >>> On Sun, Nov 18, 2018 at 9:05 AM Alex Litvak
> >  >>>  alexander.v.lit...@gmail.com>> wrote:
> >  
> >   Raid card for journal disks is Perc H730 (Megaraid), RAID 1,
> battery back cache is on
> >  
> >   Default Cache Policy: WriteBack, ReadAdaptive, Direct, No
> Write Cache if Bad BBU
> >   Current Cache Policy: WriteBack, ReadAdaptive, Direct, No
> Write Cache if Bad BBU
> >  
> >   I have  2 other nodes with older Perc H710 and similar SSDs
> with slightly higher wear (6.3% vs 5.18%) but from observation they hardly
> hit 1.5 ms on rear occasion
> >   Cache, RAID, and battery situation is the same.
> >  
> >   On 11/17/2018 11:38 PM, Serkan Çoban wrote:
> >  >> 10ms w_await for SSD is too much. How that SSD is connected
> to the system? Any raid card installed on this system? What is the raid
> mode?
> >  > On Sun, Nov 18, 2018 at 8:25 AM Alex Litvak
> >  >  alexander.v.lit...@gmail.com>> wrote:
> >  >>
> >  >> Here is another snapshot.  I wonder if this write io wait is
> too big
> >  >> Device: rrqm/s   wrqm/s r/s w/srkB/s
> wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> >  >> dm-14 0.00 0.000.00   23.00 0.00
>  336.0029.22 0.34   14.740.00   14.74   2.87   6.60
> >  >> dm-15 0.00 0.000.00   16.00 0.00
>  200.0025.00 0.010.750.000.75   0.75   1.20
> >  >> dm-16 0.00 0.000.00   17.00 0.00
>  276.0032.47 0.25   14.940.00   14.94   3.35   5.70
> >  >> dm-17 0.00 0.000.00   17.00 0.00
>  252.0029.65 0.32   18.650.00   18.65   4.00   6.80
> >  >> dm-18 0.00 0.000.00   15.00 0.00
>  152.0020.27 0.25   16.800.00   16.80   4.07   6.10
> >  >> dm-19 0.00 0.000.00   13.00 0.00
>  152.0023.38 0.21   15.920.00   15.92   4.85   6.30
> >  >> dm-20 0.00 0.000.00   20.00 0.00
>  248.00 

Re: [ceph-users] Huge latency spikes

2018-11-18 Thread Alex Litvak

Hmm,

On all nodes

 hdparm -W /dev/sdb

/dev/sdb:
SG_IO: bad/missing sense data, sb[]:  70 00 05 00 00 00 00 0d 00 00 00 00 20 00 
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 write-caching = not supported


On 11/18/2018 10:30 AM, Ashley Merrick wrote:

hdparm -W /dev/xxx should show you

On Mon, 19 Nov 2018 at 12:28 AM, Alex Litvak mailto:alexander.v.lit...@gmail.com>> wrote:

All machines state the same.

/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -Lall -a0

Adapter 0-VD 0(target id: 0): Disk Write Cache : Disk's Default
Adapter 0-VD 1(target id: 1): Disk Write Cache : Disk's Default

I assume they are all on which is actually bad based on common sense.


https://notesbytom.wordpress.com/2016/10/21/dell-perc-megaraid-disk-cache-policy/

An I couldn't find how to confirm it if it is true but vendor wouldn't ship 
drives with cache disabled.

I am getting logs in the controller log which are not shown on other servers

11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03407a0, localAddr e03407a0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03409e0, localAddr e03409e0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340c20, localAddr e0340c20
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340e60, localAddr e0340e60
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03410a0, localAddr e03410a0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03401a0, localAddr e03401a0

Not sure if it has any relation to the issue of latency but search returned 
nothing substantial.

On 11/18/2018 8:52 AM, Serkan Çoban wrote:
 > I am not saying controller cache, you should check ssd disk caches.
 > On Sun, Nov 18, 2018 at 11:40 AM Alex Litvak
 > mailto:alexander.v.lit...@gmail.com>> 
wrote:
 >>
 >> All 3 nodes have this status for SSD mirror.  Controller cache is on 
for all 3.
 >>
 >> Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache 
if Bad BBU
 >> Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache 
if Bad BBU
 >>
 >>
 >> On 11/18/2018 12:45 AM, Serkan Çoban wrote:
 >>> Does write cache on SSDs enabled on three servers? Can you check them?
 >>> On Sun, Nov 18, 2018 at 9:05 AM Alex Litvak
 >>> mailto:alexander.v.lit...@gmail.com>> 
wrote:
 
  Raid card for journal disks is Perc H730 (Megaraid), RAID 1, battery 
back cache is on
 
  Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache 
if Bad BBU
  Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache 
if Bad BBU
 
  I have  2 other nodes with older Perc H710 and similar SSDs with 
slightly higher wear (6.3% vs 5.18%) but from observation they hardly hit 1.5 ms on rear 
occasion
  Cache, RAID, and battery situation is the same.
 
  On 11/17/2018 11:38 PM, Serkan Çoban wrote:
 >> 10ms w_await for SSD is too much. How that SSD is connected to the 
system? Any raid card installed on this system? What is the raid mode?
 > On Sun, Nov 18, 2018 at 8:25 AM Alex Litvak
 > mailto:alexander.v.lit...@gmail.com>> 
wrote:
 >>
 >> Here is another snapshot.  I wonder if this write io wait is too big
 >> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s 
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
 >> dm-14             0.00     0.00    0.00   23.00     0.00   336.00   
 29.22     0.34   14.74    0.00   14.74   2.87   6.60
 >> dm-15             0.00     0.00    0.00   16.00     0.00   200.00   
 25.00     0.01    0.75    0.00    0.75   0.75   1.20
 >> dm-16             0.00     0.00    0.00   17.00     0.00   276.00   
 32.47     0.25   14.94    0.00   14.94   3.35   5.70
 >> dm-17             0.00     0.00    0.00   17.00     0.00   252.00   
 29.65     0.32   18.65    0.00   18.65   4.00   6.80
 >> dm-18             0.00     0.00    0.00   15.00     0.00   152.00   
 20.27     0.25   16.80    0.00   16.80   4.07   6.10
 >> dm-19             0.00     0.00    0.00   13.00     0.00   152.00   
 23.38     0.21   15.92    0.00   15.92   4.85   6.30
 >> dm-20             0.00     0.00    0.00   20.00     0.00   248.00   
 24.80     0.27   13.60    0.00   13.60   3.25   6.50
 >> dm-21             0.00     0.00    0.00   17.00     0.00   188.00   
 22.12     0.27   16.00    0.00   16.00   3.59   6.10
 >> dm-22             0.00     0.00    0.00   20.00     0.00   156.00   
 15.60     0.11    5.55    0.00    5.55   2.95   5.90
 >> dm-24             0.00     0.00    0.00    8.00     0.00    56.00   
 14.00     0.12   14.62    0.00   14.62   4.75   3.80
 >> dm-25             0.00     0.00    0.00   19.00     0.00   200.00   
 21.05     0.21   10.89    0.00   10.89   2.74   5.20
 >>
 >> Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    

Re: [ceph-users] Huge latency spikes

2018-11-18 Thread Ashley Merrick
hdparm -W /dev/xxx should show you

On Mon, 19 Nov 2018 at 12:28 AM, Alex Litvak 
wrote:

> All machines state the same.
>
> /opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -Lall -a0
>
> Adapter 0-VD 0(target id: 0): Disk Write Cache : Disk's Default
> Adapter 0-VD 1(target id: 1): Disk Write Cache : Disk's Default
>
> I assume they are all on which is actually bad based on common sense.
>
>
> https://notesbytom.wordpress.com/2016/10/21/dell-perc-megaraid-disk-cache-policy/
>
> An I couldn't find how to confirm it if it is true but vendor wouldn't
> ship drives with cache disabled.
>
> I am getting logs in the controller log which are not shown on other
> servers
>
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03407a0, localAddr e03407a0
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03409e0, localAddr e03409e0
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340c20, localAddr e0340c20
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340e60, localAddr e0340e60
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03410a0, localAddr e03410a0
> 11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03401a0, localAddr e03401a0
>
> Not sure if it has any relation to the issue of latency but search
> returned nothing substantial.
>
> On 11/18/2018 8:52 AM, Serkan Çoban wrote:
> > I am not saying controller cache, you should check ssd disk caches.
> > On Sun, Nov 18, 2018 at 11:40 AM Alex Litvak
> >  wrote:
> >>
> >> All 3 nodes have this status for SSD mirror.  Controller cache is on
> for all 3.
> >>
> >> Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
> if Bad BBU
> >> Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
> if Bad BBU
> >>
> >>
> >> On 11/18/2018 12:45 AM, Serkan Çoban wrote:
> >>> Does write cache on SSDs enabled on three servers? Can you check them?
> >>> On Sun, Nov 18, 2018 at 9:05 AM Alex Litvak
> >>>  wrote:
> 
>  Raid card for journal disks is Perc H730 (Megaraid), RAID 1, battery
> back cache is on
> 
>  Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
> if Bad BBU
>  Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache
> if Bad BBU
> 
>  I have  2 other nodes with older Perc H710 and similar SSDs with
> slightly higher wear (6.3% vs 5.18%) but from observation they hardly hit
> 1.5 ms on rear occasion
>  Cache, RAID, and battery situation is the same.
> 
>  On 11/17/2018 11:38 PM, Serkan Çoban wrote:
> >> 10ms w_await for SSD is too much. How that SSD is connected to the
> system? Any raid card installed on this system? What is the raid mode?
> > On Sun, Nov 18, 2018 at 8:25 AM Alex Litvak
> >  wrote:
> >>
> >> Here is another snapshot.  I wonder if this write io wait is too big
> >> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> >> dm-14 0.00 0.000.00   23.00 0.00   336.00
>   29.22 0.34   14.740.00   14.74   2.87   6.60
> >> dm-15 0.00 0.000.00   16.00 0.00   200.00
>   25.00 0.010.750.000.75   0.75   1.20
> >> dm-16 0.00 0.000.00   17.00 0.00   276.00
>   32.47 0.25   14.940.00   14.94   3.35   5.70
> >> dm-17 0.00 0.000.00   17.00 0.00   252.00
>   29.65 0.32   18.650.00   18.65   4.00   6.80
> >> dm-18 0.00 0.000.00   15.00 0.00   152.00
>   20.27 0.25   16.800.00   16.80   4.07   6.10
> >> dm-19 0.00 0.000.00   13.00 0.00   152.00
>   23.38 0.21   15.920.00   15.92   4.85   6.30
> >> dm-20 0.00 0.000.00   20.00 0.00   248.00
>   24.80 0.27   13.600.00   13.60   3.25   6.50
> >> dm-21 0.00 0.000.00   17.00 0.00   188.00
>   22.12 0.27   16.000.00   16.00   3.59   6.10
> >> dm-22 0.00 0.000.00   20.00 0.00   156.00
>   15.60 0.115.550.005.55   2.95   5.90
> >> dm-24 0.00 0.000.008.00 0.0056.00
>   14.00 0.12   14.620.00   14.62   4.75   3.80
> >> dm-25 0.00 0.000.00   19.00 0.00   200.00
>   21.05 0.21   10.890.00   10.89   2.74   5.20
> >>
> >> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
> avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
> >> dm-14 0.00 0.000.00   11.00 0.00   136.00
>   24.73 0.119.730.009.73   1.82   2.00
> >> dm-15 0.00 0.000.00   12.00 0.00   136.00
>   22.67 0.043.750.003.75   1.08   1.30
> >> dm-16 0.00 0.000.009.00 0.00   104.00
>   23.11 0.09   10.440.00   10.44   2.44   2.20
> >> dm-17 0.00 0.000.005.00 0.00   160.00
>   64.00 0.02

Re: [ceph-users] Huge latency spikes

2018-11-18 Thread Alex Litvak

All machines state the same.

/opt/MegaRAID/MegaCli/MegaCli64 -LDGetProp -DskCache -Lall -a0

Adapter 0-VD 0(target id: 0): Disk Write Cache : Disk's Default
Adapter 0-VD 1(target id: 1): Disk Write Cache : Disk's Default

I assume they are all on which is actually bad based on common sense.

https://notesbytom.wordpress.com/2016/10/21/dell-perc-megaraid-disk-cache-policy/

An I couldn't find how to confirm it if it is true but vendor wouldn't ship 
drives with cache disabled.

I am getting logs in the controller log which are not shown on other servers

11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03407a0, localAddr e03407a0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03409e0, localAddr e03409e0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340c20, localAddr e0340c20
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e0340e60, localAddr e0340e60
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03410a0, localAddr e03410a0
11/18/18  8:21:55: C0:SysDma: localAddrPlb 50e03401a0, localAddr e03401a0

Not sure if it has any relation to the issue of latency but search returned 
nothing substantial.

On 11/18/2018 8:52 AM, Serkan Çoban wrote:

I am not saying controller cache, you should check ssd disk caches.
On Sun, Nov 18, 2018 at 11:40 AM Alex Litvak
 wrote:


All 3 nodes have this status for SSD mirror.  Controller cache is on for all 3.

Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU


On 11/18/2018 12:45 AM, Serkan Çoban wrote:

Does write cache on SSDs enabled on three servers? Can you check them?
On Sun, Nov 18, 2018 at 9:05 AM Alex Litvak
 wrote:


Raid card for journal disks is Perc H730 (Megaraid), RAID 1, battery back cache 
is on

Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU

I have  2 other nodes with older Perc H710 and similar SSDs with slightly 
higher wear (6.3% vs 5.18%) but from observation they hardly hit 1.5 ms on rear 
occasion
Cache, RAID, and battery situation is the same.

On 11/17/2018 11:38 PM, Serkan Çoban wrote:

10ms w_await for SSD is too much. How that SSD is connected to the system? Any 
raid card installed on this system? What is the raid mode?

On Sun, Nov 18, 2018 at 8:25 AM Alex Litvak
 wrote:


Here is another snapshot.  I wonder if this write io wait is too big
Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
dm-14 0.00 0.000.00   23.00 0.00   336.0029.22 
0.34   14.740.00   14.74   2.87   6.60
dm-15 0.00 0.000.00   16.00 0.00   200.0025.00 
0.010.750.000.75   0.75   1.20
dm-16 0.00 0.000.00   17.00 0.00   276.0032.47 
0.25   14.940.00   14.94   3.35   5.70
dm-17 0.00 0.000.00   17.00 0.00   252.0029.65 
0.32   18.650.00   18.65   4.00   6.80
dm-18 0.00 0.000.00   15.00 0.00   152.0020.27 
0.25   16.800.00   16.80   4.07   6.10
dm-19 0.00 0.000.00   13.00 0.00   152.0023.38 
0.21   15.920.00   15.92   4.85   6.30
dm-20 0.00 0.000.00   20.00 0.00   248.0024.80 
0.27   13.600.00   13.60   3.25   6.50
dm-21 0.00 0.000.00   17.00 0.00   188.0022.12 
0.27   16.000.00   16.00   3.59   6.10
dm-22 0.00 0.000.00   20.00 0.00   156.0015.60 
0.115.550.005.55   2.95   5.90
dm-24 0.00 0.000.008.00 0.0056.0014.00 
0.12   14.620.00   14.62   4.75   3.80
dm-25 0.00 0.000.00   19.00 0.00   200.0021.05 
0.21   10.890.00   10.89   2.74   5.20

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
dm-14 0.00 0.000.00   11.00 0.00   136.0024.73 
0.119.730.009.73   1.82   2.00
dm-15 0.00 0.000.00   12.00 0.00   136.0022.67 
0.043.750.003.75   1.08   1.30
dm-16 0.00 0.000.009.00 0.00   104.0023.11 
0.09   10.440.00   10.44   2.44   2.20
dm-17 0.00 0.000.005.00 0.00   160.0064.00 
0.024.000.004.00   4.00   2.00
dm-18 0.00 0.000.005.00 0.0052.0020.80 
0.035.800.005.80   3.60   1.80
dm-19 0.00 0.000.00   10.00 0.00   104.0020.80 
0.087.900.007.90   2.10   2.10
dm-20 0.00 0.000.009.00 0.00   132.0029.33 
0.10   11.220.00   11.22   2.56   2.30
dm-21 0.00 0.000.006.00 0.0068.0022.67   

Re: [ceph-users] Huge latency spikes

2018-11-18 Thread Serkan Çoban
I am not saying controller cache, you should check ssd disk caches.
On Sun, Nov 18, 2018 at 11:40 AM Alex Litvak
 wrote:
>
> All 3 nodes have this status for SSD mirror.  Controller cache is on for all 
> 3.
>
> Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad 
> BBU
> Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad 
> BBU
>
>
> On 11/18/2018 12:45 AM, Serkan Çoban wrote:
> > Does write cache on SSDs enabled on three servers? Can you check them?
> > On Sun, Nov 18, 2018 at 9:05 AM Alex Litvak
> >  wrote:
> >>
> >> Raid card for journal disks is Perc H730 (Megaraid), RAID 1, battery back 
> >> cache is on
> >>
> >> Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if 
> >> Bad BBU
> >> Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if 
> >> Bad BBU
> >>
> >> I have  2 other nodes with older Perc H710 and similar SSDs with slightly 
> >> higher wear (6.3% vs 5.18%) but from observation they hardly hit 1.5 ms on 
> >> rear occasion
> >> Cache, RAID, and battery situation is the same.
> >>
> >> On 11/17/2018 11:38 PM, Serkan Çoban wrote:
>  10ms w_await for SSD is too much. How that SSD is connected to the 
>  system? Any raid card installed on this system? What is the raid mode?
> >>> On Sun, Nov 18, 2018 at 8:25 AM Alex Litvak
> >>>  wrote:
> 
>  Here is another snapshot.  I wonder if this write io wait is too big
>  Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s 
>  avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>  dm-14 0.00 0.000.00   23.00 0.00   336.00
>  29.22 0.34   14.740.00   14.74   2.87   6.60
>  dm-15 0.00 0.000.00   16.00 0.00   200.00
>  25.00 0.010.750.000.75   0.75   1.20
>  dm-16 0.00 0.000.00   17.00 0.00   276.00
>  32.47 0.25   14.940.00   14.94   3.35   5.70
>  dm-17 0.00 0.000.00   17.00 0.00   252.00
>  29.65 0.32   18.650.00   18.65   4.00   6.80
>  dm-18 0.00 0.000.00   15.00 0.00   152.00
>  20.27 0.25   16.800.00   16.80   4.07   6.10
>  dm-19 0.00 0.000.00   13.00 0.00   152.00
>  23.38 0.21   15.920.00   15.92   4.85   6.30
>  dm-20 0.00 0.000.00   20.00 0.00   248.00
>  24.80 0.27   13.600.00   13.60   3.25   6.50
>  dm-21 0.00 0.000.00   17.00 0.00   188.00
>  22.12 0.27   16.000.00   16.00   3.59   6.10
>  dm-22 0.00 0.000.00   20.00 0.00   156.00
>  15.60 0.115.550.005.55   2.95   5.90
>  dm-24 0.00 0.000.008.00 0.0056.00
>  14.00 0.12   14.620.00   14.62   4.75   3.80
>  dm-25 0.00 0.000.00   19.00 0.00   200.00
>  21.05 0.21   10.890.00   10.89   2.74   5.20
> 
>  Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s 
>  avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>  dm-14 0.00 0.000.00   11.00 0.00   136.00
>  24.73 0.119.730.009.73   1.82   2.00
>  dm-15 0.00 0.000.00   12.00 0.00   136.00
>  22.67 0.043.750.003.75   1.08   1.30
>  dm-16 0.00 0.000.009.00 0.00   104.00
>  23.11 0.09   10.440.00   10.44   2.44   2.20
>  dm-17 0.00 0.000.005.00 0.00   160.00
>  64.00 0.024.000.004.00   4.00   2.00
>  dm-18 0.00 0.000.005.00 0.0052.00
>  20.80 0.035.800.005.80   3.60   1.80
>  dm-19 0.00 0.000.00   10.00 0.00   104.00
>  20.80 0.087.900.007.90   2.10   2.10
>  dm-20 0.00 0.000.009.00 0.00   132.00
>  29.33 0.10   11.220.00   11.22   2.56   2.30
>  dm-21 0.00 0.000.006.00 0.0068.00
>  22.67 0.07   12.330.00   12.33   3.83   2.30
>  dm-22 0.00 0.000.003.00 0.0020.00
>  13.33 0.013.670.003.67   3.67   1.10
>  dm-24 0.00 0.000.004.00 0.0024.00
>  12.00 0.07   18.000.00   18.00   5.25   2.10
>  dm-25 0.00 0.000.006.00 0.0064.00
>  21.33 0.06   10.330.00   10.33   3.67   2.20
> 
>  Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s 
>  avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
>  dm-14 0.00 0.000.005.00 0.00   140.00
>  56.00 0.08   15.20

[ceph-users] Ceph balancer history and clarity

2018-11-18 Thread Marc Roos



- if my cluster is not well balanced, I have to run the balancer execute 
several times, because it only optimises in small steps?

- is there some history of applied plans to see how optimizing brings 
down this reported final score 0.054781?

- how can I get the current score?

- I have some 8TB, 4TB (majority) and 3TB drives, should I keep 
crush-compat or move to upmap?

- What would be a good MIN/MAX and STDDEV of ceph osd df?


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Use SSDs for metadata or for a pool cache?

2018-11-18 Thread Marc Roos
 

- Everyone here will tell you not to use 2x replica, maybe use some 
erasure code if you want to save space.
- I cannot say anything about applying the cache pool, did not use it, 
read some things that made me doubt it was useful for us. We decided to 
put some vm's on ssd rbd pool. Maybe when starting with ceph keep things 
simple, if something goes wrong it is easier to fix. 
- just try hdd performance, our linux vm's have no problem with the 
hdd's only. Windows vm's have. And remember if you loose the ssd, you 
loose all osd connected to it, could be a high impact on a small 
cluster. 




-Original Message-
From: Gesiel Galvão Bernardes [mailto:gesiel.bernar...@gmail.com] 
Sent: zondag 18 november 2018 0:03
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Use SSDs for metadata or for a pool cache?


Hello,
I am building a new cluster with 4 hosts, which have the following 
configuration:

128Gb RAM
12 HDs SATA 8TB 7.2k rpm
2 SSDs 240Gb
2x10GB Network

I will use the cluster to store RBD images of VMs, I thought to use with 
2x replica, if it does not get too slow.

My question is: Using bluestore (default since Luminous, right?), should 
I use the SSDs as a "cache pool" or use the SSDs to store the bluestore 
metadata? Or could I use 1 SSD for metadata and another for a "cache 
pool"?

Thank you in advance for your opinions.

Gesiel


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com