Re: [ceph-users] [Disarmed] Re: ceph-ansible firewalld blocking ceph comms

2019-07-26 Thread Nathan Harper
The firewalld service 'ceph' includes the range of ports required.

Not sure why it helped, but after a reboot of each OSD node the issue went
away!

On Thu, 25 Jul 2019 at 23:14,  wrote:

> Nathan;
>
> I'm not an expert on firewalld, but shouldn't you have a list of open
> ports?
>
>  ports: ?
>
> Here's the configuration on my test cluster:
> public (active)
>   target: default
>   icmp-block-inversion: no
>   interfaces: bond0
>   sources:
>   services: ssh dhcpv6-client
>   ports: 6789/tcp 3300/tcp 6800-7300/tcp 8443/tcp
>   protocols:
>   masquerade: no
>   forward-ports:
>   source-ports:
>   icmp-blocks:
>   rich rules:
> trusted (active)
>   target: ACCEPT
>   icmp-block-inversion: no
>   interfaces: bond1
>   sources:
>   services:
>   ports: 6789/tcp 3300/tcp 6800-7300/tcp 8443/tcp
>   protocols:
>   masquerade: no
>   forward-ports:
>   source-ports:
>   icmp-blocks:
>   rich rules:
>
> I use interfaces as selectors, but would think source selectors would work
> the same.
>
> You might start by adding the MON ports to the firewall on the MONs:
> firewall-cmd --zone=public --add-port=6789/tcp --permanent
> firewall-cmd --zone=public --add-port=3300/tcp --permanent
> firewall-cmd --reload
>
> Thank you,
>
> Dominic L. Hilsbos, MBA
> Director – Information Technology
> Perform Air International Inc.
> dhils...@performair.com
> www.PerformAir.com
>
>
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Nathan Harper
> Sent: Thursday, July 25, 2019 2:08 PM
> To: ceph-us...@ceph.com
> Subject: [Disarmed] Re: [ceph-users] ceph-ansible firewalld blocking ceph
> comms
>
> This is a new issue to us, and did not have the same problem running the
> same activity on our test system.
> Regards,
> Nathan
>
> On 25 Jul 2019, at 22:00, solarflow99  wrote:
> I used ceph-ansible just fine, never had this problem.
>
> On Thu, Jul 25, 2019 at 1:31 PM Nathan Harper 
> wrote:
> Hi all,
>
> We've run into a strange issue with one of our clusters managed with
> ceph-ansible.   We're adding some RGW nodes to our cluster, and so re-ran
> site.yml against the cluster.  The new RGWs added successfully, but
>
> When we did, we started to get slow requests, effectively across the whole
> cluster.   Quickly we realised that the firewall was now (apparently)
> blocking Ceph communications.   I say apparently, because the config looks
> correct:
>
> [root@osdsrv05 ~]# firewall-cmd --list-all
> public (active)
>   target: default
>   icmp-block-inversion: no
>   interfaces:
>   sources: MailScanner has detected a possible fraud attempt from
> "172.20.22.0" claiming to be 172.20.22.0/24 MailScanner has detected a
> possible fraud attempt from "172.20.23.0" claiming to be 172.20.23.0/24
>   services: ssh dhcpv6-client ceph
>   ports:
>   protocols:
>   masquerade: no
>   forward-ports:
>   source-ports:
>   icmp-blocks:
>   rich rules:
>
> If we drop the firewall everything goes back healthy.   All the clients
> (Openstack cinder) are on the 172.20.22.0 network (172.20.23.0 is the
> replication network).  Has anyone seen this?
> --
> Nathan Harper // IT Systems Lead
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
*Nathan Harper* // IT Systems Lead

*e: *nathan.har...@cfms.org.uk   *t*: 0117 906 1104  *m*:  0787 551 0891
*w: *www.cfms.org.uk
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
Green // Bristol // BS16 7FR

CFMS Services Ltd is registered in England and Wales No 05742022 - a
subsidiary of CFMS Ltd
CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1
4QP
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread Nathan Harper
This is a new issue to us, and did not have the same problem running the same 
activity on our test system. 

Regards,
Nathan

> On 25 Jul 2019, at 22:00, solarflow99  wrote:
> 
> I used ceph-ansible just fine, never had this problem.  
> 
>> On Thu, Jul 25, 2019 at 1:31 PM Nathan Harper  
>> wrote:
>> Hi all,
>> 
>> We've run into a strange issue with one of our clusters managed with 
>> ceph-ansible.   We're adding some RGW nodes to our cluster, and so re-ran 
>> site.yml against the cluster.  The new RGWs added successfully, but
>> 
>> When we did, we started to get slow requests, effectively across the whole 
>> cluster.   Quickly we realised that the firewall was now (apparently) 
>> blocking Ceph communications.   I say apparently, because the config looks 
>> correct:
>> 
>>> [root@osdsrv05 ~]# firewall-cmd --list-all
>>> public (active)
>>>   target: default
>>>   icmp-block-inversion: no
>>>   interfaces:
>>>   sources: 172.20.22.0/24 172.20.23.0/24
>>>   services: ssh dhcpv6-client ceph
>>>   ports:
>>>   protocols:
>>>   masquerade: no
>>>   forward-ports:
>>>   source-ports:
>>>   icmp-blocks:
>>>   rich rules:
>> 
>> If we drop the firewall everything goes back healthy.   All the clients 
>> (Openstack cinder) are on the 172.20.22.0 network (172.20.23.0 is the 
>> replication network).  Has anyone seen this?
>> -- 
>> Nathan Harper // IT Systems Lead
>> 
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-ansible firewalld blocking ceph comms

2019-07-25 Thread Nathan Harper
Hi all,

We've run into a strange issue with one of our clusters managed with
ceph-ansible.   We're adding some RGW nodes to our cluster, and so re-ran
site.yml against the cluster.  The new RGWs added successfully, but

When we did, we started to get slow requests, effectively across the whole
cluster.   Quickly we realised that the firewall was now (apparently)
blocking Ceph communications.   I say apparently, because the config looks
correct:

[root@osdsrv05 ~]# firewall-cmd --list-all
> public (active)
>   target: default
>   icmp-block-inversion: no
>   interfaces:
>   sources: 172.20.22.0/24 172.20.23.0/24
>   services: ssh dhcpv6-client ceph
>   ports:
>   protocols:
>   masquerade: no
>   forward-ports:
>   source-ports:
>   icmp-blocks:
>   rich rules:
>

If we drop the firewall everything goes back healthy.   All the clients
(Openstack cinder) are on the 172.20.22.0 network (172.20.23.0 is the
replication network).  Has anyone seen this?
-- 
*Nathan Harper* // IT Systems Lead
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph zabbix monitoring

2019-06-27 Thread Nathan Harper
Have you configured any encryption on your Zabbix infrastructure?   We took
a brief look at ceph+Zabbix a while ago, and the exporter didn't have the
capability to use encryption.   I don't know if it's changed in the
meantime though.

On Thu, 27 Jun 2019 at 09:43, Majid Varzideh  wrote:

> Hi friends
> i have installed ceph mimic with zabbix 3.0. i configured everything to
> monitor my cluster with zabbix and i could get data from zabbix frontend.
> but in ceph -s command it says Failed to send data to Zabbix.
> why this happen?
> my ceph version :ceph version 13.2.6
> (7b695f835b03642f85998b2ae7b6dd093d9fbce4) mimic (stable)
> zabbix 3.0.14
> thanks,
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
*Nathan Harper* // IT Systems Lead

*e: *nathan.har...@cfms.org.uk   *t*: 0117 906 1104  *m*:  0787 551 0891
*w: *www.cfms.org.uk
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
Green // Bristol // BS16 7FR

CFMS Services Ltd is registered in England and Wales No 05742022 - a
subsidiary of CFMS Ltd
CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1
4QP
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Latest recommendations on sizing

2019-03-28 Thread Nathan Harper
Hi,

We are looking at extending one of our Ceph clusters, currently running
Luminous.   The cluster is all SSD, providing RBD to Openstack, using 70
OSDs on 5 hosts.

We have a couple of projects kicking off that will need significantly more,
albeit slower storage.  I am looking at speccing out some new OSD nodes
with higher capacity spinning drives.

We are deploying 25GbE these days, so I am not worried about network
bandwidth (and have taken on recent comments suggesting that there is no
reason to run separate cluster/public networks.

What about CPUs - is it still worth 2x CPUs?  Our current OSD hosts have 2x
CPUs but neither seems particularly busy.  Would a single higher spec CPU
win out over dual lower spec CPUs, taking on board previous discussion that
GHz is king.

SSD/NVMe for WAL etc?   We're running Bluestore on all of our SSD OSDS with
colocated WAL.

We are looking to provide ~500TB into a separate (non-default) storage
pool, and so would appreciate suggestions about where my money should be
going (or not going).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Debugging fstrim issues

2018-01-29 Thread Nathan Harper
Hi,

Thanks all for your quick responses.   In my enthusiasm to test I might
have been masking the problem, plus not knowing what the output of 'fstrim'
should actually show.

Firstly, to answer the question, yes I have the relevant libvirt config,
plus have set the correct virtio-scsi settings in the image configuration.
 I am also not expecting to see anything in userland from a 'df'

After re-running the tests from the linked article after an instance was
created, it does appear that trimming isn't working, but the number of
objects was growing significantly after logging in (almost doubling) and
this was masking the effect of trimming.

tl:dr - it does actually appear to be working, but thank you for the
responses and also confirming that there was no Ceph-specific configuration
required.

On 29 January 2018 at 12:00, Ric Wheeler <rwhee...@redhat.com> wrote:

> I might have missed something in the question.
>
> Fstrim does not free up space at the user level that you see with a normal
> df.
>
> It is meant to let the block device know about all of the space unused by
> the file system.
>
> Regards,
>
> Ric
>
>
> On Jan 29, 2018 11:56 AM, "Wido den Hollander" <w...@42on.com> wrote:
>
>>
>>
>> On 01/29/2018 12:29 PM, Nathan Harper wrote:
>>
>>> Hi,
>>>
>>> I don't know if this is strictly a Ceph issue, but hoping someone will
>>> be able to shed some light.   We have an Openstack environment (Ocata)
>>> backed onto a Jewel cluster.
>>>
>>> We recently ran into some issues with full OSDs but couldn't work out
>>> what was filling up the pools.
>>>
>>> It appears that fstrim isn't actually working (or being ignored) within
>>> the instances:
>>>
>>> [centos@with ~]$ df -h
>>> Filesystem  Size  Used Avail Use% Mounted on
>>> /dev/sda1   8.0G  1.2G  6.9G  14% /
>>>
>>> plenty of unused space that can be reclaimed:
>>>
>>> running fstrim:
>>> [centos@with ~]$ sudo fstrim -a -v
>>> /: 6.9 GiB (7416942592 bytes) trimmed
>>>
>>> however, running the command again:
>>>
>>> [centos@with ~]$ sudo fstrim -a -v
>>> /: 6.9 GiB (7416942592 bytes) trimmed
>>>
>>>
>> It depends on the FS you use inside the VM.
>>
>> iirc XFS will always tell you the total free space of the FS being
>> trimmed and ext4 only the actual bytes which are trimmed.
>>
>> The discard option was enabled following:
>>> https://ceph.com/geen-categorie/openstack-and-ceph-rbd-discard/
>>>
>>> Have I missed anything at the Ceph layer to enable this?   Is there
>>> anything I can do to see if the request is making it's way through to the
>>> Ceph layer?
>>>
>>>
>> Try using 'rbd du' and see if it makes a difference.
>>
>> You also want to check if the XML definition of libvirt has unmap enabled
>> although I expect it does since you don't see any errors.
>>
>> Wido
>>
>>
>>>
>>> --
>>> *NathanHarper* // IT Systems Lead
>>>
>>> *e: *nathan.har...@cfms.org.uk <mailto:nathan.har...@cfms.org.uk>
>>> *t*: 0117 906 1104 *m*:  0787 551 0891 *w: *www.cfms.org.uk <
>>> http://www.cfms.org.uk/>
>>> CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent //
>>> Emersons Green // Bristol // BS16 7FR
>>> CFMS Services Ltd is registered in England and Wales No 05742022 - a
>>> subsidiary of CFMS Ltd
>>> CFMS Services Ltd registered office // 43 Queens Square // Bristol //
>>> BS1 4QP
>>>
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>


-- 
*Nathan Harper* // IT Systems Lead

*e: *nathan.har...@cfms.org.uk   *t*: 0117 906 1104  *m*:  0787 551 0891
*w: *www.cfms.org.uk
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
Green // Bristol // BS16 7FR

CFMS Services Ltd is registered in England and Wales No 05742022 - a
subsidiary of CFMS Ltd
CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1
4QP
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Debugging fstrim issues

2018-01-29 Thread Nathan Harper
Hi,

I don't know if this is strictly a Ceph issue, but hoping someone will be
able to shed some light.   We have an Openstack environment (Ocata) backed
onto a Jewel cluster.

We recently ran into some issues with full OSDs but couldn't work out what
was filling up the pools.

It appears that fstrim isn't actually working (or being ignored) within the
instances:

[centos@with ~]$ df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda1   8.0G  1.2G  6.9G  14% /

plenty of unused space that can be reclaimed:

running fstrim:
[centos@with ~]$ sudo fstrim -a -v
/: 6.9 GiB (7416942592 bytes) trimmed

however, running the command again:

[centos@with ~]$ sudo fstrim -a -v
/: 6.9 GiB (7416942592 bytes) trimmed

The discard option was enabled following:
https://ceph.com/geen-categorie/openstack-and-ceph-rbd-discard/

Have I missed anything at the Ceph layer to enable this?   Is there
anything I can do to see if the request is making it's way through to the
Ceph layer?



-- 
*Nathan Harper* // IT Systems Lead

*e: *nathan.har...@cfms.org.uk   *t*: 0117 906 1104  *m*:  0787 551 0891
*w: *www.cfms.org.uk
CFMS Services Ltd // Bristol & Bath Science Park // Dirac Crescent // Emersons
Green // Bristol // BS16 7FR

CFMS Services Ltd is registered in England and Wales No 05742022 - a
subsidiary of CFMS Ltd
CFMS Services Ltd registered office // 43 Queens Square // Bristol // BS1
4QP
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com