Re: [ceph-users] ceph-ansible - where to ask questions?

2019-01-31 Thread Martin Palma
Hi Will, there is a dedicated mailing list for ceph-ansible: http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com Best, Martin On Thu, Jan 31, 2019 at 5:07 PM Will Dennis wrote: > > Hi all, > > > > Trying to utilize the ‘ceph-ansible’ project > (https://github.com/ceph/ceph-ansible ) to

Re: [ceph-users] cephfs kernel client instability

2019-01-28 Thread Martin Palma
Upgrading to 4.15.0-43-generic fixed the problem. Best, Martin On Fri, Jan 25, 2019 at 9:43 PM Ilya Dryomov wrote: > > On Fri, Jan 25, 2019 at 9:40 AM Martin Palma wrote: > > > > > Do you see them repeating every 30 seconds? > > > > yes: > > > > Jan

Re: [ceph-users] cephfs kernel client instability

2019-01-25 Thread Martin Palma
> Do you see them repeating every 30 seconds? yes: Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4 10.8.55.203:6789 session lost, hunting for new mon Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3 10.8.55.202:6789 session lost, hunting for new mon Jan 25 09:34:37

Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Martin Palma
Hi Ilya, thank you for the clarification. After setting the "osd_map_messages_max" to 10 the io errors and the MDS error "MDS_CLIENT_LATE_RELEASE" are gone. The messages of "mon session lost, hunting for new new mon" didn't go away... can it be that this is related to

Re: [ceph-users] cephfs kernel client instability

2019-01-24 Thread Martin Palma
We are experiencing the same issues on clients with CephFS mounted using the kernel client and 4.x kernels. The problem shows up when we add new OSDs, on reboots after installing patches and when changing the weight. Here the logs of a misbehaving client; [6242967.890611] libceph: mon4

[ceph-users] Correlate Ceph kernel module version with Ceph version

2018-12-14 Thread Martin Palma
Hello, maybe a dump question but is there a way to correlate the ceph kernel module version with a ceph specific version? For example can I figure this out using "modinfo ceph"? Whats the best way to check if a specific client is running at least at Luminous? Best, Martin

Re: [ceph-users] list admin issues

2018-10-08 Thread Martin Palma
Same here also on Gmail with G Suite. On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich wrote: > > I'm also seeing this once every few months or so on Gmail with G Suite. > > Paul > Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen > : > > > > I also got removed once, got another warning once

Re: [ceph-users] Best handling network maintenance

2018-10-05 Thread Martin Palma
t; Mons are also on a 30s timeout. > Even a short loss of quorum isn‘t noticeable for ongoing IO. > > Paul > > > Am 04.10.2018 um 11:03 schrieb Martin Palma : > > > > Also monitor election? That is the most fear we have since the monitor > > nodes will no see each other

Re: [ceph-users] Best handling network maintenance

2018-10-04 Thread Martin Palma
Also monitor election? That is the most fear we have since the monitor nodes will no see each other for that timespan... On Thu, Oct 4, 2018 at 10:21 AM Paul Emmerich wrote: > > 10 seconds is far below any relevant timeout values (generally 20-30 > seconds); so you will be fine without any

[ceph-users] Best handling network maintenance

2018-10-04 Thread Martin Palma
Hi all, our Ceph cluster is distributed across two datacenter. Due do network maintenance the link between the two datacenter will be down for ca. 8 - 10 seconds. In this time the public network of Ceph between the two DCs will also be down. What can we do of best handling this scenario to have

Re: [ceph-users] Force unmap of RBD image

2018-09-10 Thread Martin Palma
Thanks for the suggestions, and will future check for LVM volumes, etc... the kernel version is the following 3.10.0-327.4.4.el7.x86_64 and the OS is CentOS 7.2.1511 (Core) Best, Martin On Mon, Sep 10, 2018 at 12:23 PM Ilya Dryomov wrote: > > On Mon, Sep 10, 2018 at 10:46 AM Martin Palma

[ceph-users] Force unmap of RBD image

2018-09-10 Thread Martin Palma
We are trying to unmap an rbd image form a host for deletion and hitting the following error: rbd: sysfs write failed rbd: unmap failed: (16) Device or resource busy We used commands like "lsof" and "fuser" but nothing is reported to use the device. Also checked for watcher with "rados -p pool

[ceph-users] How to secure Prometheus endpoints (mgr plugin and node_exporter)

2018-06-29 Thread Martin Palma
Since Prometheus uses a pull model over HTTP for collecting metrics. What are the best practices to secure these HTTP endpoints? - With a reverse proxy with authentication? - Export the node_exporter only on the cluster network? (not usable for the mgr plugin and for nodes like mons, mdss,...) -

[ceph-users] Question on cluster balance and data distribution

2018-06-08 Thread Martin Palma
Hi all, In our current production cluster we have the following CRUSH hierarchy, see https://pastebin.com/640Q4XSH or the attached image. This reflects 1:1 real physical deployment. We currently use also a replica factor of 3 with the following CRUSH rule on our pools: rule hdd_replicated {

[ceph-users] CephFS get directory size without mounting the fs

2018-04-18 Thread Martin Palma
Hello, Is it possible to get directory/file layout information (size, pool) of a CephFS directory directly from a metadata server without the need to mount the fs? Or better through the restful plugin... When mounted I can get infos about the directory/file layout using the getfattr command...

Re: [ceph-users] Updating standby mds from 12.2.2 to 12.2.4 caused up:active 12.2.2 mds's to suicide

2018-03-21 Thread Martin Palma
Just run into this problem on our production cluster It would have been nice if the release notes of 12.2.4 had been adapted to inform user about this. Best, Martin On Wed, Mar 14, 2018 at 9:53 PM, Gregory Farnum wrote: > On Wed, Mar 14, 2018 at 12:41 PM, Lars

Re: [ceph-users] reweight-by-utilization reverse weight after adding new nodes?

2018-02-28 Thread Martin Palma
ta distribution should be the same. If you were to reset the > weights for the previous OSDs, you would only incur an additional round of > reweighting for no discernible benefit. > > On Mon, Feb 26, 2018 at 7:13 AM Martin Palma <mar...@palma.bz> wrote: >> >> Hell

[ceph-users] reweight-by-utilization reverse weight after adding new nodes?

2018-02-26 Thread Martin Palma
Hello, from some OSDs in our cluster we got the "nearfull" warning message so we run the "ceph osd reweight-by-utilization" command to better distribute the data. Now we have expanded out cluster with new nodes should we reverse the weight of the changed OSDs to 1.0? Best, Martin

[ceph-users] librados for MacOS

2017-08-03 Thread Martin Palma
Hello, is there a way to get librados for MacOS? Has anybody tried to build librados for MacOS? Is this even possible? Best, Martin ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph kraken: Calamari Centos7

2017-07-20 Thread Martin Palma
Hi, Calamari is deprecated, it was replaced by the ceph-mgr [0] from what I know. Bye, Martin [0] http://docs.ceph.com/docs/master/mgr/ On Wed, Jul 19, 2017 at 6:28 PM, Oscar Segarra wrote: > Hi, > > Anybody has been able to setup Calamari on Centos7?? > > I've done a

Re: [ceph-users] hammer -> jewel 10.2.8 upgrade and setting sortbitwise

2017-07-18 Thread Martin Palma
Can the "sortbitwise" also be set if we have a cluster running OSDs on 10.2.6 and some OSDs on 10.2.9? Or should we wait that all OSDs are on 10.2.9? Monitor nodes are already on 10.2.9. Best, Martin On Fri, Jul 14, 2017 at 1:16 PM, Dan van der Ster wrote: > On Mon, Jul

Re: [ceph-users] Stealth Jewel release?

2017-07-14 Thread Martin Palma
Thank you for the clarification and yes we saw that v10.2.9 was just released. :-) Best, Martin On Fri, Jul 14, 2017 at 3:53 PM, Patrick Donnelly <pdonn...@redhat.com> wrote: > On Fri, Jul 14, 2017 at 12:26 AM, Martin Palma <mar...@palma.bz> wrote: >> So only the ceph-mds i

Re: [ceph-users] Stealth Jewel release?

2017-07-14 Thread Martin Palma
So only the ceph-mds is affected? Let's say if we have mons and osds on 10.2.8 and the MDS on 10.2.6 or 10.2.7 we would be "safe"? I'm asking since we need to add new storage nodes to our production cluster. Best, Martin On Wed, Jul 12, 2017 at 10:44 PM, Patrick Donnelly

Re: [ceph-users] XFS attempt to access beyond end of device

2017-03-22 Thread Martin Palma
> [429280.254400] attempt to access beyond end of device > [429280.254412] sdi1: rw=0, want=19134412768, limit=19134412767 We are seeing the same for our OSDs which have the journal as a separate partition always on the same disk and only for OSDs which we added after our cluster was upgraded to

Re: [ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Martin Palma
Hi Wido, thank you for the clarification. We will wait until recovery is over we have plenty of space on the mons :-) Best, Martin On Tue, Jan 31, 2017 at 10:35 AM, Wido den Hollander <w...@42on.com> wrote: > >> Op 31 januari 2017 om 10:22 schreef Martin Palma <mar...@palma.bz

[ceph-users] mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail

2017-01-31 Thread Martin Palma
Hi all, our cluster is currently performing a big expansion and is in recovery mode (we doubled in size and osd# from 600 TB to 1,2 TB). Now we get the following message from our monitor nodes: mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail Reading [0] it says that it is

Re: [ceph-users] Question about user's key

2017-01-20 Thread Martin Palma
Could you pls tell me where it is on the monitor nodes? only in the memory or > persisted in any files or DBs? Looks like it’s not just in memory but I > cannot find where those value saved, thanks! > > Best Regards, > Dave Chen > > From: Martin Palma [mailto:mar...@palma.bz] &g

Re: [ceph-users] Question about user's key

2017-01-19 Thread Martin Palma
Hi, They are stored on the monitore nodes. Best, Martin On Fri, 20 Jan 2017 at 04:53, Chen, Wei D wrote: > Hi, > > > > I have read through some documents about authentication and user > management about ceph, everything works fine with me, I can create > > a user and

Re: [ceph-users] RBD key permission to unprotect a rbd snapshot

2017-01-15 Thread Martin Palma
d to ensure that the user that will > perform the "snap unprotect" has the "allow class-read object_prefix > rbd_children" on all pools [1]. > > [1] http://docs.ceph.com/docs/master/man/8/ceph-authtool/#capabilities > > On Thu, Jan 12, 2017 at 10:56 AM, Martin Palm

[ceph-users] RBD key permission to unprotect a rbd snapshot

2017-01-12 Thread Martin Palma
Hi all, what permissions do I need to unprotect a protected rbd snapshot? Currently the key interacting with the pool containing the rbd image has the following permissions: mon 'allow r' osd 'allow rwx pool=vms' When I try to unprotect a snaphost with the following command "rbd snap unprotect

Re: [ceph-users] Prevent cephfs clients from mount and browsing "/"

2016-12-07 Thread Martin Palma
Thanks all for the clarification. Best, Martin On Mon, Dec 5, 2016 at 2:14 PM, John Spray <jsp...@redhat.com> wrote: > On Mon, Dec 5, 2016 at 12:35 PM, David Disseldorp <dd...@suse.de> wrote: >> Hi Martin, >> >> On Mon, 5 Dec 2016 13:27:01 +0100, Martin Palma

Re: [ceph-users] Prevent cephfs clients from mount and browsing "/"

2016-12-05 Thread Martin Palma
Ok, just discovered that with the fuse client, we have to add the '-r /path' option, to treat that as root. So I assume the caps 'mds allow r' is only needed if we also what to be able to mount the directory with the kernel client. Right? Best, Martin On Mon, Dec 5, 2016 at 1:20 PM, Martin Palma

[ceph-users] Prevent cephfs clients from mount and browsing "/"

2016-12-05 Thread Martin Palma
Hello, is it possible prevent cephfs client to mount the root of a cephfs filesystem and browse through it? We want to restrict cephfs clients to a particular directory, but when we define a specific cephx auth key for a client we need to add the following caps: "mds 'allow r'" which then gives

Re: [ceph-users] Antw: Re: Best practices for extending a ceph cluster with minimal client impact data movement

2016-11-18 Thread Martin Palma
> I was wondering how exactly you accomplish that? > Can you do this with a "ceph-deploy create" with "noin" or "noup" flags > set, or does one need to follow the manual steps of adding an osd? You can do it either way (manual or with ceph-deploy). Here are the steps using ceph-deploy: 1. Add

Re: [ceph-users] Best practices for extending a ceph cluster with minimal client impact data movement

2016-08-25 Thread Martin Palma
42on.com> wrote: > >> Op 9 augustus 2016 om 17:44 schreef Martin Palma <mar...@palma.bz>: >> >> >> Hi Wido, >> >> thanks for your advice. >> > > Just keep in mind, you should update the CRUSHMap in one big bang. The >

Re: [ceph-users] Best practices for extending a ceph cluster with minimal client impact data movement

2016-08-09 Thread Martin Palma
Hi Wido, thanks for your advice. Best, Martin On Tue, Aug 9, 2016 at 10:05 AM, Wido den Hollander <w...@42on.com> wrote: > >> Op 8 augustus 2016 om 16:45 schreef Martin Palma <mar...@palma.bz>: >> >> >> Hi all, >> >> we are in the process o

[ceph-users] Best practices for extending a ceph cluster with minimal client impact data movement

2016-08-08 Thread Martin Palma
Hi all, we are in the process of expanding our cluster and I would like to know if there are some best practices in doing so. Our current cluster is composted as follows: - 195 OSDs (14 Storage Nodes) - 3 Monitors - Total capacity 620 TB - Used 360 TB We will expand the cluster by other 14

Re: [ceph-users] ceph health

2016-07-18 Thread Martin Palma
I assume you installed Ceph using 'ceph-deploy'. I noticed the same thing on CentOS when deploying a cluster for testing... As Wido already noted the OSDs are marked as down & out. From each OSD node you can do a "ceph-disk activate-all" to start the OSDs. On Mon, Jul 18, 2016 at 12:59 PM, Wido

Re: [ceph-users] repomd.xml: [Errno 14] HTTP Error 404 - Not Found on download.ceph.com for rhel7

2016-07-08 Thread Martin Palma
It seems that the packages "ceph-release-*.noarch.rpm" contain a ceph.repo pointing to the baseurl "http://ceph.com/rpm-hammer/rhel7/$basearch; which does not exist. It should probably point to "http://ceph.com/rpm-hammer/el7/$basearch;. - Martin On Thu, Jul 7, 2016 at 5:57 P

[ceph-users] repomd.xml: [Errno 14] HTTP Error 404 - Not Found on download.ceph.com for rhel7

2016-07-07 Thread Martin Palma
Hi All, it seems that the "rhel7" folder/symlink on "download.ceph.com/rpm-hammer" does not exist anymore therefore ceph-deploy fails to deploy a new cluster. Just tested it by setting up a new lab environment. We have the same issue on our production cluster currently, which keeps us of

Re: [ceph-users] Failing upgrade from Hammer to Jewel on Centos 7

2016-06-15 Thread Martin Palma
; > -Original Message- > From: m...@palma.bz [mailto:m...@palma.bz] On Behalf Of Martin Palma > Sent: Wednesday, June 15, 2016 16:03 > To: DAVY Stephane OBS/OCB > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Failing upgrade from Hammer to Jewel on Cento

Re: [ceph-users] Failing upgrade from Hammer to Jewel on Centos 7

2016-06-15 Thread Martin Palma
Hi Stéphane, We had the same issue: https://www.mail-archive.com/ceph-users%40lists.ceph.com/msg27507.html Since then we have applied the fix suggested by Dan by simple adding "ceph-disk activate-all" to rc.local Best, Martin On Wed, Jun 15, 2016 at 10:39 AM, wrote:

Re: [ceph-users] Calculating PG in an mixed environment

2016-03-15 Thread Martin Palma
ichael J. Kidd > Sr. Software Maintenance Engineer > Red Hat Ceph Storage > +1 919-442-8878 > > On Tue, Mar 15, 2016 at 11:41 AM, Martin Palma <mar...@palma.bz> wrote: >> >> Hi all, >> >> The documentation [0] gives us the following formula

[ceph-users] Calculating PG in an mixed environment

2016-03-15 Thread Martin Palma
Hi all, The documentation [0] gives us the following formula for calculating the number of PG if the cluster is bigger than 50 OSDs: (OSDs * 100) Total PGs = pool size When we have mixed storage server (HDD disks and SSD disks) and we have

Re: [ceph-users] After Reboot no OSD disks mountend

2016-03-07 Thread Martin Palma
: > Hi, > > To clarify, I didn't notice this issue in 0.94.6 specifically... I > just don't trust the udev magic to work every time after every kernel > upgrade, etc. > > -- Dan > > On Mon, Mar 7, 2016 at 10:20 AM, Martin Palma <mar...@palma.bz> wrote: >> Hi Dan, >> &g

Re: [ceph-users] After Reboot no OSD disks mountend

2016-03-07 Thread Martin Palma
local. > (We use this all the time anyway just in case...) > > -- Dan > > On Mon, Mar 7, 2016 at 9:38 AM, Martin Palma <mar...@palma.bz> wrote: >> Hi All, >> >> we are in the middle of patching our OSD servers and noticed that >> after rebooting no OSD

[ceph-users] After Reboot no OSD disks mountend

2016-03-07 Thread Martin Palma
Hi All, we are in the middle of patching our OSD servers and noticed that after rebooting no OSD disk is mounted and therefore no OSD service starts. We have then to manually call "ceph-disk-activate /dev/sdX1" for all our disk in order to mount and start the OSD service again. Here a the

Re: [ceph-users] ceph-deploy create-initial errors out with "Some monitors have still not reached quorum"

2016-01-05 Thread Martin Palma
Hi Maruthi, happy to hear that it is working now. Yes, with the latest stable release, infernalis, the "ceph" username is reserved for the Ceph daemons. Best, Martin On Tuesday, 5 January 2016, Maruthi Seshidhar wrote: > Thank you Martin, > > Yes, "nslookup "

Re: [ceph-users] ceph-deploy create-initial errors out with "Some monitors have still not reached quorum"

2016-01-01 Thread Martin Palma
Hi Maruthi, and did you test that DNS name lookup properly works (e.g. nslookup ceph-mon1 etc...) on all hosts? >From the output of 'ceph-deploy' it seem that the host can only resolve it's own name but not the others: [ceph-mon1][DEBUG ] "monmap": { [ceph-mon1][DEBUG ] "created":

Re: [ceph-users] recommendations for file sharing

2015-12-15 Thread Martin Palma
Currently, we use approach #1 with kerberized NFSv4 and Samba (with AD as KDC) - desperately waiting for CephFS :-) Best, Martin On Tue, Dec 15, 2015 at 11:51 AM, Wade Holler wrote: > Keep it simple is my approach. #1 > > If needed Add rudimentary HA with pacemaker. > >

Re: [ceph-users] ceph-deploy mon create failing with exception

2015-10-12 Thread Martin Palma
Hi, from what I'm seeing your ceph.conf isn't quite right if we take into account you cluster description "...with one monitor node and one osd...". The parameters "mon_inital_members" and "mon_host" should only contain monitor nodes. Not all the nodes in you cluster. More over you should

Re: [ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Martin Palma
my perspective. > > > > And a swap partition is still needed even though the memory is big. > > Martin Palma 于 2015年9月18日,下午11:07写道: Hi, > > > > Is it a good idea to use a software raid for the system disk (Operating > > System) on a Ceph storage node?

[ceph-users] Software Raid 1 for system disks on storage nodes (not for OSD disks)

2015-09-18 Thread Martin Palma
Hi, Is it a good idea to use a software raid for the system disk (Operating System) on a Ceph storage node? I mean only for the OS not for the OSD disks. And what about a swap partition? Is that needed? Best, Martin ___ ceph-users mailing list

Re: [ceph-users] SSD disk distribution

2015-06-01 Thread Martin Palma
should be your correct approach. Hope this helps, Thanks Regards Somnath *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf Of *Martin Palma *Sent:* Saturday, May 30, 2015 1:37 AM *To:* ceph-users@lists.ceph.com *Subject:* [ceph-users] SSD disk distribution

[ceph-users] SSD disk distribution

2015-05-30 Thread Martin Palma
Hello, We are planing to deploy our first Ceph cluster with 14 storage nodes and 3 monitor nodes. The storage node have 12 SATA disks and 4 SSDs. 2 of the SSDs we plan to use as journal disks and 2 for cache tiering. Now the question raised in our team if it would be better to put all SSDs lets