Re: [ceph-users] Cephfs kernel driver availability

2018-07-23 Thread Michael Kuriger
If you're using CentOS/RHEL you can try the elrepo kernels

Mike Kuriger 



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of John 
Spray
Sent: Monday, July 23, 2018 5:07 AM
To: Bryan Henderson
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Cephfs kernel driver availability

On Sun, Jul 22, 2018 at 9:03 PM Bryan Henderson  wrote:
>
> Is there some better place to get a filesystem driver for the longterm
> stable Linux kernel (3.16) than the regular kernel.org source distribution?

The general advice[1] on this is not to try and use a 3.x kernel with
CephFS.  The only exception is if your distro provider is doing
special backports (latest RHEL releases have CephFS backports).  This
causes some confusion, because a number of distros that have shipped
"stable" kernels with older, known unstable CephFS code.

If you're building your own kernels then you definitely want to be on
a recent 4.x

John

1. 
https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_cephfs_best-2Dpractices_-23which-2Dkernel-2Dversion=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=1d2qC0CtsbiZASGIWepKvVV0aMaJAXZZmg2_NDncscw=nAbQCNqk5k58F3w1fk-APMYb49ODP3WlGtdkQNjwU4Q=

> The reason I ask is that I have been trying to get some clients running
> Linux kernel 3.16 (the current long term stable Linux kernel) and so far
> I have run into two serious bugs that, it turns out, were found and fixed
> years ago in more current mainline kernels.
>
> In both cases, I emailed Ben Hutchings, the apparent maintainer of 3.16,
> asking if the fixes could be added to 3.16, but was met with silence.  This
> leads me to believe that there are many more bugs in the 3.16 cephfs
> filesystem driver waiting for me.  Indeed, I've seen panics not yet explained.
>
> So what are other people using?  A less stable kernel?  An out-of-tree driver?
> FUSE?  Is there a working process for getting known bugs fixed in 3.16?
>
> --
> Bryan Henderson   San Jose, California
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=1d2qC0CtsbiZASGIWepKvVV0aMaJAXZZmg2_NDncscw=lLPHUayL4gqcIGSbOL6XkIuUPBs14rsGI6hFq1UtXvI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=1d2qC0CtsbiZASGIWepKvVV0aMaJAXZZmg2_NDncscw=lLPHUayL4gqcIGSbOL6XkIuUPBs14rsGI6hFq1UtXvI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSDs for data drives

2018-07-16 Thread Michael Kuriger
I dunno, to me benchmark tests are only really useful to compare different 
drives.


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Paul 
Emmerich
Sent: Monday, July 16, 2018 8:41 AM
To: Satish Patel
Cc: ceph-users
Subject: Re: [ceph-users] SSDs for data drives

This doesn't look like a good benchmark:

(from the blog post)

dd if=/dev/zero of=/mnt/rawdisk/data.bin bs=1G count=20 oflag=direct
1. it writes compressible data which some SSDs might compress, you should use 
urandom
2. that workload does not look like something Ceph will do to your disk, like 
not at all
If you want a quick estimate of an SSD in worst-case scenario: run the usual 4k 
oflag=direct,dsync test (or better: fio).
A bad SSD will get < 1k IOPS, a good one > 10k
But that doesn't test everything. In particular, performance might degrade as 
the disks fill up. Also, it's the absolute
worst-case, i.e., a disk used for multiple journal/wal devices


Paul

2018-07-16 10:09 GMT-04:00 Satish Patel 
mailto:satish@gmail.com>>:
https://blog.cypressxt.net/hello-ceph-and-samsung-850-evo/

On Thu, Jul 12, 2018 at 3:37 AM, Adrian Saul
mailto:adrian.s...@tpgtelecom.com.au>> wrote:
>
>
> We started our cluster with consumer (Samsung EVO) disks and the write
> performance was pitiful, they had periodic spikes in latency (average of
> 8ms, but much higher spikes) and just did not perform anywhere near where we
> were expecting.
>
>
>
> When replaced with SM863 based devices the difference was night and day.
> The DC grade disks held a nearly constant low latency (contantly sub-ms), no
> spiking and performance was massively better.   For a period I ran both
> disks in the cluster and was able to graph them side by side with the same
> workload.  This was not even a moderately loaded cluster so I am glad we
> discovered this before we went full scale.
>
>
>
> So while you certainly can do cheap and cheerful and let the data
> availability be handled by Ceph, don’t expect the performance to keep up.
>
>
>
>
>
>
>
> From: ceph-users 
> [mailto:ceph-users-boun...@lists.ceph.com]
>  On Behalf Of
> Satish Patel
> Sent: Wednesday, 11 July 2018 10:50 PM
> To: Paul Emmerich mailto:paul.emmer...@croit.io>>
> Cc: ceph-users mailto:ceph-users@lists.ceph.com>>
> Subject: Re: [ceph-users] SSDs for data drives
>
>
>
> Prices going way up if I am picking Samsung SM863a for all data drives.
>
>
>
> We have many servers running on consumer grade sad drives and we never
> noticed any performance or any fault so far (but we never used ceph before)
>
>
>
> I thought that is the whole point of ceph to provide high availability if
> drive go down also parellel read from multiple osd node
>
>
>
> Sent from my iPhone
>
>
> On Jul 11, 2018, at 6:57 AM, Paul Emmerich 
> mailto:paul.emmer...@croit.io>> wrote:
>
> Hi,
>
>
>
> we‘ve no long-term data for the SM variant.
>
> Performance is fine as far as we can tell, but the main difference between
> these two models should be endurance.
>
>
>
>
>
> Also, I forgot to mention that my experiences are only for the 1, 2, and 4
> TB variants. Smaller SSDs are often proportionally slower (especially below
> 500GB).
>
>
>
> Paul
>
>
> Robert Stanford mailto:rstanford8...@gmail.com>>:
>
> Paul -
>
>
>
>  That's extremely helpful, thanks.  I do have another cluster that uses
> Samsung SM863a just for journal (spinning disks for data).  Do you happen to
> have an opinion on those as well?
>
>
>
> On Wed, Jul 11, 2018 at 4:03 AM, Paul Emmerich 
> mailto:paul.emmer...@croit.io>>
> wrote:
>
> PM/SM863a are usually great disks and should be the default go-to option,
> they outperform
>
> even the more expensive PM1633 in our experience.
>
> (But that really doesn't matter if it's for the full OSD and not as
> dedicated WAL/journal)
>
>
>
> We got a cluster with a few hundred SanDisk Ultra II (discontinued, i
> believe) that was built on a budget.
>
> Not the best disk but great value. They have been running since ~3 years now
> with very few failures and
>
> okayish overall performance.
>
>
>
> We also got a few clusters with a few hundred SanDisk Extreme Pro, but we
> are not yet sure about their
>
> long-time durability as they are only ~9 months old (average of ~1000 write
> IOPS on each disk over that time).
>
> Some of them report only 50-60% lifetime left.
>
>
>
> For NVMe, the Intel NVMe 750 is still a great disk
>
>
>
> Be carefuly to get these exact models. Seemingly similar disks might be just
> completely bad, for
>
> example, the Samsung PM961 is just unusable for Ceph in our experience.
>
>
>
> Paul
>
>
>
> 2018-07-11 10:14 GMT+02:00 Wido den Hollander 
> mailto:w...@42on.com>>:
>
>
>
> On 07/11/2018 10:10 AM, 

Re: [ceph-users] Ceph Mimic on CentOS 7.5 dependency issue (liboath)

2018-06-23 Thread Michael Kuriger
CentOS 7.5 is pretty new.  Have you tried CentOS 7.4?

Mike Kuriger 
Sr. Unix Systems Engineer 



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Brian :
Sent: Saturday, June 23, 2018 1:41 AM
To: Stefan Kooman
Cc: ceph-users
Subject: Re: [ceph-users] Ceph Mimic on CentOS 7.5 dependency issue (liboath)

Hi Stefan

$ sudo yum provides liboath
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: mirror.strencom.net
 * epel: mirror.sax.uk.as61049.net
 * extras: mirror.strencom.net
 * updates: mirror.strencom.net
liboath-2.4.1-9.el7.x86_64 : Library for OATH handling
Repo: epel



On Sat, Jun 23, 2018 at 9:02 AM, Stefan Kooman  wrote:
> Hi list,
>
> I'm trying to install "Ceph mimic" on a CentOS 7.5 client (base
> install). I Added the "rpm-mimic" repo from our mirror and tried to
> install ceph-common, but I run into a dependency problem:
>
> --> Finished Dependency Resolution
> Error: Package: 2:ceph-common-13.2.0-0.el7.x86_64 
> (ceph.download.bit.nl_rpm-mimic_el7_x86_64)
>Requires: liboath.so.0()(64bit)
> Error: Package: 2:ceph-common-13.2.0-0.el7.x86_64 
> (ceph.download.bit.nl_rpm-mimic_el7_x86_64)
>Requires: liboath.so.0(LIBOATH_1.10.0)(64bit)
> Error: Package: 2:ceph-common-13.2.0-0.el7.x86_64 
> (ceph.download.bit.nl_rpm-mimic_el7_x86_64)
>Requires: liboath.so.0(LIBOATH_1.2.0)(64bit)
> Error: Package: 2:librgw2-13.2.0-0.el7.x86_64 
> (ceph.download.bit.nl_rpm-mimic_el7_x86_64)
>
> Is this "oath" package something I need to install from a 3rd party repo?
>
> Gr. Stefan
>
>
> --
> | BIT BV  
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.bit.nl_=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=7oT7QCZjOE1RiCQwuYT5PejOv8n637nUi2yb5vE1aaQ=aPpOV3zxQyodG4OBQXMWTPfFJBgGMq-9tNFoSSEhMxQ=
> Kamer van Koophandel 09090351
> | GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=7oT7QCZjOE1RiCQwuYT5PejOv8n637nUi2yb5vE1aaQ=TPzdw4kJULbx2F0LQ1N-L3aQsxWzkKkW0X6b6NJJ5OI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=7oT7QCZjOE1RiCQwuYT5PejOv8n637nUi2yb5vE1aaQ=TPzdw4kJULbx2F0LQ1N-L3aQsxWzkKkW0X6b6NJJ5OI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Install ceph manually with some problem

2018-06-18 Thread Michael Kuriger
Don’t use the installer scripts.  Try  yum install ceph

Mike Kuriger
Sr. Unix Systems Engineer
T: 818-649-7235 M: 818-434-6195
[ttp://www.hotyellow.com/deximages/dex-thryv-logo.jpg]

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Ch Wan
Sent: Monday, June 18, 2018 2:40 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Install ceph manually with some problem

Hi, recently I'm trying to build ceph luminous on centos-7, following 
the documents:
sudo ./install-deps.sh
./do_cmake.sh
cd build && sudo make install

But when I run /usr/local/bin/ceph -v, it failed with there error:
Traceback (most recent call last):
  File "/usr/local/bin/ceph", line 125, in 
import rados
ImportError: No module named rados

I noticed that where are some warn messages while make install
Copying /data/ceph/ceph/build/src/pybind/rgw/rgw.egg-info to 
/usr/local/lib64/python2.7/site-packages/rgw-2.0.0-py2.7.egg-info
running install_scripts
writing list of installed files to '/dev/null'
running install
Checking .pth file support in /usr/local/lib/python2.7/site-packages/
/bin/python2.7 -E -c pass
TEST FAILED: /usr/local/lib/python2.7/site-packages/ does NOT support .pth files
error: bad install directory or PYTHONPATH
You are attempting to install a package to a directory that is not
on PYTHONPATH and which Python does not read ".pth" files from.  The
installation directory you specified (via --install-dir, --prefix, or
the distutils default setting) was:
/usr/local/lib/python2.7/site-packages/
and your PYTHONPATH environment variable currently contains:
''
Here are some of your options for correcting the problem:
* You can choose a different installation directory, i.e., one that is
  on PYTHONPATH or supports .pth files
* You can add the installation directory to the PYTHONPATH environment
  variable.  (It must then also be on PYTHONPATH whenever you run
  Python and want to use the package(s) you are installing.)
* You can set up the installation directory to support ".pth" files by
  using one of the approaches described here:
  
https://pythonhosted.org/setuptools/easy_install.html#custom-installation-locations

But there is not rados.py under  /usr/local/lib/python2.7/site-packages/
[ceph@ceph-test ceph]$ ll /usr/local/lib/python2.7/site-packages/
total 132
-rw-r--r-- 1 root root 43675 Jun  8 00:21 ceph_argparse.py
-rw-r--r-- 1 root root 14242 Jun  8 00:21 ceph_daemon.py
-rw-r--r-- 1 root root 17426 Jun  8 00:21 ceph_rest_api.py
-rw-r--r-- 1 root root 51076 Jun  8 00:21 ceph_volume_client.py

Would someone help me please?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot add new OSDs in mimic

2018-06-10 Thread Michael Kuriger
Oh boy! Thankfully I upgraded our sandbox cluster so I’m not in a sticky 
situation right now :-D

Mike Kuriger
Sr. Unix Systems Engineer


From: Sergey Malinin [mailto:h...@newmail.com]
Sent: Friday, June 08, 2018 4:22 PM
To: Michael Kuriger; Paul Emmerich
Cc: ceph-users
Subject: Re: [ceph-users] cannot add new OSDs in mimic

Lack of developers response (I reported the issue on Jun, 4) leads me to 
believe that it’s not a trivial problem and we all should be getting prepared 
for a hard time playing with osdmaptool...
On Jun 9, 2018, 02:10 +0300, Paul Emmerich , wrote:

Hi,

we are also seeing this (I've also posted to the issue tracker). It only 
affects clusters upgraded from Luminous, not new ones.
Also, it's not about re-using OSDs. Deleting any OSD seems to trigger this bug 
for all new OSDs on upgraded clusters.

We are still using the pre-Luminous way to remove OSDs, i.e.:

* ceph osd down/stop service
* ceph osd crush remove
* ceph osd auth del
* ceph osd rm

Paul


2018-06-08 22:14 GMT+02:00 Michael Kuriger 
mailto:mk7...@dexyp.com>>:
Hi everyone,
I appreciate the suggestions. However, this is still an issue. I've tried 
adding the OSD using ceph-deploy, and manually from the OSD host. I'm not able 
to start newly added OSDs at all, even if I use a new ID. It seems the OSD is 
added to CEPH but I cannot start it. OSDs that existed prior to the upgrade to 
mimic are working fine. Here is a copy of an OSD log entry.

osd.58 0 failed to load OSD map for epoch 378084, got 0 bytes

fsid 1ce494ac-a218-4141-9d4f-295e6fa12f2a
last_changed 2018-06-05 15:40:50.179880
created 0.00
0: 
10.3.71.36:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.71.36-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=aap6xxqBQYKDPIw3OK6MnHfv1CiYL5-1B_jfyqjQZNw=>
 mon.ceph-mon3
1: 
10.3.74.109:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.74.109-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=hvu4xhA9c0NETJSWu7jnt0ZjFtwU-qKTT2-1Yngk4Hs=>
 mon.ceph-mon2
2: 
10.3.74.214:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.74.214-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=BzB5PjlNjh3v9xIJ7IQ9W6ygQBt9eMeuomBA6-CBotY=>
 mon.ceph-mon1

   -91> 2018-06-08 12:48:20.697 7fada058e700  1 -- 
10.3.56.69:6800/1807239<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.56.69-3A6800_1807239=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=j3rbfSYmPYmJsZM_o6ylRmfPbpKJi7Fiwl8vsUAnxJQ=>
 <== mon.0 
10.3.71.36:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.71.36-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=aap6xxqBQYKDPIw3OK6MnHfv1CiYL5-1B_jfyqjQZNw=>
 7  auth_reply(proto 2 0 (0) Success) v1  194+0+0 
(645793352 0 0) 0x559f7a3dafc0 con 0x559f7994ec00
   -90> 2018-06-08 12:48:20.697 7fada058e700 10 monclient: _check_auth_rotating 
have uptodate secrets (they expire after 2018-06-08 12:47:50.699337)
   -89> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: wait_auth_rotating 
done
   -88> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: _send_command 1 
[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["58"]}]
   -87> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: _send_mon_message to 
mon.ceph-mon3 at 
10.3.71.36:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.71.36-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=aap6xxqBQYKDPIw3OK6MnHfv1CiYL5-1B_jfyqjQZNw=>
   -86> 2018-06-08 12:48:20.698 7fadbc9d7140  1 -- 
10.3.56.69:6800/1807239<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.56.69-3A6800_1807239=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=j3rbfSYmPYmJsZM_o6ylRmfPbpKJi7Fiwl8vsUAnxJQ=>
 --> 
10.3.71.36:6789/0<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.71.36-3A6789_0=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=aap6xxqBQYKDPIw3OK6MnHfv1CiYL5-1B_jfyqjQZNw=>
 -- mon_command({"prefix": "osd crush set-device-class", "class": "hdd", "ids": 
["58"]} v 0) v1 -- 0x559f793e73c0 con 0
   -85> 2018-06-08 12:48:20.700 7fadabaa4700  5 -- 
10.3.56.69:6800/1807239<https://urldefense.proofpoint.com/v2/url?u=http-3A__10.3.56.69-3A6800_1807239=DwMFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=XnK-r3TbnkB2B_YLVF5z_TQHgbOpI4xOAk-vuFGv05k=j3rbfSYmPYmJsZM_o6ylRmfPbpKJi7Fiwl8vsUAnxJQ=>
 >> 
10.3.71.36:6789/0<https://urldefense.proofp

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-08 Thread Michael Kuriger
Hi everyone,
I appreciate the suggestions. However, this is still an issue. I've tried 
adding the OSD using ceph-deploy, and manually from the OSD host. I'm not able 
to start newly added OSDs at all, even if I use a new ID. It seems the OSD is 
added to CEPH but I cannot start it. OSDs that existed prior to the upgrade to 
mimic are working fine. Here is a copy of an OSD log entry. 

osd.58 0 failed to load OSD map for epoch 378084, got 0 bytes

fsid 1ce494ac-a218-4141-9d4f-295e6fa12f2a
last_changed 2018-06-05 15:40:50.179880
created 0.00
0: 10.3.71.36:6789/0 mon.ceph-mon3
1: 10.3.74.109:6789/0 mon.ceph-mon2
2: 10.3.74.214:6789/0 mon.ceph-mon1

   -91> 2018-06-08 12:48:20.697 7fada058e700  1 -- 10.3.56.69:6800/1807239 <== 
mon.0 10.3.71.36:6789/0 7  auth_reply(proto 2 0 (0) Success) v1  
194+0+0 (645793352 0 0) 0x559f7a3dafc0 con 0x559f7994ec00
   -90> 2018-06-08 12:48:20.697 7fada058e700 10 monclient: _check_auth_rotating 
have uptodate secrets (they expire after 2018-06-08 12:47:50.699337)
   -89> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: wait_auth_rotating 
done
   -88> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: _send_command 1 
[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["58"]}]
   -87> 2018-06-08 12:48:20.698 7fadbc9d7140 10 monclient: _send_mon_message to 
mon.ceph-mon3 at 10.3.71.36:6789/0
   -86> 2018-06-08 12:48:20.698 7fadbc9d7140  1 -- 10.3.56.69:6800/1807239 --> 
10.3.71.36:6789/0 -- mon_command({"prefix": "osd crush set-device-class", 
"class": "hdd", "ids": ["58"]} v 0) v1 -- 0x559f793e73c0 con 0
   -85> 2018-06-08 12:48:20.700 7fadabaa4700  5 -- 10.3.56.69:6800/1807239 >> 
10.3.71.36:6789/0 conn(0x559f7994ec00 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=25741 cs=1 l=1). rx mon.0 seq 
8 0x559f793e73c0 mon_command_ack([{"prefix": "osd crush set-device-class", 
"class": "hdd", "ids": ["58"]}]=0 osd.58 already set to class hdd. 
set-device-class item id 58 name 'osd.58' device_class 'hdd': no change.  
v378738) v1
   -84> 2018-06-08 12:48:20.701 7fada058e700  1 -- 10.3.56.69:6800/1807239 <== 
mon.0 10.3.71.36:6789/0 8  mon_command_ack([{"prefix": "osd crush 
set-device-class", "class": "hdd", "ids": ["58"]}]=0 osd.58 already set to 
class hdd. set-device-class item id 58 name 'osd.58' device_class 'hdd': no 
change.  v378738) v1  211+0+0 (4063854475 0 0) 0x559f793e73c0 con 
0x559f7994ec00
   -83> 2018-06-08 12:48:20.701 7fada058e700 10 monclient: 
handle_mon_command_ack 1 [{"prefix": "osd crush set-device-class", "class": 
"hdd", "ids": ["58"]}]
   -82> 2018-06-08 12:48:20.701 7fada058e700 10 monclient: _finish_command 1 = 
0 osd.58 already set to class hdd. set-device-class item id 58 name 'osd.58' 
device_class 'hdd': no change.
   -81> 2018-06-08 12:48:20.701 7fadbc9d7140 10 monclient: _send_command 2 
[{"prefix": "osd crush create-or-move", "id": 58, "weight":0.5240, "args": 
["host=sacephnode12", "root=default"]}]
   -80> 2018-06-08 12:48:20.701 7fadbc9d7140 10 monclient: _send_mon_message to 
mon.ceph-mon3 at 10.3.71.36:6789/0
   -79> 2018-06-08 12:48:20.701 7fadbc9d7140  1 -- 10.3.56.69:6800/1807239 --> 
10.3.71.36:6789/0 -- mon_command({"prefix": "osd crush create-or-move", "id": 
58, "weight":0.5240, "args": ["host=sacephnode12", "root=default"]} v 0) v1 -- 
0x559f793e7600 con 0
   -78> 2018-06-08 12:48:20.703 7fadabaa4700  5 -- 10.3.56.69:6800/1807239 >> 
10.3.71.36:6789/0 conn(0x559f7994ec00 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=25741 cs=1 l=1). rx mon.0 seq 
9 0x559f793e7600 mon_command_ack([{"prefix": "osd crush create-or-move", "id": 
58, "weight":0.5240, "args": ["host=sacephnode12", "root=default"]}]=0 
create-or-move updated item name 'osd.58' weight 0.524 at location 
{host=sacephnode12,root=default} to crush map v378738) v1
   -77> 2018-06-08 12:48:20.703 7fada058e700  1 -- 10.3.56.69:6800/1807239 <== 
mon.0 10.3.71.36:6789/0 9  mon_command_ack([{"prefix": "osd crush 
create-or-move", "id": 58, "weight":0.5240, "args": ["host=sacephnode12", 
"root=default"]}]=0 create-or-move updated item name 'osd.58' weight 0.524 at 
location {host=sacephnode12,root=default} to crush map v378738) v1  258+0+0 
(1998484028 0 0) 0x559f793e7600 con 0x559f7994ec00
   -76> 2018-06-08 12:48:20.703 7fada058e700 10 monclient: 
handle_mon_command_ack 2 [{"prefix": "osd crush create-or-move", "id": 58, 
"weight":0.5240, "args": ["host=sacephnode12", "root=default"]}]
   -75> 2018-06-08 12:48:20.703 7fada058e700 10 monclient: _finish_command 2 = 
0 create-or-move updated item name 'osd.58' weight 0.524 at location 
{host=sacephnode12,root=default} to crush map
   -74> 2018-06-08 12:48:20.703 7fadbc9d7140  0 osd.58 0 done with init, 
starting boot process
   -73> 2018-06-08 12:48:20.703 7fadbc9d7140 10 monclient: _renew_subs
   -72> 2018-06-08 12:48:20.703 7fadbc9d7140 10 monclient: _send_mon_message to 
mon.ceph-mon3 at 10.3.71.36:6789/0
   -71> 2018-06-08 12:48:20.703 7fadbc9d7140  1 -- 

Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
Yes, I followed the procedure. Also, I'm not able to create new OSD's at all in 
mimic, even on a newly deployed osd server. Same error. Even if I pass the --id 
{1d} parameter to the ceph-volume command, it still uses the first available ID 
and not the one I specify.


Mike Kuriger 
Sr. Unix Systems Engineer 
T: 818-649-7235 M: 818-434-6195 



-Original Message-
From: Vasu Kulkarni [mailto:vakul...@redhat.com] 
Sent: Thursday, June 07, 2018 1:53 PM
To: Michael Kuriger
Cc: ceph-users
Subject: Re: [ceph-users] cannot add new OSDs in mimic

It is actually documented in replacing osd case,
https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_rados_operations_add-2Dor-2Drm-2Dosds_-23replacing-2Dan-2Dosd=DwIFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=aq6X3Wv3kt3ORFoya83IqZqUQY0UzkWP_E09S0RuOk8=8qLqOnvmldGsBQFSfdTyP9q4tPrD5oViYvgvybXJDm8=,
I hope you followed that procedure?

On Thu, Jun 7, 2018 at 1:11 PM, Michael Kuriger  wrote:
> Do you mean:
> ceph osd destroy {ID}  --yes-i-really-mean-it
>
> Mike Kuriger
>
>
>
> -Original Message-
> From: Vasu Kulkarni [mailto:vakul...@redhat.com]
> Sent: Thursday, June 07, 2018 12:28 PM
> To: Michael Kuriger
> Cc: ceph-users
> Subject: Re: [ceph-users] cannot add new OSDs in mimic
>
> There is a osd destroy command but not documented, did you run that as well?
>
> On Thu, Jun 7, 2018 at 12:21 PM, Michael Kuriger  wrote:
>> CEPH team,
>> Is there a solution yet for adding OSDs in mimic - specifically re-using old 
>> IDs?  I was looking over this BUG report - 
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__tracker.ceph.com_issues_24423=DwIFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=0PCKiecm216R95S_krqboYMskCBoolGysrvgHZo8LEM=hfI2uudTfY0lGtBI6iIXvZWvNpme4xwBJe2SWx0_N3I=
>>  and my issue is similar.  I removed a bunch of OSD's after upgrading to 
>> mimic and I'm not able to re-add them using the new volume format.  I 
>> haven't tried manually adding them using 'never used' IDs.  I'll try that 
>> now but was hoping there would be a fix.
>>
>> Thanks!
>>
>> Mike Kuriger
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=0PCKiecm216R95S_krqboYMskCBoolGysrvgHZo8LEM=2aoWc5hTz041_26Stz6zPtLiB5zGFw2GbX3TPjsvieI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
Do you mean:
ceph osd destroy {ID}  --yes-i-really-mean-it

Mike Kuriger 



-Original Message-
From: Vasu Kulkarni [mailto:vakul...@redhat.com] 
Sent: Thursday, June 07, 2018 12:28 PM
To: Michael Kuriger
Cc: ceph-users
Subject: Re: [ceph-users] cannot add new OSDs in mimic

There is a osd destroy command but not documented, did you run that as well?

On Thu, Jun 7, 2018 at 12:21 PM, Michael Kuriger  wrote:
> CEPH team,
> Is there a solution yet for adding OSDs in mimic - specifically re-using old 
> IDs?  I was looking over this BUG report - 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__tracker.ceph.com_issues_24423=DwIFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=0PCKiecm216R95S_krqboYMskCBoolGysrvgHZo8LEM=hfI2uudTfY0lGtBI6iIXvZWvNpme4xwBJe2SWx0_N3I=
>  and my issue is similar.  I removed a bunch of OSD's after upgrading to 
> mimic and I'm not able to re-add them using the new volume format.  I haven't 
> tried manually adding them using 'never used' IDs.  I'll try that now but was 
> hoping there would be a fix.
>
> Thanks!
>
> Mike Kuriger
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIFaQ=5m9CfXHY6NXqkS7nN5n23w=5r9bhr1JAPRaUcJcU-FfGg=0PCKiecm216R95S_krqboYMskCBoolGysrvgHZo8LEM=2aoWc5hTz041_26Stz6zPtLiB5zGFw2GbX3TPjsvieI=
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cannot add new OSDs in mimic

2018-06-07 Thread Michael Kuriger
CEPH team,
Is there a solution yet for adding OSDs in mimic - specifically re-using old 
IDs?  I was looking over this BUG report - 
https://tracker.ceph.com/issues/24423 and my issue is similar.  I removed a 
bunch of OSD's after upgrading to mimic and I'm not able to re-add them using 
the new volume format.  I haven't tried manually adding them using 'never used' 
IDs.  I'll try that now but was hoping there would be a fix.

Thanks!

Mike Kuriger 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] a big cluster or several small

2018-05-14 Thread Michael Kuriger
The more servers you have in your cluster, the less impact a failure causes to 
the cluster. Monitor your systems and keep them up to date.  You can also 
isolate data with clever crush rules and creating multiple zones.

Mike Kuriger


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Marc 
Boisis
Sent: Monday, May 14, 2018 9:50 AM
To: ceph-users
Subject: [ceph-users] a big cluster or several small


Hi,

Hello,
Currently we have a 294 OSD (21 hosts/3 racks) cluster with RBD clients only, 1 
single pool (size=3).

We want to divide this cluster into several to minimize the risk in case of 
failure/crash.
For example, a cluster for the mail, another for the file servers, a test 
cluster ...
Do you think it's a good idea ?

Do you have experience feedback on multiple clusters in production on the same 
hardware:
- containers (LXD or Docker)
- multiple cluster on the same host without virtualization (with ceph-deploy 
... --cluster ...)
- multilple pools
...


Do you have any advice?





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd and cephfs (data) in one pool?

2017-12-27 Thread Michael Kuriger
Making the filesystem might blow away all the rbd images though.

Mike Kuriger
Sr. Unix Systems Engineer
T: 818-649-7235 M: 818-434-6195
[ttp://www.hotyellow.com/deximages/dex-thryv-logo.jpg]

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David 
Turner
Sent: Wednesday, December 27, 2017 1:44 PM
To: Chad William Seys
Cc: ceph-users
Subject: Re: [ceph-users] rbd and cephfs (data) in one pool?


Afaik, I don't think anything will stop you from doing it, but it is not a 
document or supported use-case.

On Wed, Dec 27, 2017, 3:52 PM Chad William Seys 
> wrote:
Hello,
   Is it possible to place rbd and cephfs data in the same pool?

Thanks!
Chad.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replaced a disk, first time. Quick question

2017-12-04 Thread Michael Kuriger
I've seen that before (over 100%) but I forget the cause.  At any rate, the way 
I replace disks is to first set the osd weight to 0, wait for data to 
rebalance, then down / out the osd.  I don't think ceph does any reads from a 
disk once you've marked it out so hopefully there are other copies.

Mike Kuriger
Sr. Unix Systems Engineer
T: 818-649-7235 M: 818-434-6195
[ttp://www.hotyellow.com/deximages/dex-thryv-logo.jpg]

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Drew 
Weaver
Sent: Monday, December 04, 2017 8:39 AM
To: 'ceph-us...@ceph.com'
Subject: [ceph-users] Replaced a disk, first time. Quick question

Howdy,

I replaced a disk today because it was marked as Predicted failure. These were 
the steps I took

ceph osd out osd17
ceph -w #waited for it to get done
systemctl stop ceph-osd@osd17
ceph osd purge osd17 --yes-i-really-mean-it
umount /var/lib/ceph/osd/ceph-osdX

I noticed that after I ran the 'osd out' command that it started moving data 
around.

19446/16764 objects degraded (115.999%) <-- I noticed that number seems odd

So then I replaced the disk
Created a new label on it
Ceph-deploy osd prepare OSD5:sdd

THIS time, it started rebuilding

40795/16764 objects degraded (243.349%) <-- Now I'm really concerned.

Perhaps I don't quite understand what the numbers are telling me but is it 
normal for it to rebuilding more objects than exist?

Thanks,
-Drew


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD daemons active in nodes after removal

2017-10-25 Thread Michael Kuriger
When I do this, I reweight all of the OSDs I want to remove to 0 first, wait 
for the rebalance, then proceed to remove the OSDs.  Doing it your way, you 
have to wait for the rebalance after removing each OSD one by one.

Mike Kuriger
Sr. Unix Systems Engineer
818-434-6195
[cid:image001.jpg@01D34D7C.72B6C360]

From: ceph-users  on behalf of Karun Josy 

Date: Wednesday, October 25, 2017 at 10:15 AM
To: "ceph-users@lists.ceph.com" 
Subject: [ceph-users] OSD daemons active in nodes after removal

Hello everyone! :)

I have an interesting problem. For a few weeks, we've been testing Luminous in 
a cluster made up of 8 servers and with about 20 SSD disks almost evenly 
distributed. It is running erasure coding.

Yesterday, we decided to bring the cluster to a minimum of 8 servers and 1 disk 
per server.

So, we went ahead and removed the additional disks from the ceph cluster, by 
executing commands like this from the admin server:

---
$ ceph osd out osd.20
osd.20 is already out.
$ ceph osd down osd.20
marked down osd.20.
$ ceph osd purge osd.20 --yes-i-really-mean-it
Error EBUSY: osd.20 is not `down`.

So I logged in  to the host it resides on and killed it systemctl stop 
ceph-osd@26
$ ceph osd purge osd.20 --yes-i-really-mean-it
purged osd.20


We waited for the cluster to be healthy once again and I physically removed the 
disks (hot swap, connected to an LSI 3008 controller). A few minutes after 
that, I needed to turn off one of the OSD servers to swap out a piece of 
hardware inside. So, I issued:

ceph osd set noout

And proceeded to turn off that 1 OSD server.

But the interesting thing happened then. Once that 1 server came back up, the 
cluster all of a sudden showed that out of the 8 nodes, only 2 were up!

8 (2 up, 5 in)

Even more interesting is that it seems Ceph, in each OSD server, still thinks 
the missing disks are there!

When I start ceph on each OSD server with "systemctl start ceph-osd.target", 
/var/logs/ceph gets filled with logs for disks that are not supposed to exist 
anymore.

The contents of the logs show something like:

# cat /var/log/ceph/ceph-osd.7.log
2017-10-20 08:45:16.389432 7f8ee6e36d00  0 set uid:gid to 167:167 (ceph:ceph)
2017-10-20 08:45:16.389449 7f8ee6e36d00  0 ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process 
(unknown), pid 2591
2017-10-20 08:45:16.389639 7f8ee6e36d00 -1  ** ERROR: unable to open OSD 
superblock on /var/lib/ceph/osd/ceph-7: (2) No such file or directory
2017-10-20 08:45:36.639439 7fb389277d00  0 set uid:gid to 167:167 (ceph:ceph)

The actual Ceph cluster sees only 8 disks, as you can see here:

$ ceph osd tree
ID  CLASS WEIGHT  TYPE NAME STATUS REWEIGHT PRI-AFF
 -1   7.97388 root default
 -3   1.86469 host ceph-las1-a1-osd
  1   ssd 1.86469 osd.1   down0 1.0
 -5   0.87320 host ceph-las1-a2-osd
  2   ssd 0.87320 osd.2   down0 1.0
 -7   0.87320 host ceph-las1-a3-osd
  4   ssd 0.87320 osd.4   down  1.0 1.0
 -9   0.87320 host ceph-las1-a4-osd
  8   ssd 0.87320 osd.8 up  1.0 1.0
-11   0.87320 host ceph-las1-a5-osd
 12   ssd 0.87320 osd.12  down  1.0 1.0
-13   0.87320 host ceph-las1-a6-osd
 17   ssd 0.87320 osd.17up  1.0 1.0
-15   0.87320 host ceph-las1-a7-osd
 21   ssd 0.87320 osd.21  down  1.0 1.0
-17   0.87000 host ceph-las1-a8-osd
 28   ssd 0.87000 osd.28  down0 1.0


Linux, in the OSD servers, seems to also think the disks are in:

# df -h
Filesystem  Size  Used Avail Use% Mounted on
/dev/sde2   976M  183M  727M  21% /boot
/dev/sdd197M  5.4M   92M   6% /var/lib/ceph/osd/ceph-7
/dev/sdc197M  5.4M   92M   6% /var/lib/ceph/osd/ceph-6
/dev/sda197M  5.4M   92M   6% /var/lib/ceph/osd/ceph-4
/dev/sdb197M  5.4M   92M   6% /var/lib/ceph/osd/ceph-5
tmpfs   6.3G 0  6.3G   0% /run/user/0

It should show only one disk, not 4.

I tried to issue again the commands to remove the disks, this time, in the OSD 
server itself:

$ ceph osd out osd.X
osd.X does not exist.

$ ceph osd purge osd.X --yes-i-really-mean-it
osd.X does not exist

Yet, if I again issue "systemctl start ceph-osd.target", /var/log/ceph again 
shows logs for a disk that does not exist (to make sure, I deleted all logs 
prior).

So, it seems, somewhere, Ceph in the OSD still thinks there should be more 
disks?

The Ceph cluster is unusable though. We've tried everything to bring it back 
again. But as Dr. Bones would say, it's dead Jim.



___
ceph-users mailing list
ceph-users@lists.ceph.com

Re: [ceph-users] Brand new cluster -- pg is stuck inactive

2017-10-13 Thread Michael Kuriger
You may not have enough OSDs to satisfy the crush ruleset.  

 
Mike Kuriger 
Sr. Unix Systems Engineer
818-434-6195 
 

On 10/13/17, 9:53 AM, "ceph-users on behalf of dE" 
 wrote:

Hi,

 I'm running ceph 10.2.5 on Debian (official package).

It cant seem to create any functional pools --

ceph health detail
HEALTH_ERR 64 pgs are stuck inactive for more than 300 seconds; 64 pgs 
stuck inactive; too few PGs per OSD (21 < min 30)
pg 0.39 is stuck inactive for 652.741684, current state creating, last 
acting []
pg 0.38 is stuck inactive for 652.741688, current state creating, last 
acting []
pg 0.37 is stuck inactive for 652.741690, current state creating, last 
acting []
pg 0.36 is stuck inactive for 652.741692, current state creating, last 
acting []
pg 0.35 is stuck inactive for 652.741694, current state creating, last 
acting []
pg 0.34 is stuck inactive for 652.741696, current state creating, last 
acting []
pg 0.33 is stuck inactive for 652.741698, current state creating, last 
acting []
pg 0.32 is stuck inactive for 652.741701, current state creating, last 
acting []
pg 0.3 is stuck inactive for 652.741762, current state creating, last 
acting []
pg 0.2e is stuck inactive for 652.741715, current state creating, last 
acting []
pg 0.2d is stuck inactive for 652.741719, current state creating, last 
acting []
pg 0.2c is stuck inactive for 652.741721, current state creating, last 
acting []
pg 0.2b is stuck inactive for 652.741723, current state creating, last 
acting []
pg 0.2a is stuck inactive for 652.741725, current state creating, last 
acting []
pg 0.29 is stuck inactive for 652.741727, current state creating, last 
acting []
pg 0.28 is stuck inactive for 652.741730, current state creating, last 
acting []
pg 0.27 is stuck inactive for 652.741732, current state creating, last 
acting []
pg 0.26 is stuck inactive for 652.741734, current state creating, last 
acting []
pg 0.3e is stuck inactive for 652.741707, current state creating, last 
acting []
pg 0.f is stuck inactive for 652.741761, current state creating, last 
acting []
pg 0.3f is stuck inactive for 652.741708, current state creating, last 
acting []
pg 0.10 is stuck inactive for 652.741763, current state creating, last 
acting []
pg 0.4 is stuck inactive for 652.741773, current state creating, last 
acting []
pg 0.5 is stuck inactive for 652.741774, current state creating, last 
acting []
pg 0.3a is stuck inactive for 652.741717, current state creating, last 
acting []
pg 0.b is stuck inactive for 652.741771, current state creating, last 
acting []
pg 0.c is stuck inactive for 652.741772, current state creating, last 
acting []
pg 0.3b is stuck inactive for 652.741721, current state creating, last 
acting []
pg 0.d is stuck inactive for 652.741774, current state creating, last 
acting []
pg 0.3c is stuck inactive for 652.741722, current state creating, last 
acting []
pg 0.e is stuck inactive for 652.741776, current state creating, last 
acting []
pg 0.3d is stuck inactive for 652.741724, current state creating, last 
acting []
pg 0.22 is stuck inactive for 652.741756, current state creating, last 
acting []
pg 0.21 is stuck inactive for 652.741758, current state creating, last 
acting []
pg 0.a is stuck inactive for 652.741783, current state creating, last 
acting []
pg 0.20 is stuck inactive for 652.741761, current state creating, last 
acting []
pg 0.9 is stuck inactive for 652.741787, current state creating, last 
acting []
pg 0.1f is stuck inactive for 652.741764, current state creating, last 
acting []
pg 0.8 is stuck inactive for 652.741790, current state creating, last 
acting []
pg 0.7 is stuck inactive for 652.741792, current state creating, last 
acting []
pg 0.6 is stuck inactive for 652.741794, current state creating, last 
acting []
pg 0.1e is stuck inactive for 652.741770, current state creating, last 
acting []
pg 0.1d is stuck inactive for 652.741772, current state creating, last 
acting []
pg 0.1c is stuck inactive for 652.741774, current state creating, last 
acting []
pg 0.1b is stuck inactive for 652.741777, current state creating, last 
acting []
pg 0.1a is stuck inactive for 652.741784, current state creating, last 
acting []
pg 0.2 is stuck inactive for 652.741812, current state creating, last 
acting []
pg 0.31 is stuck inactive for 652.741762, current state creating, last 
acting []
pg 0.19 is stuck inactive for 652.741789, current state creating, last 
acting []
pg 0.11 is stuck inactive for 

Re: [ceph-users] can't figure out why I have HEALTH_WARN in luminous

2017-09-25 Thread Michael Kuriger
Thanks!!  I did see that warning, but it never occurred to me I need to disable 
it.

 
Mike Kuriger 
Sr. Unix Systems Engineer 
T: 818-649-7235 M: 818-434-6195 
 <http://www.dexyp.com/>
 

On 9/23/17, 5:52 AM, "John Spray" <jsp...@redhat.com> wrote:

On Fri, Sep 22, 2017 at 6:48 PM, Michael Kuriger <mk7...@dexyp.com> wrote:
> I have a few running ceph clusters.  I built a new cluster using luminous,
> and I also upgraded a cluster running hammer to luminous.  In both cases, 
I
> have a HEALTH_WARN that I can't figure out.  The cluster appears healthy
> except for the HEALTH_WARN in overall status.  For now, I’m monitoring
> health from the “status” instead of “overall_status” until I can find out
> what the issue is.
>
>
>
> Any ideas?  Thanks!

There is a setting called mon_health_preluminous_compat_warning (true
by default), that forces the old overall_status field to WARN, to
create the awareness that your script is using the old health output.

If you do a "ceph health detail -f json" you'll see an explanatory message.

We should probably have made that visible in "status" too (or wherever
we output the overall_status as warning like this) -

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_ceph_ceph_pull_17930=DwIFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=kRQ0vVhTnz9rNJj4pbOQiA=A0xUyyGKflx20twXI038NItlc5j0OPOjFMCPdhP9rJo=aYUvA8rOZCJa3EDrPJY6BGg4ypCyz0wYu9FXsCisRUo=
 

John

>
>
> # ceph health detail
>
> HEALTH_OK
>
>
>
> # ceph -s
>
>   cluster:
>
> id: 11d436c2-1ae3-4ea4-9f11-97343e5c673b
>
> health: HEALTH_OK
>
>
>
> # ceph -s --format json-pretty
>
>
>
> {
>
> "fsid": "11d436c2-1ae3-4ea4-9f11-97343e5c673b",
>
> "health": {
>
> "checks": {},
>
> "status": "HEALTH_OK",
>
> "overall_status": "HEALTH_WARN"
>
>
>
> 
>
>
>
>
>
>
>
> Mike Kuriger
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=kRQ0vVhTnz9rNJj4pbOQiA=A0xUyyGKflx20twXI038NItlc5j0OPOjFMCPdhP9rJo=_HTLRYg4_imosjEYnHqwbN8pHmASsm6bJ7Rs3tBa3OQ=
 
>


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can't figure out why I have HEALTH_WARN in luminous

2017-09-22 Thread Michael Kuriger
I have a few running ceph clusters.  I built a new cluster using luminous, and 
I also upgraded a cluster running hammer to luminous.  In both cases, I have a 
HEALTH_WARN that I can't figure out.  The cluster appears healthy except for 
the HEALTH_WARN in overall status.  For now, I’m monitoring health from the 
“status” instead of “overall_status” until I can find out what the issue is.

Any ideas?  Thanks!

# ceph health detail
HEALTH_OK

# ceph -s
  cluster:
id: 11d436c2-1ae3-4ea4-9f11-97343e5c673b
health: HEALTH_OK

# ceph -s --format json-pretty

{
"fsid": "11d436c2-1ae3-4ea4-9f11-97343e5c673b",
"health": {
"checks": {},
"status": "HEALTH_OK",
"overall_status": "HEALTH_WARN"





Mike Kuriger

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-15 Thread Michael Kuriger
Hmm, if I only enable layering features I can get it to work.  But I’m puzzled 
why all the (default) features are not working with my system fully up to date.

Any ideas?  Is this not yet supported?


[root@test ~]# rbd create `hostname` --size 102400 --image-feature layering
[
root@test ~]# rbd map `hostname`
/dev/rbd0

[root@test ~]# rbd info `hostname`
rbd image ‘test.np.wc1.example.com':
size 102400 MB in 25600 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.13582ae8944a
format: 2
features: layering
flags: 




 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235








On 6/15/16, 9:56 AM, "ceph-users on behalf of Michael Kuriger" 
<ceph-users-boun...@lists.ceph.com on behalf of mk7...@yp.com> wrote:

>Still not working with newer client.  But I get a different error now.
>
>
>
>[root@test ~]# rbd ls
>
>test1
>
>
>
>[root@test ~]# rbd showmapped
>
>
>
>[root@test ~]# rbd map test1
>
>rbd: sysfs write failed
>
>RBD image feature set mismatch. You can disable features unsupported by the 
>kernel with "rbd feature disable".
>
>In some cases useful info is found in syslog - try "dmesg | tail" or so.
>
>rbd: map failed: (6) No such device or address
>
>
>
>[root@test ~]# dmesg | tail
>
>[52056.980880] rbd: loaded (major 251)
>
>[52056.990399] libceph: mon0 10.1.77.165:6789 session established
>
>[52056.992567] libceph: client4966 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
>
>[52057.024913] rbd: image mk7193.np.wc1.yellowpages.com: image uses 
>unsupported features: 0x3c
>
>[52085.856605] libceph: mon0 10.1.77.165:6789 session established
>
>[52085.858696] libceph: client4969 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
>
>[52085.883350] rbd: image test1: image uses unsupported features: 0x3c
>
>[52167.683868] libceph: mon1 10.1.78.75:6789 session established
>
>[52167.685990] libceph: client4937 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
>
>[52167.709796] rbd: image test1: image uses unsupported features: 0x3c
>
>
>
>[root@test ~]# uname -a
>
>Linux test.np.4.6.2-1.el7.elrepo.x86_64 #1 SMP Wed Jun 8 14:49:20 EDT 2016 
>x86_64 x86_64 x86_64 GNU/Linux
>
>
>
>[root@test ~]# ceph --version
>
>ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)
>
>
>
>
>
>
>
>
>
> 
>
>
>
> 
>
>Michael Kuriger
>
>Sr. Unix Systems Engineer
>
>* mk7...@yp.com |( 818-649-7235
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>On 6/14/16, 12:28 PM, "Ilya Dryomov" <idryo...@gmail.com> wrote:
>
>
>
>>On Mon, Jun 13, 2016 at 8:37 PM, Michael Kuriger <mk7...@yp.com> wrote:
>
>>> I just realized that this issue is probably because I’m running jewel 
>>> 10.2.1 on the servers side, but accessing from a client running hammer 
>>> 0.94.7 or infernalis 9.2.1
>
>>>
>
>>> Here is what happens if I run rbd ls from a client on infernalis.  I was 
>>> testing this access since we weren’t planning on building rpms for Jewel on 
>>> CentOS 6
>
>>>
>
>>> $ rbd ls
>
>>> 2016-06-13 11:24:06.881591 7fe61e568700  0 -- :/3877046932 >> 
>>> 10.1.77.165:6789/0 pipe(0x562ed3ea7550 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>>> c=0x562ed3ea0ac0).fault
>
>>> 2016-06-13 11:24:09.882051 7fe61137f700  0 -- :/3877046932 >> 
>>> 10.1.78.75:6789/0 pipe(0x7fe608000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>>> c=0x7fe608004ef0).fault
>
>>> 2016-06-13 11:24:12.882389 7fe61e568700  0 -- :/3877046932 >> 
>>> 10.1.77.165:6789/0 pipe(0x7fe608008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 
>>> c=0x7fe60800c5f0).fault
>
>>> 2016-06-13 11:24:18.883642 7fe61e568700  0 -- :/3877046932 >> 
>>> 10.1.77.165:6789/0 pipe(0x7fe608008350 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>>> c=0x7fe6080078e0).fault
>
>>> 2016-06-13 11:24:21.884259 7fe61137f700  0 -- :/3877046932 >> 
>>> 10.1.78.75:6789/0 pipe(0x7fe608000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 
>>> c=0x7fe608007110).fault
>
>>
>
>>Accessing jewel with older clients should work as long as you don't
>
>>enable jewel tunables and such; the same goes for older kernels.  Can
>
>>you do
>
>>
>
>>rbd --debug-ms=20 ls
>
>>
>
>>and attach the output?
>
>>
>
>>Thanks,
>
>>
>
>>Ilya
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwIGaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=87up-v2FeckUxAE8N-S9YPgbNa_YWlaYrV8efOsXeEs=k9uAOwbxafawJqm096e0GZqUPU2YbN3qm1GBol7ZvN4=
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-15 Thread Michael Kuriger
Still not working with newer client.  But I get a different error now.

[root@test ~]# rbd ls
test1

[root@test ~]# rbd showmapped

[root@test ~]# rbd map test1
rbd: sysfs write failed
RBD image feature set mismatch. You can disable features unsupported by the 
kernel with "rbd feature disable".
In some cases useful info is found in syslog - try "dmesg | tail" or so.
rbd: map failed: (6) No such device or address

[root@test ~]# dmesg | tail
[52056.980880] rbd: loaded (major 251)
[52056.990399] libceph: mon0 10.1.77.165:6789 session established
[52056.992567] libceph: client4966 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
[52057.024913] rbd: image mk7193.np.wc1.yellowpages.com: image uses unsupported 
features: 0x3c
[52085.856605] libceph: mon0 10.1.77.165:6789 session established
[52085.858696] libceph: client4969 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
[52085.883350] rbd: image test1: image uses unsupported features: 0x3c
[52167.683868] libceph: mon1 10.1.78.75:6789 session established
[52167.685990] libceph: client4937 fsid f1466aaa-b08b-4103-ba7f-69165d675ba1
[52167.709796] rbd: image test1: image uses unsupported features: 0x3c

[root@test ~]# uname -a
Linux test.np.4.6.2-1.el7.elrepo.x86_64 #1 SMP Wed Jun 8 14:49:20 EDT 2016 
x86_64 x86_64 x86_64 GNU/Linux

[root@test ~]# ceph --version
ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269)




 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235








On 6/14/16, 12:28 PM, "Ilya Dryomov" <idryo...@gmail.com> wrote:

>On Mon, Jun 13, 2016 at 8:37 PM, Michael Kuriger <mk7...@yp.com> wrote:
>> I just realized that this issue is probably because I’m running jewel 10.2.1 
>> on the servers side, but accessing from a client running hammer 0.94.7 or 
>> infernalis 9.2.1
>>
>> Here is what happens if I run rbd ls from a client on infernalis.  I was 
>> testing this access since we weren’t planning on building rpms for Jewel on 
>> CentOS 6
>>
>> $ rbd ls
>> 2016-06-13 11:24:06.881591 7fe61e568700  0 -- :/3877046932 >> 
>> 10.1.77.165:6789/0 pipe(0x562ed3ea7550 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>> c=0x562ed3ea0ac0).fault
>> 2016-06-13 11:24:09.882051 7fe61137f700  0 -- :/3877046932 >> 
>> 10.1.78.75:6789/0 pipe(0x7fe608000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>> c=0x7fe608004ef0).fault
>> 2016-06-13 11:24:12.882389 7fe61e568700  0 -- :/3877046932 >> 
>> 10.1.77.165:6789/0 pipe(0x7fe608008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 
>> c=0x7fe60800c5f0).fault
>> 2016-06-13 11:24:18.883642 7fe61e568700  0 -- :/3877046932 >> 
>> 10.1.77.165:6789/0 pipe(0x7fe608008350 sd=3 :0 s=1 pgs=0 cs=0 l=1 
>> c=0x7fe6080078e0).fault
>> 2016-06-13 11:24:21.884259 7fe61137f700  0 -- :/3877046932 >> 
>> 10.1.78.75:6789/0 pipe(0x7fe608000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 
>> c=0x7fe608007110).fault
>
>Accessing jewel with older clients should work as long as you don't
>enable jewel tunables and such; the same goes for older kernels.  Can
>you do
>
>rbd --debug-ms=20 ls
>
>and attach the output?
>
>Thanks,
>
>Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-13 Thread Michael Kuriger
I just realized that this issue is probably because I’m running jewel 10.2.1 on 
the servers side, but accessing from a client running hammer 0.94.7 or 
infernalis 9.2.1

Here is what happens if I run rbd ls from a client on infernalis.  I was 
testing this access since we weren’t planning on building rpms for Jewel on 
CentOS 6

$ rbd ls
2016-06-13 11:24:06.881591 7fe61e568700  0 -- :/3877046932 >> 
10.1.77.165:6789/0 pipe(0x562ed3ea7550 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x562ed3ea0ac0).fault
2016-06-13 11:24:09.882051 7fe61137f700  0 -- :/3877046932 >> 10.1.78.75:6789/0 
pipe(0x7fe608000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe608004ef0).fault
2016-06-13 11:24:12.882389 7fe61e568700  0 -- :/3877046932 >> 
10.1.77.165:6789/0 pipe(0x7fe608008350 sd=4 :0 s=1 pgs=0 cs=0 l=1 
c=0x7fe60800c5f0).fault
2016-06-13 11:24:18.883642 7fe61e568700  0 -- :/3877046932 >> 
10.1.77.165:6789/0 pipe(0x7fe608008350 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x7fe6080078e0).fault
2016-06-13 11:24:21.884259 7fe61137f700  0 -- :/3877046932 >> 10.1.78.75:6789/0 
pipe(0x7fe608000c00 sd=4 :0 s=1 pgs=0 cs=0 l=1 c=0x7fe608007110).fault



 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235







On 6/13/16, 2:10 AM, "Ilya Dryomov" <idryo...@gmail.com> wrote:

>On Fri, Jun 10, 2016 at 9:29 PM, Michael Kuriger <mk7...@yp.com> wrote:
>> Hi Everyone,
>> I’ve been running jewel for a while now, with tunables set to hammer.  
>> However, I want to test the new features but cannot find a fully compatible 
>> Kernel for CentOS 7.  I’ve tried a few of the elrepo kernels - elrepo-kernel 
>> 4.6 works perfectly in CentOS 6, but not CentOS 7.  I’ve tried 3.10, 4.3, 
>> 4.5, and 4.6.
>>
>> What does seem to work with the 4.6 kernel is mounting, read/write to a 
>> cephfs, and rbd map / mounting works also.  I just can’t do 'rbd ls'
>>
>> 'rbd ls' does not work with 4.6 kernel but it does work with the stock 3.10 
>> kernel.
>
>"rbd ls" operation doesn't depend on the kernel.  What do you mean by
>"can't do" - no output at all?
>
>Something similar was reported here [1].  What's the output of "rados
>-p  stat rbd_directory"?
>
>> 'rbd mount' does not work with the stock 3.10 kernel, but works with the 4.6 
>> kernel.
>
>Anything in dmesg on 3.10?
>
>[1] 
>https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mail-2Darchive.com_ceph-2Dusers-40lists.ceph.com_msg29515.html=CwIFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=kJrwZaeBUVskcdX_AfCixcgqOLQYSWLx-mq-LdVsDt8=wbjUPx1PtYZMra-NyGZasBDZZ_T7fXvqidZ11xkSiVY=
> 
>
>Thanks,
>
>Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-10 Thread Michael Kuriger
Hi Everyone,
I’ve been running jewel for a while now, with tunables set to hammer.  However, 
I want to test the new features but cannot find a fully compatible Kernel for 
CentOS 7.  I’ve tried a few of the elrepo kernels - elrepo-kernel 4.6 works 
perfectly in CentOS 6, but not CentOS 7.  I’ve tried 3.10, 4.3, 4.5, and 4.6.

What does seem to work with the 4.6 kernel is mounting, read/write to a cephfs, 
and rbd map / mounting works also.  I just can’t do 'rbd ls'  

'rbd ls' does not work with 4.6 kernel but it does work with the stock 3.10 
kernel.
'rbd mount' does not work with the stock 3.10 kernel, but works with the 4.6 
kernel.  

Very odd.  Any advice? 

Thanks!

 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Migrating from one Ceph cluster to another

2016-06-09 Thread Michael Kuriger
This is how I did it.  I upgraded my old cluster first (live one by one) .  
Then I added my new OSD servers to my running cluster.  Once they were all 
added I set the weight to 0 on all my original osd's.  This causes a lot of IO 
but all data will be migrated to the new servers.  Then you can remove the old 
OSD servers from the cluster.  



 
Michael Kuriger
Sr. Unix Systems Engineer

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Wido 
den Hollander
Sent: Thursday, June 09, 2016 12:47 AM
To: Marek Dohojda; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Migrating from one Ceph cluster to another


> Op 8 juni 2016 om 22:49 schreef Marek Dohojda <mdoho...@altitudedigital.com>:
> 
> 
> I have a ceph cluster (Hammer) and I just built a new cluster 
> (Infernalis).  This cluster contains VM boxes based on KVM.
> 
> What I would like to do is move all the data from one ceph cluster to 
> another.  However the only way I could find from my google searches 
> would be to move each image to local disk, copy this image across to 
> new cluster, and import it.
> 
> I am hoping that there is a way to just synch the data (and I do 
> realize that KVMs will have to be down for the full migration) from 
> one cluster to another.
> 

You can do this with the rbd command using export and import.

Something like:

$ rbd export image1 -|rbd import image1 -

Where you have both RBD commands connect to a different Ceph cluster. See 
--help on how to do that.

You can run this in a loop with the output of 'rbd ls'.

But that's about the only way.

Wido

> Thank you
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_lis
> tinfo.cgi_ceph-2Dusers-2Dceph.com=CwICAg=lXkdEK1PC7UK9oKA-BBSI8p1A
> amzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=lhisR2C1GH95fR5NYNEGWebX
> LILh56cyhY8u9v56o6M=ddR_8bexw5SKK1wD5UNp9Oijw0Z0I9RnhaIJbcfUS-8=
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwICAg=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=lhisR2C1GH95fR5NYNEGWebXLILh56cyhY8u9v56o6M=ddR_8bexw5SKK1wD5UNp9Oijw0Z0I9RnhaIJbcfUS-8=
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problems with Calamari setup

2016-06-02 Thread Michael Kuriger
For me, this same issue was caused by having too new a version of salt.  I’m 
running salt-2014.1.5-1 in centos 7.2, so yours will probably be different.  
But I thought it was worth mentioning.


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.com<mailto:mk7...@yp.com> |• 818-649-7235



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
fridifree
Sent: Wednesday, June 01, 2016 6:00 AM
To: Ceph Users
Subject: [ceph-users] Problems with Calamari setup

Hello, Everyone.

I'm trying to install a Calamari server in my organisation and I'm encountering 
some problems.

I have a small dev environment, just 4 OSD nodes and 5 monitors (one of them is 
also the RADOS GW). We chose to use Ubuntu 14.04 LTS for all our servers. The 
Calamari server is provisioned by VMware for now, the rest of the servers are 
physical.

The packages' versions are as follows:
- calamari-server - 1.3.1.1-1trusty
- calamari-client - 1.3.1.1-1trusty
- salt - 0.7.15
- diamond - 3.4.67

I used Calamari Survival 
Guide<https://urldefense.proofpoint.com/v2/url?u=http-3A__ceph.com_planet_ceph-2Dcalamari-2Dthe-2Dsurvival-2Dguide_=CwMFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=HYKmd-dJJAf1DrtaQM7Hs2hJN80sTAo4DWNcW14cYtw=e5MDHG9JjYjQU6WqzfyaiZLcwVIj7FYJhk4g8_f45l0=>
 but without the 'build' part.

The problem is I've managed to install the server and the web page, but the 
Calamari server doesn't recognize the cluster. It does manage to OSD nodes 
connected to it, but without a cluster (that exists).

Also, the output of the "salt '*' ceph.get_heartbeats" command seems to look 
fine, as the Cthultu log (but maybe I'm looking for the wrong thing). 
Re-installing the cluster is not an option, we want to connect the Calamari as 
it is, without hurting the Ceph cluster.

Thanks so much!

Jacob Goldenberg,
Israel.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-disk: Error: No cluster conf found in /etc/ceph with fsid

2016-05-26 Thread Michael Kuriger
Are you using an old ceph.conf with the original FSID from your first attempt 
(in your deploy directory)?


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com<mailto:mk7...@yp.com> |* 818-649-7235



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Albert.K.Chong (git.usca07.Newegg) 22201
Sent: Thursday, May 26, 2016 8:49 AM
To: Albert.K.Chong (git.usca07.Newegg) 22201; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] ceph-disk: Error: No cluster conf found in /etc/ceph 
with fsid

Hi,

Can anyone help on this topic?


Albert

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Albert.K.Chong (git.usca07.Newegg) 22201
Sent: Wednesday, May 25, 2016 3:04 PM
To: ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
Subject: [ceph-users] ceph-disk: Error: No cluster conf found in /etc/ceph with 
fsid

Hi,

I followed storage cluster Quick start guild with my centos 7 vm.  I failed the 
same step in more than 10 times including complete cleaning and reinstallation. 
 The last try I just create osd in the local drive to avoid some permission 
warning and run "ceph-deploy osd prepare .. and

[albert@admin-node my-cluster]$ ceph-deploy osd activate 
admin-node:/home/albert/my-cluster/cephd2
[ceph_deploy.conf][DEBUG ] found configuration file at: 
/home/albert/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.33): /usr/bin/ceph-deploy osd activate 
admin-node:/home/albert/my-cluster/cephd2
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username  : None
[ceph_deploy.cli][INFO  ]  verbose   : False
[ceph_deploy.cli][INFO  ]  overwrite_conf: False
[ceph_deploy.cli][INFO  ]  subcommand: activate
[ceph_deploy.cli][INFO  ]  quiet : False
[ceph_deploy.cli][INFO  ]  cd_conf   : 

[ceph_deploy.cli][INFO  ]  cluster   : ceph
[ceph_deploy.cli][INFO  ]  func  : 
[ceph_deploy.cli][INFO  ]  ceph_conf : None
[ceph_deploy.cli][INFO  ]  default_release   : False
[ceph_deploy.cli][INFO  ]  disk  : [('admin-node', 
'/home/albert/my-cluster/cephd2', None)]
[ceph_deploy.osd][DEBUG ] Activating cluster ceph disks 
admin-node:/home/albert/my-cluster/cephd2:
[admin-node][DEBUG ] connection detected need for sudo
[admin-node][DEBUG ] connected to host: admin-node
[admin-node][DEBUG ] detect platform information from remote host
[admin-node][DEBUG ] detect machine type
[admin-node][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
[ceph_deploy.osd][DEBUG ] activating host admin-node disk 
/home/albert/my-cluster/cephd2
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[admin-node][DEBUG ] find the location of an executable
[admin-node][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
--mark-init systemd --mount /home/albert/my-cluster/cephd2
[admin-node][WARNIN] main_activate: path = /home/albert/my-cluster/cephd2
[admin-node][WARNIN] activate: Cluster uuid is 
8f9bf207-6c6a-4764-8b9e-63f70810837b
[admin-node][WARNIN] command: Running command: /usr/bin/ceph-osd --cluster=ceph 
--show-config-value=fsid
[admin-node][WARNIN] Traceback (most recent call last):
[admin-node][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
[admin-node][WARNIN] load_entry_point('ceph-disk==1.0.0', 
'console_scripts', 'ceph-disk')()
[admin-node][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4964, in run
[admin-node][WARNIN] main(sys.argv[1:])
[admin-node][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 4915, in main
[admin-node][WARNIN] args.func(args)
[admin-node][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3277, in 
main_activate
[admin-node][WARNIN] init=args.mark_init,
[admin-node][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3097, in activate_dir
[admin-node][WARNIN] (osd_id, cluster) = activate(path, 
activate_key_template, init)
[admin-node][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 3173, in activate
[admin-node][WARNIN] ' with fsid %s' % ceph_fsid)
[admin-node][WARNIN] ceph_disk.main.Error: Error: No cluster conf found in 
/etc/ceph with fsid 8f9bf207-6c6a-4764-8b9e-63f70810837b
[admin-node][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy][ERROR ] RuntimeError: Failed to execute command: 
/usr/sbin/ceph-disk -v activate --mark-init systemd --mount 
/home/albert/my-cluster/cephd2


Need some help.  Really appreciated.


Albert
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't Start / Stop Ceph jewel under Centos 7.2

2016-05-26 Thread Michael Kuriger
Did you update to ceph version 10.2.1 
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269)?  This issue should have been 
resolved with the last update.  (It was for us)



 
Michael Kuriger
Sr. Unix Systems Engineer
r mk7...@yp.com |  818-649-7235



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Hauke 
Homburg
Sent: Thursday, May 26, 2016 2:42 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Can't Start / Stop Ceph jewel under Centos 7.2


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hello,

I need to test my icinga Monitoring fpr Ceph- So i want to shutdown Ceph 
Services on one Server. But systemctl start ceph.target doesn't run.

How can stop the Services


Thanks for Help

Hauke

- --
www.w3-creative.de

www.westchat.de
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)

iQIcBAEBAgAGBQJXRsTaAAoJEEIVizQb/Y0m67AP+wQzrpYKSnNA7KURP4mHOo0m
zoPmDkbucNaMBM1tFXNVJRJhyN6MVHKjkrE3TOC20AXVg1F3/nGOuW+kPjoDFek1
sRizx6kfTVrh9rQzLoadpimg32cmxjglFfA9pByPZUusPx2tOjjy1t+ucGyP91Mv
b7UXnhL2crX0dtstkYiQAq1Lyb14KdVSO8wCYfzBWgK8nf6Pc0ylSkQj+9012O6c
FLceIJA6og8ALBXTl/t0Xw09oOWOxCPblnY8Gt4I1lBKmlqq5Ztc3PM1Pg+xRt7U
zwPyiG2PqpiNfF+bc3tCavLjV+L6nASoJJ1vpYEvu3Y7W+FleIRiI8qYMyOhjXsd
SfBZSjG3fGKe+kXgKYjlIcv1GdYid6gLLDoKdPLkvnUuFDbodGBuJFq6hZ3gY1kd
rXZwCmME4mlK1uIuYjOZx+TuQj74ET4gHBe5GHq4veZe+q2Q7azJpihEPxtPLqJM
jtF7WNAsSHPIvqbvwAMgRtcyS5DBSg3mrpiTF80557Gdk1eltfLRJBW+I6SyweLr
05HBElskq3kg7rSkjJYvPDGE9qj6auLCZAgazOG2vbcqk9pm1u+spgIajQ0fRKnr
p/VfHZPqlV9rOqHH05MgA/On9DuyxEPulnZga3qsIHDMSmtQGSqU+CbNLXg0kdOE
7uQjqQFADi8FSSBfZXPR
=ilug
-END PGP SIGNATURE-

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to remove a placement group?

2016-05-15 Thread Michael Kuriger
I would try:
ceph pg repair 15.3b3


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.com<mailto:mk7...@yp.com> |• 818-649-7235



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Romero 
Junior
Sent: Saturday, May 14, 2016 11:46 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] How to remove a placement group?

Hi all,

I’m currently having trouble with an incomplete pg.
Our cluster has a replication factor of 3, however somehow I found this pg to 
be present in 9 different OSDs (being active only in 3 of them, of course).
Since I don’t really care about data loss, I was wondering if it’s possible to 
get rid of this by simply removing the pg. Is that even possible?

I’m running ceph version 0.94.3.

Here are some quick details:

1 pgs incomplete
1 pgs stuck inactive
100 requests are blocked > 32 sec

6 ops are blocked > 67108.9 sec on osd.130
94 ops are blocked > 33554.4 sec on osd.130

pg 15.3b3 is stuck inactive since forever, current state incomplete, last 
acting [130,210,148]
pg 15.3b3 is stuck unclean since forever, current state incomplete, last acting 
[130,210,148]
pg 15.3b3 is incomplete, acting [130,210,148]

Running a: “ceph pg 15.3b3 query” hangs without response.

I’ve tried setting OSD 130 as down, but then OSD 210 becomes the one keeping 
things stuck (query hangs), same for OSD 148.

Any ideas?

Kind regards,
Romero Junior
DevOps Infra Engineer
LeaseWeb Global Services B.V.

T: +31 20 316 0230
M: +31 6 2115 9310
E: r.jun...@global.leaseweb.com<mailto:r.jun...@global.leaseweb.com>
W: 
www.leaseweb.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.leaseweb.com=CwMGaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=Tscmkf_Z7k--OL5OSU9Z7wVkGAo5p0jK7WwWxuRpOM8=UBExeF-l0mdhN7KXDDE8imOJV2d0HP2q5YnDrGd32gw=>



Luttenbergweg 8,

1101 EC Amsterdam,

Netherlands




LeaseWeb is the brand name under which the various independent LeaseWeb 
companies operate. Each company is a separate and distinct entity that provides 
services in a particular geographic area. LeaseWeb Global Services B.V. does 
not provide third-party services. Please see 
www.leaseweb.com/en/legal<http://www.leaseweb.com/en/legal> for more 
information.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do I start ceph jewel in CentOS?

2016-05-04 Thread Michael Kuriger
I was able to hack the ceph /etc/init.d script to start my osd’s 

 

 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235








On 5/4/16, 9:58 AM, "ceph-users on behalf of Michael Kuriger" 
<ceph-users-boun...@lists.ceph.com on behalf of mk7...@yp.com> wrote:

>How are others starting ceph services?  Am I the only person trying to install 
>jewel on CentOS 7?
>
>Unfortunately, systemctl status does not list any “ceph” services at all.
>
>
>
> 
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>On 5/4/16, 9:37 AM, "Vasu Kulkarni" <vakul...@redhat.com> wrote:
>
>
>
>>sadly there are still some issues with jewel/master branch for centos
>
>>systemctl service,
>
>>As a workaround if you run "systemctl status" and look at the top most
>
>>service name in the ceph-osd service tree and use that to stop/start
>
>>it should work.
>
>>
>
>>
>
>>On Wed, May 4, 2016 at 9:00 AM, Michael Kuriger <mk7...@yp.com> wrote:
>
>>> I’m running CentOS 7.2.  I upgraded one server from hammer to jewel.   I
>
>>> cannot get ceph to start using these new systems scripts.  Can anyone help?
>
>>>
>
>>> I tried to enable ceph-osd@.service by creating symlinks manually.
>
>>>
>
>>> # systemctl list-unit-files|grep ceph
>
>>>
>
>>> ceph-create-keys@.service  static
>
>>>
>
>>> ceph-disk@.service static
>
>>>
>
>>> ceph-mds@.service  disabled
>
>>>
>
>>> ceph-mon@.service  disabled
>
>>>
>
>>> ceph-osd@.service  enabled
>
>>>
>
>>> ceph-mds.targetdisabled
>
>>>
>
>>> ceph-mon.targetdisabled
>
>>>
>
>>> ceph-osd.targetenabled
>
>>>
>
>>> ceph.targetenabled
>
>>>
>
>>>
>
>>>
>
>>> # systemctl start ceph.target
>
>>>
>
>>>
>
>>> # systemctl status ceph.target
>
>>>
>
>>> ● ceph.target - ceph target allowing to start/stop all ceph*@.service
>
>>> instances at once
>
>>>
>
>>>Loaded: loaded (/usr/lib/systemd/system/ceph.target; enabled; vendor
>
>>> preset: disabled)
>
>>>
>
>>>Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 6s ago
>
>>>
>
>>>
>
>>> May 04 08:53:30  systemd[1]: Reached target ceph target allowing to
>
>>> start/stop all ceph*@.service instances at once.
>
>>>
>
>>> May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all
>
>>> ceph*@.service instances at once.
>
>>>
>
>>> May 04 08:57:32  systemd[1]: Reached target ceph target allowing to
>
>>> start/stop all ceph*@.service instances at once.
>
>>>
>
>>>
>
>>> # systemctl status ceph-osd.target
>
>>>
>
>>> ● ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service
>
>>> instances at once
>
>>>
>
>>>Loaded: loaded (/usr/lib/systemd/system/ceph-osd.target; enabled; vendor
>
>>> preset: disabled)
>
>>>
>
>>>Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 20s ago
>
>>>
>
>>>
>
>>> May 04 08:53:30  systemd[1]: Reached target ceph target allowing to
>
>>> start/stop all ceph-osd@.service instances at once.
>
>>>
>
>>> May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all
>
>>> ceph-osd@.service instances at once.
>
>>>
>
>>>
>
>>> # systemctl status ceph-osd@.service
>
>>>
>
>>> Failed to get properties: Unit name ceph-osd@.service is not valid.
>
>>>
>
>>>
>
>>>
>
>>>
>
>>>
>
>>>
>
>>>
>
>>> ___
>
>>> ceph-users mailing list
>
>>> ceph-users@lists.ceph.com
>
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwIFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=ha3XvQGcc5Yztz98b7hb8pYQo14dcIiYxfOoMzyUM00=VdVOtGV4JQUKyQDDC_QYn1-7wBcSh-eYwx_cCSQWlQk=
>>>  
>
>>>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwIGaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=54N4L4csPxJYvC5YZlDB9mMwEKANhFwo2m6R0HMUGZ0=LER873rXoF5--GPvmOzkJaQDhPpSvRptxAZ3QP-mlBM=
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do I start ceph jewel in CentOS?

2016-05-04 Thread Michael Kuriger
How are others starting ceph services?  Am I the only person trying to install 
jewel on CentOS 7?
Unfortunately, systemctl status does not list any “ceph” services at all.

 








On 5/4/16, 9:37 AM, "Vasu Kulkarni" <vakul...@redhat.com> wrote:

>sadly there are still some issues with jewel/master branch for centos
>systemctl service,
>As a workaround if you run "systemctl status" and look at the top most
>service name in the ceph-osd service tree and use that to stop/start
>it should work.
>
>
>On Wed, May 4, 2016 at 9:00 AM, Michael Kuriger <mk7...@yp.com> wrote:
>> I’m running CentOS 7.2.  I upgraded one server from hammer to jewel.   I
>> cannot get ceph to start using these new systems scripts.  Can anyone help?
>>
>> I tried to enable ceph-osd@.service by creating symlinks manually.
>>
>> # systemctl list-unit-files|grep ceph
>>
>> ceph-create-keys@.service  static
>>
>> ceph-disk@.service static
>>
>> ceph-mds@.service  disabled
>>
>> ceph-mon@.service  disabled
>>
>> ceph-osd@.service  enabled
>>
>> ceph-mds.targetdisabled
>>
>> ceph-mon.targetdisabled
>>
>> ceph-osd.targetenabled
>>
>> ceph.targetenabled
>>
>>
>>
>> # systemctl start ceph.target
>>
>>
>> # systemctl status ceph.target
>>
>> ● ceph.target - ceph target allowing to start/stop all ceph*@.service
>> instances at once
>>
>>Loaded: loaded (/usr/lib/systemd/system/ceph.target; enabled; vendor
>> preset: disabled)
>>
>>Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 6s ago
>>
>>
>> May 04 08:53:30  systemd[1]: Reached target ceph target allowing to
>> start/stop all ceph*@.service instances at once.
>>
>> May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all
>> ceph*@.service instances at once.
>>
>> May 04 08:57:32  systemd[1]: Reached target ceph target allowing to
>> start/stop all ceph*@.service instances at once.
>>
>>
>> # systemctl status ceph-osd.target
>>
>> ● ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service
>> instances at once
>>
>>Loaded: loaded (/usr/lib/systemd/system/ceph-osd.target; enabled; vendor
>> preset: disabled)
>>
>>Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 20s ago
>>
>>
>> May 04 08:53:30  systemd[1]: Reached target ceph target allowing to
>> start/stop all ceph-osd@.service instances at once.
>>
>> May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all
>> ceph-osd@.service instances at once.
>>
>>
>> # systemctl status ceph-osd@.service
>>
>> Failed to get properties: Unit name ceph-osd@.service is not valid.
>>
>>
>>
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwIFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=ha3XvQGcc5Yztz98b7hb8pYQo14dcIiYxfOoMzyUM00=VdVOtGV4JQUKyQDDC_QYn1-7wBcSh-eYwx_cCSQWlQk=
>>  
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How do I start ceph jewel in CentOS?

2016-05-04 Thread Michael Kuriger
I’m running CentOS 7.2.  I upgraded one server from hammer to jewel.   I cannot 
get ceph to start using these new systems scripts.  Can anyone help?

I tried to enable ceph-osd@.service by creating symlinks manually.


# systemctl list-unit-files|grep ceph

ceph-create-keys@.service  static

ceph-disk@.service static

ceph-mds@.service  disabled

ceph-mon@.service  disabled

ceph-osd@.service  enabled

ceph-mds.targetdisabled

ceph-mon.targetdisabled

ceph-osd.targetenabled

ceph.targetenabled



# systemctl start ceph.target


# systemctl status ceph.target

● ceph.target - ceph target allowing to start/stop all ceph*@.service instances 
at once

   Loaded: loaded (/usr/lib/systemd/system/ceph.target; enabled; vendor preset: 
disabled)

   Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 6s ago


May 04 08:53:30  systemd[1]: Reached target ceph target allowing to start/stop 
all ceph*@.service instances at once.

May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all 
ceph*@.service instances at once.

May 04 08:57:32  systemd[1]: Reached target ceph target allowing to start/stop 
all ceph*@.service instances at once.


# systemctl status ceph-osd.target

● ceph-osd.target - ceph target allowing to start/stop all ceph-osd@.service 
instances at once

   Loaded: loaded (/usr/lib/systemd/system/ceph-osd.target; enabled; vendor 
preset: disabled)

   Active: active since Wed 2016-05-04 08:53:30 PDT; 4min 20s ago


May 04 08:53:30  systemd[1]: Reached target ceph target allowing to start/stop 
all ceph-osd@.service instances at once.

May 04 08:53:30  systemd[1]: Starting ceph target allowing to start/stop all 
ceph-osd@.service instances at once.


# systemctl status ceph-osd@.service

Failed to get properties: Unit name ceph-osd@.service is not valid.





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] SSD pool and SATA pool

2015-11-17 Thread Michael Kuriger
Hey everybody,
I have 10 servers, each with 2 SSD drives, and 8 SATA drives.  Is it possible 
to create 2 pools, one made up of SSD and one made up of SATA?  I tried 
manually editing the crush map to do it, but the configuration doesn’t seem to 
persist reboots.  Any help would be very appreciated.

Thanks!

Mike


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD pool and SATA pool

2015-11-17 Thread Michael Kuriger
Many thanks!


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.com<mailto:mk7...@yp.com> |• 818-649-7235


From: Sean Redmond <sean.redmo...@gmail.com<mailto:sean.redmo...@gmail.com>>
Date: Tuesday, November 17, 2015 at 2:00 PM
To: Nikola Ciprich 
<nikola.cipr...@linuxbox.cz<mailto:nikola.cipr...@linuxbox.cz>>
Cc: Michael Kuriger <mk7...@yp.com<mailto:mk7...@yp.com>>, 
"ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" 
<ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>>
Subject: Re: [ceph-users] SSD pool and SATA pool

Hi,

The below should help you:

http://www.sebastien-han.fr/blog/2014/08/25/ceph-mix-sata-and-ssd-within-the-same-box/<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.sebastien-2Dhan.fr_blog_2014_08_25_ceph-2Dmix-2Dsata-2Dand-2Dssd-2Dwithin-2Dthe-2Dsame-2Dbox_=CwMFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=sTcO3vqSIi0TXQbqBxJLeNk1XOWssOl7mENi9pWpIsA=05WfoGzVqIdA9hwi8pIEBNz4yBwCb7QYwM9QrkP07Gk=>

Thanks

On Tue, Nov 17, 2015 at 9:58 PM, Nikola Ciprich 
<nikola.cipr...@linuxbox.cz<mailto:nikola.cipr...@linuxbox.cz>> wrote:
I'm not an ceph expert, but I needed to use

osd crush update on start = false

in [osd] config section..

BR

nik


On Tue, Nov 17, 2015 at 08:53:37PM +, Michael Kuriger wrote:
> Hey everybody,
> I have 10 servers, each with 2 SSD drives, and 8 SATA drives.  Is it possible 
> to create 2 pools, one made up of SSD and one made up of SATA?  I tried 
> manually editing the crush map to do it, but the configuration doesn’t seem 
> to persist reboots.  Any help would be very appreciated.
>
> Thanks!
>
> Mike
>
>

> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwMFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=sTcO3vqSIi0TXQbqBxJLeNk1XOWssOl7mENi9pWpIsA=njhE1213mChb57wiXO0HXdX6rJ5PRl-OE0mqTupe1-4=>


--
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214<tel:%2B420%20591%20166%20214>
fax:+420 596 621 273<tel:%2B420%20596%20621%20273>
mobil:  +420 777 093 799<tel:%2B420%20777%20093%20799>
www.linuxbox.cz<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linuxbox.cz=CwMFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=sTcO3vqSIi0TXQbqBxJLeNk1XOWssOl7mENi9pWpIsA=ZsWLl_AUzsZ4KJIfkG7DwV8C7hFf1JIiWnxfN3WMP98=>

mobil servis: +420 737 238 656<tel:%2B420%20737%20238%20656>
email servis: ser...@linuxbox.cz<mailto:ser...@linuxbox.cz>
-

___
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=CwMFaQ=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ=CSYA9OS6Qd7fQySI2LDvlQ=sTcO3vqSIi0TXQbqBxJLeNk1XOWssOl7mENi9pWpIsA=njhE1213mChb57wiXO0HXdX6rJ5PRl-OE0mqTupe1-4=>


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Important security noticed regarding release signing key

2015-09-17 Thread Michael Kuriger
Thanks for the notice!



 
Michael Kuriger
Sr. Unix Systems Engineer
r mk7...@yp.com |  818-649-7235



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Sage 
Weil
Sent: Thursday, September 17, 2015 9:30 AM
To: ceph-annou...@ceph.com; ceph-de...@vger.kernel.org; ceph-us...@ceph.com; 
ceph-maintain...@ceph.com
Subject: [ceph-users] Important security noticed regarding release signing key

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Last week, Red Hat investigated an intrusion on the sites of both the Ceph 
community project (ceph.com) and Inktank (download.inktank.com), which were 
hosted on a computer system outside of Red Hat infrastructure.

Ceph.com provided Ceph community versions downloads signed with a Ceph signing 
key (id 7EBFDD5D17ED316D). Download.inktank.comprovided releases of the Red Hat 
Ceph product for Ubuntu and CentOS operating systems signed with an Inktank 
signing key (id 5438C7019DCEEEAD). While the investigation into the intrusion 
is ongoing, our initial focus was on the integrity of the software and 
distribution channel for both sites.

To date, our investigation has not discovered any compromised code or binaries 
available for download on these sites. However, we cannot fully rule out the 
possibility that some compromised code or binaries were available for download 
at some point in the past. Further, we can no longer trust the integrity of the 
Ceph signing key, and therefore have created a new signing key (id 
E84AC2C0460F3994) for verifying downloads. 
This new key is committed to the ceph.git repository and is also available from

https://git.ceph.com/release.asc

The new key should look like:

pub   4096R/460F3994 2015-09-15
uid  Ceph.com (release key) <secur...@ceph.com>

All future release git tags will be signed with this new key.

This intrusion did not affect other Ceph sites such as download.ceph.com (which 
contained some older Ceph downloads) or git.ceph.com (which mirrors various 
source repositories), and is not known to have affected any other Ceph 
community infrastructure.  There is no evidence that build system or the Ceph 
github source repository were compromised.

New hosts for ceph.com and download.ceph.com have been created and the sites 
have been rebuilt.  All content available on download.ceph.com as been 
verified, and all ceph.com URLs for package locations now redirect there.  
There is still some content missing from download.ceph.com that will appear 
later today: source tarballs will be regenerated from git, and older release 
packages are being resigned with the new release key DNS changes are still 
propogating so you may not see the new versions of the ceph.com and 
download.ceph.com sites for another hour or so.

The download.inktank.com host has been retired and affected Red Hat customers 
have been notified, further information is available at 
https://securityblog.redhat.com/2015/09/17/.

Users of Ceph packages should take action as a precautionary measure to 
download the newly-signed versions.  Please see the instructions below.

The Ceph community would like to thank Kai Fabian for initially alerting us to 
this issue.

Any questions can be directed to the email discussion lists or the #ceph IRC 
channel on irc.oftc.net.

Thank you!
sage

- -

The following steps should be performed on all nodes with Ceph software 
installed.

Replace APT keys (Debian, Ubuntu)

sudo apt-key del 17ED316D
curl https://git.ceph.com/release.asc | sudo apt-key add -

Replace RPM keys (Fedora, CentOS, SUSE, etc.)

sudo rpm -e --allmatches gpg-pubkey-17ed316d-4fb96ee8
sudo rpm --import 'https://git.ceph.com/release.asc'

Reinstalling packages (Fedora, CentOS, SUSE, etc.)

sudo yum clean metadata
sudo yum reinstall -y $(repoquery --disablerepo= --enablerepo=ceph \
--queryformat='%{NAME}' list '*')

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)

iEYEARECAAYFAlX66k0ACgkQ2kQg7SiJlcg0wQCfVy+/2BfoNqtCfAcbuNABczFx
bpIAoLf8RTHisIn5wFvEb4Akym/UNn5l
=SEws
-END PGP SIGNATURE-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Happy SysAdmin Day!

2015-07-31 Thread Michael Kuriger
Thanks Mark you too

 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235





On 7/31/15, 3:02 PM, ceph-users on behalf of Mark Nelson
ceph-users-boun...@lists.ceph.com on behalf of mnel...@redhat.com wrote:

Most folks have either probably already left or are on their way out the
door late on a friday, but I just wanted to say Happy SysAdmin day to
all of the excellent System Administrators out there running Ceph
clusters. :)

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
o.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
m6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=eWaHPQA1Amni5T9DeUE2Z49jPowepQBRTYxB
Z_Fotdos=HX0cDDeTqopJKkZMAMpDAwHhwtOaunwuMtSupqRllGoe= 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] CEPH-GW replication, disable /admin/log

2015-06-22 Thread Michael Kuriger
Is it possible to disable the replication of /admin/log and other replication 
logs?  It seems that This log replication is occupying a lot of time in my 
cluster(s).  I’d like to only replicate user’s data.

Thanks!



[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.commailto:mk7...@yp.com |• 818-649-7235



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] firefly to giant upgrade broke ceph-gw

2015-06-15 Thread Michael Kuriger
Hi all,
I recently upgraded my 2 ceph clusters from firefly to giant.  After the
update, ceph gateway has some issues.  I¹ve even gone so far as to
completely remove all gateway related pools and recreated from scratch.

I can write data into the gateway, and that seems to work (most of the
time) but deleting is not working unless I specify an exact file to
delete.  Also, my radosgw-agent is not syncing buckets any longer.  I¹m
using s3cmd to test reads/writes to the gateway.

Has anyone else had problems in giant?
 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw backup

2015-06-11 Thread Michael Kuriger
You may be able to use replication.  Here is a site showing a good example of 
how to set it up.  I have not tested replicating within the same datacenter, 
but you should just be able to define a new zone within your existing ceph 
cluster and replicate to it.

http://cephnotes.ksperis.com/blog/2015/03/13/radosgw-simple-replication-example



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Konstantin Ivanov
Sent: Thursday, May 28, 2015 1:44 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] radosgw backup

Hi everyone.
I'm wondering - is there way to backup radosgw data?
What i already tried.
create backup pool - copy .rgw.buckets to backup pool. Then i delete object 
via s3 client. And then i copy data from backup pool to .rgw.buckets. I still 
can't see object in s3 client, but can get it via http by early known url.
Questions: where radosgw stores info about objects - (how to make restored 
object visible from s3 client)? is there best way for backup data for radosgw?
Thanks for any advises.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph mount error

2015-06-11 Thread Michael Kuriger
1) set up mds server
ceph-deploy mds --overwrite-conf create hostname of mds server

2) create filesystem
ceph osd pool create cephfs_data 128
ceph osd pool create cephfs_metadata 16
ceph fs new cephfs cephfs_metadata cephfs_data
ceph fs ls
ceph mds stat

3) mount it!


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of ???
Sent: Sunday, June 07, 2015 8:15 AM
To: ceph-us...@ceph.com; community
Cc: xuzh@gmail.com
Subject: [ceph-users] ceph mount error

Hi ,
My ceph health is OK ,  And now , I want to  build  a  Filesystem , refer to  
the CEPH FS QUICK START guide .
http://ceph.com/docs/master/start/quick-cephfs/https://urldefense.proofpoint.com/v2/url?u=http-3A__ceph.com_docs_master_start_quick-2Dcephfs_d=AwMGbgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=q4j_7A_3Avo64MLd_mNa6jWl9XuLv1sx5SHvl58A0Vos=5Ttzin1olsWLGMMcZsINYfk82p7_jiBGDejDXUqUQvQe=
however , I got a error when i use the command ,  mount -t ceph 
192.168.1.105:6789:/ /mnt/mycephfs .  error :   mount error 22 = Invalid 
argument
I refer to munual , and now , I don't know how to solve it .
I am looking forward to your reply !

[cid:image001.png@01D0A433.D303BFB0]

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is Ceph right for me?

2015-06-11 Thread Michael Kuriger
You might be able to accomplish that with something like dropbox or owncloud

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Trevor 
Robinson - Key4ce
Sent: Wednesday, May 20, 2015 2:35 PM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Is Ceph right for me?

Hello,

Could somebody please advise me if Ceph is suitable for our use?

We are looking for a file system which is able to work over different locations 
which are connected by VPN. If one locations was to go offline then the 
filesystem will stay online at both sites and then once connection is regained 
the latest file version will take priority.

The main use will be for website files so the changes are most likely to be any 
uploaded files and cache files as a lot of the data will be stored in a SQL 
database which is already replicated.

With Kind Regards,
Trevor Robinson

CTO at Key4ce
[Image removed by sender. Key4ce - IT 
Professionals]https://urldefense.proofpoint.com/v2/url?u=https-3A__key4ce.com_d=AwMFAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=2W9LUdP7c-p9ne86lHzC9HrCJlqRadoKZr0lL_2jCpss=-GHFrZVDoc-S05-TAziAR-f-4eLd8MxrKbTkiSlWHyEe=

Skype:  KeyMalus.Trev
xmpp:  t.robin...@im4ce.com
Livechat:  
http://livechat.key4ce.comhttps://urldefense.proofpoint.com/v2/url?u=http-3A__livechat.key4ce.com_d=AwMFAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=2W9LUdP7c-p9ne86lHzC9HrCJlqRadoKZr0lL_2jCpss=EAVTZZsRYNxcyZr2JR9op7sqzfWA49ReJpeH7MkSgWQe=

NL:  +31 (0)40 290 3310
UK:  +44 (0)1332 898 999
CN:  +86 (0)7552 824 5985



The information contained in this message may be confidential and legally 
protected under applicable law. The message is intended solely for the 
addressee(s). If you are not the intended recipient, you are hereby notified 
that any use, forwarding, dissemination, or reproduction of this message is 
strictly prohibited and may be unlawful. If you are not the intended recipient, 
please contact the sender by return e-mail and destroy all copies of the 
original message.




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Beginners ceph journal question

2015-06-09 Thread Michael Kuriger
You could mount /dev/sdb to a filesystem, such as /ceph-disk, and then do this:
ceph-deploy osd create ceph-node1:/ceph-disk

Your journal would be a file doing it this way.


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.commailto:mk7...@yp.com |* 818-649-7235


From: Vickey Singh 
vickey.singh22...@gmail.commailto:vickey.singh22...@gmail.com
Date: Tuesday, June 9, 2015 at 12:21 AM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] Beginners ceph journal question

Hello Cephers

Beginners question on Ceph Journals creation. Need answers from experts.

- Is it true that by default ceph-deploy creates journal on dedicated partition 
and data on another partition. It does not creates journal on file ??

ceph-deploy osd create ceph-node1:/dev/sdb

This commands is creating
data partition : /dev/sdb2
Journal Partition : /dev/sdb1

In ceph-deploy command i have not specified journal partition but still it 
creates a journal on sdb1 ?

- How can i confirm if journal is on block device partition or on file ?

- How can i create journal on a file ? command would be helpful ?

Regards
Vicky
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] client.radosgw.gateway for 2 radosgw servers

2015-05-19 Thread Michael Kuriger
I have 3 GW servers, but they are defined like this:


[client.radosgw.ceph-gw1]

rgw_ops_log_data_backlog = 4096

rgw_enable_ops_log = true

keyring = /etc/ceph/ceph.client.radosgw.keyring

rgw_print_continue = true

rgw_ops_log_rados = true

host = ceph-gw1

rgw_frontends = civetweb port=80

rgw_socket_path = /var/run/ceph/ceph-client.radosgw.ceph-gw1.asok


[client.radosgw.ceph-gw2]

rgw_ops_log_data_backlog = 4096

rgw_enable_ops_log = true

keyring = /etc/ceph/ceph.client.radosgw.keyring

rgw_print_continue = true

rgw_ops_log_rados = true

host = ceph-gw2

rgw_frontends = civetweb port=80

rgw_socket_path = /var/run/ceph/ceph-client.radosgw.ceph-gw2.asok


[client.radosgw.ceph-gw3]

rgw_ops_log_data_backlog = 4096

rgw_enable_ops_log = true

keyring = /etc/ceph/ceph.client.radosgw.keyring

rgw_print_continue = true

rgw_ops_log_rados = true

host = ceph-gw3

rgw_frontends = civetweb port=80

rgw_socket_path = /var/run/ceph/ceph-client.radosgw.ceph-gw3.asok



[yp]



Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.commailto:mk7...@yp.com |* 818-649-7235


From: Florent MONTHEL fmont...@flox-arts.netmailto:fmont...@flox-arts.net
Date: Monday, May 18, 2015 at 6:14 PM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] client.radosgw.gateway for 2 radosgw servers

Hi List,

I would like to know the best way to have several radosgw servers on the same 
cluster with the same ceph.conf file

From now, I have 2 radosgw server but I have 1 conf file on each with below 
section on parrot :

[client.radosgw.gateway]
host = parrot
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = 
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
rgw print continue = false
rgw enable usage log = true
rgw usage log tick interval = 30
rgw usage log flush threshold = 1024
rgw usage max shards = 32
rgw usage max user shards = 1

And below section on cougar node :

[client.radosgw.gateway]
host = cougar
keyring = /etc/ceph/ceph.client.radosgw.keyring
rgw socket path = 
log file = /var/log/radosgw/client.radosgw.gateway.log
rgw frontends = fastcgi socket_port=9000 socket_host=0.0.0.0
rgw print continue = false
rgw enable usage log = true
rgw usage log tick interval = 30
rgw usage log flush threshold = 1024
rgw usage max shards = 32
rgw usage max user shards = 1

Is it possible to have 2 different keys for parrot and cougar and 2 sections 
client.radosgw in order to have the same ceph.conf for whole cluster (and use 
cep-deploy to push conf) ?

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Does anyone understand Calamari??

2015-05-13 Thread Michael Kuriger
OK, I finally got mine working.  For whatever reason, the latest version of 
salt was the issue for me.  Leaving the latest version of salt on the calamari 
server is working, but had to downgrade the minions.


  Removed:

salt.noarch 0:2014.7.5-1.el6salt-minion.noarch 0:2014.7.5-1.el6



  Installed:

salt.noarch 0:2014.7.1-1.el6salt-minion.noarch 0:2014.7.1-1.el6


This is on CentOS 6.6


-=Mike Kuriger


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.commailto:mk7...@yp.com |• 818-649-7235


From: Bruce McFarland 
bruce.mcfarl...@taec.toshiba.commailto:bruce.mcfarl...@taec.toshiba.com
Date: Tuesday, May 12, 2015 at 4:34 PM
To: ceph-calam...@lists.ceph.commailto:ceph-calam...@lists.ceph.com 
ceph-calam...@lists.ceph.commailto:ceph-calam...@lists.ceph.com, ceph-users 
ceph-us...@ceph.commailto:ceph-us...@ceph.com, ceph-devel 
(ceph-de...@vger.kernel.orgmailto:ceph-de...@vger.kernel.org) 
ceph-de...@vger.kernel.orgmailto:ceph-de...@vger.kernel.org
Subject: [ceph-users] Does anyone understand Calamari??

Increasing the audience since ceph-calamari is not responsive. What salt 
event/info does the Calamari Master expect to see from the ceph-mon to 
determine there is an working cluster? I had to change servers hosting the 
calamari master and can’t get the new machine to recognize the cluster. The 
‘salt \* ceph.get_heartbeats’ returns monmap, fsid, ver, epoch, etc for the 
monitor and all of the osd’s. Can anyone point me to docs or code that might 
enlighten me to what I’m overlooking? Thanks.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New Calamari server

2015-05-12 Thread Michael Kuriger
In my case, I did remove all salt keys.  The salt portion of my install is
working.  It’s just that the calamari server is not seeing the ceph
cluster.


 

 
Michael Kuriger
Sr. Unix Systems Engineer
* mk7...@yp.com |( 818-649-7235





On 5/12/15, 1:35 AM, Alexandre DERUMIER aderum...@odiso.com wrote:

Hi, when you have remove salt from nodes,

do you have remove the old master key
/etc/salt/pki/minion/minion_master.pub

?

I have add the same behavior than you when reinstalling calamari server,
and previously installed salt on ceph nodes
(with explicit error about the key in /var/log/salt/minion on ceph nodes)

- Mail original -
De: Michael Kuriger mk7...@yp.com
À: ceph-users ceph-us...@ceph.com
Cc: ceph-devel ceph-de...@vger.kernel.org
Envoyé: Lundi 11 Mai 2015 23:43:34
Objet: [ceph-users] New Calamari server

I had an issue with my calamari server, so I built a new one from
scratch. 
I¹ve been struggling trying to get the new server to start up and see my
ceph cluster. I went so far as to remove salt and diamond from my ceph
nodes and reinstalled again. On my calamari server, it sees the hosts
connected but doesn¹t detect a cluster. What am I missing? I¹ve set up
many calamari servers on different ceph clusters, but this is the first
time I¹ve tried to build a new calamari server.

Here¹s what I see on my calamari GUI:

New Calamari Installation

This appears to be the first time you have started Calamari and there are
no clusters currently configured.

33 Ceph servers are connected to Calamari, but no Ceph cluster has been
created yet. Please use ceph-deploy to create a cluster; please see the
Inktank Ceph Enterprise documentation for more details.

Thanks! 
Mike Kuriger 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] New Calamari server

2015-05-11 Thread Michael Kuriger
I had an issue with my calamari server, so I built a new one from scratch.
 I¹ve been struggling trying to get the new server to start up and see my
ceph cluster.  I went so far as to remove salt and diamond from my ceph
nodes and reinstalled again.  On my calamari server, it sees the hosts
connected but doesn¹t detect a cluster.  What am I missing?  I¹ve set up
many calamari servers on different ceph clusters, but this is the first
time I¹ve tried to build a new calamari server.

Here¹s what I see on my calamari GUI:

New Calamari Installation

This appears to be the first time you have started Calamari and there are
no clusters currently configured.

33 Ceph servers are connected to Calamari, but no Ceph cluster has been
created yet. Please use ceph-deploy to create a cluster; please see the
Inktank Ceph Enterprise documentation for more details.

Thanks!
Mike Kuriger




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

2015-03-16 Thread Michael Kuriger
I always keep my pg number a power of 2.  So I’d go from 2048 to 4096.  I’m not 
sure if this is the safest way, but it’s worked for me.


[yp]



Michael Kuriger
Sr. Unix Systems Engineer
• mk7...@yp.commailto:mk7...@yp.com |• 818-649-7235


From: Chu Duc Minh chu.ducm...@gmail.commailto:chu.ducm...@gmail.com
Date: Monday, March 16, 2015 at 7:49 AM
To: Florent B flor...@coppint.commailto:flor...@coppint.com
Cc: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: Re: [ceph-users] [SPAM] Changing pg_num = RBD VM down !

I'm using the latest Giant and have the same issue. When i increase PG_num of a 
pool from 2048 to 2148, my VMs is still ok. When i increase from 2148 to 2400, 
some VMs die (Qemu-kvm process die).
My physical servers (host VMs) running kernel 3.13 and use librbd.
I think it's a bug in librbd with crushmap.
(I set crush_tunables3 on my ceph cluster, does it make sense?)

Do you know a way to safely increase PG_num? (I don't think increase PG_num 100 
each times is a safe  good way)

Regards,

On Mon, Mar 16, 2015 at 8:50 PM, Florent B 
flor...@coppint.commailto:flor...@coppint.com wrote:
We are on Giant.

On 03/16/2015 02:03 PM, Azad Aliyar wrote:

 May I know your ceph version.?. The latest version of firefly 80.9 has
 patches to avoid excessive data migrations during rewighting osds. You
 may need set a tunable inorder make this patch active.

 This is a bugfix release for firefly.  It fixes a performance regression
 in librbd, an important CRUSH misbehavior (see below), and several RGW
 bugs.  We have also backported support for flock/fcntl locks to ceph-fuse
 and libcephfs.

 We recommend that all Firefly users upgrade.

 For more detailed information, see
   
 http://docs.ceph.com/docs/master/_downloads/v0.80.9.txthttps://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_-5Fdownloads_v0.80.9.txtd=AwMFaQc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=0MEOMMXqQGLq4weFd85B2Bxn5uBH9V9uMiuajNVb7o0s=-HHkWm2cMQZ06FKpWF4Ai-YkFb9lUR_tH_KR0eITbuUe=

 Adjusting CRUSH maps
 

 * This point release fixes several issues with CRUSH that trigger
   excessive data migration when adjusting OSD weights.  These are most
   obvious when a very small weight change (e.g., a change from 0 to
   .01) triggers a large amount of movement, but the same set of bugs
   can also lead to excessive (though less noticeable) movement in
   other cases.

   However, because the bug may already have affected your cluster,
   fixing it may trigger movement *back* to the more correct location.
   For this reason, you must manually opt-in to the fixed behavior.

   In order to set the new tunable to correct the behavior::

  ceph osd crush set-tunable straw_calc_version 1

   Note that this change will have no immediate effect.  However, from
   this point forward, any 'straw' bucket in your CRUSH map that is
   adjusted will get non-buggy internal weights, and that transition
   may trigger some rebalancing.

   You can estimate how much rebalancing will eventually be necessary
   on your cluster with::

  ceph osd getcrushmap -o /tmp/cm
  crushtool -i /tmp/cm --num-rep 3 --test --show-mappings  /tmp/a 21
  crushtool -i /tmp/cm --set-straw-calc-version 1 -o /tmp/cm2
  crushtool -i /tmp/cm2 --reweight -o /tmp/cm2
  crushtool -i /tmp/cm2 --num-rep 3 --test --show-mappings  /tmp/b
 21
  wc -l /tmp/a  # num total mappings
  diff -u /tmp/a /tmp/b | grep -c ^+# num changed mappings

Divide the total number of lines in /tmp/a with the number of lines
changed.  We've found that most clusters are under 10%.

You can force all of this rebalancing to happen at once with::

  ceph osd crush reweight-all

Otherwise, it will happen at some unknown point in the future when
CRUSH weights are next adjusted.

 Notable Changes
 ---

 * ceph-fuse: flock, fcntl lock support (Yan, Zheng, Greg Farnum)
 * crush: fix straw bucket weight calculation, add straw_calc_version
   tunable (#10095 Sage Weil)
 * crush: fix tree bucket (Rongzu Zhu)
 * crush: fix underflow of tree weights (Loic Dachary, Sage Weil)
 * crushtool: add --reweight (Sage Weil)
 * librbd: complete pending operations before losing image (#10299 Jason
   Dillaman)
 * librbd: fix read caching performance regression (#9854 Jason Dillaman)
 * librbd: gracefully handle deleted/renamed pools (#10270 Jason Dillaman)
 * mon: fix dump of chooseleaf_vary_r tunable (Sage Weil)
 * osd: fix PG ref leak in snaptrimmer on peering (#10421 Kefu Chai)
 * osd: handle no-op write with snapshot (#10262 Sage Weil)
 * radosgw-admi




 On 03/16/2015 12:37 PM, Alexandre DERUMIER wrote:
  VMs are running on the same nodes than OSD
  Are you sure that you didn't some kind of out of memory.
  pg rebalance can be memory hungry. (depend how many osd you have).

 2 OSD per host

Re: [ceph-users] Ceph repo - RSYNC?

2015-03-05 Thread Michael Kuriger
I use reposync to keep mine updated when needed.


Something like:
cd ~ /ceph/repos
reposync -r Ceph -c /etc/yum.repos.d/ceph.repo
reposync -r Ceph-noarch -c /etc/yum.repos.d/ceph.repo
reposync -r elrepo-kernel -c /etc/yum.repos.d/elrepo.repo



 
Michael Kuriger
Sr. Unix Systems Engineer
S mk7...@yp.com |  818-649-7235


-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Brian 
Rak
Sent: Thursday, March 05, 2015 10:14 AM
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph repo - RSYNC?

Do any of the Ceph repositories run rsync?  We generally mirror the repository 
locally so we don't encounter any unexpected upgrades.

eu.ceph.com used to run this, but it seems to be down now.

# rsync rsync://eu.ceph.com
rsync: failed to connect to eu.ceph.com: Connection refused (111) rsync error: 
error in socket IO (code 10) at clientserver.c(124) [receiver=3.0.6]

___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=5oPk_opCf1eJ_BZLqS3mzFHka3r1-lGm_ya8mvkaIh8s=sYjohrI39G9Owm-E92bzgsL53AYrmkFJJEzt-fEC7awe=
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Michael Kuriger
Thanks Sage for the quick reply!

-=Mike


On 2/26/15, 8:05 AM, Sage Weil sw...@redhat.com wrote:

On Thu, 26 Feb 2015, Michael Kuriger wrote:
 I¹d also like to set this up.  I¹m not sure where to begin.  When you
say
 enabled by default, where is it enabled?

The civetweb frontend is built into the radosgw process, so for the most
part you just have to get radosgw started and configured.  It isn't well
documented yet, but basically just skip everything that has anythig to do
with fastcgi or apache.  For example, if you follow the docs, you can
jump 
straight to here:

   
 https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_ma
ster_install_install-2Dceph-2Dgateway_-23id5d=AwIDAwc=lXkdEK1PC7UK9oKA-B
BSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=zHhqYz7SSnUgByi3RhC_G
O0wnyC-Tu-F34CbplxqJuEs=ir3IkFo7P_iloq8saI65giD_fkZy4CCefyRWHQtaIese=

(or maybe set up the wildcard DNS if you want to get fancy).

If you're on firefly, you also need to add the line below to ceph.conf to
enable civetweb; for giant and hammer radosgw will listen on port 7480 by
default.

sage


 
 Many thanks,
 Mike
 
 
 On 2/25/15, 1:49 PM, Sage Weil sw...@redhat.com wrote:
 
 On Wed, 25 Feb 2015, Robert LeBlanc wrote:
  We tried to get radosgw working with Apache + mod_fastcgi, but due to
  the changes in radosgw, Apache, mode_*cgi, etc and the documentation
  lagging and not having a lot of time to devote to it, we abandoned
it.
  Where it the documentation for civetweb? If it is appliance like and
  easy to set-up, we would like to try it to offer some feedback on
your
  question.
 
 In giant and hammer, it is enabled by default on port 7480.  On
firefly,
 you need to add the line
 
  rgw frontends = fastcgi, civetweb port=7480
 
 to ceph.conf (you can of course adjust the port number if you like) and
 radosgw will run standalone w/ no apache or anything else.
 
 sage
 
 
  
  Thanks,
  Robert LeBlanc
  
  On Wed, Feb 25, 2015 at 12:31 PM, Sage Weil sw...@redhat.com wrote:
   Hey,
  
   We are considering switching to civetweb (the embedded/standalone
rgw
 web
   server) as the primary supported RGW frontend instead of the
current
   apache + mod-fastcgi or mod-proxy-fcgi approach.  Supported here
 means
   both the primary platform the upstream development focuses on and
 what the
   downstream Red Hat product will officially support.
  
   How many people are using RGW standalone using the embedded
civetweb
   server instead of apache?  In production?  At what scale?  What
   version(s) (civetweb first appeared in firefly and we've backported
 most
   fixes).
  
   Have you seen any problems?  Any other feedback?  The hope is to
 (vastly)
   simplify deployment.
  
   Thanks!
   sage
   ___
   ceph-users mailing list
   ceph-users@lists.ceph.com
   
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_list
in
 
fo.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzL
OS
 
ncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb
9w
 81RjI2K1ZUs=VsyJ8UHWG0ApL86WXaeD5eOV5SRA7VmGeSkKGws3qMIe=
  --
  To unsubscribe from this list: send the line unsubscribe
ceph-devel in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_maj
or
 
domo-2Dinfo.htmld=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQ
r
 
=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb9w81RjI2K1ZUs
=i
 RNo1K7oJZ-14R-LWuNvEI0WoOr5UuaDkm3IQx77VIoe=
  
  
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listi
nf
 
o.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOS
nc
 
m6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb9w8
1R
 jI2K1ZUs=VsyJ8UHWG0ApL86WXaeD5eOV5SRA7VmGeSkKGws3qMIe=
 
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at
https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_major
domo-2Dinfo.htmld=AwIDAwc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr
=CSYA9OS6Qd7fQySI2LDvlQm=zHhqYz7SSnUgByi3RhC_GO0wnyC-Tu-F34CbplxqJuEs=w
K6iPf6pMHg8FnOgdq4XuuXYeIVz9FEIrNtGI15s6YAe=
 
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using radosgw with civetweb?

2015-02-26 Thread Michael Kuriger
I¹d also like to set this up.  I¹m not sure where to begin.  When you say
enabled by default, where is it enabled?

Many thanks,
Mike


On 2/25/15, 1:49 PM, Sage Weil sw...@redhat.com wrote:

On Wed, 25 Feb 2015, Robert LeBlanc wrote:
 We tried to get radosgw working with Apache + mod_fastcgi, but due to
 the changes in radosgw, Apache, mode_*cgi, etc and the documentation
 lagging and not having a lot of time to devote to it, we abandoned it.
 Where it the documentation for civetweb? If it is appliance like and
 easy to set-up, we would like to try it to offer some feedback on your
 question.

In giant and hammer, it is enabled by default on port 7480.  On firefly,
you need to add the line

 rgw frontends = fastcgi, civetweb port=7480

to ceph.conf (you can of course adjust the port number if you like) and
radosgw will run standalone w/ no apache or anything else.

sage


 
 Thanks,
 Robert LeBlanc
 
 On Wed, Feb 25, 2015 at 12:31 PM, Sage Weil sw...@redhat.com wrote:
  Hey,
 
  We are considering switching to civetweb (the embedded/standalone rgw
web
  server) as the primary supported RGW frontend instead of the current
  apache + mod-fastcgi or mod-proxy-fcgi approach.  Supported here
means
  both the primary platform the upstream development focuses on and
what the
  downstream Red Hat product will officially support.
 
  How many people are using RGW standalone using the embedded civetweb
  server instead of apache?  In production?  At what scale?  What
  version(s) (civetweb first appeared in firefly and we've backported
most
  fixes).
 
  Have you seen any problems?  Any other feedback?  The hope is to
(vastly)
  simplify deployment.
 
  Thanks!
  sage
  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listin
fo.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOS
ncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb9w
81RjI2K1ZUs=VsyJ8UHWG0ApL86WXaeD5eOV5SRA7VmGeSkKGws3qMIe=
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at
https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_major
domo-2Dinfo.htmld=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr
=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb9w81RjI2K1ZUs=i
RNo1K7oJZ-14R-LWuNvEI0WoOr5UuaDkm3IQx77VIoe=
 
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
o.cgi_ceph-2Dusers-2Dceph.comd=AwICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
m6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=kDLPSKEyNK6uMLfvU8U7GTva3RMufAb9w81R
jI2K1ZUs=VsyJ8UHWG0ApL86WXaeD5eOV5SRA7VmGeSkKGws3qMIe= 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] two mount points, two diffrent data

2015-01-16 Thread Michael Kuriger
You’re using a file system on 2 hosts that is not cluster aware.  Metadata 
written on hosta is not sent to hostb in this case.  You may be interested in 
looking at cephfs for this use case.


Michael Kuriger
mk7...@yp.com
818-649-7235
MikeKuriger (IM)

From: Rafał Michalak rafa...@gmail.commailto:rafa...@gmail.com
Date: Wednesday, January 14, 2015 at 5:20 AM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] two mount points, two diffrent data

Hello I have trouble with this situation

#node1
mount /dev/rbd/rbd/test /mnt
cd /mnt
touch test1
ls (i see test1, OK)

#node2
mount /dev/rbd/rbd/test /mnt
cd /mnt
(i see test1, OK)
touch test2
ls (i see test2, OK)

#node1
ls (i see test1, BAD)
touch test3
ls (i see test1, test3 BAD)

#node2
ls (i see test1, test2 BAD)

Why data not replicating on mounting fs ?
I try with filesystems ext4 and xfs
The data is visible only when unmounted and mounted again

I check health in ceph status and is HEALTH_OK

What I doing wrong ?
Thanks for any help.


My system
Ubuntu 14.04.01 LTS

#ceph --version
ceph version 0.80.7 (6c0127fcb58008793d3c8b62d925bc91963672a3)

#modinfo libceph
filename:   /lib/modules/3.13.0-44-generic/kernel/net/ceph/libceph.ko
license: GPL
description:Ceph filesystem for Linux
author:   Patience Warnick 
patie...@newdream.netmailto:patie...@newdream.net
author:   Yehuda Sadeh 
yeh...@hq.newdream.netmailto:yeh...@hq.newdream.net
author:   Sage Weil s...@newdream.netmailto:s...@newdream.net
srcversion: B8E83D4DFC53B113603CF52
depends:libcrc32c
intree:Y
vermagic:   3.13.0-44-generic SMP mod_unload modversions
signer:   Magrathea: Glacier signing key
sig_key:  50:8C:3B:4B:F1:08:ED:36:B6:06:2F:81:27:82:F7:7C:37:B9:85:37
sig_hashalgo:   sha512

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] calamari dashboard missing usage data after adding/removing ceph nodes

2014-12-30 Thread Michael Kuriger
Hi Brian,
I had this problem when I upgraded to firefly (or possibly giant) – At any 
rate, the data values changed at some point and calamari needs a slight update.

Check this file:
/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py

diff v1.py v1.py.ori
105c105
 return kb
---
 return kb * 1024
111,113c111,113
 'used_bytes': 
to_bytes(get_latest_graphite(df_path('total_used_bytes'))),
 'capacity_bytes': 
to_bytes(get_latest_graphite(df_path('total_bytes'))),
 'free_bytes': 
to_bytes(get_latest_graphite(df_path('total_avail_bytes')))
---
 'used_bytes': 
 to_bytes(get_latest_graphite(df_path('total_used'))),
 'capacity_bytes': 
 to_bytes(get_latest_graphite(df_path('total_space'))),
 'free_bytes': 
 to_bytes(get_latest_graphite(df_path('total_avail')))



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Brian 
Jarrett
Sent: Tuesday, December 30, 2014 7:38 AM
To: ceph-us...@ceph.com
Subject: [ceph-users] calamari dashboard missing usage data after 
adding/removing ceph nodes

Cephistas,
I've been running a Ceph cluster for several months now.  I started out with a 
VM called master as the admin node and a monitor and two Dell servers as OSD 
nodes (called  Node1 and Node2) and also made them monitors so I had 3 monitors.
After I got that all running fine, I added Calamari to the admin node.  Then, I 
needed the Dell servers for another project, so we bought 3 used HP servers and 
I called them Ceph1, Ceph2, and Ceph3.  The HP servers were added as OSD and 
Monitor nodes and the Dell servers were removed.  Since I now had 3 monitors 
with the HP servers, I removed the monitor on master.

So now master is my admin node with Calamari still running on it, but now the 
Usage panel on the dashboard is blank and the two data points for it are not 
being populated anymore.

So I found:
http://calamari.readthedocs.org/en/latest/operations/troubleshooting.htmlhttps://urldefense.proofpoint.com/v2/url?u=http-3A__calamari.readthedocs.org_en_latest_operations_troubleshooting.htmld=AwMFaQc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=Zwx42bTy2NONCajpon8Cv9aDTQwdsfqqUp7lh65sd8os=RaiP-TjNkfRw5Ab1jz_ZGeIJBFPOE7hmo28dojECLNwe=

I can see data by browsing to my calamari server /graphite/dashboard for all 
the monitors (there are still metrics for the removed monitors), but I don't 
see any data under ceph.cluster.cluster_id.df.total_used or ...total_avail   
Django posts errors in the log about not finding these values, so I think 
that's where my problem lies.
I've tried to locate all of the graphite and diamond configurations that I can 
find and I can't locate anything that would populate those values.
Any clues as to where I should look to get this working properly would be great.

Thanks!

Brian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Tip of the week: don't use Intel 530 SSD's for journals

2014-11-25 Thread Michael Kuriger
My cluster is actually very fast without SSD drives.  Thanks for the
advice!

Michael Kuriger 
mk7...@yp.com
818-649-7235

MikeKuriger (IM)




On 11/25/14, 7:49 AM, Mark Nelson mark.nel...@inktank.com wrote:

On 11/25/2014 09:41 AM, Erik Logtenberg wrote:
 If you are like me, you have the journals for your OSD's with rotating
 media stored separately on an SSD. If you are even more like me, you
 happen to use Intel 530 SSD's in some of your hosts. If so, please do
 check your S.M.A.R.T. statistics regularly, because these SSD's really
 can't cope with Ceph.

 Check out the media-wear graphs for the two Intel 530's in my cluster.
 As soon as those declining lines get down to 30% or so, they need to be
 replaced. That means less than half a year between purchase and
 end-of-life :(

 Tip of the week, keep an eye on those statistics, don't let a failing
 SSD surprise you.

This is really good advice, and it's not just the Intel 530s.  Most
consumer grade SSDs have pretty low write endurance.  If you mostly are
doing reads from your cluster you may be OK, but if you have even
moderately high write workloads and you care about avoiding OSD downtime
(which in a production cluster is pretty important though not usually
100% critical), get high write endurance SSDs.

Mark


 Erik.



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listin
fo.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOS
ncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=xAjtZHPapVvnusxPYRk6BsgVfaL1ZLDaT
ojJmuDFDpQs=F0CBA8T3LuTIhofIV4LGk-6CgC8KsPAu-7JgJ4jRm3Ie=


___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
o.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
m6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=xAjtZHPapVvnusxPYRk6BsgVfaL1ZLDaTojJ
muDFDpQs=F0CBA8T3LuTIhofIV4LGk-6CgC8KsPAu-7JgJ4jRm3Ie= 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Question about mount the same rbd in different machine

2014-11-25 Thread Michael Kuriger
You can't write from 2 nodes mounted to the same rbd at the same time without a 
cluster aware file system.



-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mail 
list
Sent: Tuesday, November 25, 2014 7:30 PM
To: ceph-us...@ceph.com
Subject: [ceph-users] Question about mount the same rbd in different machine

Hi, all

I create a rbd named foo, and then map it and mount on two different machine, 
and when i touch a file on the machine A, machine B can not see the new file, 
and machine B can also touch a same file!

I want to know if the rbd the same on machine A and B? or exactly they are two 
rbd?

Any idea will be appreciate!
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=Xr6ISOkb1Lr6Bkl1PP_iLNTxnz38NvS2tI3k_MDbARAs=IFerPncrFNxmE2_vwEv8X-XVezPfATstmGR6FL8jmaUe=
 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Question about mount the same rbd in different machine

2014-11-25 Thread Michael Kuriger
Each server mounting the rbd device thinks it's the only server writing to it.  
They are not aware of the other server and therefore will overwrite and corrupt 
the filesystem as soon as each server writes a file.

-Original Message-
From: mail list [mailto:louis.hust...@gmail.com] 
Sent: Tuesday, November 25, 2014 8:11 PM
To: Michael Kuriger
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] Question about mount the same rbd in different machine

hi, 

But i have touched the same file on the two machine under the same rbi with no 
error. 
will it cause some problem or just not suggested but can do?

On Nov 26, 2014, at 12:08, Michael Kuriger mk7...@yp.com wrote:

 You can't write from 2 nodes mounted to the same rbd at the same time without 
 a cluster aware file system.
 
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of mail 
 list
 Sent: Tuesday, November 25, 2014 7:30 PM
 To: ceph-us...@ceph.com
 Subject: [ceph-users] Question about mount the same rbd in different machine
 
 Hi, all
 
 I create a rbd named foo, and then map it and mount on two different machine, 
 and when i touch a file on the machine A, machine B can not see the new file, 
 and machine B can also touch a same file!
 
 I want to know if the rbd the same on machine A and B? or exactly they are 
 two rbd?
 
 Any idea will be appreciate!
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=Xr6ISOkb1Lr6Bkl1PP_iLNTxnz38NvS2tI3k_MDbARAs=IFerPncrFNxmE2_vwEv8X-XVezPfATstmGR6FL8jmaUe=
  

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Question about mount the same rbd in different machine

2014-11-25 Thread Michael Kuriger
I cannot go into detail about how or where your particular system is writing 
files.   All I can reiterate is that bbd images can only be mounted to one host 
at a time, unless you're using a cluster aware file system.  

Hope that helps!
-Mike

-Original Message-
From: mail list [mailto:louis.hust...@gmail.com] 
Sent: Tuesday, November 25, 2014 8:27 PM
To: Michael Kuriger
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] Question about mount the same rbd in different machine

Hi Michael,

I write the same file with different content, and there is no hint for 
overwrite, so when the corrupt will appear?

On Nov 26, 2014, at 12:23, Michael Kuriger mk7...@yp.com wrote:

 Each server mounting the rbd device thinks it's the only server writing to 
 it.  They are not aware of the other server and therefore will overwrite and 
 corrupt the filesystem as soon as each server writes a file.
 
 -Original Message-
 From: mail list [mailto:louis.hust...@gmail.com] 
 Sent: Tuesday, November 25, 2014 8:11 PM
 To: Michael Kuriger
 Cc: ceph-us...@ceph.com
 Subject: Re: [ceph-users] Question about mount the same rbd in different 
 machine
 
 hi, 
 
 But i have touched the same file on the two machine under the same rbi with 
 no error. 
 will it cause some problem or just not suggested but can do?
 
 On Nov 26, 2014, at 12:08, Michael Kuriger mk7...@yp.com wrote:
 
 You can't write from 2 nodes mounted to the same rbd at the same time 
 without a cluster aware file system.
 
 
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 mail list
 Sent: Tuesday, November 25, 2014 7:30 PM
 To: ceph-us...@ceph.com
 Subject: [ceph-users] Question about mount the same rbd in different machine
 
 Hi, all
 
 I create a rbd named foo, and then map it and mount on two different 
 machine, and when i touch a file on the machine A, machine B can not see the 
 new file, and machine B can also touch a same file!
 
 I want to know if the rbd the same on machine A and B? or exactly they are 
 two rbd?
 
 Any idea will be appreciate!
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=Xr6ISOkb1Lr6Bkl1PP_iLNTxnz38NvS2tI3k_MDbARAs=IFerPncrFNxmE2_vwEv8X-XVezPfATstmGR6FL8jmaUe=
  
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to mount cephfs from fstab

2014-11-24 Thread Michael Kuriger
I mount mine with an init-script.

Michael Kuriger 
mk7...@yp.com
818-649-7235

MikeKuriger (IM)




On 11/24/14, 9:08 AM, Erik Logtenberg e...@logtenberg.eu wrote:

Hi,

I would like to mount a cephfs share from fstab, but it doesn't
completely work.

First of all, I followed the documentation [1], which resulted in the
following line in fstab:

ceph-01:6789:/ /mnt/cephfs/ ceph
name=testhost,secretfile=/root/testhost.key,noacl 0 2

Yes, this works when I manually try mount /mnt/cephfs, but it does
give me the following error/warning:

mount: error writing /etc/mtab: Invalid argument

Now, even though this error doesn't influence the mounting itself, it
does prohibit my machine from booting right. Apparently Fedora/systemd
doesn't like this error when going through fstab, so booting is not
possible.

The mtab issue can easily be worked around, by calling mount manually
and using the -n (--no-mtab) argument, like this:

mount -t ceph -n ceph-01:6789:/ /mnt/cephfs/ -o
name=testhost,secretfile=/root/testhost.key,noacl

However, I can't find a way to put that -n option in /etc/fstab itself
(since it's not a -o option. Currently, I have the noauto setting in
fstab, so it doesn't get mounted on boot at all. Then I have to manually
log in and say mount /mnt/cephfs to explicitly mount the share. Far
from ideal.

So, how do my fellow cephfs-users do this?

Thanks,

Erik.

[1] 
https://urldefense.proofpoint.com/v2/url?u=http-3A__ceph.com_docs_giant_ce
phfs_fstab_d=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9
OS6Qd7fQySI2LDvlQm=fb1bHNdsLXcNI47iTaeTo8ilihg9v4zBgYrJxu_qRz4s=degRv1ci
xJhYSlA6i3W6kI_cCNdhTnxH5TzwG5cJDC4e=
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinf
o.cgi_ceph-2Dusers-2Dceph.comd=AAICAgc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSnc
m6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=fb1bHNdsLXcNI47iTaeTo8ilihg9v4zBgYrJ
xu_qRz4s=n-cv3qZjR4DxRjGiat-mt1PXDZ_431vYDG4X4AA7XZ8e= 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to run rados common by non-root user in ceph node

2014-11-24 Thread Michael Kuriger
The non root user needs to be able to read the key file.
Michael Kuriger
mk7...@yp.com
818-649-7235
MikeKuriger (IM)

From: Huynh Dac Nguyen ndhu...@spsvietnam.vnmailto:ndhu...@spsvietnam.vn
Date: Wednesday, November 19, 2014 at 8:44 PM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] how to run rados common by non-root user in ceph node


Hi All,


After setting up the ceph, i can run ceph, rados command by root user,


how can i run them by non-root account?


i created a key already:

[root@ho-srv-ceph-02 ~]# ceph auth list

client.oneadmin

key: AQBQY21UaLLhCBAAKjsM8qRxFpJA4ppbA7Rn9A==

caps: [mon] allow r

caps: [osd] allow rw pool=one


[oneadmin@ho-srv-ceph-02 ~]$ rados df

2014-11-20 11:43:44.143434 7f53ac04a760 -1 monclient(hunting): ERROR: missing 
keyring, cannot use cephx for authentication

2014-11-20 11:43:44.143611 7f53ac04a760  0 librados: client.admin 
initialization error (2) No such file or directory

couldn't connect to cluster! error -2



Regards,

Ndhuynh

This e-mail message including any attachments is for the sole use of the 
intended(s) and may contain privileged or confidential information. Any 
unauthorized review, use, disclosure or distribution is prohibited. If you are 
not intended recipient, please immediately contact the sender by reply e-mail 
and delete the original message and destroy all copies thereof.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Calamari install issues

2014-11-21 Thread Michael Kuriger
I had to run salt-call state.highstate” on my ceph nodes.
Also, if you’re running giant you’ll have to make a small change to get your 
disk stats to show up correctly.


/opt/calamari/venv/lib/python2.6/site-packages/calamari_rest_api-0.1-py2.6.egg/calamari_rest/views/v1.py


$ diff v1.py v1.py.ori

105c105

 return kb

---

 return kb * 1024

111,113c111,113

 'used_bytes': 
to_bytes(get_latest_graphite(df_path('total_used_bytes'))),

 'capacity_bytes': 
to_bytes(get_latest_graphite(df_path('total_bytes'))),

 'free_bytes': 
to_bytes(get_latest_graphite(df_path('total_avail_bytes')))

---

 'used_bytes': 
 to_bytes(get_latest_graphite(df_path('total_used'))),

 'capacity_bytes': 
 to_bytes(get_latest_graphite(df_path('total_space'))),

 'free_bytes': 
 to_bytes(get_latest_graphite(df_path('total_avail')))



Michael Kuriger
mk7...@yp.com
818-649-7235
MikeKuriger (IM)

From: Shain Miley smi...@npr.orgmailto:smi...@npr.org
Date: Friday, November 21, 2014 at 8:51 AM
To: ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com 
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
Subject: [ceph-users] Calamari install issues

Hello all,

I followed the setup steps provided here:

http://karan-mj.blogspot.com/2014/09/ceph-calamari-survival-guide.htmlhttps://urldefense.proofpoint.com/v2/url?u=http-3A__karan-2Dmj.blogspot.com_2014_09_ceph-2Dcalamari-2Dsurvival-2Dguide.htmld=AAMCAwc=lXkdEK1PC7UK9oKA-BBSI8p1AamzLOSncm6Vfn0C_UQr=CSYA9OS6Qd7fQySI2LDvlQm=MmG78XqRA2sufaWDNcd4O54jQUGlQ4vbQYBK1457kFEs=0x3U3gdO2UjIioyXC6rezJ1hLoXm1MCjNG1LI3PdUE8e=

I was able to build and install everything correctly as far as I can 
tell...however I am still not able to get the server to see the cluster.

I am getting the following errors after I log into the web gui:

4 Ceph servers are connected to Calamari, but no Ceph cluster has been created 
yet.


The ceph nodes have salt installed and are being managed by the salt-master:

root@calamari:/home/# salt-run manage.up
hqceph1.npr.org
hqceph2.npr.org
hqceph3.npr.org
hqosd1.npr.org

However something still seems to be missing:

root@calamari:/home/#  salt '*' test.ping; salt '*' ceph.get_heartbeats
hqceph1.npr.org:
True
hqceph2.npr.org:
True
hqosd1.npr.org:
True
hqceph3.npr.org:
True
hqceph1.npr.org:
'ceph.get_heartbeats' is not available.
hqceph3.npr.org:
'ceph.get_heartbeats' is not available.
hqceph2.npr.org:
'ceph.get_heartbeats' is not available.
hqosd1.npr.org:
'ceph.get_heartbeats' is not available.


Any help trying to move forward would be great!

Thanks in advance,

Shain


--
[NPR] | Shain Miley| Manager of Systems and Infrastructure, Digital Media | 
smi...@npr.orgmailto:smi...@npr.org | p: 202-513-3649
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pg's degraded

2014-11-21 Thread Michael Kuriger
I have started over from scratch a few times myself ;-)


Michael Kuriger
mk7...@yp.com
818-649-7235
MikeKuriger (IM)

From: JIten Shah jshah2...@me.commailto:jshah2...@me.com
Date: Friday, November 21, 2014 at 9:44 AM
To: Michael Kuriger mk7...@yp.commailto:mk7...@yp.com
Cc: Craig Lewis cle...@centraldesktop.commailto:cle...@centraldesktop.com, 
ceph-users ceph-us...@ceph.commailto:ceph-us...@ceph.com
Subject: Re: [ceph-users] pg's degraded

Thanks Michael. That was a good idea.

I did:

1. sudo service ceph stop mds

2. ceph mds newfs 1 0 —yes-i-really-mean-it (where 1 and 0 are pool ID’s for 
metadata and data)

3. ceph health (It was healthy now!!!)

4. sudo servie ceph start mds.$(hostname -s)

And I am back in business.

Thanks again.

—Jiten



On Nov 20, 2014, at 5:47 PM, Michael Kuriger 
mk7...@yp.commailto:mk7...@yp.com wrote:

Maybe delete the pool and start over?


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of JIten 
Shah
Sent: Thursday, November 20, 2014 5:46 PM
To: Craig Lewis
Cc: ceph-users
Subject: Re: [ceph-users] pg's degraded

Hi Craig,

Recreating the missing PG’s fixed it.  Thanks for your help.

But when I tried to mount the Filesystem, it gave me the “mount error 5”. I 
tried to restart the MDS server but it won’t work. It tells me that it’s 
laggy/unresponsive.

BTW, all these machines are VM’s.

[jshah@Lab-cephmon001 ~]$ ceph health detail
HEALTH_WARN mds cluster is degraded; mds Lab-cephmon001 is laggy
mds cluster is degraded
mds.Lab-cephmon001 at 17.147.16.111:6800/3745284 rank 0 is replaying journal
mds.Lab-cephmon001 at 17.147.16.111:6800/3745284 is laggy/unresponsive


—Jiten

On Nov 20, 2014, at 4:20 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:


Ok. Thanks.

—Jiten

On Nov 20, 2014, at 2:14 PM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


If there's no data to lose, tell Ceph to re-create all the missing PGs.

ceph pg force_create_pg 2.33

Repeat for each of the missing PGs.  If that doesn't do anything, you might 
need to tell Ceph that you lost the OSDs.  For each OSD you moved, run ceph osd 
lost OSDID, then try the force_create_pg command again.

If that doesn't work, you can keep fighting with it, but it'll be faster to 
rebuild the cluster.



On Thu, Nov 20, 2014 at 1:45 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
Thanks for your help.

I was using puppet to install the OSD’s where it chooses a path over a device 
name. Hence it created the OSD in the path within the root volume since the 
path specified was incorrect.

And all 3 of the OSD’s were rebuilt at the same time because it was unused and 
we had not put any data in there.

Any way to recover from this or should i rebuild the cluster altogether.

—Jiten

On Nov 20, 2014, at 1:40 PM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


So you have your crushmap set to choose osd instead of choose host?

Did you wait for the cluster to recover between each OSD rebuild?  If you 
rebuilt all 3 OSDs at the same time (or without waiting for a complete recovery 
between them), that would cause this problem.



On Thu, Nov 20, 2014 at 11:40 AM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
Yes, it was a healthy cluster and I had to rebuild because the OSD’s got 
accidentally created on the root disk. Out of 4 OSD’s I had to rebuild 3 of 
them.


[jshah@Lab-cephmon001 ~]$ ceph osd tree
# id weight type name up/down reweight
-1 0.5 root default
-2 0.0 host Lab-cephosd005
4 0.0 osd.4 up 1
-3 0.0 host Lab-cephosd001
0 0.0 osd.0 up 1
-4 0.0 host Lab-cephosd002
1 0.0 osd.1 up 1
-5 0.0 host Lab-cephosd003
2 0.0 osd.2 up 1
-6 0.0 host Lab-cephosd004
3 0.0 osd.3 up 1


[jshah@Lab-cephmon001 ~]$ ceph pg 2.33 query
Error ENOENT: i don't have paid 2.33

—Jiten


On Nov 20, 2014, at 11:18 AM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


Just to be clear, this is from a cluster that was healthy, had a disk replaced, 
and hasn't returned to healthy?  It's not a new cluster that has never been 
healthy, right?

Assuming it's an existing cluster, how many OSDs did you replace?  It almost 
looks like you replaced multiple OSDs at the same time, and lost data because 
of it.

Can you give us the output of `ceph osd tree`, and `ceph pg 2.33 query`?


On Wed, Nov 19, 2014 at 2:14 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
After rebuilding a few OSD’s, I see that the pg’s are stuck in degraded mode. 
Sone are in the unclean and others are in the stale state. Somehow the MDS is 
also degraded. How do I recover the OSD’s and the MDS back to healthy ? Read 
through the documentation and on the web but no luck so far.

pg 2.33 is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 0.30 is stuck unclean since forever, current state 
stale+active+degraded+remapped, last

Re: [ceph-users] pg's degraded

2014-11-20 Thread Michael Kuriger
Maybe delete the pool and start over?


From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of JIten 
Shah
Sent: Thursday, November 20, 2014 5:46 PM
To: Craig Lewis
Cc: ceph-users
Subject: Re: [ceph-users] pg's degraded

Hi Craig,

Recreating the missing PG's fixed it.  Thanks for your help.

But when I tried to mount the Filesystem, it gave me the mount error 5. I 
tried to restart the MDS server but it won't work. It tells me that it's 
laggy/unresponsive.

BTW, all these machines are VM's.

[jshah@Lab-cephmon001 ~]$ ceph health detail
HEALTH_WARN mds cluster is degraded; mds Lab-cephmon001 is laggy
mds cluster is degraded
mds.Lab-cephmon001 at 17.147.16.111:6800/3745284 rank 0 is replaying journal
mds.Lab-cephmon001 at 17.147.16.111:6800/3745284 is laggy/unresponsive


-Jiten

On Nov 20, 2014, at 4:20 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:


Ok. Thanks.

-Jiten

On Nov 20, 2014, at 2:14 PM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


If there's no data to lose, tell Ceph to re-create all the missing PGs.

ceph pg force_create_pg 2.33

Repeat for each of the missing PGs.  If that doesn't do anything, you might 
need to tell Ceph that you lost the OSDs.  For each OSD you moved, run ceph osd 
lost OSDID, then try the force_create_pg command again.

If that doesn't work, you can keep fighting with it, but it'll be faster to 
rebuild the cluster.



On Thu, Nov 20, 2014 at 1:45 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
Thanks for your help.

I was using puppet to install the OSD's where it chooses a path over a device 
name. Hence it created the OSD in the path within the root volume since the 
path specified was incorrect.

And all 3 of the OSD's were rebuilt at the same time because it was unused and 
we had not put any data in there.

Any way to recover from this or should i rebuild the cluster altogether.

-Jiten

On Nov 20, 2014, at 1:40 PM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


So you have your crushmap set to choose osd instead of choose host?

Did you wait for the cluster to recover between each OSD rebuild?  If you 
rebuilt all 3 OSDs at the same time (or without waiting for a complete recovery 
between them), that would cause this problem.



On Thu, Nov 20, 2014 at 11:40 AM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
Yes, it was a healthy cluster and I had to rebuild because the OSD's got 
accidentally created on the root disk. Out of 4 OSD's I had to rebuild 3 of 
them.


[jshah@Lab-cephmon001 ~]$ ceph osd tree
# id weight type name up/down reweight
-1 0.5 root default
-2 0.0 host Lab-cephosd005
4 0.0 osd.4 up 1
-3 0.0 host Lab-cephosd001
0 0.0 osd.0 up 1
-4 0.0 host Lab-cephosd002
1 0.0 osd.1 up 1
-5 0.0 host Lab-cephosd003
2 0.0 osd.2 up 1
-6 0.0 host Lab-cephosd004
3 0.0 osd.3 up 1


[jshah@Lab-cephmon001 ~]$ ceph pg 2.33 query
Error ENOENT: i don't have paid 2.33

-Jiten


On Nov 20, 2014, at 11:18 AM, Craig Lewis 
cle...@centraldesktop.commailto:cle...@centraldesktop.com wrote:


Just to be clear, this is from a cluster that was healthy, had a disk replaced, 
and hasn't returned to healthy?  It's not a new cluster that has never been 
healthy, right?

Assuming it's an existing cluster, how many OSDs did you replace?  It almost 
looks like you replaced multiple OSDs at the same time, and lost data because 
of it.

Can you give us the output of `ceph osd tree`, and `ceph pg 2.33 query`?


On Wed, Nov 19, 2014 at 2:14 PM, JIten Shah 
jshah2...@me.commailto:jshah2...@me.com wrote:
After rebuilding a few OSD's, I see that the pg's are stuck in degraded mode. 
Sone are in the unclean and others are in the stale state. Somehow the MDS is 
also degraded. How do I recover the OSD's and the MDS back to healthy ? Read 
through the documentation and on the web but no luck so far.

pg 2.33 is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 0.30 is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 1.31 is stuck unclean since forever, current state stale+active+degraded, 
last acting [2]
pg 2.32 is stuck unclean for 597129.903922, current state 
stale+active+degraded, last acting [2]
pg 0.2f is stuck unclean for 597129.903951, current state 
stale+active+degraded, last acting [2]
pg 1.2e is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 2.2d is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [2]
pg 0.2e is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 1.2f is stuck unclean for 597129.904015, current state 
stale+active+degraded, last acting [2]
pg 2.2c is stuck unclean since forever, current state 
stale+active+degraded+remapped, last acting [3]
pg 0.2d is stuck stale for 422844.566858, 

[ceph-users] installing ceph object gateway

2014-11-06 Thread Michael Kuriger
Is there updated documentation explaining how to install and use the
object gateway?


http://docs.ceph.com/docs/master/install/install-ceph-gateway/

I attempted this install and quickly run into problems.

Thanks!
-M

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com