[ceph-users] [ceph][nautilus] rbd-target-api Configuration does not have an entry for this host

2020-04-29 Thread Ignazio Cassano
Hello All, I just installed ceph iscsi target following "manual installation" at https://docs.ceph.com/docs/nautilus/rbd/iscsi-target-cli-manual-install/ When the api server starts it seems to work but in service status it reports: apr 30 07:22:51 lab-ceph-01 rbd-target-api[58740]: Processing

[ceph-users] Re: ceph-ansible question

2020-04-29 Thread Szabo, Istvan (Agoda)
Ok, so this is how it works with LVM. I was playing around a bit with the following config: --- dummy: osd_scenario: lvm crush_device_class: "nvme" osds_per_device: 4 devices: - /dev/sde lvm_volumes: - data: /dev/sdc db: db_osd1 db_vg: journal crush_device_class: "hdd" - data:

[ceph-users] Re: ceph-ansible question

2020-04-29 Thread Robert LeBlanc
Yes, but they are just LVs, so you can not create them or delete them easily so that it returns the space to the VG for something else. Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Tue, Apr 28, 2020 at 6:55 PM Szabo, Istvan (Agoda) <

[ceph-users] Re: Shutdown nautilus cluster, start stuck in peering

2020-04-29 Thread Marc Roos
Hmmm, !@#$!@#$ did not enable cluster network. -Original Message- To: ceph-users Subject: [ceph-users] Shutdown nautilus cluster, start stuck in peering Shutdown nautilus cluster, start stuck in peering I followed some manual here to do before turning all nodes off: ceph osd set

[ceph-users] Shutdown nautilus cluster, start stuck in peering

2020-04-29 Thread Marc Roos
Shutdown nautilus cluster, start stuck in peering I followed some manual here to do before turning all nodes off: ceph osd set norecover ; ceph osd set norebalance ; ceph osd set nobackfill ; ceph osd set nodown ; ceph osd set pause And after reboot almost al pg's were stuck in peering

[ceph-users] Re: How to apply ceph.conf changes using new tool cephadm

2020-04-29 Thread JC Lopez
Hi, later version of Ceph do not rely on the configuration file anymore but on a MON centralized configuration which explain the lack of the config push function. Set your option in the MON config DB and it will be picked up by the daemon or the client upon connection ceph config mon.{id}

[ceph-users] How to apply ceph.conf changes using new tool cephadm

2020-04-29 Thread Gencer W . Genç
Hi, === NOTE: I do not see my thread in ceph-list for some reason. I don't know if list received my question or not. So, sorry if this is duplicate. === I just deployed a new cluster with cephadm instead of ceph-deploy. In tyhe past, If i change ceph.conf for tweaking, i was able to copy them

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Francois Legrand
Hello, We set bluefs_buffered_io to false for the whole cluster except 2 osd (21 and 49) for which we decided to keep the value to true for future experiments/troubleshooting as you asked. We then restarted all the 25 downs osd and they started... except one (number 8) which still continue to

[ceph-users] Re: Upgrade Luminous to Nautilus on a Debian system

2020-04-29 Thread Paul Emmerich
We run the Luminous/Mimic -> Nautilus upgrade by upgrading Ceph and Debian at the same time, i.e., your first scenario. Didn't encounter any problems with that; the Nautilus upgrade has been very smooth for us and we've migrated almost all of our deployments using our fully automated upgrade

[ceph-users] Re: Upgrade Luminous to Nautilus on a Debian system

2020-04-29 Thread Herve Ballans
Hi Alex, Thanks a lot for your tips. I note that for my planned upgrade. I take the opportunity here to add a complementary question regarding the require-osd-release functionality (ceph osd require-osd-release nautilus ) I remember that one time I did that (on another cluster, a proxmox

[ceph-users] Re: Upgrade Luminous to Nautilus on a Debian system

2020-04-29 Thread Alex Gorbachev
On Wed, Apr 29, 2020 at 11:54 AM Herve Ballans wrote: > Hi all, > > I'm planning to upgrade one on my Ceph Cluster currently on Luminous > 12.2.13 / Debian Stretch (updated). > On this cluster, Luminous is packaged from the official Ceph repo (deb > https://download.ceph.com/debian-luminous/

[ceph-users] Re: manually configure radosgw

2020-04-29 Thread Patrick Dowler
Marc and Ken, Thanks for the tips! I will try to work through this asap. -- Patrick Dowler Canadian Astronomy Data Centre Victoria, BC, Canada On Tue, 28 Apr 2020 at 16:04, Ken Dreyer wrote: > On Mon, Apr 27, 2020 at 11:21 AM Patrick Dowler > wrote: > > > > I am trying to manually create a

[ceph-users] Re: Newbie Question: CRUSH and Librados Profiling

2020-04-29 Thread Bobby
Hi once again ! Can someone please help me on this question :-) Bobby ! On Wed, Apr 29, 2020 at 2:05 PM Bobby wrote: > > > Hi, > > It is a newbie question. I would be really thankful if you can answer it > please. I want to compile the Ceph source code. Because I want to profile > *Librados*

[ceph-users] Re: Problems getting ceph-iscsi to work

2020-04-29 Thread Ron Gage
As I continue to try to get this to work in the first place, I came across this gem in journalctl -xe: Apr 29 12:00:57 iscsi1 rbd-target-api[20275]: Traceback (most recent call last): Apr 29 12:00:57 iscsi1 rbd-target-api[20275]: File "/usr/bin/rbd-target-api", line 2951, in Apr 29 12:00:57

[ceph-users] Re: Lock errors in iscsi gateway

2020-04-29 Thread Simone Lazzaris
> > What version of tcmu-runner did you use? Was it one of the 1.4 or 1.5 > releases or from the github master branch? > > There was a bug in the older 1.4 release where due to a linux kernel > initiator side change the behavior for an error code we used went from > retrying for up to 5 minutes

[ceph-users] Upgrade Luminous to Nautilus on a Debian system

2020-04-29 Thread Herve Ballans
Hi all, I'm planning to upgrade one on my Ceph Cluster currently on Luminous 12.2.13 / Debian Stretch (updated). On this cluster, Luminous is packaged from the official Ceph repo (deb https://download.ceph.com/debian-luminous/ stretch main) I would like to upgrade it with Debian Buster and

[ceph-users] Re: Lock errors in iscsi gateway

2020-04-29 Thread Mike Christie
On 4/29/20 2:11 AM, Simone Lazzaris wrote: > In data martedì 28 aprile 2020 18:41:27 CEST, Mike Christie ha scritto: > >   > >> Could you send me: > >> > >> 1. The /var/log/messages for the initiator when you do IO and see those > >> lock messages. > >   > > On the initiator (XenServer 7.1

[ceph-users] Re: kernel: ceph: mdsmap_decode got incorrect state(up:standby-replay)

2020-04-29 Thread Jake Grimmett
...the "mdsmap_decode" errors stopped suddenly on all our clients... Not exactly sure what the problem was, but restarting our standby mds demons seems to have been the fix. Here's the log on the standby mds exactly when the errors stopped: 2020-04-29 15:41:22.944 7f3d04e06700 1 mds.ceph-s2

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Igor Fedotov
So the crash seems to be caused by the same issue - big (and presumably incomplete) write and subsequent read failure. I've managed to repro this locally. So bluefs_buffered_io seems to be a remedy for now. But additionally I can observe multiple slow ops indications in this new log and I

[ceph-users] How to apply ceph.conf changes using new tool cephadm

2020-04-29 Thread Gencer W . Genç
Hi, I just deployed a new cluster with cephadm instead of ceph-deploy. In tyhe past, If i change ceph.conf for tweaking, i was able to copy them and apply to all servers. But i cannot find this on new cephadm tool. I did few changes on ceph.conf but ceph is unaware of those changes. How can i

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Francois Legrand
Here are the logs of the newly crashed osd. F. Le 29/04/2020 à 16:21, Igor Fedotov a écrit : Sounds interesting - could you please share the crash log for these new OSDs? They presumably suffer from another issue. At least that first crash is caused by something else. "bluefs buffered io"

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Igor Fedotov
Sounds interesting - could you please share the crash log for these new OSDs? They presumably suffer from another issue. At least that first crash is caused by something else. "bluefs buffered io" can be injected on the fly but I expect it to help when OSD isn't starting up only. On

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Francois Legrand
Ok we will try that. Indeed, restarting osd.5 triggered the falling down of two other osds in the cluster. Thus we will set bluefs buffered io = false for all osds and force bluefs buffered io = true for one of the downs osds. Is that modification needs to use injectargs or changing it in the

[ceph-users] Re: Problems getting ceph-iscsi to work

2020-04-29 Thread Ron Gage
Well, some progress - for what it's worth... rbd-target-api ran for about 5 seconds before it failed. It also produced some logs. There are no apparent errors in the logs however: 2020-04-29 09:41:35,273 DEBUG [common.py:139:_open_ioctx()] - (_open_ioctx) Opening connection to rbd pool

[ceph-users] Re: How to debug ssh: ceph orch host add ceph01 10.10.1.1

2020-04-29 Thread Sebastian Wagner
We've improved the docs a little bit. Does https://docs.ceph.com/docs/master/cephadm/troubleshooting/#ssh-errors help you now? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Igor Fedotov
That's bluefs buffered io = false which did the trick. It modified write path and this presumably has fixed large write(s). Trying to reproduce locally but please preserve at least one failing OSD (i.e. do not start it with the disabled buffered io) for future experiments/troubleshooting for

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Francois Legrand
Hi, It seems much better with theses options. The osd is now up since 10mn without crashing (before it was rebooting after ~1mn). F. Le 29/04/2020 à 15:16, Igor Fedotov a écrit : Hi Francois, I'll write a more thorough response a bit later. Meanwhile could you please try OSD startup with

[ceph-users] CDS Pacific: Dashboard planning summary

2020-04-29 Thread Lenz Grimmer
Hi all, a few weeks ago, a number of virtual Ceph Developer Summit meetings took place as a replacement for the in-person summit that was planned as part of Cephalocon in Seoul: https://pad.ceph.com/p/cds-pacific The Ceph Dashboard team also participated in these and held three video conference

[ceph-users] Problems getting ceph-iscsi to work

2020-04-29 Thread Ron Gage
Hi everyone! I have been working for the past week or so trying to get ceph-iscsi to work - Octopus release. Even just getting a single node working would be a major victory in this battle but so far, victory has proven elusive. My setup: a pair of Dell Optiplex 7010 desktops, each with 16

[ceph-users] kernel: ceph: mdsmap_decode got incorrect state(up:standby-replay)

2020-04-29 Thread Jake Grimmett
Dear all, After enabling "allow_standby_replay" on our cluster we are getting (lots) of identical errors on the client /var/log/messages like Apr 29 14:21:26 hal kernel: ceph: mdsmap_decode got incorrect state(up:standby-replay) We are using the ml kernel 5.6.4-1.el7 on Scientific Linux 7.8

[ceph-users] Re: Problems getting ceph-iscsi to work

2020-04-29 Thread Jason Dillaman
On Wed, Apr 29, 2020 at 9:27 AM Ron Gage wrote: > Hi everyone! > > I have been working for the past week or so trying to get ceph-iscsi to > work - Octopus release. Even just getting a single node working would be a > major victory in this battle but so far, victory has proven elusive. > > My

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Igor Fedotov
Hi Francois, I'll write a more thorough response a bit later. Meanwhile could you please try OSD startup with the following settings now: debug-bluefs abd debug-bdev = 20 bluefs sync write = false bluefs buffered io = false Thanks, Igor On 4/29/2020 3:35 PM, Francois Legrand wrote: Hi

[ceph-users] cephfs change/migrate default data pool

2020-04-29 Thread Kenneth Waegeman
Hi all, I read in some release notes it is recommended to have your default data pool replicated and use erasure coded pools as additional pools through layouts. We have still a cephfs with +-1PB usage with a EC default pool. Is there a way to change the default pool or some other kind of

[ceph-users] Re: osd crashing and rocksdb corruption

2020-04-29 Thread Francois Legrand
Hi Igor, Here is what we did : First, as other osd were falling down, we stopped all operations with ceph osd set norecover ceph osd set norebalance ceph osd set nobackfill ceph osd set pause to avoid other crashs ! Then we moved to your recommandations (still testing on osd 5): in

[ceph-users] Newbie Question: CRUSH and Librados Profiling

2020-04-29 Thread Bobby
Hi, It is a newbie question. I would be really thankful if you can answer it please. I want to compile the Ceph source code. Because I want to profile *Librados* and *CRUSH* function stacks. Please verify if this is the right track I am following: - I have cloned the Ceph from Ceph git

[ceph-users] Re: Upgrading to Octopus

2020-04-29 Thread Gert Wieberdink
Hello Simon,The nautilus leftover idea sounds quite realistic to me. When I had Nautilus/CentOS7.7 combination running, I had a lot of internal server error 500 troubleusing the Ceph Dashboard. But only when I tried to access the RadosGateway tab in the nautilus dashboard. However, I was never

[ceph-users] Re: RGW and the orphans

2020-04-29 Thread Katarzyna Myrek
Hi @Eric Ivancich my cluster has some history and trash gathered over the years. Most (terabytes) is from https://tracker.ceph.com/issues/43756. I was able to reproduce the problem on my LAB and it is for sure connected with https://tracker.ceph.com/issues/43756. When you are on a version older

[ceph-users] Re: Lock errors in iscsi gateway

2020-04-29 Thread Simone Lazzaris
In data martedì 28 aprile 2020 18:41:27 CEST, Mike Christie ha scritto: > Could you send me: > > 1. The /var/log/messages for the initiator when you do IO and see those > lock messages. On the initiator (XenServer 7.1 which is based on CentOS AFAIK) the /var/log/messages is empty. I

[ceph-users] Re: Upgrading to Octopus

2020-04-29 Thread Simon Sutter
Hello Gert, I recreated the self signed certificate. SELinux was disabled and I temporarely disabled the firewall. It still doesn't work and there is no entry in journalctl -f. Somewhere there is still something from the previous nautilus or centos7 installation, causing this problem. I