Re: [ceph-users] Need to replace OSD. How do I find physical disk

2019-07-19 Thread Tarek Zegar
On the host with the osd run:


 ceph-volume lvm list







From:   "☣Adam" 
To: ceph-users@lists.ceph.com
Date:   07/18/2019 03:25 PM
Subject:[EXTERNAL] Re: [ceph-users] Need to replace OSD. How do I find
physical disk
Sent by:"ceph-users" 



The block device can be found in /var/lib/ceph/osd/ceph-$ID/block
# ls -l /var/lib/ceph/osd/ceph-9/block

In my case it links to /dev/sdbvg/sdb which makes is pretty obvious
which drive this is, but the Volume Group and Logical volume could be
named anything.  To see what physical disk(s) make up this volume group
use lvblk (as Reed suggested)
# lvblk

If that drive needs to be located in a computer with many drives,
smartctl should be able to be used to pull the make, model, and serial
number
# smartctl -i /dev/sdb


I was not aware of ceph-volume, or `ceph-disk list` (which is apparently
now deprecated in favor of ceph-volume), so thank you to all in this
thread for teaching about alternative (arguably more proper) ways of
doing this. :-)

On 7/18/19 12:58 PM, Pelletier, Robert wrote:
> How do I find the physical disk in a Ceph luminous cluster in order to
> replace it. Osd.9 is down in my cluster which resides on ceph-osd1 host.
>
>
>
> If I run lsblk -io KNAME,TYPE,SIZE,MODEL,SERIAL I can get the serial
> numbers of all the physical disks for example
>
> sdb    disk  1.8T ST2000DM001-1CH1 Z1E5VLRG
>
>
>
> But how do I find out which osd is mapped to sdb and so on?
>
> When I run df –h I get this
>
>
>
> [root@ceph-osd1 ~]# df -h
>
> Filesystem   Size  Used Avail Use% Mounted on
>
> /dev/mapper/ceph--osd1-root   19G  1.9G   17G  10% /
>
> devtmpfs  48G 0   48G   0% /dev
>
> tmpfs 48G 0   48G   0% /dev/shm
>
> tmpfs 48G  9.3M   48G   1% /run
>
> tmpfs 48G 0   48G   0% /sys/fs/cgroup
>
> /dev/sda3    947M  232M  716M  25% /boot
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-2
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-5
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-0
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-8
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-7
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-33
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-10
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-1
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-38
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-4
>
> tmpfs 48G   24K   48G
1% /var/lib/ceph/osd/ceph-6
>
> tmpfs    9.5G 0  9.5G   0% /run/user/0
>
>
>
>
>
> *Robert Pelletier, **IT and Security Specialist***
>
> Eastern Maine Community College
> (207) 974-4782 | 354 Hogan Rd., Bangor, ME 04401
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIF-g=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=TXW65vJi4jZZ8MBtN2cjvq0bG2nV1-y_EM43NonJWFs=a-SpJzGVKsv4FRPY4Q84J3RrM3-FRsTVJz3f0825pOc=

>
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIF-g=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=TXW65vJi4jZZ8MBtN2cjvq0bG2nV1-y_EM43NonJWFs=a-SpJzGVKsv4FRPY4Q84J3RrM3-FRsTVJz3f0825pOc=




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Client admin socket for RBD

2019-06-25 Thread Tarek Zegar

Sasha,

Sorry I don't get it, the documentation for the command states that in
order to see the config DB for all do: "ceph config dump"
To see what's in the config DB for a particular daemon do: "ceph config get
"
To see what's set for a particular daemon (be it from the config db,
override, conf file, etc): "ceph config show "

I don't  see anywhere where the command you mentioned is valid: "ceph
config get client.admin"

Here is output from a monitor node on bare metal
root@hostmonitor1:~# ceph config dump
WHO  MASK LEVELOPTIONVALUERO
mon.hostmonitor1  advanced mon_osd_down_out_interval 30
mon.hostmonitor1  advanced mon_osd_min_in_ratio  0.10
  mgr unknown  mgr/balancer/active   1*
  mgr unknown  mgr/balancer/mode upmap*
osd.* advanced debug_ms  20/20
osd.* advanced osd_max_backfills 2

root@hostmonitor1:~# ceph config get mon.hostmonitor1
WHO  MASK LEVELOPTIONVALUERO
mon.hostmonitor1  advanced mon_osd_down_out_interval 30
mon.hostmonitor1  advanced mon_osd_min_in_ratio  0.10

root@hostmonitor1:~# ceph config get client.admin
WHO MASK LEVEL OPTION VALUE RO   <-blank

What am I missing from what you're suggesting?


Thank you for clarifying,
Tarek Zegar
Senior SDS Engineer
Email tze...@us.ibm.com
Mobile 630.974.7172






From:   Sasha Litvak 
To: Tarek Zegar , ceph-users@lists.ceph.com
Date:   06/25/2019 10:38 AM
Subject:[EXTERNAL] Re: Re: Re: [ceph-users] Client admin socket for RBD



Tarek,

Of course you are correct about the client nodes.  I executed this command
inside of container that runs mon.  Or it can be done on the bare metal
node that runs mon.  You essentially quering mon configuration database.

On Tue, Jun 25, 2019 at 8:53 AM Tarek Zegar  wrote:
  "config get" on a client.admin? There is no daemon for client.admin, I
  get nothing. Can you please explain?


  Tarek Zegar
  Senior SDS Engineer
  Email tze...@us.ibm.com
  Mobile 630.974.7172




  Inactive hide details for Sasha Litvak ---06/24/2019 07:48:46 PM---ceph
  config get client.admin On Mon, Jun 24, 2019, 1:10 PM TSasha Litvak
  ---06/24/2019 07:48:46 PM---ceph config get client.admin On Mon, Jun 24,
  2019, 1:10 PM Tarek Zegar  wrote:

  From: Sasha Litvak 
  To: Tarek Zegar 
  Date: 06/24/2019 07:48 PM
  Subject: [EXTERNAL] Re: Re: [ceph-users] Client admin socket for RBD



  ceph config get client.admin

  On Mon, Jun 24, 2019, 1:10 PM Tarek Zegar  wrote:
Alex,

Sorry real quick, what did you type to get that last bit of info?

Tarek Zegar
Senior SDS Engineer
Email tze...@us.ibm.com
Mobile 630.974.7172




Alex Litvak ---06/24/2019 01:07:28 PM---Jason, Here you go:

From: Alex Litvak 
To: ceph-users@lists.ceph.com
Cc: ceph-users <
public-ceph-users-idqoxfivofjgjs9i8mt...@plane.gmane.org>
Date: 06/24/2019 01:07 PM
Subject: [EXTERNAL] Re: [ceph-users] Client admin socket for RBD
Sent by: "ceph-users" 



Jason,

Here you go:

WHO    MASK LEVEL    OPTION                      VALUE
RO
client      advanced admin_socket
/var/run/ceph/$name.$pid.asok *
global      advanced cluster_network             10.0.42.0/23
*
global      advanced debug_asok                  0/0
global      advanced debug_auth                  0/0
global      advanced debug_bdev                  0/0
global      advanced debug_bluefs                0/0
global      advanced debug_bluestore             0/0
global      advanced debug_buffer                0/0
global      advanced debug_civetweb              0/0
global      advanced debug_client                0/0
global      advanced debug_compressor            0/0
global      advanced debug_context               0/0
global      advanced debug_crush                 0/0
global      advanced debug_crypto                0/0
global      advanced debug_dpdk                  0/0
global      advanced debug_eventtrace            0/0
global      advanced debug_filer                 0/0
global      advanced debug_filestore             0/0
global      advanced debug_finisher              0/0
global      advanced debug_fuse                  0/0
global      advanced debug_heartbeatmap          0/0
global      advanced debug_javaclient            0/0
global      advanced debug_journal               0/0
global      advanced debug_journaler             0/0
global      advanced debug_kinetic               0/0
  

Re: [ceph-users] Enable buffered write for bluestore

2019-06-13 Thread Tarek Zegar

http://docs.ceph.com/docs/master/rbd/rbd-config-ref/





From:   Trilok Agarwal 
To: ceph-users@lists.ceph.com
Date:   06/12/2019 07:31 PM
Subject:[EXTERNAL] [ceph-users] Enable buffered write for bluestore
Sent by:"ceph-users" 



Hi
How can we enable bluestore_default_buffered_write using ceph-conf utility
Any pointers would be appreciated
___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=IQSM2SLfLlMC9PBaiOwxBO2O-cyYKVyr23g8JOm-DF8=3b4-XBOn9Iq644agPjiuSxrFn2zc85p1o_4W41IH3sQ=




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reweight OSD to 0, why doesn't report degraded if UP set under Pool Size

2019-06-09 Thread Tarek Zegar
Hi Haung,

So you are suggesting that even though osd.4 in this case has weight 0,
it's still getting new data being written to it? I find that counter to
what weight 0 means.

Thanks
Tarek





From:   huang jun 
To: Tarek Zegar 
Cc: Paul Emmerich , Ceph Users

Date:   06/08/2019 05:27 AM
Subject:[EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't
report degraded if UP set under Pool Size



i think the write data will also write to the osd.4 in this case.
bc your osd.4 is not down, so the ceph don't think the pg have some osd
down,
and it will replicated the data to all osds in actingbackfill set.

Tarek Zegar  于2019年6月7日周五 下午10:37写道:
  Paul / All

  I'm not sure what warning your are referring to, I'm on Nautilus. The
  point I'm getting at is if you weight out all OSD on a host with a
  cluster of 3 OSD hosts with 3 OSD each, crush rule = host, then write to
  the cluster, it *should* imo not just say remapped but undersized /
  degraded.

  See below, 1 out of the 3 OSD hosts has ALL it's OSD marked out and
  weight = 0. When you write (say using FIO), the PGs *only* have 2 OSD in
  them (UP set), which is pool min size. I don't understand why it's not
  saying undersized/degraded, this seems like a bug. Who cares that the
  Acting Set has the 3 original OSD in it, the actual data is only on 2
  OSD, which is a degraded state

  root@hostadmin:~# ceph -s
  cluster:
  id: 33d41932-9df2-40ba-8e16-8dedaa4b3ef6
  health: HEALTH_WARN
  application not enabled on 1 pool(s)

  services:
  mon: 1 daemons, quorum hostmonitor1 (age 29m)
  mgr: hostmonitor1(active, since 31m)
  osd: 9 osds: 9 up, 6 in; 100 remapped pgs

  data:
  pools: 1 pools, 100 pgs
  objects: 520 objects, 2.0 GiB
  usage: 15 GiB used, 75 GiB / 90 GiB avail
  pgs: 520/1560 objects misplaced (33.333%)
  100 active+clean+remapped

  root@hostadmin:~# ceph osd tree
  ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  -1 0.08817 root default
  -3 0.02939 host hostosd1
  0 hdd 0.00980 osd.0 up 1.0 1.0
  3 hdd 0.00980 osd.3 up 1.0 1.0
  6 hdd 0.00980 osd.6 up 1.0 1.0
  -5 0.02939 host hostosd2
  1 hdd 0.00980 osd.1 up 0 1.0
  4 hdd 0.00980 osd.4 up 0 1.0
  7 hdd 0.00980 osd.7 up 0 1.0
  -7 0.02939 host hostosd3
  2 hdd 0.00980 osd.2 up 1.0 1.0
  5 hdd 0.00980 osd.5 up 1.0 1.0
  8 hdd 0.00980 osd.8 up 1.0 1.0


  root@hostadmin:~# ceph osd df
  ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS
  STATUS
  0 hdd 0.00980 1.0 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB
  17.48 1.03 34 up
  3 hdd 0.00980 1.0 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB 8.2 GiB
  17.48 1.03 36 up
  6 hdd 0.00980 1.0 10 GiB 1.6 GiB 593 MiB 4 KiB 1024 MiB 8.4 GiB 15.80
  0.93 30 up
  1 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
  4 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up
  7 hdd 0.00980 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 100 up
  2 hdd 0.00980 1.0 10 GiB 1.5 GiB 525 MiB 8 KiB 1024 MiB 8.5 GiB 15.13
  0.89 20 up
  5 hdd 0.00980 1.0 10 GiB 1.9 GiB 941 MiB 4 KiB 1024 MiB 8.1 GiB 19.20
  1.13 43 up
  8 hdd 0.00980 1.0 10 GiB 1.6 GiB 657 MiB 8 KiB 1024 MiB 8.4 GiB 16.42
  0.97 37 up
  TOTAL 90 GiB 15 GiB 6.2 GiB 61 KiB 9.0 GiB 75 GiB 16.92
  MIN/MAX VAR: 0.89/1.13 STDDEV: 1.32
  Tarek Zegar
  Senior SDS Engineer
  Email tze...@us.ibm.com
  Mobile 630.974.7172




  Inactive hide details for Paul Emmerich ---06/07/2019 05:25:23
  AM---remapped no longer triggers a health warning in nautilus. YPaul
  Emmerich ---06/07/2019 05:25:23 AM---remapped no longer triggers a health
  warning in nautilus. Your data is still there, it's just on the

  From: Paul Emmerich 
  To: Tarek Zegar 
  Cc: Ceph Users 
  Date: 06/07/2019 05:25 AM
  Subject: [EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't
  report degraded if UP set under Pool Size



  remapped no longer triggers a health warning in nautilus.

  Your data is still there, it's just on the wrong OSD if that OSD is still
  up and running.


  Paul

  --
  Paul Emmerich

  Looking for help with your Ceph cluster? Contact us at https://croit.io

  croit GmbH
  Freseniusstr. 31h
  81247 München
  www.croit.io
  Tel: +49 89 1896585 90


  On Thu, Jun 6, 2019 at 10:48 PM Tarek Zegar  wrote:
For testing purposes I set a bunch of OSD to 0 weight, this
correctly forces Ceph to not use said OSD. I took enough out such
that the UP set only had Pool min size # of OSD (i.e 2 OSD).

Two Questions:
1. Why doesn't the acting set eventually match the UP set and
simply point to [6,5] only
2. Why are none of the PGs marked as undersized and degraded? The
data is only hosted on 2 OSD rather then Pool size (3), I would
expect a undersized warning and degraded for PG with data?

Example PG:
PG 1.4d active+clean+remapped UP= [6,5] Acting = [6,5,4]

OSD Tree:
ID CLASS WEIGHT TYPE NAME STATUS

Re: [ceph-users] Reweight OSD to 0, why doesn't report degraded if UP set under Pool Size

2019-06-07 Thread Tarek Zegar

Paul / All

I'm not sure what warning your are referring to, I'm on Nautilus. The point
I'm getting at is if you weight out all OSD on a host  with a cluster of 3
OSD hosts with 3 OSD each, crush rule = host, then write to the cluster, it
*should* imo not just say remapped but undersized / degraded.

See below, 1 out of the 3 OSD hosts has ALL it's OSD marked out and weight
= 0. When you write (say using FIO), the PGs *only* have 2 OSD in them (UP
set), which is pool min size. I don't understand why it's not saying
undersized/degraded, this seems like a bug. Who cares that the Acting Set
has the 3 original OSD in it, the actual data is only on 2 OSD, which is a
degraded state

root@hostadmin:~# ceph -s
cluster:
id: 33d41932-9df2-40ba-8e16-8dedaa4b3ef6
health: HEALTH_WARN
application not enabled on 1 pool(s)

  services:
mon: 1 daemons, quorum hostmonitor1 (age 29m)
mgr: hostmonitor1(active, since 31m)
osd: 9 osds: 9 up, 6 in; 100 remapped pgs

  data:
pools:   1 pools, 100 pgs
objects: 520 objects, 2.0 GiB
usage:   15 GiB used, 75 GiB / 90 GiB avail
pgs: 520/1560 objects misplaced (33.333%)
 100 active+clean+remapped

root@hostadmin:~# ceph osd tree
ID CLASS WEIGHT  TYPE NAME STATUS REWEIGHT PRI-AFF
-1   0.08817 root default
-3   0.02939 host hostosd1
 0   hdd 0.00980 osd.0 up  1.0 1.0
 3   hdd 0.00980 osd.3 up  1.0 1.0
 6   hdd 0.00980 osd.6 up  1.0 1.0
-5   0.02939 host hostosd2
 1   hdd 0.00980 osd.1 up0 1.0
 4   hdd 0.00980 osd.4 up0 1.0
 7   hdd 0.00980 osd.7 up0 1.0
-7   0.02939 host hostosd3
 2   hdd 0.00980 osd.2 up  1.0 1.0
 5   hdd 0.00980 osd.5 up  1.0 1.0
 8   hdd 0.00980 osd.8 up  1.0 1.0


root@hostadmin:~# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE   RAW USE DATAOMAP   META AVAIL
%USE  VAR  PGS STATUS
 0   hdd 0.00980  1.0 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB
8.2 GiB 17.48 1.03  34 up
 3   hdd 0.00980  1.0 10 GiB 1.7 GiB 765 MiB 12 KiB 1024 MiB
8.2 GiB 17.48 1.03  36 up
 6   hdd 0.00980  1.0 10 GiB 1.6 GiB 593 MiB  4 KiB 1024 MiB
8.4 GiB 15.80 0.93  30 up
 1   hdd 0.0098000 B 0 B 0 B0 B  0 B 0 B
00   0 up
 4   hdd 0.0098000 B 0 B 0 B0 B  0 B 0 B
00   0 up
 7   hdd 0.0098000 B 0 B 0 B0 B  0 B 0 B
00 100 up
 2   hdd 0.00980   1.0 10 GiB 1.5 GiB 525 MiB  8 KiB 1024 MiB 8.5
GiB 15.13 0.89  20 up
 5   hdd 0.00980   1.0 10 GiB 1.9 GiB 941 MiB  4 KiB 1024 MiB 8.1
GiB 19.20 1.13  43 up
 8   hdd 0.00980   1.0 10 GiB 1.6 GiB 657 MiB  8 KiB 1024 MiB 8.4
GiB 16.42 0.97  37 up
TOTAL 90 GiB  15 GiB 6.2 GiB 61 KiB  9.0 GiB  75 GiB
16.92
MIN/MAX VAR: 0.89/1.13  STDDEV: 1.32
Tarek Zegar
Senior SDS Engineer
Email tze...@us.ibm.com
Mobile 630.974.7172






From:   Paul Emmerich 
To: Tarek Zegar 
Cc: Ceph Users 
Date:   06/07/2019 05:25 AM
Subject:[EXTERNAL] Re: [ceph-users] Reweight OSD to 0, why doesn't
report degraded if UP set under Pool Size



remapped no longer triggers a health warning in nautilus.

Your data is still there, it's just on the wrong OSD if that OSD is still
up and running.


Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Thu, Jun 6, 2019 at 10:48 PM Tarek Zegar  wrote:
  For testing purposes I set a bunch of OSD to 0 weight, this correctly
  forces Ceph to not use said OSD. I took enough out such that the UP set
  only had Pool min size # of OSD (i.e 2 OSD).

  Two Questions:
  1. Why doesn't the acting set eventually match the UP set and simply
  point to [6,5] only
  2. Why are none of the PGs marked as undersized and degraded? The data is
  only hosted on 2 OSD rather then Pool size (3), I would expect a
  undersized warning and degraded for PG with data?

  Example PG:
  PG 1.4d active+clean+remapped UP= [6,5] Acting = [6,5,4]

  OSD Tree:
  ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
  -1 0.08817 root default
  -3 0.02939 host hostosd1
  0 hdd 0.00980 osd.0 up 1.0 1.0
  3 hdd 0.00980 osd.3 up 1.0 1.0
  6 hdd 0.00980 osd.6 up 1.0 1.0
  -5 0.02939 host hostosd2
  1 hdd 0.00980 osd.1 up 0 1.0
  4 hdd 0.00980 osd.4 up 0 1.0
  7 hdd 0.00980 osd.7 up 0 1.0
  -7 0.02939 host hostosd3
  2 hdd 0.00980 osd.2 up 1.0 1.0
  5 hdd 0.00980 osd.5 up 1.0 1.0
  8 hdd 0.00980 osd.8 up 0 1.0




  ___
  ceph-users mailing list
  ceph-users@lists.ceph.com
  http://lists.ceph.com

[ceph-users] Reweight OSD to 0, why doesn't report degraded if UP set under Pool Size

2019-06-06 Thread Tarek Zegar

For testing purposes I set a bunch of OSD to 0 weight, this correctly
forces Ceph to not use said OSD. I took enough out such that the UP set
only had Pool min size # of OSD (i.e 2 OSD).

Two Questions:
1. Why doesn't the acting set eventually match the UP set and simply point
to [6,5] only
2. Why are none of the PGs marked as undersized and degraded? The data is
only hosted on 2 OSD rather then Pool size (3), I would expect a undersized
warning and degraded for PG with data?

Example PG:
PG 1.4d active+clean+remapped  UP= [6,5] Acting = [6,5,4]

OSD Tree:
ID CLASS WEIGHT  TYPE NAME STATUS REWEIGHT PRI-AFF
-1   0.08817 root default
-3   0.02939 host hostosd1
 0   hdd 0.00980 osd.0 up  1.0 1.0
 3   hdd 0.00980 osd.3 up  1.0 1.0
 6   hdd 0.00980 osd.6 up  1.0 1.0
-5   0.02939 host hostosd2
 1   hdd 0.00980 osd.1 up0 1.0
 4   hdd 0.00980 osd.4 up0 1.0
 7   hdd 0.00980 osd.7 up0 1.0
-7   0.02939 host hostosd3
 2   hdd 0.00980 osd.2 up  1.0 1.0
 5   hdd 0.00980 osd.5 up  1.0 1.0
 8   hdd 0.00980 osd.8 up0 1.0



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fix scrub error in bluestore.

2019-06-06 Thread Tarek Zegar

Look here
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-pg/#pgs-inconsistent


Read error typically is a disk issue. The doc is not clear on how to
resolve that






From:   Alfredo Rezinovsky 
To: Ceph Users 
Date:   06/06/2019 10:58 AM
Subject:[EXTERNAL] [ceph-users] Fix scrub error in bluestore.
Sent by:"ceph-users" 



https://ceph.com/geen-categorie/ceph-manually-repair-object/

is a little outdated.

After stopping the OSD, flushing the journal I don't have any clue on how
to move the object (easy in filestore).

I have thins in my osd log.

2019-06-05 10:46:41.418 7f47d0502700 -1 log_channel(cluster) log [ERR] :
10.c5 shard 2 soid 10:a39e2c78:::183f81f.0001:head : candidate had
a read error

How can I fix it?

--
Alfrenovsky___
ceph-users mailing list
ceph-users@lists.ceph.com
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwICAg=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=352TJwgu0vnFCTdMhAtPjFy3LjdYBfTkgOCdE2HTktQ=M9UCn5VB0zy165xxF7Ip1o4HxjQZMz6QvEXcDYwZIaI=



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Tarek Zegar

Hi Oliver

Here is the output of the active mgr log after I toggled balancer off / on,
I grep'd out only "balancer" as it was far to verbose (see below). When I
look at ceph osd df I see it optimized :)
I would like to understand two things however, why is "prepared 0/10
changes" zero if it actually did something, what in the log can I look for
before I toggled that said basically "hey balancer isn't going to work
because I still think min-client-compact-level < luminous"

Thanks for helping me in getting this working!



root@hostmonitor1:/var/log/ceph# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USE AVAIL   %USE  VAR  PGS
 1   hdd 0.0098000 B 0 B 0 B 00   0
 3   hdd 0.00980  1.0 10 GiB 5.3 GiB 4.7 GiB 53.25 0.97 150
 6   hdd 0.00980  1.0 10 GiB 5.6 GiB 4.4 GiB 56.07 1.03 150
 0   hdd 0.0098000 B 0 B 0 B 00   0
 5   hdd 0.00980  1.0 10 GiB 5.7 GiB 4.3 GiB 56.97 1.04 151
 7   hdd 0.00980  1.0 10 GiB 5.2 GiB 4.8 GiB 52.35 0.96 149
 2   hdd 0.0098000 B 0 B 0 B 00   0
 4   hdd 0.00980  1.0 10 GiB 5.5 GiB 4.5 GiB 55.25 1.01 150
 8   hdd 0.00980  1.0 10 GiB 5.4 GiB 4.6 GiB 54.07 0.99 150
TOTAL 70 GiB  34 GiB  36 GiB 54.66
MIN/MAX VAR: 0.96/1.04  STDDEV: 1.60


2019-05-29 17:06:49.324 7f40ce42a700  0 log_channel(audit) log [DBG] :
from='client.11262 192.168.0.12:0/4104979884' entity='client.admin' cmd=
[{"prefix": "balancer off", "target": ["mgr", ""]}]: dispatch
2019-05-29 17:06:49.324 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer status'
2019-05-29 17:06:49.324 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer mode'
2019-05-29 17:06:49.324 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer on'
2019-05-29 17:06:49.324 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer off'
2019-05-29 17:06:49.324 7f40cec2b700  1 mgr[balancer] Handling command:
'{'prefix': 'balancer off', 'target': ['mgr', '']}'
2019-05-29 17:06:49.388 7f40d747a700  4 mgr[py] Loaded module_config entry
mgr/balancer/max_misplaced:.50
2019-05-29 17:06:49.388 7f40d747a700  4 mgr[py] Loaded module_config entry
mgr/balancer/mode:upmap
2019-05-29 17:06:49.539 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/active
2019-05-29 17:06:49.539 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/begin_time
2019-05-29 17:06:49.539 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/end_time
2019-05-29 17:06:49.539 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/sleep_interval
2019-05-29 17:06:54.279 7f40ce42a700  4 mgr.server handle_command
prefix=balancer on
2019-05-29 17:06:54.279 7f40ce42a700  0 log_channel(audit) log [DBG] :
from='client.11268 192.168.0.12:0/1339099349' entity='client.admin' cmd=
[{"prefix": "balancer on", "target": ["mgr", ""]}]: dispatch
2019-05-29 17:06:54.279 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer status'
2019-05-29 17:06:54.279 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer mode'
2019-05-29 17:06:54.279 7f40ce42a700  1 mgr.server handle_command
pyc_prefix: 'balancer on'
2019-05-29 17:06:54.279 7f40cec2b700  1 mgr[balancer] Handling command:
'{'prefix': 'balancer on', 'target': ['mgr', '']}'
2019-05-29 17:06:54.287 7f40d747a700  4 mgr[py] Loaded module_config entry
mgr/balancer/active:1
2019-05-29 17:06:54.287 7f40d747a700  4 mgr[py] Loaded module_config entry
mgr/balancer/max_misplaced:.50
2019-05-29 17:06:54.287 7f40d747a700  4 mgr[py] Loaded module_config entry
mgr/balancer/mode:upmap
2019-05-29 17:06:54.299 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/active
2019-05-29 17:06:54.299 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/begin_time
2019-05-29 17:06:54.299 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/end_time
2019-05-29 17:06:54.299 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/sleep_interval
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr[balancer] Optimize plan
auto_2019-05-29_17:06:54
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/mode
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/max_misplaced
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr[balancer] Mode upmap, max
misplaced 0.50
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr[balancer] do_upmap
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/upmap_max_iterations
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr get_config get_config key:
mgr/balancer/upmap_max_deviation
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr[balancer] pools ['rbd']
2019-05-29 17:06:54.327 7f40cd3e8700  4 mgr[balancer] prepared 0/10 changes




From:   Oliver Freyermuth 
To: Tarek Zegar 
Cc: ceph-users@lists.ceph.com
Date:   05/29/2019 11

Re: [ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Tarek Zegar

Hi Oliver,

Thank you for the response, I did ensure that min-client-compact-level is
indeed Luminous (see below). I have no kernel mapped rbd clients. Ceph
versions reports mimic. Also below is the output of ceph balancer status.
One thing to note, I did enable the balancer after I already filled the
cluster, not from the onset. I had hoped that it wouldn't matter, though
your comment "if the compat-level is too old for upmap, you'll only find a
small warning about that in the logfiles" leaves me to believe that it will
*not* work in doing it this way, please confirm and let me know what
message to look for in /var/log/ceph.

Thank you!

root@hostadmin:~# ceph balancer status
{
"active": true,
"plans": [],
"mode": "upmap"
}



root@hostadmin:~# ceph features
{
"mon": [
{
"features": "0x3ffddff8ffacfffb",
"release": "luminous",
"num": 3
}
],
"osd": [
{
"features": "0x3ffddff8ffacfffb",
"release": "luminous",
"num": 7
}
],
"client": [
{
"features": "0x3ffddff8ffacfffb",
"release": "luminous",
"num": 1
}
],
"mgr": [
{
"features": "0x3ffddff8ffacfffb",
"release": "luminous",
"num": 3
}
]
}






From:   Oliver Freyermuth 
To: ceph-users@lists.ceph.com
Date:   05/29/2019 11:13 AM
Subject:[EXTERNAL] Re: [ceph-users] Balancer: uneven OSDs
Sent by:"ceph-users" 



Hi Tarek,

what's the output of "ceph balancer status"?
In case you are using "upmap" mode, you must make sure to have a
min-client-compat-level of at least Luminous:
http://docs.ceph.com/docs/mimic/rados/operations/upmap/
Of course, please be aware that your clients must be recent enough
(especially for kernel clients).

Sadly, if the compat-level is too old for upmap, you'll only find a small
warning about that in the logfiles,
but no error on terminal when activating the balancer or any other kind of
erroneous / health condition.

Cheers,
 Oliver

Am 29.05.19 um 17:52 schrieb Tarek Zegar:
> Can anyone help with this? Why can't I optimize this cluster, the pg
counts and data distribution is way off.
> __
>
> I enabled the balancer plugin and even tried to manually invoke it but it
won't allow any changes. Looking at ceph osd df, it's not even at all.
Thoughts?
>
> root@hostadmin:~# ceph osd df
> ID CLASS WEIGHT REWEIGHT SIZE USE AVAIL %USE VAR PGS
> 1 hdd 0.00980 0 0 B 0 B 0 B 0 0 0
> 3 hdd 0.00980 1.0 10 GiB 8.3 GiB 1.7 GiB 82.83 1.14 156
> 6 hdd 0.00980 1.0 10 GiB 8.4 GiB 1.6 GiB 83.77 1.15 144
> 0 hdd 0.00980 0 0 B 0 B 0 B 0 0 0
> 5 hdd 0.00980 1.0 10 GiB 9.0 GiB 1021 MiB 90.03 1.23 159
> 7 hdd 0.00980 1.0 10 GiB 7.7 GiB 2.3 GiB 76.57 1.05 141
> 2 hdd 0.00980 1.0 10 GiB 5.5 GiB 4.5 GiB 55.42 0.76 90
> 4 hdd 0.00980 1.0 10 GiB 5.9 GiB 4.1 GiB 58.78 0.81 99
> 8 hdd 0.00980 1.0 10 GiB 6.3 GiB 3.7 GiB 63.12 0.87 111
> TOTAL 90 GiB 53 GiB 37 GiB 72.93
> MIN/MAX VAR: 0.76/1.23 STDDEV: 12.67
>
>
> root@hostadmin:~# osdmaptool om --upmap out.txt --upmap-pool rbd
> osdmaptool: osdmap file 'om'
> writing upmap command output to: out.txt
> checking for upmap cleanups
> upmap, max-count 100, max*deviation 0.01 <---really? It's not even close
to 1% across the drives*
> limiting to pools rbd (1)
> *no upmaps proposed*
>
>
> ceph balancer optimize myplan
> Error EALREADY: Unable to find further optimization,or distribution is
already perfect
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>

(See attached file: smime.p7s)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




smime.p7s
Description: Binary data
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Balancer: uneven OSDs

2019-05-29 Thread Tarek Zegar

Can anyone help with this? Why can't I optimize this cluster, the pg counts
and data distribution is way off.
__

I enabled the balancer plugin and even tried to manually invoke it but it
won't allow any changes. Looking at ceph osd df, it's not even at all.
Thoughts?

root@hostadmin:~# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USE AVAIL%USE  VAR  PGS
 1   hdd 0.0098000 B 0 B  0 B 00   0
 3   hdd 0.00980  1.0 10 GiB 8.3 GiB  1.7 GiB 82.83 1.14 156
 6   hdd 0.00980  1.0 10 GiB 8.4 GiB  1.6 GiB 83.77 1.15 144
 0   hdd 0.0098000 B 0 B  0 B 00   0
 5   hdd 0.00980  1.0 10 GiB 9.0 GiB 1021 MiB 90.03 1.23 159
 7   hdd 0.00980  1.0 10 GiB 7.7 GiB  2.3 GiB 76.57 1.05 141
 2   hdd 0.00980  1.0 10 GiB 5.5 GiB  4.5 GiB 55.42 0.76  90
 4   hdd 0.00980  1.0 10 GiB 5.9 GiB  4.1 GiB 58.78 0.81  99
 8   hdd 0.00980  1.0 10 GiB 6.3 GiB  3.7 GiB 63.12 0.87 111
TOTAL 90 GiB  53 GiB   37 GiB 72.93
MIN/MAX VAR: 0.76/1.23  STDDEV: 12.67


root@hostadmin:~# osdmaptool om --upmap out.txt --upmap-pool rbd
osdmaptool: osdmap file 'om'
writing upmap command output to: out.txt
checking for upmap cleanups
upmap, max-count 100, max deviation 0.01  <---really? It's not even close
to 1% across the drives
 limiting to pools rbd (1)
no upmaps proposed


ceph balancer optimize myplan
Error EALREADY: Unable to find further optimization,or distribution is
already perfect
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Balancer: uneven OSDs

2019-05-28 Thread Tarek Zegar

I enabled the balancer plugin and even tried to manually invoke it but it
won't allow any changes. Looking at ceph osd df, it's not even at all.
Thoughts?

root@hostadmin:~# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE   USE AVAIL%USE  VAR  PGS
 1   hdd 0.0098000 B 0 B  0 B 00   0
 3   hdd 0.00980  1.0 10 GiB 8.3 GiB  1.7 GiB 82.83 1.14 156
 6   hdd 0.00980  1.0 10 GiB 8.4 GiB  1.6 GiB 83.77 1.15 144
 0   hdd 0.0098000 B 0 B  0 B 00   0
 5   hdd 0.00980  1.0 10 GiB 9.0 GiB 1021 MiB 90.03 1.23 159
 7   hdd 0.00980  1.0 10 GiB 7.7 GiB  2.3 GiB 76.57 1.05 141
 2   hdd 0.00980  1.0 10 GiB 5.5 GiB  4.5 GiB 55.42 0.76  90
 4   hdd 0.00980  1.0 10 GiB 5.9 GiB  4.1 GiB 58.78 0.81  99
 8   hdd 0.00980  1.0 10 GiB 6.3 GiB  3.7 GiB 63.12 0.87 111
TOTAL 90 GiB  53 GiB   37 GiB 72.93
MIN/MAX VAR: 0.76/1.23  STDDEV: 12.67


root@hostadmin:~# osdmaptool om --upmap out.txt --upmap-pool rbd
osdmaptool: osdmap file 'om'
writing upmap command output to: out.txt
checking for upmap cleanups
upmap, max-count 100, max deviation 0.01  <---really? It's not even close
to 1% across the drives
 limiting to pools rbd (1)
no upmaps proposed


ceph balancer optimize myplan
Error EALREADY: Unable to find further optimization,or distribution is
already perfect
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PG stuck in Unknown after removing OSD - Help?

2019-05-20 Thread Tarek Zegar

Set 3 osd to "out", all were on the same host and should not impact the
pool because it's 3x replication and CRUSH is one osd per host.
However, now we have one PG stuck UKNOWN. Not sure why this is the case, I
do have background writes going on at the time of OSD out. Thoughts?

ceph osd tree
ID CLASS WEIGHT  TYPE NAME STATUS REWEIGHT PRI-AFF
-1   0.08817 root default
-5   0.02939 host hostosd1
 3   hdd 0.00980 osd.3 up  1.0 1.0
 4   hdd 0.00980 osd.4 up  1.0 1.0
 5   hdd 0.00980 osd.5 up  1.0 1.0
-7   0.02939 host hostosd2
 0   hdd 0.00980 osd.0 up  1.0 1.0
 6   hdd 0.00980 osd.6 up  1.0 1.0
 8   hdd 0.00980 osd.8 up  1.0 1.0
-3   0.02939 host hostosd3
 1   hdd 0.00980 osd.1 up0 1.0
 2   hdd 0.00980 osd.2 up0 1.0
 7   hdd 0.00980 osd.7 up0 1.0


ceph health detail
PG_AVAILABILITY Reduced data availability: 1 pg inactive
pg 1.e2 is stuck inactive for 1885.728547, current state unknown, last
acting [4,0]


ceph pg 1.e2 query
{
"state": "unknown",
"snap_trimq": "[]",
"snap_trimq_len": 0,
"epoch": 132,
"up": [
4,
0
],
"acting": [
4,
0
],
"info": {
"pgid": "1.e2",
"last_update": "34'3072",
"last_complete": "34'3072",
"log_tail": "0'0",
"last_user_version": 3072,
"last_backfill": "MAX",
"last_backfill_bitwise": 0,
"purged_snaps": [],
"history": {
"epoch_created": 29,
"epoch_pool_created": 29,
"last_epoch_started": 30,
"last_interval_started": 29,
"last_epoch_clean": 30,
"last_interval_clean": 29,
"last_epoch_split": 0,
"last_epoch_marked_full": 0,
"same_up_since": 70,
"same_interval_since": 70,
"same_primary_since": 70,
"last_scrub": "0'0",
"last_scrub_stamp": "2019-05-20 21:15:42.448125",
"last_deep_scrub": "0'0",
"last_deep_scrub_stamp": "2019-05-20 21:15:42.448125",
"last_clean_scrub_stamp": "2019-05-20 21:15:42.448125"
},
"stats": {
"version": "34'3072",
"reported_seq": "3131",
"reported_epoch": "132",
"state": "unknown",
"last_fresh": "2019-05-20 22:52:07.898135",
"last_change": "2019-05-20 22:50:46.711730",
"last_active": "2019-05-20 22:50:26.109185",
"last_peered": "2019-05-20 22:02:01.008787",
"last_clean": "2019-05-20 22:02:01.008787",
"last_became_active": "2019-05-20 21:15:43.662550",
"last_became_peered": "2019-05-20 21:15:43.662550",
"last_unstale": "2019-05-20 22:52:07.898135",
"last_undegraded": "2019-05-20 22:52:07.898135",
"last_fullsized": "2019-05-20 22:52:07.898135",
"mapping_epoch": 70,
"log_start": "0'0",
"ondisk_log_start": "0'0",
"created": 29,
"last_epoch_clean": 30,
"parent": "0.0",
"parent_split_bits": 0,
"last_scrub": "0'0",
"last_scrub_stamp": "2019-05-20 21:15:42.448125",
"last_deep_scrub": "0'0",
"last_deep_scrub_stamp": "2019-05-20 21:15:42.448125",
"last_clean_scrub_stamp": "2019-05-20 21:15:42.448125",
"log_size": 3072,
"ondisk_log_size": 3072,
"stats_invalid": false,
"dirty_stats_invalid": false,
"omap_stats_invalid": false,
"hitset_stats_invalid": false,
"hitset_bytes_stats_invalid": false,
"pin_stats_invalid": false,
"manifest_stats_invalid": false,
"snaptrimq_len": 0,
"stat_sum": {
"num_bytes": 12582912,
"num_objects": 3,
"num_object_clones": 0,
"num_object_copies": 9,
"num_objects_missing_on_primary": 0,
"num_objects_missing": 0,
"num_objects_degraded": 0,
"num_objects_misplaced": 0,
"num_objects_unfound": 0,
"num_objects_dirty": 3,
"num_whiteouts": 0,
"num_read": 0,
"num_read_kb": 0,
"num_write": 3072,
"num_write_kb": 12288,
"num_scrub_errors": 0,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 0,
"num_objects_recovered": 0,
"num_bytes_recovered": 0,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0,

Re: [ceph-users] Lost OSD from PCIe error, recovered, HOW to restore OSD process

2019-05-16 Thread Tarek Zegar

FYI for anyone interested, below is how to recover from a someone removing
a NVME drive (the first two steps show how mine were removed and brought
back)
Steps 3-6 are to get the drive lvm volume back AND get the OSD daemon
running for the drive

 1. echo 1 > /sys/block/nvme0n1/device/device/remove
 2.   echo 1 > /sys/bus/pci/rescan
 3.   vgcfgrestore ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841
 4.   ceph auth add osd.122 osd 'allow *' mon 'allow rwx'
-i /var/lib/ceph/osd/ceph-122/keyring
 5. ceph-volume lvm activate --all
 6.   You should see the drive somewhere in the ceph tree, move it to the
right host

Tarek





From:   "Tarek Zegar" 
To: Alfredo Deza 
Cc: ceph-users 
Date:   05/15/2019 10:32 AM
Subject:[EXTERNAL] Re: [ceph-users] Lost OSD from PCIe error,
recovered, to restore OSD process
Sent by:"ceph-users" 



TLDR; I activated the drive successfully but the daemon won't start, looks
like it's complaining about mon config, idk why (there is a valid ceph.conf
on the host). Thoughts? I feel like it's close. Thank you

I executed the command:
ceph-volume lvm activate --all


It found the drive and activated it:
--> Activating OSD ID 122 FSID a151bea5-d123-45d9-9b08-963a511c042a

--> ceph-volume lvm activate successful for osd ID: 122



However, systemd would not start the OSD process 122:
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
14:16:13.862 71970700 -1 monclient(hunting): handle_auth_bad_method
server allowed_methods [2] but i only support [2]
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
14:16:13.862 7116f700 -1 monclient(hunting): handle_auth_bad_method
server allowed_methods [2] but i only support [2]
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: failed to fetch
mon config (--no-mon-config to skip)
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Main process exited, code=exited, status=1/FAILURE
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Failed with result 'exit-code'.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Service hold-off time over, scheduling restart.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Scheduled restart job, restart counter is at 3.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Automatic restarting of the unit ceph-osd@122.service has been
scheduled, as the result for
-- the configured Restart= setting for the unit.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: Stopped Ceph object
storage daemon osd.122.
-- Subject: Unit ceph-osd@122.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit ceph-osd@122.service has finished shutting down.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Start request repeated too quickly.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Failed with result 'exit-code'.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: Failed to start Ceph
object storage daemon osd.122



Inactive hide details for Alfredo Deza ---05/15/2019 08:27:13 AM---On Tue,
May 14, 2019 at 7:24 PM Bob R  Alfredo Deza
---05/15/2019 08:27:13 AM---On Tue, May 14, 2019 at 7:24 PM Bob R
 wrote: >

From: Alfredo Deza 
To: Bob R 
Cc: Tarek Zegar , ceph-users 
Date: 05/15/2019 08:27 AM
Subject: [EXTERNAL] Re: [ceph-users] Lost OSD from PCIe error, recovered,
to restore OSD process



On Tue, May 14, 2019 at 7:24 PM Bob R  wrote:
>
> Does 'ceph-volume lvm list' show it? If so you can try to activate it
with 'ceph-volume lvm activate 122
74b01ec2--124d--427d--9812--e437f90261d4'

Good suggestion. If `ceph-volume lvm list` can see it, it can probably
activate it again. You can activate it with the OSD ID + OSD FSID, or
do:

ceph-volume lvm activate --all

You didn't say if the OSD wasn't coming up after trying to start it
(the systemd unit should still be there for ID 122), or if you tried
rebooting and that OSD didn't come up.

The systemd unit is tied to both the ID and FSID of the OSD, so it
shouldn't matter if the underlying device changed since ceph-volume
ensures it is the right one every time it activates.
>
> Bob
>
> On Tue, May 14, 2019 at 7:35 AM Tarek Zegar  wrote:
>>
>> Someone nuked and OSD that had 1 replica PGs. They accidentally did echo
1 > /sys/block/nvme0n1/device/device/remove
>> We got it back doing a echo 1 > /sys/bus/pci/rescan
>> However, it reenumerated as a different drive number (guess we didn't
have udev rules)
>> They restored the LVM volume (vgcfgrestore
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841)
>>
>> lsblk
>&g

Re: [ceph-users] Lost OSD from PCIe error, recovered, to restore OSD process

2019-05-15 Thread Tarek Zegar

TLDR; I activated the drive successfully but the daemon won't start, looks
like it's complaining about mon config, idk why (there is a valid ceph.conf
on the host). Thoughts? I feel like it's close. Thank you

I executed the command:
ceph-volume lvm activate --all


It found the drive and activated it:
--> Activating OSD ID 122 FSID a151bea5-d123-45d9-9b08-963a511c042a

--> ceph-volume lvm activate successful for osd ID: 122



However, systemd would not start the OSD process 122:
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
14:16:13.862 71970700 -1 monclient(hunting): handle_auth_bad_method
server allowed_methods [2] but i only support [2]
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: 2019-05-15
14:16:13.862 7116f700 -1 monclient(hunting): handle_auth_bad_method
server allowed_methods [2] but i only support [2]
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 ceph-osd[757237]: failed to fetch
mon config (--no-mon-config to skip)
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Main process exited, code=exited, status=1/FAILURE
May 15 14:16:13 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Failed with result 'exit-code'.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Service hold-off time over, scheduling restart.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Scheduled restart job, restart counter is at 3.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Automatic restarting of the unit ceph-osd@122.service has been
scheduled, as the result for
-- the configured Restart= setting for the unit.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: Stopped Ceph object
storage daemon osd.122.
-- Subject: Unit ceph-osd@122.service has finished shutting down
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Unit ceph-osd@122.service has finished shutting down.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Start request repeated too quickly.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: ceph-osd@122.service:
Failed with result 'exit-code'.
May 15 14:16:14 pok1-qz1-sr1-rk001-s20 systemd[1]: Failed to start Ceph
object storage daemon osd.122





From:   Alfredo Deza 
To: Bob R 
Cc: Tarek Zegar , ceph-users

Date:   05/15/2019 08:27 AM
Subject:[EXTERNAL] Re: [ceph-users] Lost OSD from PCIe error,
recovered, to restore OSD process



On Tue, May 14, 2019 at 7:24 PM Bob R  wrote:
>
> Does 'ceph-volume lvm list' show it? If so you can try to activate it
with 'ceph-volume lvm activate 122
74b01ec2--124d--427d--9812--e437f90261d4'

Good suggestion. If `ceph-volume lvm list` can see it, it can probably
activate it again. You can activate it with the OSD ID + OSD FSID, or
do:

ceph-volume lvm activate --all

You didn't say if the OSD wasn't coming up after trying to start it
(the systemd unit should still be there for ID 122), or if you tried
rebooting and that OSD didn't come up.

The systemd unit is tied to both the ID and FSID of the OSD, so it
shouldn't matter if the underlying device changed since ceph-volume
ensures it is the right one every time it activates.
>
> Bob
>
> On Tue, May 14, 2019 at 7:35 AM Tarek Zegar  wrote:
>>
>> Someone nuked and OSD that had 1 replica PGs. They accidentally did echo
1 > /sys/block/nvme0n1/device/device/remove
>> We got it back doing a echo 1 > /sys/bus/pci/rescan
>> However, it reenumerated as a different drive number (guess we didn't
have udev rules)
>> They restored the LVM volume (vgcfgrestore
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841)
>>
>> lsblk
>> nvme0n2 259:9 0 1.8T 0 diskc
>>
ceph--8c81b2a3--6c8e--4cae--a3c0--e2d91f82d841-osd--data--74b01ec2--124d--427d--9812--e437f90261d4
 253:1 0 1.8T 0 lvm
>>
>> We are stuck here. How do we attach an OSD daemon to the drive? It was
OSD.122 previously
>>
>> Thanks
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=T8FGOFoarkOiORgemihDpPCoz3wRG5GH_oQWne3ROvc=4zaqEyKSugJ7AN4hZW6vOZ4SZ0-SxF-yj8OGBM2zv6c=

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.ceph.com_listinfo.cgi_ceph-2Dusers-2Dceph.com=DwIBaQ=jf_iaSHvJObTbx-siA1ZOg=3V1n-r1W__Mu-wEAwzq7jDpopOSMrfRfomn1f5bgT28=T8FGOFoarkOiORgemihDpPCoz3wRG5GH_oQWne3ROvc=4zaqEyKSugJ7AN4hZW6vOZ4SZ0-SxF-yj8OGBM2zv6c=




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rolling upgrade fails with flag norebalance with background IO [EXT]

2019-05-14 Thread Tarek Zegar

https://github.com/ceph/ceph-ansible/issues/3961   <--- created ticket

Thanks
Tarek



From:   Matthew Vernon 
To:     Tarek Zegar , solarflo...@gmail.com
Cc: ceph-users@lists.ceph.com
Date:   05/14/2019 04:41 AM
Subject:[EXTERNAL] Re: [ceph-users] Rolling upgrade fails with flag
norebalance with background IO [EXT]



On 14/05/2019 00:36, Tarek Zegar wrote:
> It's not just mimic to nautilus
> I confirmed with luminous to mimic
>
> They are checking for clean pgs with flags set, they should unset flags,
> then check. Set flags again, move on to next osd

I think I'm inclined to agree that "norebalance" is likely to get in the
way when upgrading a cluster - our rolling upgrade playbook omits it.

OTOH, you might want to raise this on the ceph-ansible list (
ceph-ansi...@lists.ceph.com ) and/or as a github issue - I don't think
the ceph-ansible maintainers routinely watch this list.

HTH,

Matthew


--
 The Wellcome Sanger Institute is operated by Genome Research
 Limited, a charity registered in England with number 1021457 and a
 company registered in England with number 2742969, whose registered
 office is 215 Euston Road, London, NW1 2BE.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Lost OSD from PCIe error, recovered, to restore OSD process

2019-05-14 Thread Tarek Zegar

Someone nuked and OSD that had 1 replica PGs. They accidentally did echo 1
> /sys/block/nvme0n1/device/device/remove
We got it back doing a echo 1 > /sys/bus/pci/rescan
However, it reenumerated as a different drive number (guess we didn't have
udev rules)
They restored the LVM volume (vgcfgrestore
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841 ; vgchange -ay
ceph-8c81b2a3-6c8e-4cae-a3c0-e2d91f82d841)

lsblk
nvme0n2
259:90  1.8T  0 diskc
ceph--8c81b2a3--6c8e--4cae--a3c0--e2d91f82d841-osd--data--74b01ec2--124d--427d--9812--e437f90261d4
  253:10  1.8T  0 lvm

We are stuck here. How do we attach an OSD daemon to the drive? It was
OSD.122 previously

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rolling upgrade fails with flag norebalance with background IO

2019-05-13 Thread Tarek Zegar
It's not just mimic to nautilus
I confirmed with luminous to mimic
 
They are checking for clean pgs with flags set, they should unset flags, then check. Set flags again, move on to next osd
 
- Original message -From: solarflow99 To: Tarek Zegar Cc: Ceph Users Subject: [EXTERNAL] Re: [ceph-users] Rolling upgrade fails with flag norebalance with background IODate: Mon, May 13, 2019 6:36 PM 
Are you sure can you really use 3.2 for nautilus?   

On Fri, May 10, 2019 at 7:23 AM Tarek Zegar <tze...@us.ibm.com> wrote:
Ceph-ansible 3.2, rolling upgrade mimic -> nautilus. The ansible file sets flag "norebalance". When there is*no* I/O to the cluster, upgrade works fine. When upgrading with IO running in the background, some PG become `active+undersized+remapped+backfilling`Flag norebalance prevents them from backfilling / recovering and upgrade fails. I'm uncertain why those OSD are "backfilling" instead of "recovering" but I guess it doesn't matter, norebalance halts the process. setting ceph tell osd.* injectargs '--osd_max_backfills=2 made no difference https://github.com/ceph/ceph-ansible/commit/08d94324545b3c4e0f6a1caf6224f37d1c2b36db  <-- did anyone other then the author verify this?Tarek ___ceph-users mailing listceph-users@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph MGR CRASH : balancer module

2019-05-13 Thread Tarek Zegar


Hello,

My manager keeps dying, the last meta log is below. What is causing this? I
do have two roots in the osd tree with shared hosts(see below), I can't
imagine that is causing balancer to fail?


meta log:
{
"crash_id":
"2019-05-11_19:09:17.999875Z_aa7afa7c-bc7e-43ec-b32a-821bd47bd68b",
"timestamp": "2019-05-11 19:09:17.999875Z",
"process_name": "ceph-mgr",
"entity_name": "mgr.pok1-qz1-sr1-rk023-s08",
"ceph_version": "14.2.0",
"utsname_hostname": "pok1-qz1-sr1-rk023-s08",
"utsname_sysname": "Linux",
"utsname_release": "4.15.0-1014-ibm-gt",
"utsname_version": "#16-Ubuntu SMP Tue Dec 11 11:19:10 UTC 2018",
"utsname_machine": "x86_64",
"os_name": "Ubuntu",
"os_id": "ubuntu",
"os_version_id": "18.04",
"os_version": "18.04.1 LTS (Bionic Beaver)",
"assert_condition": "osd_weight.count(i.first)",
"assert_func": "int OSDMap::calc_pg_upmaps(CephContext*, float, int,
const std::set&, OSDMap::Incremental*)",
"assert_file": "/build/ceph-14.2.0/src/osd/OSDMap.cc",
"assert_line": 4743,
"assert_thread_name": "balancer",
"assert_msg": "/build/ceph-14.2.0/src/osd/OSDMap.cc: In function 'int
OSDMap::calc_pg_upmaps(CephContext*, float, int, const std::set&,
OSDMap::Incremental*)' thread 7fffd6572700 time 2019-05-11 19:09:17.998114
\n/build/ceph-14.2.0/src/osd/OSDMap.cc: 4743: FAILED ceph_assert
(osd_weight.count(i.first))\n",
"backtrace": [
"(()+0x12890) [0x7fffee586890]",
"(gsignal()+0xc7) [0x7fffed67ee97]",
"(abort()+0x141) [0x7fffed680801]",
"(ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x1a3) [0x7fffef1eb7d3]",
"(ceph::__ceph_assertf_fail(char const*, char const*, int, char
const*, char const*, ...)+0) [0x7fffef1eb95d]",
"(OSDMap::calc_pg_upmaps(CephContext*, float, int, std::set, std::allocator > const&,
OSDMap::Incremental*)+0x274b) [0x7fffef61bb3b]",
"(()+0x1d52b6) [0x557292b6]",
"(PyEval_EvalFrameEx()+0x8010) [0x7fffeeab21d0]",
"(PyEval_EvalCodeEx()+0x7d8) [0x7fffeebe2278]",
"(PyEval_EvalFrameEx()+0x5bf6) [0x7fffeeaafdb6]",
"(PyEval_EvalFrameEx()+0x8b5b) [0x7fffeeab2d1b]",
"(PyEval_EvalFrameEx()+0x8b5b) [0x7fffeeab2d1b]",
"(PyEval_EvalCodeEx()+0x7d8) [0x7fffeebe2278]",
"(()+0x1645f9) [0x7fffeeb675f9]",
"(PyObject_Call()+0x43) [0x7fffeea57333]",
"(()+0x1abd1c) [0x7fffeebaed1c]",
"(PyObject_Call()+0x43) [0x7fffeea57333]",
"(PyObject_CallMethod()+0xc8) [0x7fffeeb7bc78]",
"(PyModuleRunner::serve()+0x62) [0x55725f32]",
"(PyModuleRunner::PyModuleRunnerThread::entry()+0x1cf)
[0x557265df]",
"(()+0x76db) [0x7fffee57b6db]",
"(clone()+0x3f) [0x7fffed76188f]"
]
}

OSD TREE:
ID  CLASS WEIGHTTYPE NAME   STATUS REWEIGHT PRI-AFF
-2954.58200 root tzrootthreenodes
-2518.19400 host pok1-qz1-sr1-rk001-s20
  0   ssd   1.81898 osd.0   up  1.0 1.0
122   ssd   1.81898 osd.122 up  1.0 1.0
135   ssd   1.81898 osd.135 up  1.0 1.0
149   ssd   1.81898 osd.149 up  1.0 1.0
162   ssd   1.81898 osd.162 up  1.0 1.0
175   ssd   1.81898 osd.175 up  1.0 1.0
188   ssd   1.81898 osd.188 up  1.0 1.0
200   ssd   1.81898 osd.200 up  1.0 1.0
213   ssd   1.81898 osd.213 up  1.0 1.0
225   ssd   1.81898 osd.225 up  1.0 1.0
 -518.19400 host pok1-qz1-sr1-rk002-s05
112   ssd   1.81898 osd.112 up  1.0 1.0
120   ssd   1.81898 osd.120 up  1.0 1.0
132   ssd   1.81898 osd.132 up  1.0 1.0
144   ssd   1.81898 osd.144 up  1.0 1.0
156   ssd   1.81898 osd.156 up  1.0 1.0
168   ssd   1.81898 osd.168 up  1.0 1.0
180   ssd   1.81898 osd.180 up  1.0 1.0
192   ssd   1.81898 osd.192 up  1.0 1.0
204   ssd   1.81898 osd.204 up  1.0 1.0
216   ssd   1.81898 osd.216 up  1.0 1.0
-1118.19400 host pok1-qz1-sr1-rk002-s16
115   ssd   1.81898 osd.115 up  1.0 1.0
127   ssd   1.81898 osd.127 up  1.0 1.0
139   ssd   1.81898 osd.139 up  1.0 1.0
151   ssd   1.81898 osd.151 up  1.0 1.0
163   ssd   1.81898 osd.163 up  1.0 1.0
174   ssd   1.81898   

[ceph-users] Rolling upgrade fails with flag norebalance with background IO

2019-05-10 Thread Tarek Zegar


Ceph-ansible 3.2, rolling upgrade mimic -> nautilus. The ansible file sets
flag "norebalance". When there is*no* I/O to the cluster, upgrade works
fine. When upgrading with IO running in the background, some PG become
`active+undersized+remapped+backfilling`
Flag norebalance prevents them from backfilling / recovering and upgrade
fails. I'm uncertain why those OSD are "backfilling" instead of
"recovering" but I guess it doesn't matter, norebalance halts the process.
setting ceph tell osd.* injectargs '--osd_max_backfills=2 made no
difference

https://github.com/ceph/ceph-ansible/commit/08d94324545b3c4e0f6a1caf6224f37d1c2b36db
  <-- did anyone other then the author verify this?

Tarek

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PG in UP set but not Acting? Backfill halted

2019-05-09 Thread Tarek Zegar


Hello,

Been working on Ceph for only a few weeks and have a small cluster in VMs.
I did a ceph-ansible rolling_update to nautilus from mimic and some of my
PG were stuck in 'active+undersized+remapped+backfilling' with no progress.
All OSD were up and in (see ceph tree below). The PGs only had 2 OSD in the
acting set, yet 3 in the UP set. I don't understand how the acting set can
have two and the upset can have 3; if anything, wouldn't the upset be a
subset of acting?
Anyway, I noticed that the ansible rolling_update set the following flags
'noout' AND 'norebalance'. PG query showed backfill target as OSD 0 (which
was missing from the acting set) and "waiting on backfill" was blank, as
such I'm very confused.
So it wants to backfill OSD 0, it's not blocked per empty set in
waiting_on_backfill, so what's holding it up? Why is it not in the acting
set? (what's the clear definition of acting vs up)

ceph osd tree
ID CLASS WEIGHT  TYPE NAME STATUS REWEIGHT PRI-AFF
-1   0.08817 root default
-5   0.02939 host hostosd1
 0   hdd 0.00980 osd.0 up  1.0 1.0
 4   hdd 0.00980 osd.4 up  1.0 1.0
 7   hdd 0.00980 osd.7 up  1.0 1.0
-3   0.02939 host hostosd2
 1   hdd 0.00980 osd.1 up  1.0 1.0
 3   hdd 0.00980 osd.3 up  1.0 1.0
 6   hdd 0.00980 osd.6 up  1.0 1.0
-7   0.02939 host hostosd3
 2   hdd 0.00980 osd.2 up  1.0 1.0
 5   hdd 0.00980 osd.5 up  1.0 1.0
 8   hdd 0.00980 osd.8 up  1.0 1.0


PG Info
1.35  3  00 0   0 8388623
0  0 3045 3045active+undersized+remapped
+backfilling 2019-05-09 16:18:02.513033 50'107145 50:108127 [5,6,0]
5   [5,6]

PG Query
"state": "active+undersized+remapped+backfilling",
"snap_trimq": "[]",
"snap_trimq_len": 0,
"epoch": 50,
"up": [
5,
6,
0
],
"acting": [
5,
6
],
"backfill_targets": [
"0"
],
"acting_recovery_backfill": [
"0",
"5",
"6"
]

...

"waiting_on_backfill": [],
"last_backfill_started": "MAX",
"backfill_info": {
"begin": "MAX",
"end": "MAX",
"objects": []
},
"peer_backfill_info": [
"0",
{
"begin": "MAX",
"end": "MAX",
"objects": []
}
],
"backfills_in_flight": [],
"recovering": [],
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com