Re: [ceph-users] Filestore to Bluestore migration question

2018-11-05 Thread Amit Ghadge
On Mon, 5 Nov 2018, 21:13 Hayashida, Mami,  wrote:

> Additional info -- I know that /var/lib/ceph/osd/ceph-{60..69} are not
> mounted at this point (i.e.  mount | grep ceph-60, and 61-69, returns
> nothing.).  They don't show up when I run "df", either.
>
ceph-volume command automatically mount ceph-idx directory to tmpfs

>
> On Mon, Nov 5, 2018 at 10:15 AM, Hayashida, Mami 
> wrote:
>
>> Well, over the weekend the whole server went down and is now in the
>> emergency mode. (I am running Ubuntu 16.04).  When I run "journalctl  -p
>> err -xb"   I see that
>>
>> systemd[1]: Timed out waiting for device dev-sdh1.device.
>> -- Subject: Unit dev-sdh1.device has failed
>> -- Defined-By: systemd
>> -- Support: http://lists.freeddesktop.org/
>> --
>> -- Unit dev-sdh1.device has failed.
>>
>>
>> I see this for every single one of the newly-converted Bluestore OSD
>> disks (/dev/sd{h..q}1).
>>
>>
>> --
>>
>> On Mon, Nov 5, 2018 at 9:57 AM, Alfredo Deza  wrote:
>>
>>> On Fri, Nov 2, 2018 at 5:04 PM Hayashida, Mami 
>>> wrote:
>>> >
>>> > I followed all the steps Hector suggested, and almost everything seems
>>> to have worked fine.  I say "almost" because one out of the 10 osds I was
>>> migrating could not be activated even though everything up to that point
>>> worked just as well for that osd as the other ones. Here is the output for
>>> that particular failure:
>>> >
>>> > *
>>> > ceph-volume lvm activate --all
>>> > ...
>>> > --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-XX
>>> > Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
>>> > --> Absolute path not found for executable: restorecon
>>> > --> Ensure $PATH environment variable contains common executable
>>> locations
>>> > Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir
>>> --dev /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>>> >  stderr: failed to read label for /dev/hdd67/data67: (2) No such file
>>> or directory
>>> > -->  RuntimeError: command returned non-zero exit status:
>>>
>>> I wonder if the /dev/sdo device where hdd67/data67 is located is
>>> available, or if something else is missing. You could try poking
>>> around with `lvs` and see if that LV shows up, also `ceph-volume lvm
>>> list hdd67/data67` can help here because it
>>> groups OSDs to LVs. If you run `ceph-volume lvm list --format=json
>>> hdd67/data67` you will also see all the metadata stored in it.
>>>
>>> Would be interesting to see that output to verify things exist and are
>>> usable for OSD activation.
>>>
>>> >
>>> > ***
>>> > I then checked to see if the rest of the migrated OSDs were back in by
>>> calling the ceph osd tree command from the admin node.  Since they were
>>> not, I tried to restart the first of the 10 newly migrated Bluestore osds
>>> by calling
>>> >
>>> > ***
>>> > systemctl start ceph-osd@60
>>> >
>>> > At that point, not only this particular service could not be started,
>>> but ALL the OSDs (daemons) on the entire node shut down!
>>> >
>>> > **
>>> > root@osd1:~# systemctl status ceph-osd@60
>>> > ● ceph-osd@60.service - Ceph object storage daemon osd.60
>>> >Loaded: loaded (/lib/systemd/system/ceph-osd@.service;
>>> enabled-runtime; vendor preset: enabled)
>>> >Active: inactive (dead) since Fri 2018-11-02 15:47:20 EDT; 1h 9min
>>> ago
>>> >   Process: 3473621 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER}
>>> --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
>>> >   Process: 3473147 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh
>>> --cluster ${CLUSTER} --id %i (code=exited, status=0/SUCCESS)
>>> >  Main PID: 3473621 (code=exited, status=0/SUCCESS)
>>> >
>>> > Oct 29 15:57:53 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-29
>>> 15:57:53.868856 7f68adaece00 -1 osd.60 48106 log_to_monitors {default=true}
>>> > Oct 29 15:57:53 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-29
>>> 15:57:53.874373 7f68adaece00 -1 osd.60 48106 mon_cmd_maybe_osd_create fail:
>>> 'you must complete the upgrade and 'ceph osd require-osd-release luminous'
>>> before using crush device classes': (1) Operation not permitted
>>> > Oct 30 06:25:01 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-30
>>> 06:25:01.961720 7f687feb3700 -1 received  signal: Hangup from  PID: 3485955
>>> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse
>>> radosgw  UID: 0
>>> > Oct 31 06:25:02 osd1.x.uky.edu ceph-osd[3473621]: 2018-10-31
>>> 06:25:02.110898 7f687feb3700 -1 received  signal: Hangup from  PID: 3500945
>>> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse
>>> radosgw  UID: 0
>>> > Nov 01 06:25:02 osd1.x.uky.edu ceph-osd[3473621]: 2018-11-01
>>> 06:25:02.101548 7f687feb3700 -1 received  signal: Hangup from  PID: 3514774
>>> task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse
>>> radosgw  UID: 0
>>> > Nov 02 06:25:02 osd1.x.uky.edu ceph-osd[3473621]: 2018-11-02
>>> 06:25:01.997557 7f687feb3700 -1 received  signal: Hangup 

[ceph-users] Ceph luminous custom plugin

2018-11-14 Thread Amit Ghadge
Hi,
I copied my custom module in /usr/lib64/ceph/mgr and run "ceph mgr module
enable  --force" to enable plugin. It's plug and print some
message in plugin but it's not print any log in ceph-mgr log file.


Thanks,
Amit G
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph luminous custom plugin

2018-11-14 Thread Amit Ghadge
On Wed, Nov 14, 2018 at 5:11 PM Amit Ghadge  wrote:

> Hi,
> I copied my custom module in /usr/lib64/ceph/mgr and run "ceph mgr module
> enable  --force" to enable plugin. It's plug and print some
> message in plugin but it's not print any log in ceph-mgr log file.
>
>
> Thanks,
> Amit G
>

Yes, it's started working need to restart ceph-mgr service.

Thanks,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] [Ceph-users] Multisite-Master zone still in recover mode

2019-01-02 Thread Amit Ghadge
Hi,

We following http://docs.ceph.com/docs/master/radosgw/multisite/ steps to
migrate single-site to master zone and then setup secondary zone.
We not delete existing data and all objects sync to secondary zone but in
master zone it still showing in recovery mode, dynamic resharding is
disable.

Master zone
# radosgw-admin sync status
  realm 2c642eee-46e0-488e-8566-6a58878c1a95 (movie)
  zonegroup b569583b-ae34-4798-bb7c-a79de191b7dd (us)
   zone 2929a077-6d81-48ee-bf64-3503dcdf2d46 (us-west)
  metadata sync no sync (zone is master)
  data sync source: 5bcbf11e-5626-4773-967d-6d22decb44c0 (us-east)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
128 shards are recovering
recovering shards:
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127]


Secondary zone
#  radosgw-admin sync status
  realm 2c642eee-46e0-488e-8566-6a58878c1a95 (movie)
  zonegroup b569583b-ae34-4798-bb7c-a79de191b7dd (us)
   zone 5bcbf11e-5626-4773-967d-6d22decb44c0 (us-east)
  metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
  data sync source: 2929a077-6d81-48ee-bf64-3503dcdf2d46 (us-west)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with
source


After we pushed objects to master zone, objects sync to secondary zone and
started showing in recovery mode.

So, My question is,  It is normal behavior?
We running ceph version 12.2.9.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Journal drive recommendation

2018-11-26 Thread Amit Ghadge
Hi all,

We have planning to use SSD data drive, so for journal drive, is there any
recommendation to use same drive or separate drive?

Thanks,
Amit
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Journal drive recommendation

2018-11-26 Thread Amit Ghadge
On Tue, 27 Nov 2018, 10:55 Martin Verges,  wrote:

> Hello,
>
> what type of SSD data drives do you plan to use?
>
We have plan to using external ssd data drive.

> In general, I would not recommend to use external journal on ssd OSDs, but
> it is possible to squeeze out a bit more performance depending on your data
> disks.
>

> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Di., 27. Nov. 2018, 02:50 hat Amit Ghadge 
> geschrieben:
>
>> Hi all,
>>
>> We have planning to use SSD data drive, so for journal drive, is there
>> any recommendation to use same drive or separate drive?
>>
>> Thanks,
>> Amit
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster Status:HEALTH_ERR for Full OSD

2019-01-30 Thread Amit Ghadge
Better way is increase osd set-full-ratio slightly (.97) and then remove
buckets.

-AmitG

On Wed, 30 Jan 2019, 21:30 Paul Emmerich,  wrote:

> Quick and dirty solution: take the full OSD down to issue the deletion
> command ;)
>
> Better solutions: temporarily incrase the full limit (ceph osd
> set-full-ratio) or reduce the OSD's reweight (ceph osd reweight)
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Wed, Jan 30, 2019 at 11:56 AM Fabio - NS3 srl  wrote:
> >
> > Hello guys,
> > i have a Ceph with a full S3
> >
> > ~# ceph health detail
> > HEALTH_ERR 1 full osd(s); 1 near full osd(s)
> > osd.2 is full at 95%
> > osd.5 is near full at 85%
> >
> >
> > I want to delete some bucket but when i tried to show list bucket
> >
> >
> > ~# radosgw-admin bucket list
> > 2019-01-30 11:41:47.933621 7f467a9d0780  0 client.3967227.objecter
> FULL, paused modify 0x2aaf410 tid 8
> >
> > the command remains blocked ...no prompt.
> >
> > Solutions  as well as adding an OSD?
> >
> > Many thank
> > --
> > Fabio
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multisite Ceph setup sync issue

2019-01-30 Thread Amit Ghadge
Have you commit your changes on slave gateway?
First, run commit command on slave gateway and then try.

-AmitG

On Wed, 30 Jan 2019, 21:06 Krishna Verma,  wrote:

> Hi Casey,
>
> Thanks for your reply, however I tried with "--source-zone" option with
> sync command but getting below error:
>
> Sync status From slave gateway to master zone "noida1"
>
> [cephuser@zabbix-client ~]$ radosgw-admin sync status --source-zone
> noida1 2>/dev/null
>   realm 1102c891-d81c-480e-9487-c9f874287d13 (georep)
>   zonegroup 74ad391b-fbca-4c05-b9e7-c90fd4851223 (noida)
>zone 45c690a8-f39c-4b1d-9faf-e0e991ceaaac (san-jose)
>   metadata sync failed to read sync status: (2) No such file or directory
>   data sync source: 71931e0e-1be6-449f-af34-edb4166c4e4a (noida1)
> failed to retrieve sync info: (5) Input/output
> error
> [cephuser@zabbix-client ~]$
>
> Sync status From Master Gateway to slave zone " san-jose":
>
> [cephuser@zabbix-server ~]$  radosgw-admin sync status --source-zone
> san-jose 2>/dev/null
>   realm 1102c891-d81c-480e-9487-c9f874287d13 (georep)
>   zonegroup 74ad391b-fbca-4c05-b9e7-c90fd4851223 (noida)
>zone 71931e0e-1be6-449f-af34-edb4166c4e4a (noida1)
>   metadata sync no sync (zone is master)
> [cephuser@zabbix-server ~]$
>
> Zone detail from master gateway :
>
> [cephuser@zabbix-server ~]$ radosgw-admin zonegroup get  2>/dev/null
> {
> "id": "74ad391b-fbca-4c05-b9e7-c90fd4851223",
> "name": "noida",
> "api_name": "noida",
> "is_master": "true",
> "endpoints": [
> "http:\/\/zabbix-server:7480"
> ],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "71931e0e-1be6-449f-af34-edb4166c4e4a",
> "zones": [
> {
> "id": "71931e0e-1be6-449f-af34-edb4166c4e4a",
> "name": "noida1",
> "endpoints": [
> "http:\/\/vlno-ceph01:7480"
> ],
> "log_meta": "false",
> "log_data": "false",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> }
> ],
> "placement_targets": [
> {
> "name": "default-placement",
> "tags": []
> }
> ],
> "default_placement": "default-placement",
> "realm_id": "1102c891-d81c-480e-9487-c9f874287d13"
> }
>
> [cephuser@zabbix-server ~]$
>
>
> Zone detail from slave gateway:
>
> [cephuser@zabbix-client ~]$ radosgw-admin zonegroup get  2>/dev/null
> {
> "id": "74ad391b-fbca-4c05-b9e7-c90fd4851223",
> "name": "noida",
> "api_name": "noida",
> "is_master": "true",
> "endpoints": [
> "http:\/\/zabbix-server:7480"
> ],
> "hostnames": [],
> "hostnames_s3website": [],
> "master_zone": "71931e0e-1be6-449f-af34-edb4166c4e4a",
> "zones": [
> {
> "id": "45c690a8-f39c-4b1d-9faf-e0e991ceaaac",
> "name": "san-jose",
> "endpoints": [
> "http:\/\/zabbix-client:7480"
> ],
> "log_meta": "false",
> "log_data": "true",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> },
> {
> "id": "71931e0e-1be6-449f-af34-edb4166c4e4a",
> "name": "noida1",
> "endpoints": [
> "http:\/\/vlno-ceph01:7480"
> ],
> "log_meta": "false",
> "log_data": "true",
> "bucket_index_max_shards": 0,
> "read_only": "false"
> }
> ],
> "placement_targets": [
> {
> "name": "default-placement",
> "tags": []
> }
> ],
> "default_placement": "default-placement",
> "realm_id": "1102c891-d81c-480e-9487-c9f874287d13"
> }
>
> [cephuser@zabbix-client ~]
>
> I need your expert advice.
>
> /Krishna
>
> -Original Message-
> From: Casey Bodley 
> Sent: Wednesday, January 30, 2019 1:54 AM
> To: Krishna Verma 
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Multisite Ceph setup sync issue
>
> EXTERNAL MAIL
>
>
> On Tue, Jan 29, 2019 at 12:24 PM Krishna Verma  wrote:
> >
> > Hi Ceph Users,
> >
> >
> >
> > I need your to fix sync issue in multisite setup.
> >
> >
> >
> > I have 2 cluster in different datacenter that we want to use for
> bidirectional data replication. By followed the documentation
> https://urldefense.proofpoint.com/v2/url?u=http-3A__docs.ceph.com_docs_master_radosgw_multisite_=DwIFaQ=aUq983L2pue2FqKFoP6PGHMJQyoJ7kl3s3GZ-_haXqY=0E5nRoxLsT2ZXgCpJM_6ZItAWQ2jH8rVLG6tiXhoLFE=6MZxZ1DTyLWzj88W8kB9g8C3vLhZvcRI3-Xv_HdQ-Hg=Uw-eyENNATG6meKsTgmwdwMLYUD13mmDkpr9Eo2dqZo=
> I have setup the gateway on each site but when I am checking the sync
> status its getting failed as below:
> >
> >
> >
> > Admin node at master :
> >
> > [cephuser@vlno-ceph01 cluster]$ radosgw-admin data sync status
> >
> > ERROR: source zone not specified
> >
> > [cephuser@vlno-ceph01 cluster]$