Re: [ceph-users] osds fails to start with mismatch in id

2014-11-11 Thread Ramakrishna Nishtala (rnishtal)
Hi
It appears that in case of pre-created partitions, ceph-deploy create, unable 
to change the partition guid’s. The parted guid remains as it is.

Ran manually sgdisk on each partition as
sgdisk --change-name=2:ceph data --partition-guid=2:${osd_uuid} 
--typecode=2:${ptype2} /dev/${i}.
The typecode for journal and data picked up from ceph-disk-udev.

Udev working fine now after reboot and not required to make any changes in 
fstab. All osd’s are up too.
ceph -s
cluster 9c6cd1ae-66bf-45ce-b7ba-0256b572a8b7
 health HEALTH_OK
 osdmap e358: 60 osds: 60 up, 60 in
  pgmap v1258: 4096 pgs, 1 pools, 0 bytes data, 0 objects
2802 MB used, 217 TB / 217 TB avail
4096 active+clean

Thanks to all who responded.

Regards,

Rama

From: Daniel Schwager [mailto:daniel.schwa...@dtnet.de]
Sent: Monday, November 10, 2014 10:39 PM
To: 'Irek Fasikhov'; Ramakrishna Nishtala (rnishtal); 'Gregory Farnum'
Cc: 'ceph-us...@ceph.com'
Subject: RE: [ceph-users] osds fails to start with mismatch in id

Hi Ramakrishna,

we use the phy. path (containing the serial number) to a disk to prevent 
complexity and wrong mapping... This path will never change:

/etc/ceph/ceph.conf
[osd.16]
devs = /dev/disk/by-id/scsi-SATA_ST4000NM0033-9Z_Z1Z0SDCY-part1
osd_journal = 
/dev/disk/by-id/scsi-SATA_INTEL_SSDSC2BA1BTTV330609AU100FGN-part1
...

regards
Danny



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Irek 
Fasikhov
Sent: Tuesday, November 11, 2014 6:36 AM
To: Ramakrishna Nishtala (rnishtal); Gregory Farnum
Cc: ceph-us...@ceph.commailto:ceph-us...@ceph.com
Subject: Re: [ceph-users] osds fails to start with mismatch in id

Hi, Ramakrishna.
I think you understand what the problem is:
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-56/whoami
56
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-57/whoami
57


Tue Nov 11 2014 at 6:01:40, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.commailto:rnish...@cisco.com:

Hi Greg,

Thanks for the pointer. I think you are right. The full story is like this.



After installation, everything works fine until I reboot. I do observe udevadm 
getting triggered in logs, but the devices do not come up after reboot. Exact 
issue as http://tracker.ceph.com/issues/5194. But this has been fixed a while 
back per the case details.

As a workaround, I copied the contents from /proc/mounts to fstab and that’s 
where I landed into the issue.



After your suggestion, defined as UUID in fstab, but similar problem.

blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing blkid 
explicitly to get the UUID’s. Goes in line with ceph-disk comments.



Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very weird 
that links below change in /dev/disk/by-uuid and /dev/disk/by-partuuid etc.



Before reboot

lrwxrwxrwx 1 root root 10 Nov 10 06:31 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdc2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdb2



After reboot

lrwxrwxrwx 1 root root 10 Nov 10 09:50 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdb2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdh2



Essentially, the transformation here is sdb2-sdh2 and sdc2- sdb2. In fact I 
haven’t partitioned my sdh at all before the test. The only difference probably 
from the standard procedure is I have pre-created the partitions for the 
journal and data, with parted.



/lib/udev/rules.d  osd rules has four different partition GUID codes,

45b0969e-9b03-4f30-b4c6-5ec00ceff106,

45b0969e-9b03-4f30-b4c6-b4b80ceff106,

4fbd7e29-9d25-41b8-afd0-062c0ceff05d,

4fbd7e29-9d25-41b8-afd0-5ec00ceff05d,



But all my partitions journal/data are having 
ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.



Appreciate any help.



Regards,



Rama

=

-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.commailto:g...@gregs42.com]
Sent: Sunday, November 09, 2014 3:36 PM
To: Ramakrishna Nishtala (rnishtal)
Cc: ceph-us...@ceph.commailto:ceph-us...@ceph.com
Subject: Re: [ceph-users] osds fails to start with mismatch in id



On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.commailto:rnish...@cisco.com wrote:

 Hi



 I am on ceph 0.87, RHEL 7



 Out of 60 few osd’s start and the rest complain about mismatch about

 id’s 

Re: [ceph-users] osds fails to start with mismatch in id

2014-11-10 Thread Ramakrishna Nishtala (rnishtal)
Hi Greg,

Thanks for the pointer. I think you are right. The full story is like this.



After installation, everything works fine until I reboot. I do observe udevadm 
getting triggered in logs, but the devices do not come up after reboot. Exact 
issue as http://tracker.ceph.com/issues/5194. But this has been fixed a while 
back per the case details.

As a workaround, I copied the contents from /proc/mounts to fstab and that’s 
where I landed into the issue.



After your suggestion, defined as UUID in fstab, but similar problem.

blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing blkid 
explicitly to get the UUID’s. Goes in line with ceph-disk comments.



Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very weird 
that links below change in /dev/disk/by-uuid and /dev/disk/by-partuuid etc.



Before reboot

lrwxrwxrwx 1 root root 10 Nov 10 06:31 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdc2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdb2



After reboot

lrwxrwxrwx 1 root root 10 Nov 10 09:50 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdb2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdh2



Essentially, the transformation here is sdb2-sdh2 and sdc2- sdb2. In fact I 
haven’t partitioned my sdh at all before the test. The only difference probably 
from the standard procedure is I have pre-created the partitions for the 
journal and data, with parted.



/lib/udev/rules.d  osd rules has four different partition GUID codes,

45b0969e-9b03-4f30-b4c6-5ec00ceff106,

45b0969e-9b03-4f30-b4c6-b4b80ceff106,

4fbd7e29-9d25-41b8-afd0-062c0ceff05d,

4fbd7e29-9d25-41b8-afd0-5ec00ceff05d,



But all my partitions journal/data are having 
ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.



Appreciate any help.



Regards,



Rama

=

-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.com]
Sent: Sunday, November 09, 2014 3:36 PM
To: Ramakrishna Nishtala (rnishtal)
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] osds fails to start with mismatch in id



On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.commailto:rnish...@cisco.com wrote:

 Hi



 I am on ceph 0.87, RHEL 7



 Out of 60 few osd’s start and the rest complain about mismatch about

 id’s as below.







 2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53



 2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54



 2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55



 2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56



 2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57







 Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this

 out manually corrected it and turned authentication to none too, but

 did not help.







 Any clues, how it can be corrected?



It sounds like maybe the symlinks to data and journal aren't matching up with 
where they're supposed to be. This is usually a result of using unstable /dev 
links that don't always match to the same physical disks. Have you checked that?

-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds fails to start with mismatch in id

2014-11-10 Thread Irek Fasikhov
Hi, Ramakrishna.
I think you understand what the problem is:
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-56/whoami
56
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-57/whoami
57


Tue Nov 11 2014 at 6:01:40, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.com:

  Hi Greg,

 Thanks for the pointer. I think you are right. The full story is like this.



 After installation, everything works fine until I reboot. I do observe
 udevadm getting triggered in logs, but the devices do not come up after
 reboot. Exact issue as http://tracker.ceph.com/issues/5194. But this has
 been fixed a while back per the case details.

 As a workaround, I copied the contents from /proc/mounts to fstab and
 that’s where I landed into the issue.



 After your suggestion, defined as UUID in fstab, but similar problem.

 blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing
 blkid explicitly to get the UUID’s. Goes in line with ceph-disk comments.



 Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very
 weird that links below change in /dev/disk/by-uuid and
 /dev/disk/by-partuuid etc.



 *Before reboot*

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - ../../sdd2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 89594989-90cb-4144-ac99-0ffd6a04146e - ../../sde2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - ../../sda2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 c57541a1-6820-44a8-943f-94d68b4b03d4 - ../../sdc2

 lrwxrwxrwx 1 root root 10 Nov 10 06:31
 da7030dd-712e-45e4-8d89-6e795d9f8011 - ../../sdb2



 *After reboot*

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - ../../sdd2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 89594989-90cb-4144-ac99-0ffd6a04146e - ../../sde2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - ../../sda2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 c57541a1-6820-44a8-943f-94d68b4b03d4 - ../../sdb2

 lrwxrwxrwx 1 root root 10 Nov 10 09:50
 da7030dd-712e-45e4-8d89-6e795d9f8011 - ../../sdh2



 Essentially, the transformation here is sdb2-sdh2 and sdc2- sdb2. In
 fact I haven’t partitioned my sdh at all before the test. The only
 difference probably from the standard procedure is I have pre-created the
 partitions for the journal and data, with parted.



 /lib/udev/rules.d  osd rules has four different partition GUID codes,

 45b0969e-9b03-4f30-b4c6-5ec00ceff106,

 45b0969e-9b03-4f30-b4c6-b4b80ceff106,

 4fbd7e29-9d25-41b8-afd0-062c0ceff05d,

 4fbd7e29-9d25-41b8-afd0-5ec00ceff05d,



 But all my partitions journal/data are having
 ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.



 Appreciate any help.



 Regards,



 Rama

 =

 -Original Message-
 From: Gregory Farnum [mailto:g...@gregs42.com]
 Sent: Sunday, November 09, 2014 3:36 PM
 To: Ramakrishna Nishtala (rnishtal)
 Cc: ceph-us...@ceph.com
 Subject: Re: [ceph-users] osds fails to start with mismatch in id



 On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) 
 rnish...@cisco.com wrote:

  Hi

 

  I am on ceph 0.87, RHEL 7

 

  Out of 60 few osd’s start and the rest complain about mismatch about

  id’s as below.

 

 

 

  2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53

 

  2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54

 

  2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55

 

  2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56

 

  2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57

 

 

 

  Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this

  out manually corrected it and turned authentication to none too, but

  did not help.

 

 

 

  Any clues, how it can be corrected?



 It sounds like maybe the symlinks to data and journal aren't matching up
 with where they're supposed to be. This is usually a result of using
 unstable /dev links that don't always match to the same physical disks.
 Have you checked that?

 -Greg
  ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds fails to start with mismatch in id

2014-11-10 Thread Daniel Schwager
Hi Ramakrishna,

we use the phy. path (containing the serial number) to a disk to prevent 
complexity and wrong mapping... This path will never change:
/etc/ceph/ceph.conf
[osd.16]
devs = /dev/disk/by-id/scsi-SATA_ST4000NM0033-9Z_Z1Z0SDCY-part1
osd_journal = 
/dev/disk/by-id/scsi-SATA_INTEL_SSDSC2BA1BTTV330609AU100FGN-part1
...

regards
Danny



From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Irek 
Fasikhov
Sent: Tuesday, November 11, 2014 6:36 AM
To: Ramakrishna Nishtala (rnishtal); Gregory Farnum
Cc: ceph-us...@ceph.com
Subject: Re: [ceph-users] osds fails to start with mismatch in id

Hi, Ramakrishna.
I think you understand what the problem is:
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-56/whoami
56
[ceph@ceph05 ~]$ cat /var/lib/ceph/osd/ceph-57/whoami
57


Tue Nov 11 2014 at 6:01:40, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.commailto:rnish...@cisco.com:

Hi Greg,

Thanks for the pointer. I think you are right. The full story is like this.



After installation, everything works fine until I reboot. I do observe udevadm 
getting triggered in logs, but the devices do not come up after reboot. Exact 
issue as http://tracker.ceph.com/issues/5194. But this has been fixed a while 
back per the case details.

As a workaround, I copied the contents from /proc/mounts to fstab and that’s 
where I landed into the issue.



After your suggestion, defined as UUID in fstab, but similar problem.

blkid.tab now moved to tmpfs and also isn’t consistent ever after issuing blkid 
explicitly to get the UUID’s. Goes in line with ceph-disk comments.



Decided to reinstall, dd the partitions, zapdisks etc. Did not help. Very weird 
that links below change in /dev/disk/by-uuid and /dev/disk/by-partuuid etc.



Before reboot

lrwxrwxrwx 1 root root 10 Nov 10 06:31 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdc2

lrwxrwxrwx 1 root root 10 Nov 10 06:31 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdb2



After reboot

lrwxrwxrwx 1 root root 10 Nov 10 09:50 11aca3e2-a9d5-4bcc-a5b0-441c53d473b6 - 
../../sdd2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 89594989-90cb-4144-ac99-0ffd6a04146e - 
../../sde2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c17fe791-5525-4b09-92c4-f90eaaf80dc6 - 
../../sda2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 c57541a1-6820-44a8-943f-94d68b4b03d4 - 
../../sdb2

lrwxrwxrwx 1 root root 10 Nov 10 09:50 da7030dd-712e-45e4-8d89-6e795d9f8011 - 
../../sdh2



Essentially, the transformation here is sdb2-sdh2 and sdc2- sdb2. In fact I 
haven’t partitioned my sdh at all before the test. The only difference probably 
from the standard procedure is I have pre-created the partitions for the 
journal and data, with parted.



/lib/udev/rules.d  osd rules has four different partition GUID codes,

45b0969e-9b03-4f30-b4c6-5ec00ceff106,

45b0969e-9b03-4f30-b4c6-b4b80ceff106,

4fbd7e29-9d25-41b8-afd0-062c0ceff05d,

4fbd7e29-9d25-41b8-afd0-5ec00ceff05d,



But all my partitions journal/data are having 
ebd0a0a2-b9e5-4433-87c0-68b6b72699c7 as partition guid code.



Appreciate any help.



Regards,



Rama

=

-Original Message-
From: Gregory Farnum [mailto:g...@gregs42.commailto:g...@gregs42.com]
Sent: Sunday, November 09, 2014 3:36 PM
To: Ramakrishna Nishtala (rnishtal)
Cc: ceph-us...@ceph.commailto:ceph-us...@ceph.com
Subject: Re: [ceph-users] osds fails to start with mismatch in id



On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal) 
rnish...@cisco.commailto:rnish...@cisco.com wrote:

 Hi



 I am on ceph 0.87, RHEL 7



 Out of 60 few osd’s start and the rest complain about mismatch about

 id’s as below.







 2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53



 2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54



 2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55



 2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56



 2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57







 Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this

 out manually corrected it and turned authentication to none too, but

 did not help.







 Any clues, how it can be corrected?



It sounds like maybe the symlinks to data and journal aren't matching up with 
where they're supposed to be. This is usually a result of using unstable /dev 
links that don't always match to the same physical disks. Have you checked that?

-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.commailto:ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


smime.p7s
Description: S/MIME cryptographic signature

[ceph-users] osds fails to start with mismatch in id

2014-11-09 Thread Ramakrishna Nishtala (rnishtal)
Hi
I am on ceph 0.87, RHEL 7
Out of 60 few osd's start and the rest complain about mismatch about id's as 
below.

2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53
2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54
2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55
2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56
2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57

Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this out 
manually corrected it and turned authentication to none too, but did not help.

Any clues, how it can be corrected?
Few OSD's are up though.

cluster 580f6503-2271-44b0-8ee6-e95c8f1c87c6
 health HEALTH_WARN 3451 pgs stale; 3451 pgs stuck stale; 7/17 in osds are 
down
 monmap e1: 1 mons at {host=192.168.30.201:6789/0}, election epoch 1, 
quorum 0  host
 osdmap e410: 60 osds: 10 up, 17 in

Regards,

Rama
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osds fails to start with mismatch in id

2014-11-09 Thread Gregory Farnum
On Sun, Nov 9, 2014 at 3:21 PM, Ramakrishna Nishtala (rnishtal)
rnish...@cisco.com wrote:
 Hi

 I am on ceph 0.87, RHEL 7

 Out of 60 few osd’s start and the rest complain about mismatch about id’s as
 below.



 2014-11-09 07:09:55.501177 7f4633e01880 -1 OSD id 56 != my id 53

 2014-11-09 07:09:55.810048 7f636edf4880 -1 OSD id 57 != my id 54

 2014-11-09 07:09:56.122957 7f459a766880 -1 OSD id 58 != my id 55

 2014-11-09 07:09:56.429771 7f87f8e0c880 -1 OSD id 0 != my id 56

 2014-11-09 07:09:56.741329 7fadd9b91880 -1 OSD id 2 != my id 57



 Found one OSD ID in /var/lib/ceph/cluster-id/keyring. To check this out
 manually corrected it and turned authentication to none too, but did not
 help.



 Any clues, how it can be corrected?

It sounds like maybe the symlinks to data and journal aren't matching
up with where they're supposed to be. This is usually a result of
using unstable /dev links that don't always match to the same physical
disks. Have you checked that?
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com