RE: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Somnath Roy
But, do we know why Jumbo frames may have an impact on peering ?
In our setup so far, we haven't enabled jumbo frames other than performance 
reason (if at all).

Thanks  Regards
Somnath

-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Robert 
LeBlanc
Sent: Tuesday, March 31, 2015 11:08 AM
To: Sage Weil
Cc: ceph-devel; Ceph-User
Subject: Re: [ceph-users] Force an OSD to try to peer

I was desperate for anything after exhausting every other possibility I could 
think of. Maybe I should put a checklist in the Ceph docs of things to look for.

Thanks,

On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
 On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

 I always hesitate to point the finger at the jumbo frames
 configuration but almost every time that is the culprit!

 Thanks for the update.  :)
 sage




 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us wrote:
  I've been working at this peering problem all day. I've done a lot
  of testing at the network layer and I just don't believe that we
  have a problem that would prevent OSDs from peering. When looking
  though osd_debug 20/20 logs, it just doesn't look like the OSDs are
  trying to peer. I don't know if it is because there are so many
  outstanding creations or what. OSDs will peer with OSDs on other
  hosts, but for reason only chooses a certain number and not one that it 
  needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These
  usually have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an
  OSD should contact or anything that would be artificially limiting
  the peering connections. I've restarted the OSDs a number of times,
  as well as rebooting the hosts. I beleive if the OSDs finish
  peering everything will clear up. I can't find anything in pg query
  that would help me figure out what is blocking it (peering blocked
  by is empty). The PGs are scattered across all the hosts so we can't pin 
  it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
  (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
  stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
  9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,
last_unstale: 2015-03-30 20:10:39.509855,
mapping_epoch: 683,
log_start: 0'0,
ondisk_log_start: 0'0,
created: 293,
last_epoch_clean: 343,
parent: 0.0,
parent_split_bits: 0,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00,
log_size: 0,
ondisk_log_size: 0,

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Robert LeBlanc
At the L2 level, if the hosts and switches don't accept jumbo frames,
they just drop them because they are too big. They are not fragmented
because they don't go through a router. My problem is that OSDs were
able to peer with other OSDs on the host, but my guess is that they
never sent/received packets larger than 1500 bytes. Then other OSD
processes tried to peer but sent packets larger than 1500 bytes
causing the packets to be dropped and peering to stall.

On Tue, Mar 31, 2015 at 12:10 PM, Somnath Roy somnath@sandisk.com wrote:
 But, do we know why Jumbo frames may have an impact on peering ?
 In our setup so far, we haven't enabled jumbo frames other than performance 
 reason (if at all).

 Thanks  Regards
 Somnath

 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Robert LeBlanc
 Sent: Tuesday, March 31, 2015 11:08 AM
 To: Sage Weil
 Cc: ceph-devel; Ceph-User
 Subject: Re: [ceph-users] Force an OSD to try to peer

 I was desperate for anything after exhausting every other possibility I could 
 think of. Maybe I should put a checklist in the Ceph docs of things to look 
 for.

 Thanks,

 On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
 On Tue, 31 Mar 2015, Robert LeBlanc wrote:
 Turns out jumbo frames was not set on all the switch ports. Once that
 was resolved the cluster quickly became healthy.

 I always hesitate to point the finger at the jumbo frames
 configuration but almost every time that is the culprit!

 Thanks for the update.  :)
 sage




 On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us 
 wrote:
  I've been working at this peering problem all day. I've done a lot
  of testing at the network layer and I just don't believe that we
  have a problem that would prevent OSDs from peering. When looking
  though osd_debug 20/20 logs, it just doesn't look like the OSDs are
  trying to peer. I don't know if it is because there are so many
  outstanding creations or what. OSDs will peer with OSDs on other
  hosts, but for reason only chooses a certain number and not one that it 
  needs to finish the peering process.
 
  I've check: firewall, open files, number of threads allowed. These
  usually have given me an error in the logs that helped me fix the problem.
 
  I can't find a configuration item that specifies how many peers an
  OSD should contact or anything that would be artificially limiting
  the peering connections. I've restarted the OSDs a number of times,
  as well as rebooting the hosts. I beleive if the OSDs finish
  peering everything will clear up. I can't find anything in pg query
  that would help me figure out what is blocking it (peering blocked
  by is empty). The PGs are scattered across all the hosts so we can't pin 
  it down to a specific host.
 
  Any ideas on what to try would be appreciated.
 
  [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
  (6c0127fcb58008793d3c8b62d925bc91963672a3)
  [ulhglive-root@ceph9 ~]# ceph status
  cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
   health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
  stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
   monmap e2: 3 mons at
  {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
  9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
   osdmap e704: 120 osds: 120 up, 120 in
pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
  11447 MB used, 436 TB / 436 TB avail
   727 active+clean
   990 peering
37 creating+peering
 1 down+peering
   290 remapped+peering
 3 creating+remapped+peering
 
  { state: peering,
epoch: 707,
up: [
  40,
  92,
  48,
  91],
acting: [
  40,
  92,
  48,
  91],
info: { pgid: 7.171,
last_update: 0'0,
last_complete: 0'0,
log_tail: 0'0,
last_user_version: 0,
last_backfill: MAX,
purged_snaps: [],
history: { epoch_created: 293,
last_epoch_started: 343,
last_epoch_clean: 343,
last_epoch_split: 0,
same_up_since: 688,
same_interval_since: 688,
same_primary_since: 608,
last_scrub: 0'0,
last_scrub_stamp: 2015-03-30 11:11:18.872851,
last_deep_scrub: 0'0,
last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
last_clean_scrub_stamp: 0.00},
stats: { version: 0'0,
reported_seq: 326,
reported_epoch: 707,
state: peering,
last_fresh: 2015-03-30 20:10:39.509855,
last_change: 2015-03-30 19:44:17.361601,
last_active: 2015-03-30 11:37:56.956417,
last_clean: 2015-03-30 11:37:56.956417,
last_became_active: 0.00,

RE: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread Sage Weil
On Tue, 31 Mar 2015, Somnath Roy wrote:
 But, do we know why Jumbo frames may have an impact on peering ?
 In our setup so far, we haven't enabled jumbo frames other than performance 
 reason (if at all).

It's nothing specific to peering (or ceph).  The symptom we've seen is 
just that byte stop passing across a TCP connection, usually when there is 
some largish messages being sent.  The ping/heartbeat messages get through 
because they are small and we disable nagle so they never end up in large 
frames.

It's a pain to diagnose.

sage


 
 Thanks  Regards
 Somnath
 
 -Original Message-
 From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
 Robert LeBlanc
 Sent: Tuesday, March 31, 2015 11:08 AM
 To: Sage Weil
 Cc: ceph-devel; Ceph-User
 Subject: Re: [ceph-users] Force an OSD to try to peer
 
 I was desperate for anything after exhausting every other possibility I could 
 think of. Maybe I should put a checklist in the Ceph docs of things to look 
 for.
 
 Thanks,
 
 On Tue, Mar 31, 2015 at 11:36 AM, Sage Weil s...@newdream.net wrote:
  On Tue, 31 Mar 2015, Robert LeBlanc wrote:
  Turns out jumbo frames was not set on all the switch ports. Once that
  was resolved the cluster quickly became healthy.
 
  I always hesitate to point the finger at the jumbo frames
  configuration but almost every time that is the culprit!
 
  Thanks for the update.  :)
  sage
 
 
 
 
  On Mon, Mar 30, 2015 at 8:15 PM, Robert LeBlanc rob...@leblancnet.us 
  wrote:
   I've been working at this peering problem all day. I've done a lot
   of testing at the network layer and I just don't believe that we
   have a problem that would prevent OSDs from peering. When looking
   though osd_debug 20/20 logs, it just doesn't look like the OSDs are
   trying to peer. I don't know if it is because there are so many
   outstanding creations or what. OSDs will peer with OSDs on other
   hosts, but for reason only chooses a certain number and not one that it 
   needs to finish the peering process.
  
   I've check: firewall, open files, number of threads allowed. These
   usually have given me an error in the logs that helped me fix the 
   problem.
  
   I can't find a configuration item that specifies how many peers an
   OSD should contact or anything that would be artificially limiting
   the peering connections. I've restarted the OSDs a number of times,
   as well as rebooting the hosts. I beleive if the OSDs finish
   peering everything will clear up. I can't find anything in pg query
   that would help me figure out what is blocking it (peering blocked
   by is empty). The PGs are scattered across all the hosts so we can't pin 
   it down to a specific host.
  
   Any ideas on what to try would be appreciated.
  
   [ulhglive-root@ceph9 ~]# ceph --version ceph version 0.80.7
   (6c0127fcb58008793d3c8b62d925bc91963672a3)
   [ulhglive-root@ceph9 ~]# ceph status
   cluster 48de182b-5488-42bb-a6d2-62e8e47b435c
health HEALTH_WARN 1 pgs down; 1321 pgs peering; 1321 pgs
   stuck inactive; 1321 pgs stuck unclean; too few pgs per osd (17  min 20)
monmap e2: 3 mons at
   {mon1=10.217.72.27:6789/0,mon2=10.217.72.28:6789/0,mon3=10.217.72.2
   9:6789/0}, election epoch 30, quorum 0,1,2 mon1,mon2,mon3
osdmap e704: 120 osds: 120 up, 120 in
 pgmap v1895: 2048 pgs, 1 pools, 0 bytes data, 0 objects
   11447 MB used, 436 TB / 436 TB avail
727 active+clean
990 peering
 37 creating+peering
  1 down+peering
290 remapped+peering
  3 creating+remapped+peering
  
   { state: peering,
 epoch: 707,
 up: [
   40,
   92,
   48,
   91],
 acting: [
   40,
   92,
   48,
   91],
 info: { pgid: 7.171,
 last_update: 0'0,
 last_complete: 0'0,
 log_tail: 0'0,
 last_user_version: 0,
 last_backfill: MAX,
 purged_snaps: [],
 history: { epoch_created: 293,
 last_epoch_started: 343,
 last_epoch_clean: 343,
 last_epoch_split: 0,
 same_up_since: 688,
 same_interval_since: 688,
 same_primary_since: 608,
 last_scrub: 0'0,
 last_scrub_stamp: 2015-03-30 11:11:18.872851,
 last_deep_scrub: 0'0,
 last_deep_scrub_stamp: 2015-03-30 11:11:18.872851,
 last_clean_scrub_stamp: 0.00},
 stats: { version: 0'0,
 reported_seq: 326,
 reported_epoch: 707,
 state: peering,
 last_fresh: 2015-03-30 20:10:39.509855,
 last_change: 2015-03-30 19:44:17.361601,
 last_active: 2015-03-30 11:37:56.956417,
 last_clean: 2015-03-30 11:37:56.956417,
 last_became_active: 0.00,
 last_unstale: 2015-03-30 20:10:39.509855,