RE: BGP question

2009-09-17 Thread Rens
That routes detail command doesn't really give me that much extra info over
the other one.

Could it be that my import at RIPE for my customer AS is set to action
pref=100 instead of 120 is causing this?

Regards,

Rens
-Original Message-
From: Richard A Steenbergen [mailto:r...@e-gerbil.net] 
Sent: jeudi 17 septembre 2009 8:33
To: Rens
Cc: nanog@nanog.org
Subject: Re: BGP question

On Thu, Sep 17, 2009 at 06:49:37AM +0200, Rens wrote:
 *i customer_rangeLocal IX 110
 0    i
 *   customer_range upstream provider   100
 0       i
 *   customer_range my customer 0  120
 0    i
 
Last update to IP routing table: 0h23m34s, 1 path(s) installed:
Route is not advertised to any peers
 
...
 But it seems to choose the set the one as local IX as best.
 
 When I disable local IX it sets the one with my upstream provider as best.
 
 Shouldn't it set the customer_range I receive from my customer as best
since
 it's highest local pref and advertise it to my upstream provider peer?

Yes, which means there must be something else wrong with that route to
keep it from getting installed as best path. That CLI output smells like
Foundry, and I don't remember exactly how it would show up, but on Cisco
for example show ip bgp not only shows your rib but also the adj rib
in for neighbors which have soft-reconfig enabled. This would show up as
(received) but not say used, though again I'm not sure if Foundry does
this or how it would show up.

If you aren't doing something silly like outright rejecting the route in
the route-map or prefix-list on the neighbor, make sure the next-hop is
valid and reachable. IIRC Foundry doesn't do bgp next-hop recursion by
default, you have to manually enable it. If I remember my Foundry BGP
correctly (despite many years and a lot of therapy trying to repress
those memories) you can see more details with show ip bgp routes detail
x.x.x.x.

-- 
Richard A Steenbergen r...@e-gerbil.net   http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)




PPPOE design

2009-09-17 Thread Devangnp

Hi,

Any design consideration on PPPoE agg  routing? Any good example of  
covering multiple customers or subnets ?


Thanks,
Devang Patel



RE: Keepalives are temporarily in throttle due to closed TCP window

2009-09-17 Thread Michael Ruiz
And is that the one that traverses the 3550 with the 1500 byte MTU? 

Both connection traver through the 3550. I will disable the command on 7206 
vxr. thanks

-Original Message-
From: Richard A Steenbergen r...@e-gerbil.net
To: Michael Ruiz mr...@telwestservices.com
Cc: Brian Dickson brian.dick...@concertia.com; nanog@nanog.org 
nanog@nanog.org
Sent: 9/16/09 8:58 PM
Subject: Re: Keepalives are temporarily in throttle due to closed TCP window

On Wed, Sep 16, 2009 at 06:47:10PM -0500, Michael Ruiz wrote:
 Either a) you have the mtu misconfigured on that 7206vxr
 
 That part is where I am at a loss.  How is it the 6509 can establish a
 IBGP session with a 7606 when it has to go through the 7206 VXR?  The
 DS-3s are connected to the 7206 VXR. To add more depth to the story.  I
 have 8 IBGP sessions that are connected to the 7206 VXR that have been
 up and running for over a year.  Some of the sessions traverse the DS-3s
 and or a GigE long haul connections.  There are a total 10 Core routers
 that are mixture of Cisco 7606, 6509s, 7206 VXR w/ NPE400s or G1s.  Only
 this one IBGP session out of 9 routers is not being established.  Since
 I have a switch between the 7606 and 7206, I plan to put a packet
 capture server and see what I can see. 

And is that the one that traverses the 3550 with the 1500 byte MTU? 
Re-read what we said. You should be able to test the MTU theory by 
disabling path-mtu-discovery, which will cause MSS to fail back to the 
minimum 576.

-- 
Richard A Steenbergen r...@e-gerbil.net   http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)



RE: Keepalives are temporarily in throttle due to closed TCP window

2009-09-17 Thread Michael Ruiz
Oh you guys are going to love this...Before I could send out the maintenance 
notification for tonight to make changes,  The session has been up for 21 
hours.  This is before I could put a packet capture server on the segment.  
Sigh 

SNIP
  BGP state = Established, up for 21:29:25
  Last read 00:00:24, last write 00:00:02, hold time is 180, keepalive interval 
is 60 seconds
  Neighbor capabilities:


-Original Message-
From: Michael Ruiz 
Sent: Thursday, September 17, 2009 7:47 AM
To: Richard A Steenbergen; Michael Ruiz
Cc: Brian Dickson; nanog@nanog.org
Subject: RE: Keepalives are temporarily in throttle due to closed TCP window

And is that the one that traverses the 3550 with the 1500 byte MTU? 

Both connection traver through the 3550. I will disable the command on 7206 
vxr. thanks

-Original Message-
From: Richard A Steenbergen r...@e-gerbil.net
To: Michael Ruiz mr...@telwestservices.com
Cc: Brian Dickson brian.dick...@concertia.com; nanog@nanog.org 
nanog@nanog.org
Sent: 9/16/09 8:58 PM
Subject: Re: Keepalives are temporarily in throttle due to closed TCP window

On Wed, Sep 16, 2009 at 06:47:10PM -0500, Michael Ruiz wrote:
 Either a) you have the mtu misconfigured on that 7206vxr
 
 That part is where I am at a loss.  How is it the 6509 can establish a
 IBGP session with a 7606 when it has to go through the 7206 VXR?  The
 DS-3s are connected to the 7206 VXR. To add more depth to the story.  I
 have 8 IBGP sessions that are connected to the 7206 VXR that have been
 up and running for over a year.  Some of the sessions traverse the DS-3s
 and or a GigE long haul connections.  There are a total 10 Core routers
 that are mixture of Cisco 7606, 6509s, 7206 VXR w/ NPE400s or G1s.  Only
 this one IBGP session out of 9 routers is not being established.  Since
 I have a switch between the 7606 and 7206, I plan to put a packet
 capture server and see what I can see. 

And is that the one that traverses the 3550 with the 1500 byte MTU? 
Re-read what we said. You should be able to test the MTU theory by 
disabling path-mtu-discovery, which will cause MSS to fail back to the 
minimum 576.

-- 
Richard A Steenbergen r...@e-gerbil.net   http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)


Re: Keepalives are temporarily in throttle due to closed TCP window

2009-09-17 Thread Richard A Steenbergen
On Thu, Sep 17, 2009 at 09:17:00AM -0500, Michael Ruiz wrote:
 Oh you guys are going to love this...Before I could send out the maintenance 
 notification for tonight to make changes,  The session has been up for 21 
 hours.  This is before I could put a packet capture server on the segment.  
 Sigh 
 
 SNIP
   BGP state = Established, up for 21:29:25
   Last read 00:00:24, last write 00:00:02, hold time is 180, keepalive 
 interval is 60 seconds
   Neighbor capabilities:

You don't need to use an external sniffer, you can use debug ip packet
to see traffic being punted to the control plane, or in the case of the
6500 you can use ELAM or ERSPAN (though this is probably a little bit on
the advanced side). If this was an MTU mismatch a sniffer wouldn't
reveal anything other than missing packets anyways, which you could just
as easily deduce from a debug or looking at the retransmit counters on
the bgp neighbor.

http://cisco.cluepon.net/index.php/Using_capture_buffer_with_ELAM
http://cisco.cluepon.net/index.php/6500_SPAN_the_RP

My money is still on MTU mismatch. Assume the simplest and most likely 
explanation until proved otherwise.

-- 
Richard A Steenbergen r...@e-gerbil.net   http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)



RE: Keepalives are temporarily in throttle due to closed TCP window

2009-09-17 Thread Michael Ruiz
http://cisco.cluepon.net/index.php/Using_capture_buffer_with_ELAM
http://cisco.cluepon.net/index.php/6500_SPAN_the_RP

Well even the IBGP session came up on its own now and has been up for 1
day and 1 hour, I can honestly say this is bizarre situation.  I will
use the above links if something like this or weird happens.  Thank you
all. 

-Original Message-
From: Richard A Steenbergen [mailto:r...@e-gerbil.net] 
Sent: Thursday, September 17, 2009 1:11 PM
To: Michael Ruiz
Cc: Brian Dickson; nanog@nanog.org
Subject: Re: Keepalives are temporarily in throttle due to closed TCP
window

On Thu, Sep 17, 2009 at 09:17:00AM -0500, Michael Ruiz wrote:
 Oh you guys are going to love this...Before I could send out the
maintenance notification for tonight to make changes,  The session has
been up for 21 hours.  This is before I could put a packet capture
server on the segment.  Sigh 
 
 SNIP
   BGP state = Established, up for 21:29:25
   Last read 00:00:24, last write 00:00:02, hold time is 180, keepalive
interval is 60 seconds
   Neighbor capabilities:

You don't need to use an external sniffer, you can use debug ip packet
to see traffic being punted to the control plane, or in the case of the
6500 you can use ELAM or ERSPAN (though this is probably a little bit on
the advanced side). If this was an MTU mismatch a sniffer wouldn't
reveal anything other than missing packets anyways, which you could just
as easily deduce from a debug or looking at the retransmit counters on
the bgp neighbor.

http://cisco.cluepon.net/index.php/Using_capture_buffer_with_ELAM
http://cisco.cluepon.net/index.php/6500_SPAN_the_RP

My money is still on MTU mismatch. Assume the simplest and most likely 
explanation until proved otherwise.

-- 
Richard A Steenbergen r...@e-gerbil.net
http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1
2CBC)



Contact w/ clue re: ATT SMS email gateway?

2009-09-17 Thread Dave Pascoe
Recently something seems to have changed with the @txt.att.net email to
SMS gateway.  Messages sent through the gateway suffer from the following:

1) Long delay in reaching the phone (intermittent)
   (yes I know there is no latency guarantee)

and, even more crippling,

2) Message comes through as just SMS Message instead of the SMTP
message content.  And the sender is always 410-000-01x, where x
increments by 1 with each new incoming email-to-SMS gateway-handled message.

Phone is an iPhone 3GS.  This has worked fine for quite a while.  No
changes on the iPhone.

I have gone through normal ATT Wireless Customer Service but there
isn't much clue there - had to explain what an email to SMS gateway is.

Anyone elese seeing this?  Anyone from ATT Wireless here?

Please contact me off-list.

TIA,
Dave Pascoe



Re: Contact w/ clue re: ATT SMS email gateway?

2009-09-17 Thread Seth Mattinen
Dave Pascoe wrote:
 Recently something seems to have changed with the @txt.att.net email to
 SMS gateway.  Messages sent through the gateway suffer from the following:
 
 1) Long delay in reaching the phone (intermittent)
(yes I know there is no latency guarantee)
 
 and, even more crippling,
 
 2) Message comes through as just SMS Message instead of the SMTP
 message content.  And the sender is always 410-000-01x, where x
 increments by 1 with each new incoming email-to-SMS gateway-handled message.
 
 Phone is an iPhone 3GS.  This has worked fine for quite a while.  No
 changes on the iPhone.
 
 I have gone through normal ATT Wireless Customer Service but there
 isn't much clue there - had to explain what an email to SMS gateway is.
 
 Anyone elese seeing this?  Anyone from ATT Wireless here?
 

I've had better luck with SNPP (Simple Network Paging Protocol) these
days and completely abandoned anything using SMTP gateways. Better
delivery performance in my experience than the SMTP gateways, but the
sender always shows up as 1600. I use Sprint.

~Seth



Contact w/ clue re: ATT SMS email gateway?

2009-09-17 Thread Crist Clark
 On 9/17/2009 at 12:03 PM, Dave Pascoe davek...@gmail.com wrote:
 Recently something seems to have changed with the @txt.att.net email to
 SMS gateway.  Messages sent through the gateway suffer from the following:
 
 1) Long delay in reaching the phone (intermittent)
(yes I know there is no latency guarantee)
 
 and, even more crippling,
 
 2) Message comes through as just SMS Message instead of the SMTP
 message content.  And the sender is always 410-000-01x, where x
 increments by 1 with each new incoming email-to-SMS gateway-handled message.
 
 Phone is an iPhone 3GS.  This has worked fine for quite a while.  No
 changes on the iPhone.
 
 I have gone through normal ATT Wireless Customer Service but there
 isn't much clue there - had to explain what an email to SMS gateway is.
 
 Anyone elese seeing this?  Anyone from ATT Wireless here?

If you do find anyone, can you ask them about the really annoying
reject-after-DATA problem?

That is, if 555-555-1234, for some reason is not authorized to receive
SMS, you get a 250 response after RCPT TO, but a 5xx after the DATA
is sent. So if the message had multiple recipients, some of which are
allowed to receive SMS, the message then fails for all of them.

Verizon also has this problem, BTW.




cross connect reliability

2009-09-17 Thread Michael J McCafferty
All,
Today I had yet another cross-connect fail at our colo provider. From
memory, this is the 6th cross-connect to fail while in service, in 4yrs
and recently there was a bad SFP on their end as well. This seemes like
a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of jiggling
and wiggling... lots of chances to get bent out of whack from activity
near my patches and cables.
Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and made
less of a ruckus when cabling... but this still seems like a pretty high
failure rate at the colo.
I am curious; what do you expect the average reliability of your FastE
or GigE copper cross-connects at a colo?

Thanks,
Mike

-- 

Michael J. McCafferty
Principal, Security Engineer
M5 Hosting
http://www.m5hosting.com

You can have your own custom Dedicated Server up and running today !
RedHat Enterprise, CentOS, Ubuntu, Debian, OpenBSD, FreeBSD, and more





Re: cross connect reliability

2009-09-17 Thread Seth Mattinen
Michael J McCafferty wrote:
 All,
   Today I had yet another cross-connect fail at our colo provider. From
 memory, this is the 6th cross-connect to fail while in service, in 4yrs
 and recently there was a bad SFP on their end as well. This seemes like
 a high failure rate to me. When I asked about the high failure rate,
 they said that they run a lot of cables and there is a lot of jiggling
 and wiggling... lots of chances to get bent out of whack from activity
 near my patches and cables.
   Until a few years ago my time was spent mostly in single tenant data
 centers, and it may be true that we made fewer cabling changes and made
 less of a ruckus when cabling... but this still seems like a pretty high
 failure rate at the colo.
   I am curious; what do you expect the average reliability of your FastE
 or GigE copper cross-connects at a colo?
 

Never to fail? Seriously; if you're talking about a passive connection
(optical or electrical) like a patch panel, I'd expect it to keep going
forever unless someone damages it.

~Seth



Re: cross connect reliability

2009-09-17 Thread William Pitcock
We have never had a xconnect fail, ever.  And we have several.  This is over a 
6 year period.

William
--Original Message--
From: Michael J McCafferty
To: nanog
Subject: cross connect reliability
Sent: Sep 17, 2009 4:45 PM

All,
Today I had yet another cross-connect fail at our colo provider. From
memory, this is the 6th cross-connect to fail while in service, in 4yrs
and recently there was a bad SFP on their end as well. This seemes like
a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of jiggling
and wiggling... lots of chances to get bent out of whack from activity
near my patches and cables.
Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and made
less of a ruckus when cabling... but this still seems like a pretty high
failure rate at the colo.
I am curious; what do you expect the average reliability of your FastE
or GigE copper cross-connects at a colo?

Thanks,
Mike

-- 

Michael J. McCafferty
Principal, Security Engineer
M5 Hosting
http://www.m5hosting.com

You can have your own custom Dedicated Server up and running today !
RedHat Enterprise, CentOS, Ubuntu, Debian, OpenBSD, FreeBSD, and more





-- 
William Pitcock
SystemInPlace - Simple Hosting Solutions
1-866-519-6149

RE: cross connect reliability

2009-09-17 Thread David Hubbard
From: Michael J McCafferty [mailto:m...@m5computersecurity.com] 
 
 All,
   Today I had yet another cross-connect fail at our colo 
 provider. From memory, this is the 6th cross-connect to
 fail while in service, in 4yrs and recently there was a
 bad SFP on their end as well. This seemes like a high
 failure rate to me. When I asked about the high failure
 rate, they said that they run a lot of cables and there
 is a lot of jiggling and wiggling... lots of chances to
 get bent out of whack from activity near my patches and
 cables.

   I am curious; what do you expect the average 
 reliability of your FastE
 or GigE copper cross-connects at a colo?

You're seeing these failures on copper?!  I've
worked in some places with absolute cabling
nightmares on the copper side of things in wiring
closets but those were all enterprise situations
and we still rarely had failures of ports and even
less on the wiring itself.  I'd never expect there
to be frequent copper issues in a colo where you
would not have anywhere near the volume of moving
cables around that goes on in an enterprise.

On the fiber side, your experience still seems high
although obviously much easier to damage fiber
cables.  Perhaps they have a big mess of a fiber
plant and have to fish tape new runs through it
and like to snag adjacent cables, etc.  It can only
get worse if that's the problem and they don't
clean it up.

David



RE: cross connect reliability

2009-09-17 Thread Michael K. Smith - Adhost
Hello Michael:

 -Original Message-
 From: Michael J McCafferty [mailto:m...@m5computersecurity.com]
 Sent: Thursday, September 17, 2009 2:46 PM
 To: nanog
 Subject: cross connect reliability
 
 All,
   Today I had yet another cross-connect fail at our colo provider.
 From
 memory, this is the 6th cross-connect to fail while in service, in
4yrs
 and recently there was a bad SFP on their end as well. This seemes
like
 a high failure rate to me. When I asked about the high failure rate,
 they said that they run a lot of cables and there is a lot of jiggling
 and wiggling... lots of chances to get bent out of whack from activity
 near my patches and cables.
   Until a few years ago my time was spent mostly in single tenant
 data
 centers, and it may be true that we made fewer cabling changes and
made
 less of a ruckus when cabling... but this still seems like a pretty
 high
 failure rate at the colo.
   I am curious; what do you expect the average reliability of your
 FastE
 or GigE copper cross-connects at a colo?
 
 Thanks,
 Mike
 
I agree with their Reason for Outage, but it sounds like a design issue.
We prewire all of our switches to patch panels so they don't get touched
once they're installed.  The patch panels are much more friendly to
insertions and removals than a 48 port 1-U switch.  We also have
multiple connections on the fiber side to avoid those failures.  With
all of that, we still have failures, but their effect and frequency are
minimized.

Mike

--
Michael K. Smith - CISSP, GISP
Chief Technical Officer - Adhost Internet LLC mksm...@adhost.com
w: +1 (206) 404-9500 f: +1 (206) 404-9050
PGP: B49A DDF5 8611 27F3  08B9 84BB E61E 38C0 (Key ID: 0x9A96777D)





Re: cross connect reliability

2009-09-17 Thread Alex Balashov

Seth Mattinen wrote:

Michael J McCafferty wrote:

All,
Today I had yet another cross-connect fail at our colo provider. From
memory, this is the 6th cross-connect to fail while in service, in 4yrs
and recently there was a bad SFP on their end as well. This seemes like
a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of jiggling
and wiggling... lots of chances to get bent out of whack from activity
near my patches and cables.
Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and made
less of a ruckus when cabling... but this still seems like a pretty high
failure rate at the colo.
I am curious; what do you expect the average reliability of your FastE
or GigE copper cross-connects at a colo?



Never to fail? Seriously; if you're talking about a passive connection
(optical or electrical) like a patch panel, I'd expect it to keep going
forever unless someone damages it.


That's truly wishful thinking, as are the assumptions that insulate it 
from damaging factors.  Nothing lasts forever.


--
Alex Balashov - Principal
Evariste Systems
Web : http://www.evaristesys.com/
Tel : (+1) (678) 954-0670
Direct  : (+1) (678) 954-0671



Re: cross connect reliability

2009-09-17 Thread Marshall Eubanks


On Sep 17, 2009, at 5:52 PM, Seth Mattinen wrote:


Michael J McCafferty wrote:

All,
	Today I had yet another cross-connect fail at our colo provider.  
From
memory, this is the 6th cross-connect to fail while in service, in  
4yrs
and recently there was a bad SFP on their end as well. This seemes  
like

a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of  
jiggling
and wiggling... lots of chances to get bent out of whack from  
activity

near my patches and cables.
Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and  
made
less of a ruckus when cabling... but this still seems like a pretty  
high

failure rate at the colo.
	I am curious; what do you expect the average reliability of your  
FastE

or GigE copper cross-connects at a colo?



Never to fail? Seriously; if you're talking about a passive connection
(optical or electrical) like a patch panel, I'd expect it to keep  
going

forever unless someone damages it.



Or until someone pulls out the wrong cable (which has happened to me).

Regards
Marshall



~Seth







Re: cross connect reliability

2009-09-17 Thread Justin Wilson - MTIN

From: Michael J McCafferty m...@m5computersecurity.com
Organization: M5Hosting
Date: Thu, 17 Sep 2009 14:45:36 -0700
To: nanog nanog@nanog.org
Subject: cross connect reliability

All,
 Today I had yet another cross-connect fail at our colo provider. From
memory, this is the 6th cross-connect to fail while in service, in 4yrs
and recently there was a bad SFP on their end as well. This seemes like
a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of jiggling
and wiggling... lots of chances to get bent out of whack from activity
near my patches and cables.
 Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and made
less of a ruckus when cabling... but this still seems like a pretty high
failure rate at the colo.
 I am curious; what do you expect the average reliability of your FastE
or GigE copper cross-connects at a colo?

Thanks,
Mike



Does the colo let anyone run cables or do they have approved
contractors?It sounds like a design issue to me in the way the cables
are treated.  In 4 years at a busy colo we have had one copper cross connect
not act right.  It would pass data but was flaky.  We replaced it because it
was an easy run just to rule it out.

I am assuming your are in shared space.  If so I would investigate your
weak points (which I am sure you already are doing).

Justin



Re: cross connect reliability

2009-09-17 Thread Seth Mattinen
Alex Balashov wrote:
 Seth Mattinen wrote:
 Michael J McCafferty wrote:
 All,
 Today I had yet another cross-connect fail at our colo provider.
 From
 memory, this is the 6th cross-connect to fail while in service, in 4yrs
 and recently there was a bad SFP on their end as well. This seemes like
 a high failure rate to me. When I asked about the high failure rate,
 they said that they run a lot of cables and there is a lot of jiggling
 and wiggling... lots of chances to get bent out of whack from activity
 near my patches and cables.
 Until a few years ago my time was spent mostly in single tenant data
 centers, and it may be true that we made fewer cabling changes and made
 less of a ruckus when cabling... but this still seems like a pretty high
 failure rate at the colo.
 I am curious; what do you expect the average reliability of your
 FastE
 or GigE copper cross-connects at a colo?


 Never to fail? Seriously; if you're talking about a passive connection
 (optical or electrical) like a patch panel, I'd expect it to keep going
 forever unless someone damages it.
 
 That's truly wishful thinking, as are the assumptions that insulate it
 from damaging factors.  Nothing lasts forever.
 

What the OP is describing is abnormally high in my view.

Based purely on my own personal experience, the structured wiring in my
parent's house I put in in the mid 90's has never suffered a failure, is
still in use today, and it's in a residential environment with dogs and
cats. I'd expect a properly managed environment to fare at least as good
as that.

~Seth




Re: cross connect reliability

2009-09-17 Thread Charles Wyble



Marshall Eubanks wrote:


On Sep 17, 2009, at 5:52 PM, Seth Mattinen wrote:


Michael J McCafferty wrote:

All,
Today I had yet another cross-connect fail at our colo provider. 
From

memory, this is the 6th cross-connect to fail while in service, in 4yrs
and recently there was a bad SFP on their end as well. This seemes like
a high failure rate to me. When I asked about the high failure rate,
they said that they run a lot of cables and there is a lot of jiggling
and wiggling... lots of chances to get bent out of whack from activity
near my patches and cables.
Until a few years ago my time was spent mostly in single tenant data
centers, and it may be true that we made fewer cabling changes and made
less of a ruckus when cabling... but this still seems like a pretty high
failure rate at the colo.
I am curious; what do you expect the average reliability of your 
FastE

or GigE copper cross-connects at a colo?



Never to fail? Seriously; if you're talking about a passive connection
(optical or electrical) like a patch panel, I'd expect it to keep going
forever unless someone damages it.



Or until someone pulls out the wrong cable (which has happened to me).



That's not a failure though. It's a disconnection. It happens but is 
readily attributable to a cause.


Random failures of a single ports connectivity bizzare and annoying. 
Whole switches? Seen it.

Whole panels? Seen it.
Whole blades? Seen it.

Single port on a switch or patch panel? Never.



RE: cross connect reliability

2009-09-17 Thread Deepak Jain

[lots of stuff deleted]. 

We've seen cross-connects fail at sites like E and others. Generally 
speaking, it is a human-error issue and not a component failure one. Either 
people are being sloppy and aren't reading labels, or the labels aren't there. 

In a cabinet situation, every cabinet does not necessarily home back to its own 
patch panel, so some trashing may occur -- it can be avoided with good design 
[cables in the back stay there, etc].

When you are talking about optics failing and they are providing smart 
cross-connects, almost anything is possible. 

The true tell tale is whether you have to call when the cross-connect goes 
down, or if it just bounces. Either way, have them take you to their 
cross-connect room and show you their mess. Once you see it, you'll know what 
to expect going forward.

Deepak


Re: cross connect reliability

2009-09-17 Thread Mike Lieman
We have a winner!

On Thu, Sep 17, 2009 at 5:59 PM, Marshall Eubanks t...@americafree.tvwrote:


 Or until someone pulls out the wrong cable (which has happened to me).

 Regards
 Marshall


  ~Seth







MPLS Multi-vrf and IP Multicast

2009-09-17 Thread devang patel
Hello All,

Any scenario where we are using MPLS between PE-CE when CE is multi-vrf
router, any deployment in real word? Carrier supporting carrier CSC is one
where you have MPLS between PE-CE link.
If PE-CE running MPLS between them then what will be impact on MULTICAST
between two site?
If PE-CE connectivity is pure IP then i think multicast will work properly
right?

thanks,
Devang Patel


Re: cross connect reliability

2009-09-17 Thread Pete Carah
On 09/17/2009 06:37 PM, Deepak Jain wrote:
 
 [lots of stuff deleted]. 
 

A famous one that can happen with some techs is that they make jumpers
from solid wire with generic rj45 plugs (yes, I've seen this recently
from several folks who should know better).  These will last somewhere
around a year (long enough to forget when they were installed) then
randomly fail from just fan vibration or slight breezes.  There are rj45
plugs made for solid wire (have 3 little prongs instead of 2, and they
are offset to straddle the wire) but I feel that even these can go bad.
 I know if the techs are properly educated that this will never happen
(tm)...  (till someone needs a custom-length jumper on a sunday...)
(for which, one colo building has an ace hardware with most of the right
stuff, but unfortunately most don't).

As we all (should) know, all solid-wire cable should terminate in a
panel and proper short jumpers (pref. with molded strain-relief) are
used for the rest.

-- Pete



Re: cross connect reliability

2009-09-17 Thread Jon Lewis
Not really.  That's all too easy to diagnose and fix.  Poorly terminated 
and or mistreated cabling is far more likely.  I wrote a long post about 
all the crap termination and poor treatment I've seen...but canceled the 
message.


On Thu, 17 Sep 2009, Mike Lieman wrote:


We have a winner!

On Thu, Sep 17, 2009 at 5:59 PM, Marshall Eubanks t...@americafree.tvwrote:



Or until someone pulls out the wrong cable (which has happened to me).

Regards
Marshall


 ~Seth











--
 Jon Lewis   |  I route
 Senior Network Engineer |  therefore you are
 Atlantic Net|
_ http://www.lewis.org/~jlewis/pgp for PGP public key_



Re: cross connect reliability

2009-09-17 Thread Mike Lieman
Because no-one is stealing pairs anymore?

On Thu, Sep 17, 2009 at 7:23 PM, Jon Lewis jle...@lewis.org wrote:

 Not really.  That's all too easy to diagnose and fix.  Poorly terminated
 and or mistreated cabling is far more likely.  I wrote a long post about all
 the crap termination and poor treatment I've seen...but canceled the
 message.

 On Thu, 17 Sep 2009, Mike Lieman wrote:

  We have a winner!

 On Thu, Sep 17, 2009 at 5:59 PM, Marshall Eubanks t...@americafree.tv
 wrote:


 Or until someone pulls out the wrong cable (which has happened to me).

 Regards
 Marshall


  ~Seth








 --
  Jon Lewis   |  I route
  Senior Network Engineer |  therefore you are
  Atlantic Net|
 _ 
 http://www.lewis.org/~jlewis/pgphttp://www.lewis.org/%7Ejlewis/pgpfor PGP 
 public key_



Re: cross connect reliability

2009-09-17 Thread Mark Andrews

In message 20090917234547.gt51...@gerbil.cluepon.net, Richard A Steenbergen w
rites:
 On Thu, Sep 17, 2009 at 03:35:37PM -0700, Charles Wyble wrote:
  
  Random failures of a single ports connectivity bizzare and annoying. 
  Whole switches? Seen it.
  Whole panels? Seen it.
  Whole blades? Seen it.
  
  Single port on a switch or patch panel? Never.
 
 You've never seen a single port go bad on a switch? I can't even count
 the number of times I've seen that happen. Not that I'm not suggesting 
 the OP wasn't the victim of a human error like unplugging the wrong port 
 and they just lied to him, that happens even more.
 
 My favorite bizarre random failure story is a toss-up between one of 
 these two:
 
 Story 1. Had a customer report that they weren't able to transfer this
 one particular file over their connection. The transfer would start and
 then at a certain point the tcp session would just lock up. After a lot
 of head scratching, it turned out that for 8 ports on a 24 port FastE
 switch blade, this certain combination of bytes caused the packet to be
 dropped on this otherwise perfectly normal and functioning card, thus
 stalling the tcp session while leaving everything around it unaffected.
 If you moved them to a different port outside this group of 8, or used
 https, or uuencoded it, it would go through fine.

Seen that more than once.  It's worse when it's in some router on the
other side of the planet and your just a lowly customer.
 
 Story 2. Had a customer report that they were getting extremely slow 
 transfers to another network, despite not being able to find any packet 
 loss. Shifting the traffic to a different port to reach the same network 
 resolved the problem. After removing the traffic and attempting to ping 
 the far side, I got the following:
 
 drop
 64 bytes from x.x.x.x: icmp_seq=1 ttl=61 time=0.194 ms
 64 bytes from x.x.x.x: icmp_seq=2 ttl=61 time=0.196 ms
 64 bytes from x.x.x.x: icmp_seq=3 ttl=61 time=0.183 ms
 64 bytes from x.x.x.x: icmp_seq=0 ttl=61 time=4.159 ms
 drop
 64 bytes from x.x.x.x: icmp_seq=5 ttl=61 time=0.194 ms
 64 bytes from x.x.x.x: icmp_seq=6 ttl=61 time=0.196 ms
 64 bytes from x.x.x.x: icmp_seq=7 ttl=61 time=0.183 ms
 64 bytes from x.x.x.x: icmp_seq=4 ttl=61 time=4.159 ms
 
 After a little bit more testing, it turned out that every 4th packet
 that was being sent to the peers' router was being queued until another
 4th packet would come along and knock it out. If you increased the
 interval time of the ping, you would see the amount of time the packet
 spent in the queue increase. At one point I had it up to over 350
 seconds (not milliseconds) that the packet stayed in the other routers'
 queue before that 4th packet came along and knocked it free. I suspect
 it could have gone higher, but random scanning traffic on the internet
 was coming in. When there was a lot of traffic on the interface you
 would never see the packet loss, just reordering of every 4th packet and 
 thus slow tcp transfers. :)
 
 -- 
 Richard A Steenbergen r...@e-gerbil.net   http://www.e-gerbil.net/ras
 GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
 
-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742 INTERNET: ma...@isc.org