Re: TWC/Charter/Spectrum contact off-list ? (Reverse DNS issue)

2018-03-19 Thread Ross Vandegrift
On Thu, Oct 19, 2017 at 08:04:12AM -0400, Brandon Applegate wrote:
> I had success with this issue about 2 years ago when some TWC folks
> contacted me.  I don’t know if those folks are still with TWC/Charter
> here in the end of 2017 - hence posting on NANOG.  The tl;dr is IPv6
> reverse DNS issues.  It was broken, got fixed, and seems to have
> broken again recently.

Did you ever get a response or make progress?  I got a ticket escalated
to engineering in mid-December about 2606:6000::/32.  Just learned from
support that it was closed without a resolution.

Thanks,
Ross


Anyone around from AS7459?

2010-06-02 Thread Ross Vandegrift
Hey,

If you're from AS7459, you're announcing more specifics from one of
our prefixes.  Please drop me a line off-list, it's making my
afternoon a drag.

Ross
-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie


signature.asc
Description: Digital signature


Re: Mitigating human error in the SP

2010-02-03 Thread Ross Vandegrift
On Mon, Feb 01, 2010 at 09:46:07PM -0500, Stefan Fouant wrote:
 Vijay Gill had some real interesting insights into this in a
 presentation he gave back at NANOG 44:
 
 http://www.nanog.org/meetings/nanog44/presentations/Monday/Gill_programatic_N44.pdf
 
 His Blog article on Infrastructure is Software further expounds
 upon the benefits of such an approach -
 http://vijaygill.wordpress.com/2009/07/22/infrastructure-is-software/
 
 That stuff is light years ahead of anything anybody is doing today
 (well, apart from maybe Vijay himself ;) ... but IMO it's where we
 need to start heading.

Vijay's stuff is fascinating.  The vision is great.  But in my
experience, the vendors and implementations basically ruin the dream
for anyone who doesn't have his pull.

I'm sure my software is nowhere close to being as sophisticated as
his, but my plans are pretty much in line with his suggestions.  Some
problems I've run into that I don't see any kind of solution for:

1) Forwarding-impacting bugs: IOS bugs that are triggered by SNMP are
easily the #1 cause of our accidental service impact.  Most seem to be
race conditions that require real-world config and forwarding load -
not something a small shop can afford to build a lab to reproduce.  If
we stuck to manual deployment, we might have made a few mistakes but
would it have been worse?  Maybe - but honestly, it could be a wash.

2) Vendor support is highly suspicious of automation: anytime I open a
ticket, even unrelated to an automated software process, the first
thing the vendor support demands is to disable all automation.
Juniper is by far the best about this, and they *still* don't actually
believe their own automation tools work.  Cisco TAC's answer has
always been don't ever use SNMP if it causes crashes!  Procurve
doesn't even bother to respond to tickets related to automation bugs,
even if they are remotely triggerable crashes in the default config.

3) Automation interfaces are largely unsupported: I imagine vendor
software development having one or two guys that are the masterminds
for SNMP/NETCONF/whatever - and that's it.  When I have a question on
how to find a particular tool, or find a bug in an automation
function, I can often go months on a ticket with people that have no
idea what I'm talking about.  What documentation exists is typically
incomplete or inconsistent across versions and product lines.

4) Related tools prevent reliable error reporting: as far as I can
tell, Net-SNMP returns random values if a request fails; if there's a
pattern, I've failed to discern it.  expect is similar.  ScreenOS's
SSH implementation always returns that a file copy failed.  Procurve
only this year implemented ssh key-based auth in combination with
remote authentication.  The best-of-breed seems to be an oft-pathetic
collection of tools.

5) Management support: developing automation software is hard - network
devices aren't nearly as easy to deal with as they should be.  When I
spend weeks developing features that later causes IOS to spontaneously
reload, people that don't understand the relation to operational
impact start to advocate dismantling the automation just like the
vendors above.

I'm sure we'll continue to build automated policy and configuration
tools.  I'm just not convinced it's the panacea that everyone thinks.
Unless you're one of the biggest, it puts your network at someone
else's mercy - and that someone else doesn't care about your
operational expenses.

Ross

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie


signature.asc
Description: Digital signature


Re: OT: VSS + MEC - port-channel dynamically cloned?

2009-11-24 Thread Ross Vandegrift
On Tue, Nov 24, 2009 at 07:51:29AM +0100, Leland Vandervort wrote:
 Essentially, for all of the MEC connections, the VSS has created a clone
 of the configured port-channel to bind the actual physical connections,
 rather than binding them under the configured port-channel (and suffixed
 the port-channel number with A or B depending on which chassis was first
 to bind).

IOS does this when ethernet channel members cannot join the bundle due
to negotiation mismatch.  If the currently active elements are
incompatible with a new element, the A/B interfaces are created.
These are called secondary aggregators in IOS-speak.

http://www.cisco.com/en/US/tech/tk389/tk213/technologies_configuration_example09186a0080094470.shtml#po1a

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie


signature.asc
Description: Digital signature


Re: OT: VSS + MEC - port-channel dynamically cloned?

2009-11-24 Thread Ross Vandegrift
On Tue, Nov 24, 2009 at 10:19:33PM +0100, Leland Vandervort wrote:
 In this case, though I cannot see where the mismatch is given that the
 encapsulation, trunking (vlans allowed, etc.) and channel mode (LACP)
 are all configured identically across all ports and the channel itself.
 
   Just wondering if it's a left-over from before the VSS migration when
 the original trunks were two separate etherchannels and then migrated
 them live to MEC... 

Check flow control between all of the elements.  The only time I've
seen this was inconsistent flow control settings between different
media types on an F5 BIG-IP - 6500 bundle.

show interfaces flowcontrol

Ross

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie


signature.asc
Description: Digital signature


Re: Data Center testing

2009-08-26 Thread Ross Vandegrift
On Tue, Aug 25, 2009 at 12:53:10PM +, Jeff Aitken wrote:
 you have to have some way of describing the desired state of the network in
 machine-parsable format

Any suggested tools for describing the desired state of the network?

NDL, the only option I'm familiar with, is just a brute-force approach
to describing routers in XML.  This is hardly better than a
router-config, and the visualizations break down on any graph with
more than a few nodes or edges.  I'd need thousands to describe
customer routers.

Or do we just give up on describing all of those customer-facing
interfaces, and only manage descriptions for the service-provider part
of the network?  This seems to be what people actually do with network
descriptions (oversimplify), and that doesn't seem like much of a
description to me.

Is there a practical middle-ground between dismissing a multitude of
relevant customer configuration and the data overload created by
merely replicating the entire network config in a new language?

Ross

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie



Re: DNS hardening, was Re: Dan Kaminsky

2009-08-06 Thread Ross Vandegrift
On Thu, Aug 06, 2009 at 03:16:25PM +, Paul Vixie wrote:
  ...: Do loadbalancers, or loadbalanced deployments, deal with this
  properly? (loadbalancers like F5, citrix, radware, cisco, etc...)
 
 as far as i know, no loadbalancer understands SCTP today.  if they can be
 made to pass SCTP through unmodified and only do their enhanced L4 on UDP
 and TCP as they do now, all will be well.  if not then a loadbalancer
 upgrade or removal will be nec'y for anyone who wants to deploy SCTP.

F5 BIG-IP 10.0 has support for load balancing SCTP.  I have not tested
or implemented it.  I do not know what feature parity exists with
other protocols.  But at least it's documented and supported.

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie



Re: Network diagram software

2009-02-11 Thread Ross Vandegrift
On Wed, Feb 11, 2009 at 02:06:09PM +0100, Mathias Wolkert wrote:
 I'd like to know what software people are using to document networks.
 Visio is obvious but feels like a straight jacket to me.
 I liked netviz but it seems owned by CA and unsupported nowadays.
 
 What do you use?

I'd like to put a second request.  I often want to very quickly
mock-up a diagram that I'm going to use for myself or for internal
purposes.

Is there any application that takes some kind of *simple* description
and produces a (possibly not so beautiful) picture?  For example, I
might say something like:

Router(rtr1) connects to vlan 100
Router(rtr2) connects to Router(rtr1) via T1
switch(sw1) connects to vlan100
switch(sw2) connects to Router(rtr2)
A few hosts connect to Switch(sw1)
A few hosts connect to Switch(sw2)

-- 
Ross Vandegrift
r...@kallisti.us

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie



Re: Catalyst 6500 High Switch Proc

2008-11-17 Thread Ross Vandegrift
On Sat, Nov 15, 2008 at 04:35:28PM -0500, Philip L. wrote:
 One thing to note, is that our main ACL for ingress traffic is applied 
 here due to historical reasons.  It's roughly 5000 single host entries 
 at present.  We also use these devices for NDE.

On a SUP7203BXL, if your ACL TCAM utilization is fine, this shouldn't
impact performance unless you're logging too much.  Since you've been
over the CPU utilization doc, I'm guessing you know that.

show platform hardware capacity acl will give you a breakdown on
your ACL TCAM usage.

 I'm probably missing some other key details, but what could influence 
 the SP like this?  Any insight would be appreciated.

Cisco says that Netflow-based features always handle the first packet
of a flow in software, but I don't know if this is the RP or the SP.
It would make sense if a first-flow packet that didn't need punting
hit the SP and not the RP.  In that case, your traffic level with
netflow enabled could explain your high SP utilization.

-- 
Ross Vandegrift
[EMAIL PROTECTED]

If the fight gets hot, the songs get hotter.  If the going gets tough,
the songs get tougher.
--Woody Guthrie



Re: Replacement for Avaya CNA/RouteScience

2008-07-04 Thread Ross Vandegrift
On Thu, Jul 03, 2008 at 10:36:27PM -0400, Christian Koch wrote:
 i definitely see value in appliances like the fcp and route science box, i
 just think for a smaller provider it may not be necessary - or maybe i have
 it backwards,and it is a better solution for a smaller provider so they
 don't have to waste money on highly skilled engineers? maybe i am just
 thinking inside the box at the moment, from an engineers view..if so my
 apologies for steering off course

The FCP stinks at managing blackholing.  There's supposedly new code
on the way to help with some of the blackhole avoidance, but I'll
believe it when I see it.  It can only really control the outbound
path, so if someone else chooses a path to me that blackholed between
us, there's not a lot it can do.

On the other hand, the best value of the FCP is commit management.  It
does a fantastic job of making sure we pay the least amount of money
to our tranit providers.  No more manual balancing of traffic frees up
a lot of time, and having an automatic process for it means that we
never exceed commit on links that we don't have to.

The FCP produces lovely graphs and charts that describe this, which is
probably why people accuse it of being too PHB-friendly.  But Internap
wasn't stupid - one of those pretty charts is cost savings the FCP has
accumulated this month vs. the natural BGP decision.

For a network with a heavy outbound bias, that quickly adds up to a
decent chunk of change.

Ross

-- 
Ross Vandegrift
[EMAIL PROTECTED]

The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell.
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37



Re: Techniques for passive traffic capturing

2008-06-24 Thread Ross Vandegrift
On Mon, Jun 23, 2008 at 10:00:06PM -0500, Kevin Kadow wrote:
 We started out with SPAN ports, then moved on to Netoptics taps.
 
 Lately we've been using a combination of Cisco Netflow (from remote routers),
 and native Argus flows (from local taps) where we need more details.
 
 Flows are useful to answer What happened X minutes/hours/days ago?,
 and where you do not need/want to capture full packet bodies
 (though with Argus you can choose whether to include payload data).
 
 http://qosient.com/argus/

Cool - good to know that the Netoptics gear is good.  Seems like
there's a few resounding approvals of them.

Netflow would be lovely to export from our border routers.
Unfortunately, we are somewhat married to the 6500 platform which has
absolutely awful netflow support.  Very small TCAM, export is CPU
expensive, and sampling makes both problems worse.  So a mirrored copy
of the transit link is being sent to a pmacct box for flow generation.

-- 
Ross Vandegrift
[EMAIL PROTECTED]

The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell.
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37



Re: Techniques for passive traffic capturing

2008-06-24 Thread Ross Vandegrift
On Tue, Jun 24, 2008 at 01:19:03PM +1200, Nathan Ward wrote:
 I see little point in aggregating tapped traffic, unless you have only  
 a small amount of it and you're doing it to save cost on monitoring  
 network interfaces - but is that saved cost still a saving when you  
 factor in the cost of the extra 3750s in the middle? I'd guess no.

Thanks for all the info Nathan - lots of good leads in your email.
Let me include some more information.

The problem is finding a way to multiplex that traffic from the
optical tap to multiple things that want to peek at it.  The
remote-span trick solves that, as well as integrating media
converters.  3750 is nice since you can stack em up and mix/match the
SFP and copper ports.

For example - we have an FCP box from Internap.  It wants to see
mirrored traffic so it can watch for TCP setup problems and try to
find blackholes.  It takes 10G feeds of aggregated transit links.

Then, we want to do some passive IDS analysis.  But snort can only
really only handle 600-800Mbps before it starts saturating CPU
(not multithreaded...) - so one collector per gigE transit seems
logical.

We'd like to generate flow data out of our forwarding plane since
we use 6500s to pull in border transit links.  The Netflow on those
boxes is terrible.  pmacct does a much better job, but it needs to see
all the traffic out of band.

 Note that for a single GE link, you'd need 2GE of remote span backhaul  
 (one GE in each direction).

We're mostly a content network, very few eyeballs.  Our ingress
traffic is negligable compared to egress, which makes the problem
easier.

 Matrix switches aren't useful for your case, as you're talking about  
 monitoring for trending etc. I think. Matrix switches are good when  
 you have lots of links, and want to be able to switch between them. Is  
 the cost of matrix switch ports worth the saving in GE interfaces on  
 PCs?

I guess what made me look at them is their ability to multiplex the
stream of data.  Take it from an optical tap, spit the same data out
of multiple ports.

The remote-span trick seems to do the same thing, so I'm wondering
where the gotcha is.  If there's an advantage to using something like
the Matrix switches, I'd love to know that now.

 The above is based on the assumption you're using PCs for monitoring,  
 the economics of aggregating tap traffic may make more sense if you're  
 using some fancy monitoring platform.

Yea - the fact that we have both makes the aggregation method look
good.  The FCP takes 10G aggregated feeds.  The PCs will want single
gig views of the transit links.

 If you find that you need lots of GE interfaces per PC or something,  
 and are saturating the PCI bus, look at DAG cards from Endace. They're  
 designed for passive monitoring, and will send you only headers and do  
 BPF in hardware. I looked at these for a similar project, but didn't  
 bother as it was cheaper to buy more PC chassis' and commodity GE  
 cards. They can do 10GE monitoring, so if you need several 10GE's per  
 chassis I'd recommend these.

Ah the Endace gear looks really interesting.  Thanks for the pointer!

-- 
Ross Vandegrift
[EMAIL PROTECTED]

The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell.
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37



Techniques for passive traffic capturing

2008-06-23 Thread Ross Vandegrift
Hello everyone,

Over the past two years, there's been a trend toward doing more and
more analysis and reporting based on passive traffic analysis.

We started out using SPAN sessions to produce an extra copy of all of
our transit links for these purposes.  But the Cisco limits of two
SPAN sessions per device (on our platforms) is a major limitation.

Does anyone have a better soultion for more flexible data collection?

I've been thinking about a move to a system based on optical taps of
each of the links.  I'd aggregate these links into something like a
3750 and use remote-span VLANs to pass the traffic onto servers that
sniffing on their interface on that 3750.  Do products like the
NetOptics Matrix Switches offer a substantial advantage?

Comments or suggestions?


-- 
Ross Vandegrift
[EMAIL PROTECTED]

The good Christian should beware of mathematicians, and all those who
make empty prophecies. The danger already exists that the mathematicians
have made a covenant with the devil to darken the spirit and to confine
man in the bonds of Hell.
--St. Augustine, De Genesi ad Litteram, Book II, xviii, 37