Re: Open source Netflow analysis for monitoring AS-to-AS traffic

2024-03-29 Thread Peter Phaal
The sFlow frame_length field isn't intended to be vague. If you are seeing
non-conforming sFlow implementations, please raise the issue with the
vendor so they can fix the issue.

Verifying that the frame_length and stripped fields are correctly
implemented is one of the tests performed by the sFlow Test tool and
running the tool can be helpful in persuading a vendor that they are out of
compliance:

https://blog.sflow.com/2015/11/sflow-test.html

The following language is included in the sFlow Version 5 spec,
https://sflow.org/sflow_version_5.txt.

/* Raw Packet Header */
/* opaque = flow_data; enterprise = 0; format = 1 */

struct sampled_header {
   header_protocol protocol;   /* Format of sampled header */
   unsigned int frame_length;  /* Original length of packet before
  sampling.
  Note: For a layer 2 header_protocol,
length is total number of octets
of data received on the network
(excluding framing bits but
including FCS octets).
Hardware limitations may
prevent an exact reporting
of the underlying frame length,
but an agent should attempt to
be as accurate as possible. Any
octets added to the frame_length
to compensate for encapsulations
removed by the underlying
hardware
must also be added to the
stripped
count. */



v1.00   sFlow.org  [Page 35]

FINALsFlow Version 5   July 2004


   unsigned int stripped;  /* The number of octets removed from
  the packet before extracting the
  header<> octets. Trailing
encapsulation
  data corresponding to any leading
  encapsulations that were stripped must
  also be stripped. Trailing
encapsulation
  data for the outermost protocol layer
  included in the sampled header must be
  stripped.

  In the case of a non-encapsulated
802.3
  packet stripped >= 4 since VLAN tag
  information might have been stripped
off
  in addition to the FCS.

  Outer encapsulations that are
ambiguous,
  or not one of the standard
header_protocol
  must be stripped. */
   opaque header<>;/* Header bytes */
}

On Fri, Mar 29, 2024 at 12:46 PM Steven Bakker 
wrote:

> To top it off, both the sFlow and IPFIX specs are sufficiently vague about
> the meaning of the "frame size", so vendors can implement whatever they
> want (include/exclude padding, include/exclude FCS). This implies that you
> shouldn't trust these fields.
>


Re: Open source Netflow analysis for monitoring AS-to-AS traffic

2024-03-28 Thread Peter Phaal
The documentation for IOS-XR suggests that enabling extended-router in the
sFlow configuration should export "Autonomous system path to the
destination", at least on the 8000 series routers:

https://www.cisco.com/c/en/us/td/docs/iosxr/cisco8000/netflow/command/reference/b-netflow-cr-cisco8k/m-sflow-commands.html

I couldn't find a similar option in the NetFlow/IPFIX configuration guide,
but I might have missed it.


On Thu, Mar 28, 2024 at 10:48 AM Saku Ytti  wrote:

> Hey,
>
> On Thu, 28 Mar 2024 at 17:49, Peter Phaal  wrote:
>
> > sFlow was mentioned because I believe Brian's routers support the
> feature and may well export the as-path data directly via sFlow (I am not
> aware that it is a feature widely supported in vendor NetFlow/IPFIX
> implementations?).
>
> Exporting AS information is wire-format agnostic feature, if it's
> supported or not, it can equally be injected into sFlow, NetflowV5
> (src and dst only), NetflowV9 and IPFIX. The cost is that you need to
> program in FIB entries the information, so that the information
> becomes available at look-up time for record creation.
>
> In OP's case (IOS-XR) this means enabling 'attribute-download' for
> BGP, and I believe IOS-XR will never download any other asn but src
> and dst, therefore full information cannot be injected into any
> emitted wire-format.
> --
>   ++ytti
>


Re: Open source Netflow analysis for monitoring AS-to-AS traffic

2024-03-28 Thread Peter Phaal
I hope my comments were useful. I was trying to raise awareness that bgp
as-path information is an option and might be helpful in addressing Brian's
requirements, "I want to see with which ASes I am exchanging the most
traffic across my transits and IX links. I want to look for opportunities
to peer so I can better sell expansion of peering to upper management."

Possible reports that could be of interest are:
1. destination AS numbers by traffic volume and as-path length
2. destination AS numbers by traffic volume and second to last AS in path
(AS of peering with destination).
3. traffic volume by transit AS
4. traffic volume passing through AS allow / deny ASN list.

What other types of report might be interesting?

sFlow was mentioned because I believe Brian's routers support the feature
and may well export the as-path data directly via sFlow (I am not aware
that it is a feature widely supported in vendor NetFlow/IPFIX
implementations?). However, some of the tools mentioned (pmacct, Kentik,
Akvorado) can enrich flow data downstream (through BGP / BMP peering
session with router) if it isn't present in the sFlow/Netflow/IPFIX
records, although downstream enrichment does add a level of operational
complexity.

On Wed, Mar 27, 2024 at 11:03 PM Saku Ytti  wrote:

> On Wed, 27 Mar 2024 at 21:02, Peter Phaal  wrote:
>
> > Brian, you may want to see if your routers support sFlow (vendors have
> added the feature over the last few years).
>
> Why is this a solution, what does it solve for OP? Why is it
> meaningful what the wire-format of the records are? I read OP's
> question at a much higher level, about how to interact and reason
> about data, rather than how to emit it.
>
> Ultimately sFlow is a perfect subset of IPFIX, when you run IPFIX
> without caching you get the functional equivalent of sFlow (there is
> an IPFIX entity for emitting n bytes from frame as well as data).
>
> --
>   ++ytti
>


Re: Open source Netflow analysis for monitoring AS-to-AS traffic

2024-03-27 Thread Peter Phaal
Brian, you may want to see if your routers support sFlow (vendors have
added the feature over the last few years).

In particular, see if it includes support for the sFlow extended_gateway
structure:

/* Extended Gateway Data */
/* opaque = flow_data; enterprise = 0; format = 1003 */

struct extended_gateway {
   next_hop nexthop;   /* Address of the border router that should
  be used for the destination network */
   unsigned int as;/* Autonomous system number of router */
   unsigned int src_as;/* Autonomous system number of source */
   unsigned int src_peer_as;   /* Autonomous system number of source peer */
   as_path_type dst_as_path<>; /* Autonomous system path to the destination
*/
   unsigned int communities<>; /* Communities associated with this route */
   unsigned int localpref; /* LocalPref associated with this route */
}

The dst_as_path field is particularly valuable since it allows you to see
who your customers are peering with.

While not a complete solution, you might want to take a look at sflowtool,
https://github.com/sflow/sflowtool, to decode the sFlow records and convert
them to JSON. It's not hard to write a Python script to calculate BGP
peering metrics and push the results into a time series database
(Prometheus, InfluxDB, etc) and build dashboards in Grafana. The following
article gives a few examples:

https://blog.sflow.com/2018/12/sflow-to-json.html

On Tue, Mar 26, 2024 at 5:06 PM Brian Knight via NANOG 
wrote:

> What's presently the most commonly used open source toolset for monitoring
> AS-to-AS traffic?
>
> I want to see with which ASes I am exchanging the most traffic across my
> transits and IX links. I want to look for opportunities to peer so I can
> better sell expansion of peering to upper management.
>
> Our routers are mostly $VENDOR_C_XR so Netflow support is key.
>
> In the past, I've used AS-Stats 
> for this purpose. However, it is particularly CPU and disk IO intensive.
> Also, it has not been actively maintained since 2017.
>
> InfluxDB wants to sell me
>  on Telegraf +
> InfluxDB + Chronograf + Kapacitor, but I can't find any clear guide on what
> hardware I would need for that, never mind how to set up the software. It
> does appear to have an open source option, however.
>
> pmacct seems to be good at gathering Netflow, but doesn't seem to analyze
> data. I don't see any concise howto guides for setting this up for my
> purpose, however.
>
> I'm aware Kentik does this very well, but I have no budget at the moment,
> my testing window is longer than the 30 day trial, and we are not prepared
> to share our Netflow data with a third party.
>
> Elastiflow  appears to have been open source
>  at one time
> in the past, but no longer. Since it too appears to be hosted, I have the
> same objections as I do with Kentik above.
>
> On-list and off-list replies are welcome.
>
> Thanks,
>
> -Brian
>
>


Re: Flow Tools AS-Path

2023-04-04 Thread Peter Phaal
Export of destination AS-Path is supported in the sFlow extended_gateway
structure.

/* Extended Gateway Data */
/* opaque = flow_data; enterprise = 0; format = 1003 */

struct extended_gateway {
   next_hop nexthop;   /* Address of the border router that should
  be used for the destination network */
   unsigned int as;/* Autonomous system number of router */
   unsigned int src_as;/* Autonomous system number of source */
   unsigned int src_peer_as;   /* Autonomous system number of source peer */
   as_path_type dst_as_path<>; /* Autonomous system path to the destination */
   unsigned int communities<>; /* Communities associated with this route */
   unsigned int localpref; /* LocalPref associated with this route */
}

Arista EOS supports aspath if you enable sflow extension bgp. Cisco
also claims to support the feature on IOS XR platforms.

In addition to BGP, there are a number of MPLS, tunnel encap/decap
etc. sFlow extended structures.

Also optical interface metrics, dropped packet notifications, and more:

https://sflow.org/developers/specifications.php


On Tue, Apr 4, 2023 at 6:06 AM Mike Hammett  wrote:

> One of the reasons to analyze flow data is to make purchase\peering
> decisions. The sFlow standard seems to only include source and destination
> AS, though I know some route platforms have extensions to provide
> additional data.
>
> 1) How common is it to have the additional extensions to include that data
> for analysis?
> 2) I have seen flow tools that show the entire AS path. Are they just
> cherry picking which platforms they showcase for the best marketing, or are
> they enriching the data they receive from "lesser" platforms from an
> outside source?
>
> For that purpose, knowing what ASes your data goes to is useful. It's even
> more useful to find an upstream network that includes a bunch of those.
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions
> http://www.ics-il.com
>
> Midwest-IX
> http://www.midwest-ix.com
>
>


Re: SDN Internet Router (sir)

2023-01-03 Thread Peter Phaal
https://github.com/sflow-rt/active-routes

Inspired by SIR, but uses Bird multi-table capability to separate RIB/FIB
routes.

On Tue, Jan 3, 2023 at 7:47 AM Mike Hammett  wrote:

> https://github.com/dbarrosop/sir
>
> I came across this over the weekend. Given that the project was abandoned
> six years ago, are there any other efforts with a similar goal (more
> intelligently placing routes into FIBs of low-FIB capacity devices?
>
>
>
> -
> Mike Hammett
> Intelligent Computing Solutions 
> 
> 
> 
> 
> Midwest Internet Exchange 
> 
> 
> 
> The Brothers WISP 
> 
> 
>


Re: Sflow/netflow/ipfix open source security projects

2022-08-10 Thread Peter Phaal
Sounds like an interesting project. You might want to take a look at
sflowtool to get started. The following article shows how to use sflowtool
to decode sFlow datagrams and includes a simple Python script matching IP
addresses against a known threat database.

https://blog.sflow.com/2018/12/sflow-to-json.html

On Wed, Aug 10, 2022 at 7:19 AM Drew Weaver  wrote:

> Hello,
>
>
>
> I am interested in getting involved with an open source project in my
> spare time.
>
>
>
> I thought that it may be useful to contribute to an open source project
> that uses flow data to check for lateral movement inside of networks and
> also to check for known bads in remote connections.
>
>
>
> This seems like really low hanging fruit from a defense scenario.
>
>
>
> I’ve tried googling around for something like this and I have come up
> short.
>
>
>
> Is anyone aware of any such projects?
>
>
>
> Thanks,
>
> -Drew
>
>
>


Re: Free-ish Linux Netflow collector/analyser options

2022-05-17 Thread Peter Phaal
Juniper added sFlow support to MX routers in Junos 18.1R1,
https://blog.sflow.com/2018/04/sflow-available-on-juniper-mx-series.html

You might want to consider deploying sFlow instead of IPFIX, particularly
if you are interested in DDoS mitigation where low latency and visibility
into packet headers can be helpful.

-Peter

On Mon, May 16, 2022 at 11:36 AM Matthew Crocker 
wrote:

>
>
> I’m looking for a free-ish Linux open sources Netflow collector/analyser.
> I have 5 Juniper MX routers that will send IPFIX flows to for an ISP
> network.I’m hoping it is something I can run in AWS/EC2 as I don’t want
> to worry about storage again in my lifetime.  Does anyone have any
> recommendations?
>
>
>
> For reporting I would like to generate basic  usage reports to/from
> IP/Subnet/ASN.  It would be great if it could also detect DDoS and activate
> flowspec back into my core routers but that isn’t a requirement
>
>
>
> Thanks
>
>
>
> -Matt
>
>
>


Re: sflow -> aggregated aspath visualization?

2020-03-16 Thread Peter Phaal
You could use Prometheus / Grafana to build the dashboards.

The following example is a starting point (top ASNs / Countries by traffic
volume):
https://grafana.com/grafana/dashboards/11146

The example could be modified to make the make router / interface
selectable, or cloned to create separate per router / interface dashboards.

On Sat, Mar 14, 2020 at 12:33 PM Adam Thompson 
wrote:

> I’m looking for product recommendations:
>
>
>
> We’ve noticed that about 20% of our traffic here lately has decamped from
> the free (or, at least, flat-rate) connection to CANARIE (our R network)
> and its various connected content-delivery networks, and onto our
> commercial provider.
>
> While this is presumptively a legitimate shift, we’d like to better
> understand these changes when they occur, in a way that our executive can
> understand at a glance.
>
> We do have sFlow (et al.) going to an Arbor PeakFlow box for analysis, but
> it’s lacklustre at best at understanding changes like this.
>
> I want:
>
>- Top #n ASNs by traffic volume, per router/interface, stacked chart
>- Some way to visualize large jumps in that dataset, e.g. if
>Cloudflare ditched their CANARIE connection and now that traffic all goes
>commercial, I don’t know what sort of graphic would be useful, maybe a
>stacked polar chart so you could see when an AS jumped from one sector to
>another?  Even stacked bar charts could be useful.
>
>
>
> If anyone knows of tools capable of generating easy-to-understand reports,
> dashboards, including historical “what changed this week”-type data, please
> let me know.
>
>
>
> For that matter, if you have a technique of collecting this data and using
> Excel to do the reporting, that would work too.
>
>
>
> (Yes, I could theoretically build this off of existing open source tools…
> eventually)
>


Re: Sflow billing or usage calculation software

2019-04-17 Thread Peter Phaal
On Tue, Apr 16, 2019 at 8:35 PM Deepak Jain  wrote:

> Now I know I'm pushing my luck... but do certain vendors more fully
> embrace sFlow than others? maybe one of the whitebox vendors if not one
> of the majors?
>
> Hacking support into something isn't the worse thing in the world, but
> if there is any experience on this to leverage off of, that is helpful.
>

Unfortunately, there isn't a publicly available list showing how well or
completely vendors have implemented the sFlow specifications:
https://sflow.org/developers/specifications.php

I have been working on an sFlow test suite to try and address this problem:
https://blog.sflow.com/2015/11/sflow-test.html

The source code for the tests is on GitHub:
https://github.com/sflow-rt/sflow-test

The easiest way to run the software is using Docker:
https://hub.docker.com/r/sflow/sflow-test

The goal is to compile a list of equipment and network operating systems
that pass the tests and publish the results on sFlow.org. Failed tests can
be passed to vendors to help them improve their implementations. In
addition to identifying feature support, there are also stress tests to
ensure accurate results under production workloads (rapid detection of DDoS
etc).

Involvement of operators would be great. If there are tests that are
missing from the suite, please submit an enhancement request, or even
better, a pull request, on GitHub. If you have a test lab and can run the
tests on your own hardware, please share the results.

The open source Host sFlow agent, https://sflow.net/, has been ported to a
number of white box network operating systems and provides an opportunity
for the community to extend sFlow functionality and address issues in the
white box ecosystem. Operator involvement in this project would be most
welcome.

Peter


Re: Sflow billing or usage calculation software

2019-04-13 Thread Peter Phaal
Tony,

You might find the following article useful in identifying features to
consider when evaluating sFlow analyzers:
https://blog.sflow.com/2009/05/choosing-sflow-analyzer.html

The following white paper discusses accuracy of packet sampling for usage
accounting:
https://inmon.com/pdf/sFlowBilling.pdf

Peter



On Sat, Apr 13, 2019 at 2:07 PM Tony C  wrote:

> Hi All
>
> I am looking for Sflow analytical software that can tell me automatically
> over say a period of 24 hours (or any time period I select) the average
> mbit of data consumed for any IP address within our entire AS.
>
> (Without configuring a rule or billing group for each IP address or
> customer within our network)
>
> The purpose is to help quickly work out IP addressees which are using more
> bandwidth (in or out) than what we consider to be acceptable usage.
>
> For example, I would like to review a report or be automatically alerted
> to any IP address using more than an average of 50mbit within the past 24
> hour plus have the capability to review data say over a month.
>
> Any names of software of suggestions would be great which I can
> investigate, happy to look at both commercial software and open source or
> if you have a Sflow billing solution for data consumption which is simple
> and easy to use please let me know
>
> Thanks
>
> Tony
>


Juniper releases sFlow support for MX routers

2018-04-06 Thread Peter Phaal
Hi All,

I thought there might be interest in availability of sFlow in Junos OS
Release 18.1R1 for MX routers:

https://blog.sflow.com/2018/04/sflow-available-on-juniper-mx-series.html

Peter


Re: Open Souce Network Operating Systems

2018-01-20 Thread Peter Phaal
On Sat, Jan 20, 2018 at 11:26 AM, Colton Conor 
wrote:
>
> Thanks for the information. Do you have a recommendation of which
> distribution of Linux to use for this? Is there one that is more network
> centric than another?
>

Cumulus Linux, OpenSwitch, and Open Network Linux are all Debian based so
there is probably a greater ecosystem of developers and network centric
tools being built for Debian.


Re: Open Souce Network Operating Systems

2018-01-20 Thread Peter Phaal
On Sat, Jan 20, 2018 at 9:32 AM, Colton Conor 
wrote:
>
> My understanding if Free Range Routing is a package of software that runs
> in linux, but not a full and true NOS right?
>

Why not consider Linux a NOS? Installing Free Range Routing adds control
plane protocols: BGP, OSPF, ISIS, etc.


> I looked into Cumulus Linux, but it seems to only run on the supported
> hardware which is while box switches. Can you run Cumulus Linux on a X86
> server with intel NICs? Can you run Cumulus on a raspberry pi?
>

Cumulus Linux is basically Ubuntu with Free Range Routing pre-installed
along with a daemon that offloads forwarding from the Linux kernel to an
ASIC. CumulusVX is a free Cumulus Linux virtual machine that is useful for
staging / testing configurations since it has the same behavior as the
hardware switch.

On X86 servers with Intel NICs, just run Linux. Cumulus Host Pack can be
installed to add Free Range Routing and other Cumulus tools on the server.
Alternatively, you can choose any Linux control plane, automation, or
monitoring tools and install them on the hosts and Cumulus Linux switches
to unify management and control, e.g. Bird, collectd, telegraf, Puppet,
Chef, Ansible, etc.

Linux distros (including Ubuntu) are available for non-X86 hardware like
Raspberry Pi etc.


>
> Ideally I think I am looking to a Linux operating system that can run on
> multiple CPU architectures, has device support for Broadcom and other
> Merchant silicon switching and wifi adapters.


If you consider Linux as the NOS then it already meets these requirements.


Re: tracking TCP session hop by hop

2017-11-29 Thread Peter Phaal
On Wed, Nov 29, 2017 at 9:06 AM, William Herrin  wrote:

> On Tue, Nov 28, 2017 at 3:48 PM, Yifeng Zhou 
> wrote:
>
> > Is there any way that we can track TCP session hop by hop?
> >
> > Say we have 10 ECMP between A and Z point, what's the easiest way to
> track
> > specific session is using which path? How we can check between
> > servers(Linux/Unix) and between Routers(Cisco/Juniper etc)?
> >
>
> A TCP connection is uniquely identified by the combination of four numbers:
> The source IP address, the source port, the destination IP address and the
> destination port. You used the word session, but sessions happen above TCP
> in the stack and may use more than one TCP connection.  Every packet in the
> connection contains all four numbers and no packet from any other
> connection contains the same four numbers.
>
> If you want to track the connections, you capture the packets at each point
> in the path (router products have vendor-specific ways of doing this) and
> see which unique sets of the four numbers went through which router and
> router interface.
>
>
> If you want to -test- which path a TCP connection -would- take, Ruairi's
> afore-mentioned tcptraceroute is the way to go. The regular traceroute with
> modern Linux servers also supports the "-T" flag which does the same thing.
> It works just like regular traceroute but uses synthetic TCP SYN packets
> instead of ICMP or UDP packets, allowing the packets to pass firewalls
> which would otherwise block the trace.
>
> Bear in mind that in each case you will likely only see the path taken at
> the IP level. Underlying transits at the Ethernet or MPLS level are
> intentionally invisible to the endpoints.
>
>
In the data center context, enabling sFlow continuously captures packets
from all paths and can be used to trace multi-path packet flows, whether
layer 2 (MLAG/LAG), or layer 3 (ECMP). sFlow reports physical switch ports
and captures Ethernet packet headers, so you can relate paths to MPLS
labels, Ethernet headers, IP headers, TCP/UDP headers, VxLAN tunnels, etc.

The following article provides an example:
http://blog.sflow.com/2017/09/troubleshooting-connectivity-problems.html


Re: Netflow/sFlow generator for Linux with BGP support

2017-01-28 Thread Peter Phaal
Patrick,

You might want to try pmacct:
http://www.pmacct.net/

Peter

On Sat, Jan 28, 2017 at 8:17 AM, Patrick Velder  wrote:

> Hi there
>
> I'm currently switching from MikroTik CCR 1009 to SuperMicro 5018D-FN8T as
> small router. Now I'd love to integrate BGP infos into netflow/sflow, as
> MikroTik still doesn't have any support for that.
>
> Are there any alternatives to nProbe (which supports BGP but is ways too
> expensive with its 300€)?
>
> Regards
> Patrick
>
>


Re: Barefoot "Tofino": 6.4 Tbps whitebox switch silicon?

2016-06-16 Thread Peter Phaal
On Thu, Jun 16, 2016 at 1:19 AM, Saku Ytti  wrote:
> On 16 June 2016 at 06:21, Eric Kuhnke  wrote:
>> Based on their investors, could have interesting results for much lower
>> cost 100GbE whitebox switches.
>
> Why lower cost? The BOM isn't the expensive part, the code is the
> expensive part. Only way I see this happening, is if we get open
> source routing suite for the box, i.e. 0 cost software.

It shouldn't be long before we see open source routing suites (Bird,
Quagga, GoBGP, etc) running on Linux (ONL, OS10, OpenSwitch). There is
a P4 program that implements the Switch Abstraction Interface (SAI),
which provides a common interface, device independent, interface to
merchant silicon.

http://p4.org/p4/an-open-source-p4-switch-with-sai-support/

A quick way to do interesting things with the programmable data plane
is to use P4 to augment the basic switching / routing behavior
provided by SAI: moving resources from layer 2 tables to layer 3
tables, adding telemetry, adding additional control capabilities,
etc.:

http://blog.sflow.com/2016/06/programmable-hardware-barefoot-networks.html


Re: Network Weathermap

2016-04-28 Thread Peter Phaal
Many drawing tools support SVG as a file export format. Exporting or
converting the map to SVG format allows the map attributes (link
colors, widths, etc) to be modulated using JavaScript embedded in the
web page.

As an example, the following SC15 weathermap was created by converting
a PDF diagram of the network into an SVG file:

http://blog.sflow.com/2015/11/sc15-live-real-time-weathermap.html

The code is on GitHub and it wouldn't be hard to re-purpose:

https://github.com/pphaal/sc15-weather

The ESnet weathermap is very cool and they have open sourced the code:

https://my.es.net/
http://www.hpcwire.com/2015/10/05/esnet-releases-software-for-building-interactive-network-portals/

On Thu, Apr 28, 2016 at 11:32 AM, James Bensley  wrote:
> Hi all,
>
> I know its been a while since I posted this thread, I've been swamped.
> Finally I'm getting time to look back at this. I think I had 0 on-list
> replies and about 10 off-list private replies, so clearly others are having
> the same problem but not speaking openly about it.
>
> There were two main themes in the off list replies;
>
> 1. Several people are drawing in a tool like Visio and then importing the
> picture as a background to the weathermap plugin and adding the links and
> nodes over the top.
>
> 2. A couple of people were drawing in something else other than Visio that
> would spit out files containing objects and coordinates and then had
> written scripts to convert those coordinates to Weathermap plugin file
> format.
>
> Method 1 is OK, I really want it to be less hassle than that so 2 seems
> like the best idea. Only one person would share their conversion script
> with me briefly on PasteBin then it expired and it wasn't for Visio format
> files, so I didn't save it.
>
> Having a quick play in Visio just now the files are saved as XML formatted
> X/Y axis values. Bit of a Python novice but I'm thinking I could basically
> ingest a Visio file and parse the the XML and then iterate over it
> converting each "object" into weathermap syntax.
>
> That isn't too difficult however for the maps to be any good I need to
> think about the "via" feature for links in Weathermap to map them  more
> clearly if they cross over each other. There might still also be a lot of
> hackery when it comes to mapping the imported nodes and links to actual
> ones in Cacti. It might be that you have to match all the imported nodes
> and links to RRDs the first time you import the diagram then on all future
> imports just new links and nodes.
>
> Before I commit the time to this, has anyone done this already or is anyone
> a absolute Lord of Python who wants to do it quicker than I can do it? :)
>
> Cheers,
> James.


Re: collectd as alternative to RTG for high-resolution polling and long term storage?

2016-03-19 Thread Peter Phaal
On Wed, Mar 16, 2016 at 11:45 AM, Eric Kuhnke  wrote:
> Would anyone care to share their experience using collectd as an
> alternative to rtg for high-resolution polling of interface traffic and
> long term storage?
>
> I am investigating the various options for large data set size, lossless
> long term traffic charting (not RRAs which lose precision over time). One
> possible use is precision 95th billing.
>
> https://collectd.org/

Devices that support sFlow natively implement collectd type
functionality for streaming interface counters to a time series
database (InfluxDB, Graphite, OpenTSB, etc.) Tools like Grafana can be
used to query the database and build dashboards.

Host sFlow (http://sflow.net) is very similar to collectd in the
metrics it exports, but with the added ability to export flow data
from host adapters, bridges, vSwitches, firewalls, routing, VMs,
containers etc.


Re: sFlow vs netFlow/IPFIX

2016-03-03 Thread Peter Phaal
On Thu, Mar 3, 2016 at 9:16 AM, Nick Hilliard  wrote:
> The beauty of sflow is that you can do anything in the collector, but
> most people aren't going to do this because it means maintaining two
> sets of data about your flow configuration: one set on the switch and
> one set in your collector code which you've now diverged from the
> mainstream distribution, thereby creating a requirement for future
> maintenance, with associated costs.

I completely agree that you don't want to maintain two sets of
configurations (switch and collector) that need to be updated.
However, it's much better to focus on minimizing switch configuration
complexity than it is to focus on reducing analyzer software
configuration complexity.

If you push the problem of de-duplication to the switch configurations
then you end up with VxT sets of switch configuration in a
multi-vendor network (where T is the number of topologically different
wiring configurations used for the switches and V is the number of
vendors - actually it can be even worse, since each vendor product
line might have different configuration options, CatOS vs NX-OS for
example). Adopting a standard sFlow agent configuration in which
monitoring is enabled on all switch ports with policy based default
sampling rates (a good default sampling rate = port speed in gigabits
per second x 1000. e.g. 1-in-10,000 on a 10G port) greatly simplifies
large scale sFlow deployments. Now instead of maintaining and updating
VxT configurations in thousands of switches, there are only V switch
configurations that are installed when the switches are initially
provisioned and remain static over the lifetime of the network.

Updating the single, central configuration of the sFlow analyzer
software is much simpler and easily automated. It also makes it much
easier to roll out new analytics capabilities since they are simply
configuration and software updates to the central collector and don't
require building, testing and deploying configurations to all the
devices.


Re: sFlow vs netFlow/IPFIX

2016-03-03 Thread Peter Phaal
While it would be nice if the Nexus switches supported ingress
sampling, you can get exactly the same result at the receiving end by
dropping the egress samples. The following sflowtool output shows some
of the metadata contained in the packet sample:

startSample --
sampleType_tag 0:1
sampleType FLOWSAMPLE
sampleSequenceNo 1022129
sourceId 0:7
meanSkipCount 128
samplePool 130832512
dropEvents 0
inputPort 7
outputPort 10

The two fields of interest are the sourceId (0:7) indicating that this
measurement came from a data source of type ifIndex (0) and that the
ifIndex of the data sources is 7. The inputPort is the ifIndex of the
port that received the packet. In this case because the dataSource
ifIndex and the inputPort ifIndex are the same, this is an ingress
sampled packet. A simple filter along the lines:

if ( sourceId.split(':')[1] != inputPort) return;

would allow your sFlow analyzer to eliminate the unwanted samples. You
could also enable / disable ports on your switches to ensure that each
path is sampled once, but that does limit the types of analysis you
can do with the data. A better approach is to simply add additional
input filters to specify which edge data sources you want to include /
exclude in your traffic accounting application since this would allow
the full sFlow feed to be used for other purposes as well (identifying
traffic on busy links, etc.)

The overhead of enabling sFlow on all ports and all devices is
generally quite small since packets are sampled in hardware and
production sampling rates tend to be in range (1,000 - 50,000) so very
little traffic measurement traffic is actually generated. A more
important consideration is operational complexity. If you have
thousands of switches, designing customized configurations for each
one doesn't make a lot of sense. It's much better if the intelligence
is applied at the collecting end. Taking this approach and including
sensible defaults in the agents can get the sFlow agent configuration
down to something as simple as:

sflow {
  DNSSD = off
  collector {
ip = 10.0.0.162
  }
}

And you could go even simpler if you use DNS SRV records to identify
the sFlow collector(s)

sflow {
  DNSSD=on
}

These configurations are from Cumulus Linux.

One of the trends in merchant silicon based platforms is inclusions of
the ONIE boot loader. If you don't like the network operating system,
you can install a different operating system to better suite your
requirements without ripping and replacing hardware. There are many
virtually identical switches built around the Broadcom ASICs, giving a
lot of choice in hardware and network operating system vendor.

On Thu, Mar 3, 2016 at 3:53 AM, Nick Hilliard <n...@foobar.org> wrote:
> Peter Phaal wrote:
>> I think "pathologically broken" somewhat overstates the case.
>> Bidirectional sampling is allowed by the sFlow spec and other vendors
>> have made that choice. Another vendor used to implement egress only
>> sampling (also allowed) but unusual. I agree that ingress is the most
>> common and easiest to deal with, but a decent sFlow analyzer should be
>> able to handle all three cases without over / under counting.
>
> Bidirectional sampling doesn't allow you to define an sampling perimeter
> on your switch topology.  This means that if you if you have anything
> other than a trivial topology, you will end up double-counting your
> traffic.  The only way to work around this is to get the collector to
> discard 50% of the samples or otherwise write down the amount of traffic
> by 50%, assuming a standard accounting perimeter configuration.  This is
> broken.
>
> The thing is, this is ridiculously easy to fix in code.  The hooks are
> already there.
>
> Nick


Re: sFlow vs netFlow/IPFIX

2016-03-02 Thread Peter Phaal
On Wed, Mar 2, 2016 at 2:45 PM, Nick Hilliard <n...@foobar.org> wrote:
> Peter Phaal wrote:
>> Monitoring ingress and egress in the switch is wasteful of resources.
>
> It's more than a waste of resources:  it's pathologically broken and
> Cisco decline to fix it, despite the fact that enabling ingress-only or
> egress-only is fully supported via API in the Broadcom SDKs, and
> consequently the amount of configuration glue required to fix it in
> NX-OS is nearly zero.
>
> Broadcom chipsets don't support netflow, so sflow is the only game in
> town if you need data telemetry on broadcom-based ToR boxes.
>
> As I said in a previous email on this thread, refusing to support this
> properly is a harmful and short sighted approach to customers' requirements.

I think "pathologically broken" somewhat overstates the case.
Bidirectional sampling is allowed by the sFlow spec and other vendors
have made that choice. Another vendor used to implement egress only
sampling (also allowed) but unusual. I agree that ingress is the most
common and easiest to deal with, but a decent sFlow analyzer should be
able to handle all three cases without over / under counting.

More annoying is differences in how vendors report telemetry from LAG
/ MLAG topologies. The "sFlow LAG Counters Structure" extension was
published in 2012 and defines how counters and samples should be
generated on LAGs. Anyone with using LAG / MLAG topologies might want
to ask their vendor if they support / plan to support the extension.


Re: sFlow vs netFlow/IPFIX

2016-03-02 Thread Peter Phaal
On Wed, Mar 2, 2016 at 9:30 AM, Nick Hilliard <n...@foobar.org> wrote:
> Peter Phaal wrote:
>> The Nexus 3200 should work well with 100G flows - I believe it's
>> based on the latest Broadcom Tomahawk ASIC. The older Trident II
>> ASICs in the Nexus 9k are 40g parts
>
> does nx-os still force ingress-and-egress sflow?  sflow is pretty
> useless you can define an accounting perimeter, which means that you
> need either ingress across the board, or egress.  ingress-and-egress is
> basically useless because you end up double counting everything.

Monitoring ingress and egress in the switch is wasteful of resources.
In most use switch  use cases (a leaf / spine fabric for example) the
next hop switch will also be reporting ingress sFlow and so when you
combine sFlow streams from both switches you get bi-directional
visibility into every link. Enabling ingress only sFlow on all switch
ports catches all packet paths and halves the overhead of
bi-directional sampling.

The sFlow architecture shifts intelligence from the devices to
external software. The goal is to have a general purpose telemetry
stream that can be used for a variety of purposes. Rather than having
the complexity of configuring sFlow selectively at the sender, the
receiver is responsible for de-duplicate the sFlow stream for
accounting (the packet stream selection and elimination you are doing
in the switch configuration can equally be applied on receipt).
Shifting the decision to the collector means you can also use the
stream to diagnose performance problems (for example identifying top
flows on a busy link), traffic engineering of large flows, etc. If the
sender is configured to suite one application, you limit the value of
the measurements for other applications.

An often overlooked feature of sFlow is that the agent also
periodically sends interface counters (reducing or eliminating the
need for SNMP polling in many use cases). The counters and packet
samples are tied together in the sFlow data model - for example you
can use the interface speed information from the counter samples to
compute utilizations based on the packet sample stream etc). Broadcom
also defined sFlow metrics to provide additional visibility into the
ASIC forwarding pipeline (layer 2 / layer 3 / ACL table utilization,
buffer utilization, microburst detection) and the inclusion of these
metrics with the samples packet data in the sFlow telemetry stream
provides a way to identify the traffic that is consuming the hardware
resources.


Re: sFlow vs netFlow/IPFIX

2016-03-02 Thread Peter Phaal
> 
> On Mar 1, 2016, at 10:12 PM, Mark Tinka  wrote:
> 
> 
> 
>> On 2/Mar/16 08:04, Mark Tinka wrote:
>> 
>> We were initially looking at at the Nexus 9000, but then moved to the
>> 7700 because the Broadcom chip on the 7700 cannot do single flows larger
>> than 40Gbps on the 100Gbps ports.
> 
> The Broadcom chip on the 9000, I meant...
> 
> Mark.

The Nexus 3200 should work well with 100G flows - I believe it's based on the 
latest Broadcom Tomahawk ASIC. The older Trident II ASICs in the Nexus 9k are 
40g parts

Re: sFlow vs netFlow/IPFIX

2016-03-01 Thread Peter Phaal
On Tue, Mar 1, 2016 at 6:13 AM, Mark Tinka  wrote:
>
>
> On 29/Feb/16 12:15, Nikolay Shopik wrote:
>
>> Cisco Nexus switches support sflow, since they are broadcom based.
>
> Not all of them, just the Nexus 9000, IIRC.
>

The situation in the Cisco Nexus line is confusing. In addition, to
the Nexus 9000 series, the Nexus 3000 series and 3100 series are also
Broadcom based and also support sFlow. The Nexus 3500 series and 6000
series use Cisco ASICs and don't have sFlow or NetFlow support.

It also appears that Cisco's merchant silicon based switches have a
greater variety of orchestration capabilities, Python, NX-API,
Ansible, etc.


Re: mrtg alternative

2016-02-27 Thread Peter Phaal
InfluxDB + Grafana are a modern alternative from the DevOps space:

http://lkhill.com/using-influxdb-grafana-to-display-network-statistics/

On Fri, Feb 26, 2016 at 3:18 PM, Baldur Norddahl
 wrote:
> Hi
>
> I am currently using MRTG and RRD to make traffic graphs. I am searching
> for more modern alternatives that allows the user to dynamically zoom and
> scroll the timeline.
>
> Bonus points if the user can customize the graphs directly in the
> webbrowse. For example he might be able to add or remove individual peers
> from the graph by simply clicking a checkbox.
>
> What is the 2016 tool for this?
>
> Regards,
>
> Baldur


Re: DDOS, IDS, RTBH, and Rate limiting

2014-11-21 Thread Peter Phaal
 Actually, sFlow from many vendors is pretty good (per your points about
 flow
 burstiness and delays), and is good enough for dDoS detection.  Not for
 security forensics, or billing at 99.99% accuracy, but good enough for
 traffic visibility, peering analytics, and (d)DoS detection.

 Well, if it is available, except hardware limitations, there is second
 obstacle,
 software licensing cost. On latest JunOS, for example on EX2200, you need
 to purchase license (EFL), and if am not wrong it is $3000 for 48port units.
 So if only sFlow feature is on stake, it worth to think, to purchase
 license,
 or to purchase server.

Juniper no longer charges for sFlow on the EX2200 (as of Junos 11.2):

http://www.juniper.net/techpubs/en_US/junos11.2/information-products/topic-collections/release-notes/11.2/junos-release-notes-11.2.pdf

I am not aware of any vendor requiring an additional license to enable sFlow.

sFlow (packet sampling) works extremely well for the DDoS flood
detection / mitigation use case. The measurements are build into low
cost commodity switch hardware and can be enabled operationally
without adversely impacting switch performance.  A flood attack
generates high packet rates and sampling a 10G port at 1-in-10,000
will reliably detect flood attacks within seconds.

For most use cases, it is much less expensive to use switches to
perform measurement than to attach taps / mirror port probes. If your
switches don't already support sFlow, you can buy a 10G capable white
box switch for a few thousand dollars that will let you monitor 1.2
Terabits/sec. If you go with an open platform such as Cumulus Linux,
you could even run your DDoS mitigation software on the switch and
dispense with the external server. Embedded instrumentation is simple
to deploy and reduces operational complexity and cost when compared to
add on probe solutions.

Peter Phaal
InMon Corp.


Re: Filter NTP traffic by packet size?

2014-02-23 Thread Peter Phaal
What is the business model for the IX? Unauthorized filtering of
incoming traffic risks collateral damage and outing exchange members
seems problematic.

The business model seems clearer when offering filtering as a service
to downstream networks, the effects are narrowly scoped, and members
have control over the traffic they accept from the exchange, e.g. I
don't want to accept NTP traffic to any destination that exceeds
1Gbit/s, or is sourced from an NTP server on my blacklist. Giving
policy control to the downstream allows them to protect their networks
and make business decisions about how they want to prioritize services
and customers when resources are constrained.

Would exchange members pay for this type of control? DDoS mitigation
appears to be less of a technical problem than an issue of misaligned
costs and benefits. How do you create incentives for upstream
providers to invest in solutions when the benefits accrue downstream?

On Sun, Feb 23, 2014 at 7:14 AM, Mikael Abrahamsson swm...@swm.pp.se wrote:
 On Sun, 23 Feb 2014, Chris Laffin wrote:

 Ive talked to some major peering exchanges and they refuse to take any
 action. Possibly if the requests come from many peering participants it will
 be taken more seriously?


 If only there was more focus on the BCP38 offenders who are the real root
 cause of this problem, I would be more happy.

 I would be more impressed if the IXes would start to use their sFlow
 capabilities to find out what IX ports the NTP queries are coming to
 backtrace the traffic to the BCP38 offendors than try to block the NTP
 packets resulting from these src address forged queries.

 --
 Mikael Abrahamssonemail: swm...@swm.pp.se



Re: Filter NTP traffic by packet size?

2014-02-22 Thread Peter Phaal
Brocade demonstrated how peering exchanges can selectively filter
large NTP reflection flows using the sFlow monitoring and hybrid port
OpenFlow capabilities of their MLXe switches at last week's Network
Field Day event.

http://blog.sflow.com/2014/02/nfd7-real-time-sdn-and-nfv-analytics_1986.html

On Sat, Feb 22, 2014 at 4:43 PM, Chris Laffin claf...@peer1.com wrote:
 Has anyone talked about policing ntp everywhere. Normal traffic levels are 
 extremely low but the ddos traffic is very high. It would be really cool if 
 peering exchanges could police ntp on their connected members.

 On Feb 22, 2014, at 8:05, Paul Ferguson fergdawgs...@mykolab.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA256

 On 2/22/2014 7:06 AM, Nick Hilliard wrote:

 On 22/02/2014 09:07, Cb B wrote:
 Summary IETF response:  The problem i described is already solved
 by bcp38, nothing to see here, carry on with UDP

 udp is here to stay.  Denying this is no more useful than trying to
 push the tide back with a teaspoon.

 Yes, udp is here to stay, and I quote Randy Bush on this, I encourage
 my competitors to block udp.  :-p

 - - ferg


 - --
 Paul Ferguson
 VP Threat Intelligence, IID
 PGP Public Key ID: 0x54DC85B2

 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.22 (MingW32)
 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

 iF4EAREIAAYFAlMIynoACgkQKJasdVTchbJsqQD/ZVz5vYaIAEv/z2kbU6kEM+KS
 OQx2XcSkU7r02wNDytoBANVkgZQalF40vhQED+6KyKv7xL1VfxQg1W8T4drh+6/M
 =FTxg
 -END PGP SIGNATURE-





Re: TWC (AS11351) blocking all NTP?

2014-02-03 Thread Peter Phaal
Why burn the village when only one house is the problem? I thought
there might be some interest in hearing about work being done to use
SDN to automatically configure filtering in existing switches and
routers to mitigate flood attacks.

Real-time analytics based on measurements from switches/routers
(sFlow/PSAMP/IPFIX) can identify large UDP flows and integrated hybrid
OpenFlow, I2RS, REST, NETCONF APIs, etc. can be used to program the
switches/routers to selectively filter traffic based on UDP port and
IP source / destination. By deploying a DDoS mitigation SDN
application,  providers can use their existing infrastructure to
protect their own and their customers networks from flood attacks, and
generate additional revenue by delivering flood protection as a value
added service.

https://datatracker.ietf.org/doc/draft-krishnan-i2rs-large-flow-use-case/
http://events.linuxfoundation.org/sites/events/files/slides/flow-aware-real-time-sdn-analytics-odl-summit-v2.pdf

Specifically looking at sFlow, large flood attacks can be detected
within a second. The following article describes a simple example
using integrated hybrid OpenFlow in a 10/40G ToR switch:

http://blog.sflow.com/2014/01/physical-switch-hybrid-openflow-example.html

The example can be modified to target NTP mon_getlist requests and
responses using the following sFlow-RT flow definition:

{'ipdestination,udpsourceport',value:'ntppvtbytes',filter:'ntppvtreq=20,42'}

or to target DNS ANY requests:

{keys:'ipdestination,udpsourceport',value:'frames',filter:'dnsqr=truednsqtype=255'}

The OpenFlow block control can be modified to selectively filter UDP
traffic based on the identified UDP source port and destination IP
address.

Vendors are adding new SDN capabilities to their platforms (often as
software upgraded), so it's worth taking a look and seeing what is
possible.

Peter

On Sun, Feb 2, 2014 at 7:38 PM, Larry Sheldon larryshel...@cox.net wrote:
 On 2/2/2014 9:17 PM, ryang...@gmail.com wrote:

 I'd hate to think that NetOps would be so heavy handed in blocking
 all of UDP, as this would essentially halt quite a bit of audio/video
 traffic. That being said, there's still quite the need for protocol
 improvement when making use of UDP, but blocking UDP as a whole is
 definitely not a resolution, and simply creating a wall that not only
 keeps the abusive traffic out, but keeps legitimate traffic from
 flowing freely as it should.


 We had to burn down the village to save it.


 --
 Requiescas in pace o email   Two identifying characteristics
 of System Administrators:
 Ex turpi causa non oritur actio  Infallibility, and the ability to
 learn from their mistakes.
   (Adapted from Stephen Pinker)




Re: TWC (AS11351) blocking all NTP?

2014-02-03 Thread Peter Phaal
On Mon, Feb 3, 2014 at 10:16 AM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 On Mon, Feb 3, 2014 at 12:42 PM, Peter Phaal peter.ph...@gmail.com wrote:
 Why burn the village when only one house is the problem? I thought
 there might be some interest in hearing about work being done to use
 SDN to automatically configure filtering in existing switches and
 routers to mitigate flood attacks.


 that's great... who's got sdn capable gear in deployments today? with
 code and OSS stuff to deal with random SDN pokery? and who has spare
 tcam/etc to deal with said pokery of 'block the attack-du-jour' ?

 There's certainly the case that you could drop acls/something on
 equipment to selectively block the traffic that matters... I suspect
 in some cases the choice was: 50% of the edge box customers on this
 location are a problem, block it across the board here instead of X00
 times (see concern about tcam/etc problems)

I agree that managing limited TCAM space is critical to the
scaleability of any mitigation solution. However, tying up TCAM space
on every edge device with filters to prevent each new threat is likely
to be less scaleable than a measurement driven control that only takes
a TCAM slot on a device when an active attack is detected transiting
that device.

 Real-time analytics based on measurements from switches/routers
 (sFlow/PSAMP/IPFIX) can identify large UDP flows and integrated hybrid
 OpenFlow, I2RS, REST, NETCONF APIs, etc. can be used to program the
 switches/routers to selectively filter traffic based on UDP port and
 IP source / destination. By deploying a DDoS mitigation SDN
 application,  providers can use their existing infrastructure to
 protect their own and their customers networks from flood attacks, and
 generate additional revenue by delivering flood protection as a value
 added service.

 yup, that sounds wonderous... and I'm sure that in the future utopian
 world (like 7-10 years from now, based on age-out of gear and OSS IT
 change requirements) we'll see more of this. I don't think you'll see
 much (in terms of edge ports on the network today) of this happening
 'right now' though.

The current 10G upgrade cycle provides an opportunity to deploy
equipment that is SDN capable. The functionality required for this use
case is supported by current generation merchant silicon and is widely
available right now in inexpensive switches.

 Specifically looking at sFlow, large flood attacks can be detected
 within a second. The following article describes a simple example
 using integrated hybrid OpenFlow in a 10/40G ToR switch:

 hopefully there's some clamp on how much change per device/port you
 plan too? :) I'd hate to see the RP/RE/etc get so busy programming
 tcam that bgp/isis/ospf/etc flaps :(

With integrated hybrid OpenFlow, there is very little activity on the
OpenFlow control plane. The normal BGP, ECMP, LAG, etc. control planes
handles forwarding of packets. OpenFlow is only used to selectively
override specific FIB entries.

I2RS provides a similar capability to selectively override RIB entries
and implement controls. However, I don't know if any vendors are
shipping I2RS capable routers today.

Typical networks probably only see a few DDoS attacks an hour at the
most, so pushing a few rules an hour to mitigate them should have
little impact on the switch control plane.

A good working definition of a large flow is 10% of a link's
bandwidth. If you only trigger actions for large flows then in the
worst case you would only require 10 rules per port to change how
these flows are treated.



 http://blog.sflow.com/2014/01/physical-switch-hybrid-openflow-example.html

 The example can be modified to target NTP mon_getlist requests and
 responses using the following sFlow-RT flow definition:

 {'ipdestination,udpsourceport',value:'ntppvtbytes',filter:'ntppvtreq=20,42'}

 or to target DNS ANY requests:

 {keys:'ipdestination,udpsourceport',value:'frames',filter:'dnsqr=truednsqtype=255'}


 this also assume almost 1:1 sampling... which might not be feasible
 either...otherwise you'll be seeing fairly lossy results, right?

Actually, to detect large flows (defined as 10% of link bandwidth)
within a second, you would only require the following sampling rates:

1G link, sampling rate = 1-in-1,000 (large flow = 100M bit/s)
10G link, sampling rate = 1-in-10,000 (large flow = 1G bit/s)
40G link, sampling rate = 1-in-40,000 (large flow = 4G bit/s
100G link, sampling rate = 1-in-100,000 (large flow = 10G bit/s)

These sampling rates are realistically achievable in production
networks (enabling monitoring on all ports) and would allow you to
detect the specific IP destination and UDP source port associated with
a flood attack, and the switches in the attack path, within a second.


 The OpenFlow block control can be modified to selectively filter UDP
 traffic based on the identified UDP source port and destination IP
 address.


 hopefully your OSS and netflow/sflow collection isn't

Re: TWC (AS11351) blocking all NTP?

2014-02-03 Thread Peter Phaal
On Mon, Feb 3, 2014 at 12:38 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 On Mon, Feb 3, 2014 at 2:42 PM, Peter Phaal peter.ph...@gmail.com wrote:
 On Mon, Feb 3, 2014 at 10:16 AM, Christopher Morrow
 morrowc.li...@gmail.com wrote:
 On Mon, Feb 3, 2014 at 12:42 PM, Peter Phaal peter.ph...@gmail.com wrote:

 There's certainly the case that you could drop acls/something on
 equipment to selectively block the traffic that matters... I suspect
 in some cases the choice was: 50% of the edge box customers on this
 location are a problem, block it across the board here instead of X00
 times (see concern about tcam/etc problems)

 I agree that managing limited TCAM space is critical to the
 scaleability of any mitigation solution. However, tying up TCAM space
 on every edge device with filters to prevent each new threat is likely

 yup, there's a tradeoff, today it's being made one way, tomorrow
 perhaps a different way. My point was that today the percentage of sdn
 capable devices is small enough that you still need a decimal point to
 measure it. (I bet, based on total devices deployed) The percentage of
 oss backend work done to do what you want is likely smaller...

 the folk in NZ-land (Citylink, reannz ... others - find josh baily /
 cardigan) are making some strides, but only in the exchange areas so
 far. fun stuff... but not the deployed gear as an L2/L3 device in
 TWC/Comcast/Verizon.

I agree that today most networks aren't SDN ready, but there are
inexpensive switches on the market that can perform these functions
and for providers that have them in their network, this is an option
today. In some environments, it could also make sense to drop in a
layer switches to monitor and control traffic entering / exiting the
network.

 The current 10G upgrade cycle provides an opportunity to deploy

 by 'current 10g upgrade cycle' you mean the one that happened 2-5 yrs
 ago? or somethign newer? did you mean 100G?

I was referring to the current upgrade cycle in data centers, with
servers connected with 10G rather than 1G adapters. The high volumes
are driving down the cost of 10/40/100G switches.


 equipment that is SDN capable. The functionality required for this use
 case is supported by current generation merchant silicon and is widely
 available right now in inexpensive switches.


 right... and everyone is removing their vendor supported gear and
 replacing it with pica8 boxes? The reality is that as speeds/feeds
 have increased over the last while basic operations techiques really
 haven't. Should they? maybe? will they? probably? is that going to
 happen on a dime? nope. Again, I suspect you'll see smaller
 deployments of sdn-like stuff 'soon' and larger deployments when
 people are more comfortable with the operations/failure modes that
 change.

Not just Pica8, most vendors (branded or white box) are using the same
Broadcom merchant silicon, including Cisco, Juniper, Arista,
Dell/Force10, Extreme etc.:

http://blog.sflow.com/2014/01/drivers-for-growth.html


 Specifically looking at sFlow, large flood attacks can be detected
 within a second. The following article describes a simple example
 using integrated hybrid OpenFlow in a 10/40G ToR switch:

 hopefully there's some clamp on how much change per device/port you
 plan too? :) I'd hate to see the RP/RE/etc get so busy programming
 tcam that bgp/isis/ospf/etc flaps :(

 With integrated hybrid OpenFlow, there is very little activity on the
 OpenFlow control plane. The normal BGP, ECMP, LAG, etc. control planes
 handles forwarding of packets. OpenFlow is only used to selectively
 override specific FIB entries.

 that didn't really answer the question :) if I have 10k customers
 behind the edge box and some of them NOW start being abused, then more
 later and that mix changes... if it changes a bunch because the
 attacker is really attackers. how fast do I change before I can't do
 normal ops anymore?

Good point - the proposed solution is most effective for protecting
customers that are targeted by DDoS attacks. While trying to prevent
attackers entering the network is good citizenship, the value and
effectiveness of the mitigation service increases as you get closer to
the target of the attack. In this case there typically aren't very
many targets and so a single rule filtering on destination IP address
and protocol would typically be effective (and less disruptive to the
victim that null routing).


 Typical networks probably only see a few DDoS attacks an hour at the
 most, so pushing a few rules an hour to mitigate them should have
 little impact on the switch control plane.

 based on what math did you get 'few per hour?' As an endpoint (focal
 point) or as a contributor? The problem that started this discussion
 was being a contributor...which I bet happens a lot more often than
 /few an hour/.

I am sorry, I should have been clearer, the SDN solution I was
describing is aimed at protecting the target's links, rather than
mitigating

Re: TWC (AS11351) blocking all NTP?

2014-02-03 Thread Peter Phaal
On Mon, Feb 3, 2014 at 2:58 PM, Christopher Morrow
morrowc.li...@gmail.com wrote:
 wait, so the whole of the thread is about stopping participants in the
 attack, and you're suggesting that removing/changing end-system
 switch/routing gear and doing something more complex than:
   deny udp any 123 any
   deny udp any 123 any 123
   permit ip any any

 is a good plan?

 I'd direct you at:
   https://www.nanog.org/resources/tutorials

 and particularly at:
  Tutorial: ISP Security - Real World Techniques II
  https://www.nanog.org/meetings/nanog23/presentations/greene.pdf

Thanks for the links. Many SDN solutions can be replicated using
manual processes (or are ways of automating currently manual
processes). Programmatic APIs allows the speed and accuracy of the
response to be increased and the solution to be delivered at scale and
at lower cost.

 it's probably not a good plan to forklift your edge, for dos targets
 where all you really need is a 3 line acl.

For many networks it doesn't need to be forklift upgrade - vendors are
adding programmatic APIs to their existing products (OpenFlow, Arista
eAPI, NETCONF, ALU Web Services ...) - so a firmware upgrade may be
all that is required.

I do think that there are operational advantages to using protocols
like OpenFlow, I2RS, BGP FlowSpec for these soft controls since they
allow the configuration to remain relatively static and they avoid
problems of split control (for example, and operator makes a config
change and saves, locking in a temporary control from the SDN system).

I would argue that the more specific the ACL can be the less
collateral damage. Built-in measurement allows for a more targeted
response.

 Good point - the proposed solution is most effective for protecting
 customers that are targeted by DDoS attacks. While trying to prevent

 Oh, so the 3 line acl is not an option? or (for a lot of customers a
 fine answer) null route? Some things have changed in the world of dos
 mitigation, but a bunch of the basics still apply. I do know that in
 the unfortunate event that your network is the transit or terminus of
 a dos attack at high volume you want to do the least configuration
 that'll satisfy the 2 parties involved (you and your customer)...
 doing a bunch of hardware replacement and/or sdn things when you can
 get the job done with some acls or routing changes is really going to
 be risky.

I think an automatic system using a programmatic API to install as
narrowly scoped a filter as possible is the most conservative and
least risky option. Manual processes are error prone, slow, and blunt
instruments like a null route can cause collateral damage to services.

 Typical networks probably only see a few DDoS attacks an hour at the
 most, so pushing a few rules an hour to mitigate them should have
 little impact on the switch control plane.

 based on what math did you get 'few per hour?' As an endpoint (focal
 point) or as a contributor? The problem that started this discussion
 was being a contributor...which I bet happens a lot more often than
 /few an hour/.

 I am sorry, I should have been clearer, the SDN solution I was
 describing is aimed at protecting the target's links, rather than
 mitigating the botnet and amplification layers.

 and i'd say that today sdn is out of reach for most deployments, and
 that the simplest answer is already available.

 The number of attacks was from the perspective of DDoS targets and
 their service providers.  If you are considering each participant in
 the attack the number goes up considerably.

 I bet roland has some good round-numbers on number of dos attacks per
 day... I bet it's higher than a few per hour globally, for the ones
 that get noticed.

The few per hour number isn't a global statistic. This is the number
that a large hosting data center might experience. The global number
is much larger, but not very relevant to a specific provider looking
to size a mitigation solution.

 note that the focus of the original thread was on the contributors. I
 think the target part of the problem has been solved since before the
 slides in the pdf link at the top...

Do most service providers allow their customers to control ACLs in the
upstream routers? Do they automatically monitor traffic and insert the
filters themselves when there is an attack? I don't believe so - while
the slides describe a solution, automation is needed to make available
at large scale.

 you're getting pretty complicated for the target side:
   ip access-list 150 permit ip any any log

 (note this is basically taken verbatim from the slides)

 view logs, see the overwhelming majority are to hostX port Y proto
 Z... filter, done.
 you can do that in about 5 mins time, quicker if you care to rush a bit.

An automated system can perform the analysis and apply the filter in a
second with no human intervention. What if you have to manage
thousands of customer links?

 This brings up an interesting point use case for an OpenFlow 

Re: ddos attacks

2013-12-18 Thread Peter Phaal
Dan,

If you are using sFlow for your measurements, then you might want to take a
look sFlow-RT for DDoS mitigation. The following case study describes how
sFlow and null routing are being used to mitigate flood attacks:

http://blog.sflow.com/2013/03/ddos.html

The analytics engine will detect flood attacks in less than a second and
you can use the embedded scripting API to initiate automated responses. The
following articles contain basic DDoS mitigation scripts - you just need to
replace the block() and allow() functions with calls to expect scripts,
OpenFlow rules, or REST API calls - whatever makes sense in your
environment.

http://blog.sflow.com/search/label/DoS

This is a commercial product, but it's free to try out (no registration
required):

http://inmon.com/products/sFlow-RT.php

Cheers,
Peter


On Wed, Dec 18, 2013 at 8:36 AM, Dan White dwh...@olp.net wrote:

 Can anyone recommend a vendor solution for DDOS mitigation? We are looking
 for a solution that detects DDOS attacks from sflow information and
 automatically announces BGP /32 blackhole routes to our upstream providers,
 or a similar solution.

 Thank You.


 On 08/05/13 21:09 +1000, Ahad Aboss wrote:

 Scott,

 Use a DDOS detection and mitigation system with DPI capabilities to deal
 with traditional DDOS attack and anomalous behaviour such as worm
 propagation, botnet attacks and malicious subscriber activity such as
 flooding and probing. There are only a few vendors who successfully play
 in
 this space who provide a self healing/self defending system.

 Cheers
 Ahad
 -Original Message-
 From: sgr...@airstreamcomm.net [mailto:sgr...@airstreamcomm.net]
 Sent: Friday, 2 August 2013 11:37 PM
 To: nanog@nanog.org
 Subject: ddos attacks

 I’m curious to know what other service providers are doing to
 alleviate/prevent ddos attacks from happening in your network.  Are you
 completely reactive and block as many addresses as possible or null0
 traffic
 to the effected host until it stops or do you block certain ports to
 prevent
 them.  What’s the best way people are dealing with them?

 Scott


 --
 Dan White




Re: Looking for Netflow analysis package

2013-05-14 Thread Peter Phaal
You might want to take a look at pmacct, http://www.pmacct.net/. It
includes an embedded version of Quagga, allowing BGP AS Path data to be
efficiently joined with flow records.

Peter


On Tue, May 14, 2013 at 3:59 PM, Erik Sundberg esundb...@nitelusa.comwrote:

 Does anyone know of a netflow collector that will do the following.
 *Graph/List Destination Networks By Top AS
 *Graph/List Destination Networks By Top IP Address
 *AS Path Analysis
 *Traffic Type (ICMP, TCP, UDP, IPSEC, HTTP, SSH, SMTP, etc..)

 We will be using this to help us decide who to Peer with and what transit
 Providers to look at.

 I am familiar with Arbor Network's Peak Flow utility but it's a little too
 pricy.
 I also found AS-Stats https://neon1.net/as-stats/ look promising from the
 power point on their page.

 Thanks
 Erik


 

 CONFIDENTIALITY NOTICE: This e-mail transmission, and any documents, files
 or previous e-mail messages attached to it may contain confidential
 information that is legally privileged. If you are not the intended
 recipient, or a person responsible for delivering it to the intended
 recipient, you are hereby notified that any disclosure, copying,
 distribution or use of any of the information contained in or attached to
 this transmission is STRICTLY PROHIBITED. If you have received this
 transmission in error please notify the sender immediately by replying to
 this e-mail. You must destroy the original transmission and its attachments
 without reading or saving in any manner. Thank you.



Re: SDN - Killer Apps

2013-02-25 Thread Peter Phaal
On Mon, Feb 25, 2013 at 2:10 AM, Saku Ytti s...@ytti.fi wrote:
 On (2013-02-25 13:53 +0530), Glen Kent wrote:

 I understand that this is just some bit of what we can do with SDN. The
 amount of what all can be done is limitless. So, a question to all out
 there - Is my understanding of what can be achieved with SDN, is correct?

 Frankly I don't think there is single answer.

 From my point of view I don't see much use for it as general purpose SP.

There is potential for balancing to be a killer application for SDN in
the service provider space:

http://blog.sflow.com/2013/02/sdn-and-large-flows.html

What do people think?



Re: Anyone know of a good InfiniBand vendor in the US?

2013-02-21 Thread Peter Phaal
I wanted to bring attention to the following draft proposal from
Mellanox to export traffic information from InfiniBand switches:

http://sflow.org/draft_sflow_infiniband.txt

If you are an InfiniBand user, this is a great opportunity to think
about the types of metrics that you woud want from your switches in
order to better understand performance. The operational sensibility
that the NANOG audience brings is particularly valuable.

Comments on the proposal are welcome on the sFlow discussion group:

http://groups.google.com/group/sflow

On Wed, Feb 20, 2013 at 2:25 PM, Tom Ammon thomasam...@gmail.com wrote:
 IPoIB looks more like an application than a network protocol to Infiniband.
 The IB fabric doesn't have a concept of broadcast, so ARP works much
 differently than it does in IPv4/ethernet world - basically an all-nodes
 multicast group handles the distribution of ARP messages. That said, the ib
 drivers that come with redhat/centos are pretty good, and you can always
 download the official OFED drivers from the OFA at
 https://www.openfabrics.org/linux-sources.html if the stuff in your linux
 distribution is missing something.

 I've set up IPoIB routers running 10G NICs on the ethernet side and QDR
 HCAs on the IB side, using quagga to plug in to the rest of my OSPF
 network, and it works fine. Basically you just need to set up quagga like
 you would if you were going to turn a linux box into an ethernet router and
 don't worry about the fact that it's actually IB on one side of the router
 - your network statements, etc., in OSPF in quagga won't change at all.

 You'll find that some things in IB have no equivalent to ethernet. For
 example, if you want to have gateway redundancy for traffic exiting the IB
 fabric, your first instinct will be to look for VRRP for IB, but you won't
 find it, because of the ARP differences I talked about above. To get around
 this you can set up linux-ha or some other type of heartbeat arrangement
 and bring up a virtual IP on the active gateway, which can be shifted over
 to the standby gateway when the ha scripts detect a problem. Some vendors
 also have proprietary solutions to this problem but they tend to be
 expensive.

 So, I'd say, read up on quagga and give that a try, and I think you'll find
 that as long as the IB drivers are up to snuff (the sminfo command returns
 valid results, etc.) it'll pretty much just work for you. I'm also happy to
 discuss more offline if you prefer.

 Tom

 Tom


 On Tue, Feb 19, 2013 at 5:55 PM, Jon Lewis jle...@lewis.org wrote:

 On Tue, 19 Feb 2013, Landon Stewart wrote:

  Oh by vendor I mean VAR I guess.  Mostly I'm also wondering how an IB
 network handles IPoIB and how one uses IB with a gateway to layer 3
 Ethernet switches or edge routers.  If anyone has any resources that
 provide details on how this works and how ethernet VLANs are handled I'd
 appreciate it.


 My limited IB experience has been that the IB switch acts much like a dumb
 ethernet switch, caring only about which IB hardware addresses are
 reachable via which port.  Routing between IPoIB and IP over ethernet can
 be done by any host with interfaces on both networks and IP forwarding
 enabled.  In our setups, we've used IPoIB, but with 1918 addresses and not
 routed beyond the IB network.

 --**--**--
  Jon Lewis, MCP :)   |  I route
  Senior Network Engineer |  therefore you are
  Atlantic Net|
 _ 
 http://www.lewis.org/~jlewis/**pgphttp://www.lewis.org/~jlewis/pgpfor PGP 
 public key_




 --
 -
 Tom Ammon
 Network Engineer
 M: (801) 674-9273
 t...@tomsbox.net
 -



Re: switch 10G standalone TOR, core to DC

2013-02-19 Thread Peter Phaal
On Tue, Feb 19, 2013 at 8:21 PM, Bao Nguyen ngq...@gmail.com wrote:
 Anyone have worked with the switching vendor Quanta for their 10ge switching 
 as
 TOR? [1] Their spec looked interesting and they are quiet cheap.


 [1]
 http://www.quantaqct.com/en/01_product/02_detail.php?mid=30sid=114id=116qs=63


 -bn
 0216331C


Based on the specs, the Quanta switches look like they use Broadcom
merchant silicon and should have similar performance to other switches
based on the same chipset:

http://blog.sflow.com/2011/12/merchant-silicon.html

While many vendors use merchant silicon, there is variability in
firmware, exposed features, CLI etc.



Re: switch 10G standalone TOR, core to DC

2013-01-29 Thread Peter Phaal
Peter,

Network visibility wasn't mentioned as a requirement, but it is worth
considering since the ToR switches are the best place monitor server
network I/O, tunneled traffic (VxLAN, GRE etc), storage (iSCSI, FCoE,
HDFS etc).

The Nexus 5548 switch does not include monitoring (i.e. no
NetFlow/sFlow). The Nexus 3048, along with all the other 10G ToR
switches so far mentioned on this thread, supports sFlow and provides
wire speed 10G/40G monitoring.

The following article provides additional background:

http://blog.sflow.com/2012/02/10-gigabit-ethernet.html

Cheers,
Peter

On Tue, Jan 29, 2013 at 7:15 AM, Steven Fischer sfischer1...@gmail.com wrote:
 although everyone here seems to hold Cisco in contempt, the Nexux 5548 is a
 rock-solid switch - at least that has been my experience with it.


 On Tue, Jan 29, 2013 at 6:27 AM, Piotr piotr.1...@interia.pl wrote:


 Hello,

 I looking some 10G switches, it should work as TOR or core in DC. It
 should have more than 40 port 10G in one unit, wirespeed L2 L3, with
 virtual routers and some other ip functions like some BGP, OSPF, policy
 routing, 1-2U, MLAG, g.8032 (ERPS) trill-like ?

 Other important features are  big port buffers ( something similar to
 Juniper EX8200 - 512 MB per slot), defined counters accessible via snmp
 (like in junos), L3 statistics  accessible via snmp


 Extreme 670 looks good but they have small port buffers. It can be also
 some small chassis with line cards but the cost per 10G ports is too big..

 What vendor, model You prefer or suggest as a solution ?

 thanks for help
 best,
 Peter






 --
 To him who is able to keep you from falling and to present you before his
 glorious presence without fault and with great joy



Re: Detection of Rogue Access Points

2012-10-14 Thread Peter Phaal
Do the layer 2 switches include sFlow instrumentation?

http://sflow.org/products/network.php

The following paper describes how IP TTL values can help identify
unauthorized NAT devices.

http://www.sflow.org/detectNAT/

Peter

On Sun, Oct 14, 2012 at 1:59 PM, Jonathan Rogers quantumf...@gmail.com wrote:
 Gentlemen,

 An issue has come up in my organization recently with rogue access points.
 So far it has manifested itself two ways:

 1. A WAP that was set up specifically to be transparent and provided
 unprotected wireless access to our network.

 2. A consumer-grade wireless router that was plugged in and just worked
 because it got an address from DHCP and then handed out addresses on its
 own little network.

 These are at remote sites that are on their own subnets (10.100.x.0/24;
 about 130 of them so far). Each site has a decent Cisco router at the
 demarc that we control. The edge is relatively low-quality managed layer 2
 switches that we could turn off ports on if we needed to, but we have to
 know where to look, first.

 I'm looking for innovative ideas on how to find such a rogue device,
 ideally as soon as it is plugged in to the network. With situation #2 we
 may be able to detect NAT going on that should not be there. Situation #1
 is much more difficult, although I've seen some research material on how
 frames that originate from 802.11 networks look different from regular
 ethernet frames. Installation of an advanced monitoring device at each site
 is not really practical, but we may be able to run some software on a
 Windows PC in each office. One idea put forth was checking for NTP traffic
 that was not going to our authorized NTP server, but NTP isn't necessarily
 turned on by default, especially on consumer-grade hardware.

 Any ideas?

 Thank you for your time,

 Jonathan Rogers



Re: Real world sflow vs netflow?

2012-09-24 Thread Peter Phaal
On Mon, Sep 24, 2012 at 5:48 AM, Joe Loiacono jloia...@csc.com wrote:
 Peter Phaal peter.ph...@gmail.com wrote on 09/23/2012 12:23:57 PM:


 Exporting packet oriented measurements doesn't mean that you have to
 loose ingress/egress interface data. In the specific example being
 discussed (sFlow export), detailed forwarding information from the
 router forwarding plane is exported with each sampled packet header
 (full AS-path if you are using BGP).


 Wrt AS-path, I don't get how this happens. Since this is important to this
 community, could you explain?

Sure. I think it's worth discussing in some detail since this is
relevant to the NANOG community and it is important to understand how
it works.

When a switch/router decides to sample a packet it records the
ingress/egress interfaces and accumulates information about how it
decided to forward the packet by examining its FIB tables. Each packet
may take a different path, some may by switched at layer 2, others may
be forwarded based on a local routing protocol like OSPF, and still
others may be forwarded based on BGP.

The forwarding data associated with each packet is irregular (e.g. a
switched packet won't have BGP information), and so sFlow doesn't try
to flatten it into tables, but instead encodes the data using XDR (RFC
1832), expressing each element of the forwarding decision as a tag,
length, value encoded structure that contains attributes relevant to
each type of forwarding decision. The AS-Path itself is a fairly
complicated, variable length structure and again, this is encoded as
XDR.

These are all optional fields in sFlow, so you should check with your
switch vendor to see which ones they support. If they don't currently
export the FIB data you are looking for, you should ask them to
upgrade their agent because as Jeroen pointed out, populating each
structure is just an extra lookup performed by the management CPU on
the router.

FYI I have see full AS-path data exported from a busy 100G router, so
there should be no problem collecting these measurements in a
production setting.

The following extract from the sFlow version 5 specification shows
what forwarding information is exported:

/* Extended Flow Data

   Extended data types provide supplimentary information about the
   sampled packet. All applicable extended flow records should be
   included with each flow sample. */

/* Extended Switch Data */
/* opaque = flow_data; enterprise = 0; format = 1001 */
/* Note: For untagged ingress ports, use the assigned vlan and priority
 of the port for the src_vlan and src_priority values.
 For untagged egress ports, use the values for dst_vlan and
 dst_priority that would have been placed in the 802.Q tag
 had the egress port been a tagged member of the VLAN instead
 of an untagged member. */

struct extended_switch {
   unsigned int src_vlan; /* The 802.1Q VLAN id of incoming frame */
   unsigned int src_priority; /* The 802.1p priority of incoming frame */
   unsigned int dst_vlan; /* The 802.1Q VLAN id of outgoing frame */
   unsigned int dst_priority; /* The 802.1p priority of outgoing frame */
}

/* IP Route Next Hop
   ipForwardNextHop (RFC 2096) for IPv4 routes.
   ipv6RouteNextHop (RFC 2465) for IPv6 routes. */

typedef next_hop address;

/* Extended Router Data */
/* opaque = flow_data; enterprise = 0; format = 1002 */

struct extended_router {
   next_hop nexthop;/* IP address of next hop router */
   unsigned int src_mask_len;   /* Source address prefix mask
   (expressed as number of bits) */
   unsigned int dst_mask_len;   /* Destination address prefix mask
   (expressed as number of bits) */
}

enum as_path_segment_type {
   AS_SET  = 1,/* Unordered set of ASs */
   AS_SEQUENCE = 2 /* Ordered set of ASs */
}

union as_path_type (as_path_segment_type) {
   case AS_SET:
  unsigned int as_set;
   case AS_SEQUENCE:
  unsigned int as_sequence;
}

/* Extended Gateway Data */
/* opaque = flow_data; enterprise = 0; format = 1003 */

struct extended_gateway {
   next_hop nexthop;   /* Address of the border router that should
  be used for the destination network */
   unsigned int as;/* Autonomous system number of router */
   unsigned int src_as;/* Autonomous system number of source */
   unsigned int src_peer_as;   /* Autonomous system number of source peer */
   as_path_type dst_as_path; /* Autonomous system path to the destination */
   unsigned int communities; /* Communities associated with this route */
   unsigned int localpref; /* LocalPref associated with this route */
}



Re: Real world sflow vs netflow?

2012-09-24 Thread Peter Phaal
On Mon, Sep 24, 2012 at 11:19 AM, Joe Loiacono jloia...@csc.com wrote:
 OK, Well I guess I was thinking sFlow was primarily a switch oriented
 technology versus on a layer-3 peering router.

The sFlow technology is a good fit for any device that performs a
packet forwarding function (including routers) and the sFlow.org web
site maintains a list of switches and routers that implement the
technology,

http://sflow.org/products/network.php

However, you are correct that today sFlow is more broadly implemented
in switching platforms than routing platforms, but I expect this will
change as network speeds increase and platforms converge.



Re: Real world sflow vs netflow?

2012-09-23 Thread Peter Phaal
On Sun, Sep 23, 2012 at 8:16 AM, Dobbins, Roland rdobb...@arbor.net wrote:

 On Sep 23, 2012, at 7:55 PM, Danny McPherson wrote:

 If the *flow generation process is not performed on the router (or otherwise
 conveyed by some metadata outside of raw [sampled] packet headers) then
 you lose visibility to ingress and egress ifIndex (interface) information --
 information which is required if/when deploying controls on those systems to
 squelch various traffic flows.

 Thanks, Danny - I guess I should've spelled it out, thanks for clarifying, 
 heh.

 It should also be noted that generating the flows directly from the data 
 plane of the
 router/switch or doing it offboard (as long as sufficient ingress/egress 
 ifindex
 metadata are collected and exported, as you note) is just an implementation 
 detail
 - it isn't inherent to s/Flow, NetFlow, IPFIX, et. al.  So, claiming this as 
 some kind
 of advantage for a particular flow telemetry format is a non sequitur.

Exporting packet oriented measurements doesn't mean that you have to
loose ingress/egress interface data. In the specific example being
discussed (sFlow export), detailed forwarding information from the
router forwarding plane is exported with each sampled packet header
(full AS-path if you are using BGP). An external flow generator in
this case can produce flow records that are identical to those that
the device would produce, i.e. include ingress/egress ports.

The difference between packet oriented or flow oriented export is an
implementation detail if your only requirement is to obtain layer IP
flow records, but becomes significant if you want to create customized
flow records or create packet oriented metrics. Applications for
packet oriented metrics mentioned earlier in this thread included
route analytics, analysis of ECMP/LAG/TRILL forwarding, packet size
distribution vs. DSCP, DDoS mitigation.

The problem with having the router perform the flow analysis is that
once data is aggregated, it can't be disaggregated. It's like the
difference between receiving eggs or an omelette. If you like the
omelette, great! But if you wan't a different omelette or would like
to poach, boil, scramble or bake your eggs then getting the raw eggs
is a lot more versatile.



Re: Real world sflow vs netflow?

2012-09-22 Thread Peter Phaal
On Fri, Sep 21, 2012 at 10:02 PM, Dobbins, Roland rdobb...@arbor.net wrote:

 On Sep 22, 2012, at 12:40 AM, Peter Phaal wrote:

  However, moving the flow generation out of the router gives a lot of 
 flexibility.

 Actually, moving it out of the router creates huge problems and destroys a 
 lot of the value of the flow telemetry - it nullifies your ability to 
 traceback where traffic is ingressing your network, which is key for both 
 security as well as traffic engineering, peering analysis, etc.

 It is far, far better to get your flow telemetry from your various edge 
 routers, if at all possible, rather that probes.  Scales better, too - and is 
 less expensive in terms of both capex and opex.

Roland,

I probably wasn't as clear as a should have been in describing how
sFlow works. Here are some comments and links to additional
information that address each of your concerns:

1. There are no probes involved when using sFlow, the architecture
looks very similar to NetFlow with UDP records streaming from multiple
routers to a software collector.

http://blog.sflow.com/2009/05/choosing-sflow-analyzer.html

2. The sFlow records exported by the router include telemetry that
allows you to trace traffic paths through the network (ingress port,
egress port, FIB entry etc.).

http://blog.sflow.com/2009/05/packet-paths.html

3. sFlow has a lower CAPEX,  the flow cache resides in inexpensive
memory on a commodity server instead of limited, expensive, TCAM
memory on the router. The sFlow instrumentation is included in ASICs
and is a base feature of the device; unlike NetFlow which often
requires upgraded supervisor cards etc. sFlow is widely supported in
merchant silicon, further reducing costs and increasing multi-vendor
interoperability - Cisco supports sFlow in the merchant silicon based
Nexus 3k series.

http://blog.sflow.com/2010/09/superlinear.html
http://blog.sflow.com/2011/12/merchant-silicon.html
http://blog.sflow.com/2012/09/vendor-support.html
http://blog.sflow.com/2012/08/cisco-adds-sflow-support.html

4. sFlow has lower OPEX, the architecture is simpler, has lower
operational complexity and provides much greater scalability.

http://blog.sflow.com/2010/11/complexity-kills.html
http://blog.sflow.com/2010/09/superlinear.html

Peter



Re: Real world sflow vs netflow?

2012-09-22 Thread Peter Phaal
On Sat, Sep 22, 2012 at 4:41 PM, Dobbins, Roland rdobb...@arbor.net wrote:
 You have misinterpreted what I said.  I was saying that flow telemetry of any
 variety must be exported from edge devices, which in most cases are routers
 (in some cases layer-3 switches), in response to your 'move it out of the 
 router'
 comment.

I am sorry I misunderstood your comment, I agree that it is important
to gather telemetry directly from your edge devices. The comment move
it out of the router referred to the location of the flow-cache in
the following scenario.

On Thu, Sep 20, 2012 at 11:21 AM, Mikael Abrahamsson swm...@swm.pp.se wrote:
 Most of the platforms I know of do sampled netflow at 1:100-1:1000 or so,
 and then I don't really see the fundamental difference in doing the flow
 analysis on the router itself (classic netflow) or doing the same but at the
 sFlow collector.

In both cases the router is generating the telemetry, in the netflow
case, packets are sampled on the router, the router builds flow
records based on the contents of the sampled packets, and the flow
records are exported. In the sFlow case, the raw sampled packet
headers are exported to external software which builds flow records.
In both cases the router is making the primary measurements and you
end up with the same measurements.

On Fri, Sep 21, 2012 at 10:02 PM, Dobbins, Roland rdobb...@arbor.net wrote:
 Actually, moving it out of the router creates huge problems and destroys a 
 lot of
 the value of the flow telemetry - it nullifies your ability to traceback 
 where traffic is
 ingressing your network, which is key for both security as well as traffic
 engineering, peering analysis, etc.

 It is far, far better to get your flow telemetry from your various edge 
 routers, if at
 all possible, rather that probes.  Scales better, too - and is less expensive 
 in
 terms of both capex and opex.

I agree completely, probes are expensive, difficult to manage and
can't accurately tell you how the traffic passed through the router.



Re: Real world sflow vs netflow?

2012-09-21 Thread Peter Phaal
On Thu, Sep 20, 2012 at 11:21 AM, Mikael Abrahamsson swm...@swm.pp.se wrote:
 Most of the platforms I know of do sampled netflow at 1:100-1:1000 or so,
 and then I don't really see the fundamental difference in doing the flow
 analysis on the router itself (classic netflow) or doing the same but at the
 sFlow collector.

There is no difference in the flow records you would obtain in either
case. However, moving the flow generation out of the router gives a
lot of flexibility. You can now choose how you want to generate flows,
rather than depend on the router vendor. You are also guaranteed
multi-vendor interoperability since problems associated with
differences in how each vendor generates flows are eliminated.

For a real world example on the need for flexibility in monitoring,
consider the challenge posed by IPv6 migration and virtualization as
they greatly increase the amount of layer 2, 3 and 4 tunneled traffic.
With an external software based flow generation you can easily upgrade
the software to report flows within the tunnels etc.

http://blog.sflow.com/2012/05/tunnels.html

There are many other things you can do with packet oriented (sFlow)
data besides flow generation and analysis that I think are worth being
aware of:

1. Route analytics. Packet forwarding decisions are made on a packet
by packet basis and sFlow accurately captures the forwarding decision
made for each sampled packet (flows are not a good way to report
forwarding decisions since you are forced to assume that the all
packets in the flow took the same forwarding path, which may not be
the case). With packet oriented measurements you can build a route
cache and use it to understand traffic forwarding based on AS-path,
next hop router etc.

2. Analysis of multi-path forwarding. Detailed visibility into
per-packet forwarding lets you diagnose issues with unbalanced LAG
groups, ECMP paths, TRILL paths etc.

3. Packet sizes. With packet oriented data you can easily calculate
packet size distributions by protocol, DSCP class, egress port etc.

4. DDoS detection and mitigation. Analysis of the sampled packet
stream can detect DDoS attacks within seconds and an automatic
response can be constructed using packet forwarding and header
information to find a signature for the attack, point of ingress etc.
You can also use packet analyzers like Wireshark and tcpdump to look
at the sFlow packet header records,
http://blog.sflow.com/2011/11/wireshark.html

5. Packet counters. MIB-2 interface counters are included in the set
of measurements that sFlow exports. Eliminating SNMP polling reduces
CPU load on the router (I have seen very high router CPU loads
associated with SNMP) and provides much faster updates on link
utilizations, packet discard rates etc.

I think Nick Hilliard put it well:

 Flows are good for measuring some things; raw packet sampling is good for
 measuring others.

 Decide on what you're trying to measure, then pick the best tool for the job.

However, to choose intelligently requires an understanding of the
fundamental differences between packet oriented and flow oriented
measurements, particularly as to how those differences relate to the
problem you are trying to solve. The two types of measurement are
related, but not the same.



Re: Real world sflow vs netflow?

2012-09-20 Thread Peter Phaal
On Sat, Jul 14, 2012 at 1:30 AM, Łukasz Bromirski luk...@bromirski.net wrote:
 sFlow is really sPacket, as it doesn't deal with flows.

 NetFlow, jFlow, IPFIX deal with flows.

I am a puzzled by the orthodoxy that seems to prevail around the value
flows as a measure of network traffic in packet switched networks.

The following article contains some thoughts on flow oriented and
packet oriented measurements. Apologies to NANOG readers for the
simplistic analogies used to describe packet switching, the article is
also intended for server administrators and application developers who
often don't really know what happens when they write some bytes to a
TCP socket.

http://blog.sflow.com/2012/09/packets-and-flows.html

The article positions flows as a useful abstraction for characterizing
host and application performance, but as a poor fit for understanding
packet traffic and measuring the performance of packet switches and
routers. This isn't really an issue of sFlow vs. NetFlow/IPFIX etc.
Either protocol can be used to export both types of measurements; the
question is what types of measurement should be exported.

What do people think?

Peter



Re: Real world sflow vs netflow?

2012-07-17 Thread Peter Phaal
In the case of sFlow, the collector determines how to report bytes.
The sFlow agent reports the size of the sampled layer 2 frame (along
with the first 128 bytes of the frame) and the collector can choose
whether to report L2 bytes, L3 bytes, L4 bytes etc. by subtracting the
sizes of the headers. It seems likely that the sFlow collector used in
the tests was reporting L3 bytes since the numbers were in agreement
with the numbers reported by NetFlow.

Peter

On Tue, Jul 17, 2012 at 8:32 AM, Simon Leinen simon.lei...@switch.ch wrote:
 James Braunegg writes:
 That being said both netflow and sflow both under read by about 3%
 when compared to snmp port counters, which we put to the conclusion
 was broadcast traffic etc which the routers didn't see / flow.

 That's one reason, but another reason would be that at least in Netflow
 (but sFlow may be similar depending on how you use it), the reported
 byte counts only include the sizes of the L3 packets, i.e. starting at
 the IP header, while the SNMP interface counters (ifInOctets etc.)
 include L2 overhead such as Ethernet frame headers and such.
 --
 Simon.




Re: Real world sflow vs netflow?

2012-07-13 Thread Peter Phaal
Hi David,

The main architectural difference between sFlow and Netflow is the
location of the flow cache:

1. NetFlow: Packets are decoded on the router, flow keys are extracted
and used to lookup/create an entry in a flow cache which is then
updated based on values in the packet. Records are exported from the
flow cache in the form of Netflow datagrams when the flow completes or
based on a timeout.
2. sFlow: Packets are randomly sampled in hardware and the packet
headers are immediately exported as sFlow datagrams - there is no flow
cache on the switch/router. In addition to exporting the packet
header, the sFlow agent captures the FIB state associated with
forwarding the sampled packet, exporting information such as next hop
router, AS-path, communities etc. An sFlow agent also periodically
sends all the MIB-II interface counters, eliminating the need for SNMP
polling - this isn't very important if you are only monitoring a few
links, but makes a big difference if you are monitoring large chassis
switches or tens or hundreds of thousands of ports in a data center or
campus environment.

Moving the flow cache off the router has a number of benefits:
1. You are no longer limited by the hardware/firmware capabilities of
the router - your analysis software decides which fields to decode and
how to accumulate results. For example, if you are managing a mixed
IPv4/IPv6 environment you can decide to use sFlow to look into v6 over
v4 and v4 over v6 tunnels (to do the same thing with Netflow would
likely require a hardware upgrade). You can even feed sFlow into
Wireshark for detailed analysis of protocols and packet headers.
2. Operational complexity is greatly reduced since the configuration
options and resource management issues associated with the flow cache
are eliminated.
3. Low latency. Measurements aren't delayed by the flow cache - you
can detect DDoS attacks/large flows within seconds.
4. Scalability - you can turn on sFlow on every link (even 100G
links), on every device for a comprehensive view of traffic.
5. Multi-vendor interoperability. The sFlow measurements are
interoperable across vendors (since very little processing is
performed on the devices). With NetFlow, different vendors and devices
have different hardware limitations affecting the fields that they can
export.

Unsampled Netflow is only practical for moderate traffic levels. If
you carry significant traffic you would want to enable sampling
anyway, even with Netflow. However, there are a wide range of Netflow
sampling implementations, many of which yield questionable results. In
contrast, the sFlow standard specifies how sampling must be performed
and ensures that information is included that allows the sampled data
to be correctly scaled and produce unbiased measurements.

Cheers,
Peter

On Fri, Jul 13, 2012 at 10:30 AM, David Hubbard
dhubb...@dino.hostasaurus.com wrote:
 Can anyone on or off list give me some real world
 thoughts on sflow vs netflow for border
 routers? (multi-homed, BGP, straight v4  v6 only
 for web hosting, no mpls, vpns, vlans, etc.)

 Finding it hard to decipher the vendor version
 of the answer to that question.  We use
 netflow v9 currently but are considering hardware
 that would be sflow.  We don't use it for
 billing purposes, mostly for spotting malicious
 remote hosts doing things like scans, spotting
 traffic such as weird ports in use in either
 direction that warrant further investigation,
 watching for ddos/dos destinations to act on
 mitigation, or investigating the nature of unusual
 levels of traffic on switch ports that set off
 alarms.  I'm concerned things like port scans,
 etc. won't be picked up by the NMS if fed by
 sflow due to the sampling nature, or similar
 concern if 500 ssh connections by the same remote
 host are sampled as 1 connection, etc.  Of course
 these concerns were put in my head by someone
 interested in me continuing to use equipment that
 happens to output netflow data, hence me wanting some
 real people answers. :-)

 Thanks!





Re: Network Traffic Collection

2012-02-23 Thread Peter Phaal
On Thu, Feb 23, 2012 at 1:59 PM, Justin M. Streiner
strei...@cluebyfour.org wrote:
 On Thu, 23 Feb 2012, Maverick wrote:

 I want to be able to see information like how much traffic an ip send
 over a period of time, what machines it talked to etc from this
 perspective it should be IP based but I would really like to know how
 other people do it.


 Truth is that most people probably don't do it, beyond temporary, ad-hoc
 deployments, to solve a specific problem at a specific point in time.
 Traffic capture and analysis doesn't scale too well into multi-Gb/s service
 provider environments.

 Netflow tools are an option if 'reasonably accurate' is good enough for your
 needs.

 jms


For high speed switched Ethernet environments, consider using sFlow.

You can treat sFlow as remote packet capture and use Wireshark/tcpdump
for troubleshooting network traffic:

http://blog.sflow.com/2011/11/wireshark.html

Or use sFlow reporting tools to find IP sources, protocols etc.:

http://sflow.org/products/collectors.php

Which tool to choose depends on your requirements.



Re: What sflow software - Manage Engine Net flow analyzer or Plixer Scrutinizer with Analyzer

2011-01-01 Thread Peter Phaal
sFlowTrend is free for up to five routers and should meet your requirement to 
quickly see top flows:

http://inmon.com/products/sFlowTrend.php

sFlowTrend is InMon's entry level product, if you need more features you might 
want to try sFlowTrend-Pro or Traffic Sentinel.

When selecting an sFlow analyzer, it is important to understand the sFlow 
architecture and the functional requirements it places on the analyzer - many 
products are principally netflow analyzers and do a poor job with sFlow

http://blog.sflow.com/2009/05/choosing-sflow-analyzer.html

Peter

On Jan 1, 2011, at 2:56 AM, Alex Pinto alex.pint...@hotmail.com wrote:

 
 Hi everyone, we currently are looking at sflow options for a commercial 
 collector and analyzer. The core use is for visibility on our network, for 
 quickly detecting source / destination IP addresses, ie where the traffic is 
 going and where is it coming from, the type of traffic would be interesting 
 also but to be honest all which really matters is source / destination.
 
 The requirement of the sflow software is to give us options and data very 
 quickly in the event of a DDOS attack so mitigation can occur quickly once we 
 understand what’s happening on the network. The last thing we want is for the 
 software not to work under a DDOS (too much data) thus leaving us blind upon 
 an attack. The quicker the software can report on issues, the quicker we can 
 do something about it. 
 Our current routers are fully sflow capable and both export nicely to both 
 packages.
 
 Our findings so far
 
 Manage Engine Net flow analyzer has both a Linux and windows version, the 
 software is very light and seems to perform very fast, although light on 
 additional features such as custom reporting, and alerting / in depth packet 
 information.  The concern is this software too simple, will it work under 
 heavy load?
 Based on our needs Manage Engine Net flow costs $2000.00
 
 Plixer Scrutinizer – based on windows the software seems resource intensive 
 but has a MASSIVE amount of extra visibility built into the software 
 including automatic alerts, that being said the software does seem extremely 
 more complex to configure and understand, reports seem to take longer to 
 produce and the information doesn’t seem to be reported as quickly. (ie lags 
 by minutes or so compared to Manage Engine)  
 Based on our needs Plixer Scrutinizer Costs $4000.00
 
 Does anyone have any real life experience on either package the cost 
 different between the two packages doesn’t worry us, it’s all about selecting 
 the correct package knowing the one time we need to access the flow 
 information and get it quick that the package we choose preforms quickly and 
 works.
 
 I’d also like to hear from anyone else using another commercial solution, 
 which they would recommend.
 
 Thanks in advance
 
 Alex 



ipfix/netflow/sflow generator for Linux

2010-12-27 Thread Peter Phaal
The latest version of Host sFlow adds support for ULOG traffic
monitoring (with ingress/egress ifIndex numbers):
http://host-sflow.sourceforge.net/

Cheers,
Peter


 My only issue is that I can't seem to find any good software for Linux that
 works with multiple interfaces to generate the flow information. I've tried
 ndsad, nprobe, softflowd, host sflow, and ipcad without much luck. Most of
 the software only works on one interface (which is useless as I need to do
 accounting for numerous interfaces).