RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-10 Thread c b
If you need that kind of density, I recommend a Clos fabric. Arista, Juniper, 
Brocade, Big Switch BCF and Cisco all have solutions that would allow you to 
build a high-density leaf/spine. You can build the Cisco solution with NXOS or 
ACI, depending which models you choose. The prices on these solutions are all 
somewhat in the same ballpark based on list pricing I've seen... even Cisco 
(the Nexus 9k is surprisingly in the same range as branded whitebox). There is 
also Pluribus which offers a fabric, but their niche is having server procs on 
board the switches and it seems like your project involves physical rather than 
virtual servers. Still, the Pluribus could be used without taking advantage of 
the on board server compute I suppose.
I also recommend looking into a solution that supports VXLAN (or GENEVE, or 
whatever overlay works for your needs) simply because MAC is carried in Layer-3 
so you won't have to deal with spanning tree or monstrous mac tables. But you 
don't need to do an overlay if you just segment with traditional VLANs.
I'm guessing you don't need HA (A/B uplinks utilizing LACP) for these servers?
Also, do you need line rate forwarding? Having 1,000 devices with 1Gb uplinks 
doesn't necessarily mean that full throughput is required... the clustering and 
the applications may be sporadic and bursty? I have seen load-testing clusters, 
hadoop and data warehousing pushing high volumes but the individual NICs in the 
clusters never actually hit capacity... If you need line-rate, then you need to 
do a deep dive with several of the vendors because there are significant 
differences in buffers on some models.
And... what support do you need? Just one spare on the shelf or full vendor 
support on every switch? That will impact which vendor you choose.
I'd like to hear more about this effort once you get it going. Which vendor you 
went with, how you tuned it, and why you selected who you did. Also, how it 
works.
LFoD
 Date: Sun, 10 May 2015 01:17:07 +
 From: jo...@iecc.com
 To: nanog@nanog.org
 Subject: Re: Thousands of hosts on a gigabit LAN, maybe not
 
 In article 
 cahf3uwypqn1ns_umjz-znuk3i5ufczbu9l39b-crovg6yum...@mail.gmail.com you 
 write:
 Juniper OCX1100 have 72 ports in 1U.
 
 Yeah, too bad it costs $32,000.  Other than that it'd be perfect.
 
 R's,
 John
  

RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-10 Thread John R. Levine
Also, do you need line rate forwarding? Having 1,000 devices with 1Gb 
uplinks doesn't necessarily mean that full throughput is required... the 
clustering and the applications may be sporadic and bursty?


It's definitely sporadic and bursty.  There's another network for high 
speed traffic among the nodes.  The Ethernet is for stuff like program 
loading from NFS servers..


And... what support do you need? Just one spare on the shelf or full 
vendor support on every switch?


Spare on the shelf, definitely.

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-10 Thread Nick Hilliard
On 10/05/2015 00:33, Karl Auer wrote:
 Would be interesting to see how IPv6 performed, since is one of the
 things it was supposed to be able to deliver - massively scalable links
 (equivalent to an IPv4 broadcast domain) via massively reduced protocol
 chatter (IPv6 multicast groups vs IPv4 broadcast), plus fully automated
 L3 address assignment.

It will perform badly because putting large numbers of hosts in a single
broadcast domain is a bad idea, no matter what the protocol.

If you have a very large L2 domain and if you use router advertisements to
handle your default gateway announcement, you'll probably end up trashing
your routers due to periodic neighbor solicitation messages.  If you don't
use tight timers, your failover convergence time will be trash.  On the
other hand, the tighter the timers, the more you'll trash your routers,
particularly if there is a failover event - in other words, exactly when
you don't want to stress the network.

In the best case, the gateway unavailability mttr will be around 5-10
seconds and it will be non-deterministic.  This means that if you want
router failover which actually works, you will need to use a first-hop
routing protocol like vrrp or similar.

You will probably want to disable all multicast snooping on your network
because of ipv6 chatter.  Pushing state requirements into the L2 forwarding
mechanism turns out not to be a good idea especially at scale - see the
bimajority.org url that someone else posted on this thread, which is as
much about poor switch implementation as it is about poor protocol design
and solving problems that are a lot less relevant on today's networks.
This will mean that you will also need to manually prune the scope of your
dot1q network domain because otherwise the multicast chatter will be
spammed network-wide across all vlans on which it's defined.

RA gives the operator no way of controlling which IP address is assigned to
which hosts, which means that the operator of the large l2 domain is likely
to want to disable SLAAC if they plan to have any input on what IP address
is assigned to what host.  This may or may not be important to the
operator.  If it's hosts on a hot-seated corporate lan, probably it doesn't
matter too much.  If it's a service provider selling ipv6 services, it
matters a lot.

Regardless of whether this is the case, RA guard on each end-point is a
necessity and if you don't have it, your control plane will be compromised.
 RA guard is more complicated than ARP / DHCP guard and is not well
supported on a lot of hardware.

Finally, if you have a connectivity problem with your large l2 domain, your
problem surface area is much greater than if you segment your network into
smaller chunks, which allows the scope of your outage to be a lot larger.

Nick



Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread charles

On 2015-05-09 11:57, Baldur Norddahl wrote:
The standard 48 port with 2 port uplink 1U switch is far from full 
depth.
You put them in the back of the rack and have the small computers in 
the
front. You might even turn the switches around, so the ports face 
inwards
into the rack. The network cables would be very short and go directly 
from
the mini computers (Raspberry Pi?) to the switch, all within the one 
unit

shelf.


Yes this.

I presumed ras pi, but those don't have gigabit Ethernet.

Then I realized:  http://www.parallella.org/ (I've got one of these 
sitting on my standby shelf to be racked, which is what made me think of 
it).


To the OP please do tell us more about what you are doing, it sounds 
very interesting.


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Eduardo Schoedler
Juniper OCX1100 have 72 ports in 1U.

And you can tune Linux IPv4 neighbor:
https://ams-ix.net/technical/specifications-descriptions/config-guide#11

--
Eduardo Schoedler



Em sábado, 9 de maio de 2015, Lamar Owen lo...@pari.edu escreveu:

 On 05/08/2015 02:53 PM, John Levine wrote:

 ...
 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.

 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA

  You know, I read this post and immediately thought 'SGI Altix'
 scalable to 512 CPU's per system image and 20 images per cluster (NASA's
 Columbia supercomputer had 10,240 CPUs in that configuration.twelve
 years ago, using 1.5GHz 64-bit RISC CPUs running Linux my, how we've
 come full circle (today's equivalent has less power consumption, at
 least)).  The NUMA technology in those Altix CPU's is a de-facto
 'memory-area network' and thus can have some interesting topologies.

 Clusters can be made using nodes with at least two NICs in them, and no
 switching.  With four or eight ports you can do some nice mesh topologies.
 This wouldn't be L2 bridging, either, but a L3 mesh could be made that
 could be rather efficient, with no switches, as long as you have at least
 three ports per node, and you can do something reasonably efficient with a
 switch or two and some chains of nodes, with two NICs per node.  L3 keeps
 the broadcast domain size small, and broadcast overhead becomes small.

 If you only have one NIC per node, well, time to get some seriously
 high-density switches. but even then how many nodes are going to be per
 42U rack?  A top-of-rack switch may only need 192 ports, and that's only
 4U, with 1U 48 port switches. 8U you can do 384 ports, and three racks will
 do a bit over 1,000.  Octopus cables going from an RJ21 to 8P8C modular are
 available, so you could use high-density blades; Cisco claims you could do
 576 10/100/1000 ports in a 13-slot 6500.  That's half the rack space for
 the switching.  If 10/100 is enough, you could do 12 of the WS-X6196-21AF
 cards (or the RJ-45 'two-ports-per-plug' WS-X6148X2-45AF) and get in theory
 1,152 ports in a 6513 (one SUP; drop 96 ports from that to get a redundant
 SUP).

 Looking at another post in the thread, these moonshot rigs sound
 interesting 45 server blades in 4.3U.  4.3U?!?!?  Heh, some custom
 rails, I guess, to get ten in 47U.  They claim a quad-server blade, so
 1,800 servers (with networking) in a 47U rack.  Yow.  Cost of several
 hundred thousand dollars for that setup.

 The effective limit on subnet size would be of course broadcast overhead;
 1,000 nodes on a /22 would likely be painfully slow due to broadcast
 overhead alone.



-- 
Eduardo Schoedler


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Eduardo Schoedler
You do not mention low cost before ;)



Em sábado, 9 de maio de 2015, John Levine jo...@iecc.com escreveu:

 In article 
 cahf3uwypqn1ns_umjz-znuk3i5ufczbu9l39b-crovg6yum...@mail.gmail.com
 javascript:; you write:
 Juniper OCX1100 have 72 ports in 1U.

 Yeah, too bad it costs $32,000.  Other than that it'd be perfect.

 R's,
 John



-- 
Eduardo Schoedler


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread John Levine
To the OP please do tell us more about what you are doing, it sounds 
very interesting.

There's a conference paper in preparation.  I'll send a pointer when I can.

R's,
John




Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Karl Auer
On Sat, 2015-05-09 at 17:06 -0400, Lamar Owen wrote:
 The effective limit on subnet size would be of course broadcast 
 overhead; 1,000 nodes on a /22 would likely be painfully slow due to 
 broadcast overhead alone.

Would be interesting to see how IPv6 performed, since is one of the
things it was supposed to be able to deliver - massively scalable links
(equivalent to an IPv4 broadcast domain) via massively reduced protocol
chatter (IPv6 multicast groups vs IPv4 broadcast), plus fully automated
L3 address assignment.

IPv4 ARP, for example, hits every on-subnet neighbour; the IPv6
equivalent uses multicast to hit only those neighbours that happen to
share the same 24 low-end L3 address bits as the desired target - a
statistically much smaller subset of on-link neighbours, and in normal
subnets typically only one host. Only chatter that really should go to
all hosts does so - such as router advertisements.

Regards, K.



-- 
~~~
Karl Auer (ka...@biplane.com.au)
http://www.biplane.com.au/kauer
http://twitter.com/kauer389

GPG fingerprint: 3C41 82BE A9E7 99A1 B931 5AE7 7638 0147 2C3C 2AC4
Old fingerprint: EC67 61E2 C2F6 EB55 884B E129 072B 0AF0 72AA 9882




Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread John Levine
In article cahf3uwypqn1ns_umjz-znuk3i5ufczbu9l39b-crovg6yum...@mail.gmail.com 
you write:
Juniper OCX1100 have 72 ports in 1U.

Yeah, too bad it costs $32,000.  Other than that it'd be perfect.

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Lamar Owen

On 05/08/2015 02:53 PM, John Levine wrote:

...
Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA

You know, I read this post and immediately thought 'SGI Altix' 
scalable to 512 CPU's per system image and 20 images per cluster 
(NASA's Columbia supercomputer had 10,240 CPUs in that 
configuration.twelve years ago, using 1.5GHz 64-bit RISC CPUs 
running Linux my, how we've come full circle (today's equivalent 
has less power consumption, at least)).  The NUMA technology in 
those Altix CPU's is a de-facto 'memory-area network' and thus can have 
some interesting topologies.


Clusters can be made using nodes with at least two NICs in them, and no 
switching.  With four or eight ports you can do some nice mesh 
topologies.  This wouldn't be L2 bridging, either, but a L3 mesh could 
be made that could be rather efficient, with no switches, as long as you 
have at least three ports per node, and you can do something reasonably 
efficient with a switch or two and some chains of nodes, with two NICs 
per node.  L3 keeps the broadcast domain size small, and broadcast 
overhead becomes small.


If you only have one NIC per node, well, time to get some seriously 
high-density switches. but even then how many nodes are going to be 
per 42U rack?  A top-of-rack switch may only need 192 ports, and that's 
only 4U, with 1U 48 port switches. 8U you can do 384 ports, and three 
racks will do a bit over 1,000.  Octopus cables going from an RJ21 to 
8P8C modular are available, so you could use high-density blades; Cisco 
claims you could do 576 10/100/1000 ports in a 13-slot 6500.  That's 
half the rack space for the switching.  If 10/100 is enough, you could 
do 12 of the WS-X6196-21AF cards (or the RJ-45 'two-ports-per-plug' 
WS-X6148X2-45AF) and get in theory 1,152 ports in a 6513 (one SUP; drop 
96 ports from that to get a redundant SUP).


Looking at another post in the thread, these moonshot rigs sound 
interesting 45 server blades in 4.3U.  4.3U?!?!?  Heh, some custom 
rails, I guess, to get ten in 47U.  They claim a quad-server blade, so 
1,800 servers (with networking) in a 47U rack.  Yow.  Cost of several 
hundred thousand dollars for that setup.


The effective limit on subnet size would be of course broadcast 
overhead; 1,000 nodes on a /22 would likely be painfully slow due to 
broadcast overhead alone.




Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Bruce Simpson

On 09/05/2015 23:33, Karl Auer wrote:

IPv4 ARP, for example, hits every on-subnet neighbour; the IPv6
equivalent uses multicast to hit only those neighbours that happen to
share the same 24 low-end L3 address bits as the desired target - a
statistically much smaller subset of on-link neighbours, and in normal
subnets typically only one host. Only chatter that really should go to
all hosts does so - such as router advertisements.



Except when the IPv6 solicited-node multicast groups cause $VENDOR 
switch meltdown:

http://blog.bimajority.org/2014/09/05/the-network-nightmare-that-ate-my-week/


RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Jerry J. Anderson, CCIE #5000
 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.  Each of the
 computers runs Linux and has a gigabit ethernet interface.  It occurs
 to me that it is unlikely that I can buy an ethernet switch with
 thousands of ports, and even if I could, would I want a Linux system
 to have 10,000 entries or more in its ARP table.

 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.

 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA

Brocade's Virtual Cluster Switching (VCS) fabric on their VDX switches is a 
good solution for large, flat data center networks like
this.  It's based on TRILL, so no STP or tree structure are required.  All 
ports are live, as is all inter-switch bandwidth.  Cisco
has a similar solution, as do other vendors.

Thank you,
Jerry

-- 
Jerry J. Anderson, CCIE #5000
Member, Anderson Consulting, LLC
800 Ridgeview Ave, Broomfield, CO  80020-6618
Office: 650-523-2132 Mobile: 773-793-7717
www.linkedin.com/in/AndersonConsultingLLC



Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-09 Thread Baldur Norddahl
The standard 48 port with 2 port uplink 1U switch is far from full depth.
You put them in the back of the rack and have the small computers in the
front. You might even turn the switches around, so the ports face inwards
into the rack. The network cables would be very short and go directly from
the mini computers (Raspberry Pi?) to the switch, all within the one unit
shelf.

Assuming a max sized rack with depth of 90 cm and the switches might be 30
cm. That leaves 60 cm to mount mini computers. That is approximately 12000
cubic cm of space per rack unit. A Raspberry PI is approximately 120 cubic
cm. So you might be able to fit 48 of them in that space. It would be a
very tight fit indeed but maybe not impossible.

As to the original question, I would have 48 computers in a subnet. This is
the correct number because you would connect each shelf switch to a top of
rack switch, and spend a few extra bucks on the ToR so that it can do layer
3 routing between shelfs.

Regards,

Baldur


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Brandon Martin

On 05/08/2015 02:53 PM, John Levine wrote:

Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.  Each of the
computers runs Linux and has a gigabit ethernet interface.  It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA


Unless you have some dire need to get these all on the same broadcast 
domain, those kind of numbers on a single L2 would send me running for 
the hills for lots of reasons, some of which you've identified.


I'd find a good L3 switch and put no more ~200-500 IPs on each L2 and 
let the switch handle gluing it together at L3.  With the proper 
hardware, this is a fully line-rate operation and should have no real 
downsides aside from splitting up the broadcast domains (if you do need 
multicast, make sure your gear can do it).  With a divide-and-conquer 
approach, you shouldn't have problems fitting the L2+L3 tables into even 
a pretty modest L3 switch.


Densest chassis switches I know of are going to be gets about 96 ports 
per RU (48 ports each on a half-width blade, but you need breakout 
panels to get standard RJ45 8P8C connectors as the blades have MRJ21s) 
less rack overhead for power supplies, management, etc..  That should 
get you ~2000 ports per rack [1].  Such switches can be quite expensive. 
 The trend seems to be toward stacking pizza boxes these days, though. 
 Get the number of ports you need per rack (you're presumably not 
putting all 10,000 nodes in a single rack) and aggregate up one or two 
layers.  This gives you a pretty good candidate for your L2/L3 split.


[1] Purely as an example, you can cram 3x Brocade MLX-16 chassis into a 
42U rack (with 0RU to spare).  That gives you 48 slots for line cards. 
Leaving at least one slot in each chassis for 10Gb or 100Gb uplinks to 
something else, 45x48 = 2160 1000BASE-T ports (electrically) in a 42U 
rack, and you'll need 45 more RU somewhere for breakout patch panels!

--
Brandon Martin


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Rafael Possamai
- The more switches a packet has to go through, the higher the latency, so
your response times may deteriorate if you cascade too many switches.
Legend says up to 4 is a good number, any further you risk creating a big
mess.

- The more switches you add, the higher your bandwidth utilized by
broadcasts in the same subnet.
http://en.wikipedia.org/wiki/Broadcast_radiation

- If you have only one connection between each switch, each switch is going
to be limited to that rate (1gbps in this case), possibly creating a
bottleneck depending on your application and how exactly it behaves.
Consider aggregating uplinks.

- Bundling too many Ethernet cables will cause interference (cross-talk),
so keep that in mind. I'd purchase F/S/FTP cables and the like.

Here I am going off on a tangent: if your friends want to build a super
computer then there's a way to calculate the most efficient number of
nodes given your constraints (e.g. linear optimization). This could save
you time, money and headaches. An example: maximize the number of TFLOPS
while minimizing number of nodes (i.e. number of switch ports). Just a
quick thought.






On Fri, May 8, 2015 at 1:53 PM, John Levine jo...@iecc.com wrote:

 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.  Each of the
 computers runs Linux and has a gigabit ethernet interface.  It occurs
 to me that it is unlikely that I can buy an ethernet switch with
 thousands of ports, and even if I could, would I want a Linux system
 to have 10,000 entries or more in its ARP table.

 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.

 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA

 R's,
 John



Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Miles Fidelman

John Levine wrote:

Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.  Each of the
computers runs Linux and has a gigabit ethernet interface.  It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA




It's become fairly commonplace to build supercomputers out of clusters 
of 100s, or 1000s of commodity PCs, see, for example:

www.rocksclusters.org
http://www.rocksclusters.org/presentations/tutorial/tutorial-1.pdf
or
http://www.dodlive.mil/files/2010/12/CondorSupercomputerbrochure_101117_kb-3.pdf 
(a cluster of 1760 playstations at AFRL Rome Labs)


Interestingly, all the documentation I can find is heavy on the software 
layers used to cluster resources - but there's little about hardware 
configuration other than pretty pictures of racks with lots of CPUs and 
lots of wires.


If the people you know are trying to do something similar - it might be 
worth some nosing around the Rocks community, or some phone calls.  I 
expect that interconnect architecture and latency might be a bit of an 
issue for this sort of application.


Miles Fidelman




--
In theory, there is no difference between theory and practice.
In practice, there is.    Yogi Berra



Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Christopher Morrow
On Fri, May 8, 2015 at 2:53 PM, John Levine jo...@iecc.com wrote:
 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.  Each of the
 computers runs Linux and has a gigabit ethernet interface.  It occurs
 to me that it is unlikely that I can buy an ethernet switch with
 thousands of ports, and even if I could, would I want a Linux system
 to have 10,000 entries or more in its ARP table.

 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.

 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA

consider the pain of also ipv6's link-local gamery.
look at the nvo3 WG and it's predecessor (which shouldn't have really
existed anyway, but whatever, and apparently my mind helped me forget
about the pain involved with this wg)

I think 'why one lan' ? why not just small (/26 or /24 max?) subnet
sizes... or do it all in v6 on /64's with 1/rack or 1/~200 hosts.


Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread John Levine
Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.  Each of the
computers runs Linux and has a gigabit ethernet interface.  It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Dave Taht
On Fri, May 8, 2015 at 11:53 AM, John Levine jo...@iecc.com wrote:
 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.

Very cool-ly crazy.

 Each of the
 computers runs Linux and has a gigabit ethernet interface.  It occurs
 to me that it is unlikely that I can buy an ethernet switch with
 thousands of ports, and even if I could, would I want a Linux system
 to have 10,000 entries or more in its ARP table.

Agreed. :) You don't really want 10,000 entries in a routing FIB
table either, but I was seriously encouraged by the work going
on in linux 4.0 and 4.1 to improve those lookups.

https://netdev01.org/docs/duyck-fib-trie.pdf

I'd love to know the actual scalability of some modern
routing protocols (isis, babel, ospfv3, olsrv2, rpl) with that
many nodes too

 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.

That is an awful lot of ports to fit in a rack (48 ports, 36 2U slots
in the rack (and is that too high?) = 1728
ports) A thought is you could make it meshier using multiple
interfaces per tiny linux box? Put, say
3-6 interfaces and have a very few switches interconnecting given
clusters (and multiple paths
to each switch). That would reduce your arp table (and fib table) by a
lot at the cost of adding
hops...

 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA

max per vlan 4096. Still a lot.

Another approach might be max density on a switch (48?) per cluster,
routed (not switched) 10GigE
to another 10GigE+ switch.

I'd love to know the rule of thumbs here also, I imagine some rules
must exist for those in the VM
or VXLAN worlds.

 R's,
 John



-- 
Dave Täht
Open Networking needs **Open Source Hardware**

https://plus.google.com/u/0/+EricRaymond/posts/JqxCe2pFr67


RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Chuck Church
Sounds interesting.  I wouldn't do more than a /23 (assuming IPv4) per subnet.  
Join them all together with a fast L3 switch.  I'm still trying to visualize 
what several thousand tiny computers in a single rack might look like.  Other 
than a cabling nightmare.  1000 RJ-45 switch ports is a good chuck of a rack 
itself.

Chuck

-Original Message-
From: NANOG [mailto:nanog-boun...@nanog.org] On Behalf Of John Levine
Sent: Friday, May 08, 2015 2:53 PM
To: nanog@nanog.org
Subject: Thousands of hosts on a gigabit LAN, maybe not

Some people I know (yes really) are building a system that will have several 
thousand little computers in some racks.  Each of the computers runs Linux and 
has a gigabit ethernet interface.  It occurs to me that it is unlikely that I 
can buy an ethernet switch with thousands of ports, and even if I could, would 
I want a Linux system to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with considerably less to 
the outside.  Physical distance shouldn't be a problem since everything's in 
the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded switches vs. 
routers, and whatever else one needs to design a dense network like this?  TIA

R's,
John



RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Sameer Khosla
You may want to look at CLOS / leaf/spine architecture.  This design tends to 
be optimized for east-west traffic, scales easily as bandwidth needs grow, and 
keeps thing simple, l2/l3 boundry on the ToR switch, L3 ECMP from leaf to 
spine.  Not a lot of complexity and scale fairly high on both leafs and spines. 
 

Sk.

-Original Message-
From: NANOG [mailto:nanog-boun...@nanog.org] On Behalf Of John Levine
Sent: Friday, May 08, 2015 2:53 PM
To: nanog@nanog.org
Subject: Thousands of hosts on a gigabit LAN, maybe not

Some people I know (yes really) are building a system that will have several 
thousand little computers in some racks.  Each of the computers runs Linux and 
has a gigabit ethernet interface.  It occurs to me that it is unlikely that I 
can buy an ethernet switch with thousands of ports, and even if I could, would 
I want a Linux system to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with considerably less to 
the outside.  Physical distance shouldn't be a problem since everything's in 
the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded switches vs. 
routers, and whatever else one needs to design a dense network like this?  TIA

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Niels Bakker

* lists.na...@monmotha.net (Brandon Martin) [Fri 08 May 2015, 21:42 CEST]:
[1] Purely as an example, you can cram 3x Brocade MLX-16 chassis into 
a 42U rack (with 0RU to spare).  That gives you 48 slots for line cards.


You really can't.  Cables need to come from the top, not from the 
sides, or they'll block the path of other linecards.



-- Niels.


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread John Levine
 to have 10,000 entries or more in its ARP table.

Agreed. :) You don't really want 10,000 entries in a routing FIB
table either, but I was seriously encouraged by the work going
on in linux 4.0 and 4.1 to improve those lookups.

One obvious way to deal with that is to put some manageable number of
hosts on a subnet and route traffic between the subnets.  I think we
can assume they'll all have 10/8 addresses, and I'm not too worried
about performance to the outside world, just within the network.

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Blake Hudson
Linux has a (configurable) limit on the neighbor table. I know in RHEL 
variants, the default has been 1024 neighbors for a while.


net.ipv4.neigh.default.gc_thresh3
net.ipv4.neigh.default.gc_thresh2
net.ipv4.neigh.default.gc_thresh1

net.ipv6.neigh.default.gc_thresh3
net.ipv6.neigh.default.gc_thresh2
net.ipv6.neigh.default.gc_thresh1

These may be rough guidelines for performance or arbitrary limits 
someone thought would be a good idea. Either way, you'll need to 
increase the number if you're using IP on Linux.


Although not explicitly stated, I would assume that these computers may 
be virtualized or inside some sort of blade chassis (which reduces the 
number of physical cables to a switch). Strictly speaking, I see no 
hardware limitation in your way, as most top of rack switches will 
easily do a few thousand or 10's of thousands of MAC entries and a few 
thousand hosts can fit inside a single IP4 or IP6 subnet. There are some 
pretty dense switches if you actually do need 1000 ports, but as others 
have stated, you'll utilize a good portion of the rack in cable and 
connectors.


--Blake


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Miles Fidelman
Forgot to mention - you might also want to check out Beowulf clusters - 
there's an email list at http://www.beowulf.org/ - probably some useful 
info in the list archives, maybe a good place to post your query.


Miles

Miles Fidelman wrote:

John Levine wrote:

Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.  Each of the
computers runs Linux and has a gigabit ethernet interface.  It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA




It's become fairly commonplace to build supercomputers out of clusters 
of 100s, or 1000s of commodity PCs, see, for example:

www.rocksclusters.org
http://www.rocksclusters.org/presentations/tutorial/tutorial-1.pdf
or
http://www.dodlive.mil/files/2010/12/CondorSupercomputerbrochure_101117_kb-3.pdf 
(a cluster of 1760 playstations at AFRL Rome Labs)


Interestingly, all the documentation I can find is heavy on the 
software layers used to cluster resources - but there's little about 
hardware configuration other than pretty pictures of racks with lots 
of CPUs and lots of wires.


If the people you know are trying to do something similar - it might 
be worth some nosing around the Rocks community, or some phone calls.  
I expect that interconnect architecture and latency might be a bit of 
an issue for this sort of application.


Miles Fidelman







--
In theory, there is no difference between theory and practice.
In practice, there is.    Yogi Berra



RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Brian R
Agree with many of the other comments.  Smaller subnets (the /23 suggestion 
sounds good) with L3 between the subnets.
 
off topic
The first thing that came to mind was Bitcoin farm! then Ask Bitmaintech 
and then I'd be more worried about the number of fans and A/C units.
 /off topic
 
Brian
 
 Date: Fri, 8 May 2015 18:53:03 +
 From: jo...@iecc.com
 To: nanog@nanog.org
 Subject: Thousands of hosts on a gigabit LAN, maybe not
 
 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.  Each of the
 computers runs Linux and has a gigabit ethernet interface.  It occurs
 to me that it is unlikely that I can buy an ethernet switch with
 thousands of ports, and even if I could, would I want a Linux system
 to have 10,000 entries or more in its ARP table.
 
 Most of the traffic will be from one node to another, with
 considerably less to the outside.  Physical distance shouldn't be a
 problem since everything's in the same room, maybe the same rack.
 
 What's the rule of thumb for number of hosts per switch, cascaded
 switches vs. routers, and whatever else one needs to design a dense
 network like this?  TIA
 
 R's,
 John
  

Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Brandon Martin

On 05/08/2015 04:17 PM, Niels Bakker wrote:

* lists.na...@monmotha.net (Brandon Martin) [Fri 08 May 2015, 21:42 CEST]:

[1] Purely as an example, you can cram 3x Brocade MLX-16 chassis into
a 42U rack (with 0RU to spare).  That gives you 48 slots for line cards.


You really can't.  Cables need to come from the top, not from the sides,
or they'll block the path of other linecards.


Hum, good point.  Cram may not be a strong enough term :)  It'd work 
on the horizontal slot chassis types (4/8 slot), but not the vertical 
(16/32 slot).


You might be able to make it fit if you didn't care about 
maintainability, I guess.  There's some room to maneuver if you don't 
care about being able to get the power supplies out, too.  I don't 
recommend this approach...  Those MRJ21 cables are not easy to work with 
as it is.

--
Brandon Martin


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread charles

On 2015-05-08 13:53, John Levine wrote:

Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.



How many racks?
How many computers per rack unit? How many computers per rack?
(How are you handling power?)
How big is each computer?

Do you want network cabling to be contained to each rack? Or do you want 
to run the cable to a central networking/switching rack?


H even a 6513 fully populated with POE 48 port line cards (which 
could let you do power and network in the same cable (I think? Does POE 
work on gigabit these days)? would get you (12*48 = 576) ports.


So 48U rack - 15U (I think the 6513 is 15U total) leaves you 33U. 
Can you fit 576 systems in 33U?



  Each of the

computers runs Linux and has a gigabit ethernet interface.




Copper?

  It occurs

to me that it is unlikely that I can buy an ethernet switch with
thousands of ports


6515?


, and even if I could, would I want a Linux system

to have 10,000 entries or more in its ARP table.



Add more ram. That's always the answer. LOL.



Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA



We need more data.



Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Roland Dobbins


On 9 May 2015, at 1:53, John Levine wrote:

What's the rule of thumb for number of hosts per switch, cascaded 
switches vs. routers, and whatever else one needs to design a dense 
network like this?


Most of the major switch vendors have design guides and other examples 
like this available (this one is Cisco-specific):


http://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Data_Center/VMDC/3-0-1/DG/VMDC_3-0-1_DG/VMDC301_DG3.html

Some organizations like Facebook have also taken the time to write up 
their approaches and make them publicly available:


https://code.facebook.com/posts/360346274145943/introducing-data-center-fabric-the-next-generation-facebook-data-center-network/

---
Roland Dobbins rdobb...@arbor.net


RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Phil Bedard
The real answer to this is being able to cram them into a single chassis which 
can multiplex the network through a backplane.  Something like the HP Moonshot 
ARM system or the way others like Google build high density compute with 
integrated Ethernet switching. 

Phil

-Original Message-
From: John Levine jo...@iecc.com
Sent: ‎5/‎8/‎2015 2:59 PM
To: nanog@nanog.org nanog@nanog.org
Subject: Thousands of hosts on a gigabit LAN, maybe not

Some people I know (yes really) are building a system that will have
several thousand little computers in some racks.  Each of the
computers runs Linux and has a gigabit ethernet interface.  It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside.  Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA

R's,
John


RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread charles

On 2015-05-08 18:20, Phil Bedard wrote:

The real answer to this is being able to cram them into a single
chassis which can multiplex the network through a backplane.
Something like the HP Moonshot ARM system or the way others like
Google build high density compute with integrated Ethernet switching.




I was going to suggest moonshot myself (I walk by a number of moonshot 
units daily). However it seemed like the systems were already selected 
and then someone was like oh yeah, better ask netops how to hook these 
things we bought and didn't tell anyone about to the interwebz. (I mean 
that's not a 100% accurate description of my $DAYJOB at all).


In which case, the standard response is well gee whizz buddy, ya should 
of bought moonshot jigs. But now you have to buy pallet loads of chassis 
switches. Hope you have some money left over in your budget.


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Joe Hamelin
On Fri, May 8, 2015 at 11:53 AM, John Levine jo...@iecc.com wrote:

 Some people I know (yes really) are building a system that will have
 several thousand little computers in some racks.  Each of the
 computers runs Linux and has a gigabit ethernet interface.


Though a bit off-topic I ran in to this project at the CascadeIT
conference.  I'm currently in corp IT that is Notes/Windows based so I
haven't had a good place to test it but the concept is very interesting.
They distributed way they monitor would greatly reduce bandwidth overhead.

http://assimproj.org

The Assimilation Project is designed to discover and monitor
infrastructure, services, and dependencies on a network of potentially
unlimited size, without significant growth in centralized resources. The
work of discovery and monitoring is delegated uniformly in tiny pieces to
the various machines in a network-aware topology - minimizing network
overhead and being naturally geographically sensitive.

The two main ideas are:

   - distribute discovery throughout the network, doing most discovery
   locally
   - distribute the monitoring as broadly as possible in a network-aware
   fashion.
   - use autoconfiguration and zero-network-footprint discovery techniques
   to monitor most resources automatically. during the initial installation
   and during ongoing system addition and maintenance.



--
Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Jima

On 2015-05-08 12:53, John Levine wrote:

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this?  TIA


 I won't pretend to know best practices, but my inclination would be to 
connect the devices to 48-port L2 ToR switches with 2-4 SFP+ uplink 
ports (a number of vendors have options for this), with the 10gbit ports 
aggregated to a 10gbit core L2/L3 switch stack (ditto).  I'm not sure 
I'd attempt this without 10gbit to the edge switches, due to Rafael's 
aforementioned point of the bottleneck/loss of multiple ports for trunking.


 Not knowing the architectural constraints, I'd probably go with 
others' advice of limiting L2 zones to 200-500 hosts, which would 
probably amount to 4-10 edge switches per VLAN.


 Dang.  The more I think about this project, the more expensive it sounds.

 Jima


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Joe Hamelin
On Fri, May 8, 2015 at 5:19 PM, Jima na...@jima.us wrote:
   Dang.  The more I think about this project, the more expensive it sounds.

Naw, just use WiFi.  ;)

--
Joe Hamelin, W7COM, Tulalip, WA, 360-474-7474


RE: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread John R. Levine

off topic
The first thing that came to mind was Bitcoin farm! then Ask Bitmaintech and then 
I'd be more worried about the number of fans and A/C units.
/off topic


I promise, no bitcoins involved.

R's,
John


Re: Thousands of hosts on a gigabit LAN, maybe not

2015-05-08 Thread Benson Schliesser
Morrow's comment about the ARMD WG notwithstanding, there might be some 
useful context in https://tools.ietf.org/html/draft-karir-armd-statistics-01


Cheers,
-Benson


Christopher Morrow mailto:morrowc.li...@gmail.com
May 8, 2015 at 12:19 PM

consider the pain of also ipv6's link-local gamery.
look at the nvo3 WG and it's predecessor (which shouldn't have really
existed anyway, but whatever, and apparently my mind helped me forget
about the pain involved with this wg)

I think 'why one lan' ? why not just small (/26 or /24 max?) subnet
sizes... or do it all in v6 on /64's with 1/rack or 1/~200 hosts.
John Levine mailto:jo...@iecc.com
May 8, 2015 at 11:53 AM
Some people I know (yes really) are building a system that will have
several thousand little computers in some racks. Each of the
computers runs Linux and has a gigabit ethernet interface. It occurs
to me that it is unlikely that I can buy an ethernet switch with
thousands of ports, and even if I could, would I want a Linux system
to have 10,000 entries or more in its ARP table.

Most of the traffic will be from one node to another, with
considerably less to the outside. Physical distance shouldn't be a
problem since everything's in the same room, maybe the same rack.

What's the rule of thumb for number of hosts per switch, cascaded
switches vs. routers, and whatever else one needs to design a dense
network like this? TIA

R's,
John