Tom, Adrian, et al - I have posted before about this issue a few weeks ago - apparently this affects more than just Virtualbox or VMWare, I am experiencing this *EXACT* thing on Hyper-V as well. I have not tried this on metal.
My network looks like this: (Customer VMs)<--->(Hyper-V OpenBSD 6.4 PE)<--->(CISCO ASR P)<--->(CISCO ME3750 PE)<--->(CE) They Layer-2 between the Hyper-V and Cisco ASR is a Cisco Nexus 5672. I am using L3VPN instead of Pseudowire. Some of the ARP entries will time out and when they do, LDP will crash. (ldp engine terminated; signal 10) 100.92.64.37 (incomplete) hvn0 expired <---- ARP timed out 100.92.64.68 00:b7:71:93:32:95 hvn0 6m8s <---- ARP about to time out ARP Timing out makes no sense as these devices are all running OSPF with each other, granted OSPF is running Multicast to 224.0.0.5 I would think that would be enough to keep ARP up. In my environment I use Salt to manage my systems, and my PE formula has static ARP entries that get added, but that's not really a fix but a workaround. 100.92.64.37 (incomplete) hvn0 expired <--- ARP still missing for this guy 100.92.64.68 00:b7:71:93:32:95 hvn0 expired <--- ARP timed out while writing this # ping 100.92.64.68 PING 100.92.64.68 (100.92.64.68): 56 data bytes ping: sendmsg: Host is down ping: wrote 100.92.64.68 64 chars, ret=-1 ping: sendmsg: Host is down Other ARP Entries stay up, the ones that do not run LDP, and oddly enough- other OpenBSD systems. It seems like this only happens to OpenBSD LDP against Cisco IOS/IOS-XE (in my environment, anyway) -Henry On Wed, Mar 13, 2019 at 7:28 PM Tom Smyth <tom.sm...@wirelessconnect.eu> wrote: > Adrian, > sorry I only saw this now ... when trying to go through old unread mails > > I would be very wary of vmware virtual networking and Layer 2 Forwarding > I loved vmware before I discovered the ridiculous short comings in > their virtual networks > > Vmware Virtual Switches vmxnet > they are not switches or bridges... :( they (vmware) over optimised > and the virtual switches forward too and from vms by default based on > macs learned > via each attached machines vmx config file. > the workaround is promiscuous mode for the virtual switch... (turns > your virtual switch into > a crappy hub) > but this copies packets (frames) that are destined > for 1 machine virtual machine attached to the virtual switch, so if > you have high traffic volumes > and alot of machines attached to the virtual lan ... your perf is > going to suck ... > also you need to allow forged transmits on the virtual switch (macs > that dont match the vmx machine > mac configuration (which all bridged packets from behind your openbsd > guest will appear as ... > > if you are desperate to use vmware .. .check out the labs... they had > an "improved" > virtual switch with mac learning capabilities ... (only down side is > that particular virtual switch > has no mac ageing on the switch your virtual switch FIB wont flush > without rebooting the host > > apparently vmware have a switch that has proper mac learning from > virtual machines that > are bridging , but this requires the super duper awesome license (the > enterprise + or something like that, > > If you still need to use vmware on a lesser license perhaps a > multiport card + sriov and avoid their poor virtual switches > > basically you will have a lot of hassle with that, > > I hope this helps ... 352 days later :/ > > Tom Smyth > PS > Einstein once said " you should make things as simple as possible but > no simpler" it would appear vmware > did not heed this advice... and you dont have to be a genius to work > that out ... :) (because I did :) ) > > On Fri, 16 Mar 2018 at 04:55, Adrian Close <adr...@close.wattle.id.au> > wrote: > > > > Hi, > > > > I'm looking at doing some MPLS/VPLS stuff with OpenBSD, in particular > > using 'mpw' pseudowires. I've created a test network comprising two > > "PE" and two "P" hosts, to transport Ethernet traffic between service > > ports on the PE hosts across the MPLS network, based on an example I > > found online. I'm using a 6.3 snapshot from March 11th. > > > > [firewall] = [em0][mpw0][PE1][em1] - [em0][P1][em1] - [em1][P2][em0] > > - [em1][PE2][mpw0][em0] = [host] > > > > PE1 em0 and mpw0 are in a bridge, PE1 em1 is MPLS, P1 em0/1 are MPLS etc. > > > > This is all working great, except for short outages which turn out to > > coincide with the ARP cache expiry time for the P router's IP address on > > the PE host. > > > > When the ARP entry times out (or is manually deleted), the PE host > > doesn't ARP for the P router IP, but instead sends ARP who-has queries > > for other, definitely non-local things, such as the IP address for the > > other PE host's router-id. After a minute or so it finally ARPs for the > > P router IP and things work again. > > > > This only happens when "ldpd" is running (and I think only when the > > pseudowires are actually up). If I stop "ldpd" on the PE host, ARP > > works fine as expected every time. > > > > I guess I could fix this with static ARP entries, but that doesn't seem > > like quite the right thing. My test setup is running in Virtualbox > > VMs. I also replicated the issue under VMWare ESX using 'vic' > interfaces. > > > > Does anyone have any clues on this? > > > > Thanks in advance, > > > > Adrian Close > > > > > -- > Kindest regards, > Tom Smyth > > Mobile: +353 87 6193172 > The information contained in this E-mail is intended only for the > confidential use of the named recipient. If the reader of this message > is not the intended recipient or the person responsible for > delivering it to the recipient, you are hereby notified that you have > received this communication in error and that any review, > dissemination or copying of this communication is strictly prohibited. > If you have received this in error, please notify the sender > immediately by telephone at the number above and erase the message > You are requested to carry out your own virus check before > opening any attachment. > >