Hi Folks,

I've been contacted offlist with a request for further updates and information, 
and tonight I discovered something really weird well worth sharing. Apologies 
for another long post!

First an update on the changes to my test environment:
I created a new vnic on e1000g1 called dnsvnic0, and created/cloned my 
sparse-template zone into a new sparseroot zone named dns, which uses dnsvnic0, 
with the IP address 192.168.1.62.
The zone booted and I was straight into this problem again. I hadn't been able 
to get my sparse-template zone to fault again, but immediately after creating a 
new vnic/zone, I was back to having this elusive yet frustrating issue.

Just as a refresher, my Solaris server here is VM running under VMware ESXi 
3.5u3 (with all current patches). An extra layer of virtualisation does add 
extra questions, so I tried a ping test that would be entirely internal to the 
ESX host.; pinging the global zone from the non-global [dns] zone.

Traffic test #1
>From within the dns zone:
bash-3.2# ping 192.168.1.60
no answer from 192.168.1.60
bash-3.2# arp -an
Net to Media Table: IPv4
Device   IP Address               Mask      Flags      Phys Addr
------ -------------------- --------------- -------- ---------------
dnsvnic0 192.168.1.61         255.255.255.255 o        02:08:20:be:66:8e
dnsvnic0 192.168.1.60         255.255.255.255          00:0c:29:60:4e:c2
dnsvnic0 192.168.1.62         255.255.255.255 SPLA     02:08:20:ff:77:4f
dnsvnic0 192.168.1.133        255.255.255.255 o        00:15:f2:1d:48:c2
dnsvnic0 224.0.0.0            240.0.0.0       SM       01:00:5e:00:00:00
Arp packets *are* returning. ICMP however are *not*.

snoop from the global zone on the e1000g1 interface (which the vnic is running 
on):
# snoop -d e1000g1 arp or icmp
Using device e1000g1 (promiscuous mode)
192.168.1.62 -> (broadcast)  ARP C Who is 192.168.1.60, persephone ?
  persephone -> 192.168.1.62 ARP R 192.168.1.60, persephone is 0:c:29:60:4e:c2
192.168.1.62 -> persephone   ICMP Echo request (ID: 23182 Sequence number: 0)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23182 Sequence number: 1)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23182 Sequence number: 2)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23182 Sequence number: 3)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23182 Sequence number: 4)
(and so on...)

# snoop -d e1000g0 arp or icmp (which only the global zone is using)
Using device e1000g0 (promiscuous mode)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23212 Sequence number: 0)
  persephone -> 192.168.1.62 ICMP Echo reply (ID: 23212 Sequence number: 0)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23212 Sequence number: 1)
  persephone -> 192.168.1.62 ICMP Echo reply (ID: 23212 Sequence number: 1)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23212 Sequence number: 2)
  persephone -> 192.168.1.62 ICMP Echo reply (ID: 23212 Sequence number: 2)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23212 Sequence number: 3)
  persephone -> 192.168.1.62 ICMP Echo reply (ID: 23212 Sequence number: 3)
192.168.1.62 -> persephone   ICMP Echo request (ID: 23212 Sequence number: 4)
  persephone -> 192.168.1.62 ICMP Echo reply (ID: 23212 Sequence number: 4)

So the global zone is replying to the non-global zone, 'dns' just isn't seeing 
the replies.
This is sounding a lot like a weird vswitch bug.

Next I decided to try zone-to-zone traffic.:
Server - vnic - IP
Zone-template - zonevnic0 (via e1000g1) - 192.168.1.61
DNS - dnsvnic0 (via e1000g1) - 192.168.1.62

This worked... DNS could ping Zone-template.
What's really surprised my was that that my snoop on e1000g1 was showing the 
traffic. It was my understanding that vnic-to-vnic traffic that's attached to 
the same pnic never actually went across the wire, so why is snoop on a 
physical interface showing vnic <> vnic traffic ?

A) Something in crossbow isn't working properly.
B) I'm misunderstanding how vnics talk to each other. I understand etherstubs, 
but it just makes sense that inter-zone traffic shouldn't be sending traffic 
down a bottleneck like a pNIC when it's all *internal* anyway.
C) The traffic isn't actually going out the physical interface across the wire, 
but it is going via the logical concept of the e1000g1 interface, which snoop 
is reporting on - which is rather confusing to an end user like me trying to 
diagnose this using snoop :(

Can anyone clarify this one for me?

The WTF moment of the night was this:
vSwitches security in ESX is configured like this by default:
Promiscuous Mode: Disabled
MAC Address Changes: Accept
Forged Transmits: Accept

These sound like reasonable defaults to me, toggling the Promiscuous flag to my 
understanding would pretty much turn the vSwitch into a "vHub"!

I left a [non-returning] ping running between dns and the global zone, and 
decided to try enabling Promiscuous mode anyway.
No change.

I started a snoop up on e1000g1, and suddenly the sparse-template <> dns ping 
that I started in another terminal moments ago started working. I disabled the 
snoop, and it stopped working again.

!!!?

Enabling the promiscuous flag on the e1000g1 driver is suddenly "fixing" my 
traffic problem.

My best interpretation of this data is that 1 of 3 things isn't working, and 
I'm starting to get out of my depth here fast.

A) Crossbow itself is doing something 'funny' with the way traffic is being 
passed on to the vswitch, which is causing it to not send traffic for this mac 
address down the correct virtual port on the switch. Arp spoofing is common 
enough and both of those options are already enabled so it's something else 
which is causing it to get confused it would seem. Sadly there isn't any 
interface to the vSwitch that I'm aware of to pull some stats/logs from.
Funny promiscous ARPs? sending traffic down both pnics? something else to 
confuse the vswitch? I'm out of skills to troubleshoot this option any further.

B) The vSwitch in ESXi has a bug. If so, why is it only effecting crossbow... 
ESX is very widely used so if there was a glaring bug in the vSwitch ethernet 
implementation it would be very common and public knowledge. Crossbow is new 
enough; is it possible that I'm the first to have tried this configuration 
under ESX and thus am the first to notice this issue?
There aren't any other options within ESX that I'm aware of that I can try to 
get some further data on the vSwitch itself, so I'm at a loss as to how I 
troubleshoot this one further.
I'm also just using the free ESXi, so I can't contact VMware for support on 
this and at this point it would be a pretty vauge bug report anyway :/

C) The intel pro 1000 vNIC that ESX is exposing to the VM has a bug in it, or 
the solaris e1000g driver has a bug when sending crossbow traffic across it (or 
a combination of the two).
The intel pro 1000 is a very common server NIC, and I'd be gobsmacked if there 
was a bug with a real (non-virtual) e1000g adapter that the Sun folk hadn't 
picked up in their prerelease testing.

The only option for vNICs within ESX, for a 64-bit solaris host, is the e1000 
NIC. I trying to setup a 32-bit host to see what NIC that ends up with. If this 
provides different result, that at least gives us some better information on 
where to start looking!

Any further directions or feedback would be most welcome. If I'm heading in the 
wrong direction, please do tell me :)

Jonathan
-- 
This message posted from opensolaris.org

Reply via email to