At 4:41 PM +0800 4/9/10, Min Miles Xu wrote:
Maurice Volaski wrote:
At 5:02 PM +0800 4/8/10, Min Miles Xu wrote:
Hi Maurice,

Did you notice how much time it took for the ping request packet to arrive at the Windows machine and how much time for the echo packet? And could you run the dtrace script in the attachment while you ping? Just type "dtrace -s e1000g_tx.d"


At first, running ping from the OpenSolaris VM doesn't display anything. Eventually, it does like so

64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7944. time=1549727.719 ms 64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7947. time=1547915.398 ms 64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7951. time=1546165.354 ms 64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7952. time=1546227.851 ms 64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7954. time=1545352.856 ms 64 bytes from thevaultinternal (172.16.1.2): icmp_seq=7955. time=1545415.344 ms

On the Windows side, I see the pings being received and replied to. This suggests to me it's the same problem described in the other thread. And if you are right about that, it implies that even if I could try a different NIC, it might still exhibit this behavior. What's puzzling is why I can't seem to trigger it by running multiple rsh threads and why many other users aren't experiencing it. It's not a new problem. I've seen in way earlier builds, possibly b110!


Hmm, according to the dtrace output, it takes no more than 1799 ms for a kernel's ICMP round-trip. But the ping output showed approaching 100 times longer. It seems to be something wrong inside the VM. Does somebody else have any ideas?


I should have clarified that the actual ping time is as you say less than 2 seconds or so, nothing like the crazy million ms being misreported by OpenSolaris. Presumably, that is part of the bug. The question is what is causing that and the already slow 1799 ms response times.

Anyway, I have some interesting news to report. I tested with OpenSolaris CIFS, the included Samba, and Samba 3.5.1 and they all poison and cripple the interface. I also tried on the virtual switch associated with a real NIC and it happens there too.

But since I couldn't get rsh from my Mac to do it, I wondered whether the problem is somehow being triggered by some pattern intrinsic to CIFS/SMB network traffic, so I am trying again by exporting the OpenSolaris shares via NFS to a Linux VM on the same computer that in turn re-exports them via Samba to the Windows VM. Guess what? This is working!
--

Maurice Volaski, [email protected]
Computing Support, Rose F. Kennedy Center
Albert Einstein College of Medicine of Yeshiva University
_______________________________________________
networking-discuss mailing list
[email protected]

Reply via email to