No, I'm not doing anything in particular in the virtual machine. The media file is played on another computer in the (physical) network over CIFS. Over the network I also access the server using Remote Desktop/Terminal Services to communicate to the virtual machine (using the VirtualBox RDP interface, i.e. not the guest OS RDP), VNC (to access OI using vncserver) and SSH (to OI).

I wouldn't say that the entire server stops responding, only the connection to CIFS and SSH. I wasn't running VNC when it happened yesterday so I don't know about it, but the RDP connection and the Virtual Machine inside this server was unaffected while CIFS and SSH was frozen.

I tried today to start the virtual machine but it failed because it could not find the connection (e1000g2):

"Error: failed to start machine. Error message: Failed to open/create the internal network 'HostInterfaceNetworking-e1000
g2 - Intel PRO/1000 Gigabit Ethernet' (VERR_SUPDRV_COMPONENT_NOT_FOUND).
Failed to attach the network LUN (VERR_SUPDRV_COMPONENT_NOT_FOUND).
Unknown error creating VM (VERR_SUPDRV_COMPONENT_NOT_FOUND)"

ifconfig -a returns:
...
e1000g1: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 2
        inet 10.40.137.185 netmask ffffff00 broadcast 10.40.137.255
e1000g2: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 3
        inet 10.40.137.196 netmask ffffff00 broadcast 10.40.137.255
rge0: flags=1004843<UP,BROADCAST,RUNNING,MULTICAST,DHCP,IPv4> mtu 1500 index 4
        inet 0.0.0.0 netmask ff000000
...

i.e. e1000g1 and e1000g2 appears to be running just fine, wtf !?! I found the following entries in the /var/adm/messages:

Jan 23 13:50:49 <computername> nwamd[95]: [ID 234669 daemon.error] 3: nwamd_door_switch: need solaris.network.autoconf.read for request type 1
Jan 23 13:56:59 <computername> last message repeated 75 times
Jan 23 13:57:04 <computername> nwamd[95]: [ID 234669 daemon.error] 3: nwamd_door_switch: need solaris.network.autoconf.read for request type 1
Jan 23 13:58:19 <computername> last message repeated 15 times
Jan 23 13:58:22 <computername> gnome-session[916]: [ID 702911 daemon.warning] WARNING: Unable to determine session: Unable to lookup session information for process '916' Jan 23 13:58:24 <computername> nwamd[95]: [ID 234669 daemon.error] 3: nwamd_door_switch: need solaris.network.autoconf.read for request type 1
Jan 23 14:03:24 <computername> last message repeated 60 times
Jan 23 14:03:26 <computername> gnome-session[916]: [ID 702911 daemon.warning] WARNING: Unable to determine session: Unable to lookup session information for process '916' Jan 23 14:03:29 <computername> nwamd[95]: [ID 234669 daemon.error] 3: nwamd_door_switch: need solaris.network.autoconf.read for request type 1
Jan 23 14:03:34 <computername> last message repeated 1 time
Jan 23 14:03:39 <computername> nwamd[95]: [ID 234669 daemon.error] 3: nwamd_door_switch: need solaris.network.autoconf.read for request type 1

Some errors here... I looked into the log of the nwam service (/var/svc/log/network-physical\:nwam.log):

[ Jan 23 13:03:15 Enabled. ]
[ Jan 23 13:03:16 Executing start method ("/lib/svc/method/net-nwam start"). ] /lib/svc/method/net-nwam[548]: /sbin/ibd_upgrade: not found [No such file or directory]
[ Jan 23 13:03:17 Method "start" exited with status 0. ]
[ Jan 23 13:03:17 Rereading configuration. ]
[ Jan 23 13:03:17 Executing refresh method ("/lib/svc/method/net-nwam refresh"). ]
[ Jan 23 13:03:17 Method "refresh" exited with status 0. ]

nothing remarkable here... I investigated the issue on VBox forums and this issue was resolved by the rem_drv/add_drv vboxflt commands. It's not the first time I've had this issue and one of the people at the forums claims that this issue occurs after every third powercycle/reboot. It was hinted that VBox doesn't like dynamic IP addresses so I have also given e1000g2 a fixed address in the router (I configured the DHCP server in the router to always give the same IP to the MAC address of the e1000g2 connection). I've done it on the e1000g1 already, otherwise it would be impossible to ssh to the server from the "outside world".

Robin.


On 2012-01-23 11:40, Open Indiana wrote:
Ok,

So if I read it correct your virtual machine is playing an audio file and
then the server stops responding. That could mean the hardware that
virtualbox uses to play the soundfile if flooded or that the drivers of the
soundcard in your server/PC are not working very well?
What soundcard are you using?


-----Original Message-----
From: Robin Axelsson [mailto:gu99r...@student.chalmers.se]
Sent: zondag 22 januari 2012 23:38
To: openindiana-discuss@openindiana.org
Subject: Re: [OpenIndiana-discuss] CIFS performance issues

I don't understand what you mean with PCI-x settings and where to check them
out. The hardware is not PCI-X, it is PCIe. The affected LSI HBA is a
discrete PCIe card that operates in IT-mode. As in system logs I assume you
mean /var/adm/messages and I could not find anything there.

If this was only a hard disk controller issue (I made sure that there are
enough lanes for it) then I wouldn't expect applications such as SSH to be
affected by it.

The settings of the Intel NIC card is not in the BIOS, at least not what I
can see (i.e. there is no visible BIOS of the discrete NIC like it is for
the LSI SAS controller during POST). So, I'm not entirely sure what settings
for the NIC you are referring to.
Robin.


On 2012-01-22 20:28, Open Indiana wrote:
A very stupid answer, but have you looked at the bios and inspected
the settings of the network devices and /or PCIx ? How is your bios
setup (AHCI or raid or ??) ?

Do you see any error in the system logs?

To my opinion your system swallows in the datatransfers. Either on the
NIC<->montherboard side or at the montherboard<->   harddiskcontroller
side.
Do your extra NIC's and the LSI share the same PCI-x settings? Do they
both support all settings?

B,

Roelof
-----Original Message-----
From: Robin Axelsson [mailto:gu99r...@student.chalmers.se]
Sent: zondag 22 januari 2012 19:38
To: OpenIndiana-discuss@openindiana.org
Subject: [OpenIndiana-discuss] CIFS performance issues

In the past, I used OpenSolaris b134 which I then updated to
OpenIndiana
b148 and never did I experience performance issues related to the
network connection (and that was when using two of the "infamous"
RTL8111DL OnBoard ports). Now that I have swapped the motherboard and
the hard drive and later added a 2-port Intel EXPI9402PT NIC (because
of driver issues with the Realtek NIC that wasn't there before), I
performed a fresh install of OpenIndiana.

Since then I experience intermittent network freeze-ups that I cannot
link to faults of the storage pool (iostat -E returns 0 errors). I
have had this issue both with the dual port Intel controller as well
as with a single port Intel controller (EXPI9400PT) and the Realtek
8111E OnBoard NIC. The storage pool is behind an LSI MegaRAID 1068e
based controller using no port extenders.

In detail (9400PT+8111E):
-------------------------
I was running a Virtual Machine with VirtualBox 3.2.14 with (1) a
bridged network connection and was accessed over the network using (2)
VBox RDP connection and (3) a ZFS based CIFS share to be accessed from
a Windows computer over the network. These applications were
administrated both over
(4) SSH (port 2244) and (5) VNC (using vncserver). A typical start of
the VM was done with 'screen VBoxHeadless --startvm ...'

I assigned the network ports the following way:

e1000g: VBox RDP, VNC, SSH
rge0: Virtual Machine Network Connection (Bridged)

I tried various combinations but the connection froze intermittently
for all applications. The bridged network connection was worst. When I
SSHed over rge0, the connection was frequently severed which is was not
over e1000.
So I pulled the plug on the rg0 and let everything go through the
e1000 connection. freeze-ups became more frequent and it seemed like
the Bridged connection was causing this issue because the connection
didn't freeze like that when the VM wasn't running.

Note that I didn't assign the CIFS share to any particular port but
calls to<computername>   were assigned to the e1000 port in the
/etc/inet/hosts file.
-------------------------

In detail (9402PT):
-------------------
In this setup I run essentially the same applications but all through
the 9402PT which has two ports (e1000g1 and e1000g2). So I assign the
applications the following way:

e1000g1: VBox RDP, SSH,<computername>   (in /etc/inet/hosts)
e1000g2: Bridged connection to the virtual machine

So while running the virtual machine on the server, having an open SSH
connection to it and a command prompt pointing (cd x:\) at the CIFS
share (which is mapped as a network drive, say "X:") I started a media
player and played an audio file over the CIFS share which made the
connection freeze.
The freezing affected the media player and the command prompt but the
RDP connection worked and access to internet inside the VM was flawless.
The SSH connection was frozen as well. After a few minutes it became
responsive and iostat -E reported no errors. The command prompt and
the media player were still frozen but "ls<path to CIFS shared contents>"
worked fine over the SSH connection. Shortly after that the CIFS
connection came back and things seem to run ok.

So in conclusion the freeze-ups are still there but less frequent. I
have tried VirtualBox 4.1.8 but the ethernet connection is worse with
that version which is why I downgraded to 3.2.14 (which was published
_after_ 4.1.8).
-------------------

These issues occur on server grade hardware using drivers that
are/were certified by Sun (as I understand it). Moreover, CIFS and ZFS
are the core functionality of OpenIndiana so it is quite essential
that the network works properly and is stable.

I'm sorely tempted to issue a bug report but I would want some advice
on how to troubleshoot and provide relevant bug reports. There are no
entries in the /var/adm/messages that are related to the latest
freeze-up mentioned above and I couldn't find any when running the
prior setups. These freeze-ups don't happen all the time so it isn't
easy to consistently reproduce them.

Robin.




_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss



_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

.



_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss



_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

.




_______________________________________________
OpenIndiana-discuss mailing list
OpenIndiana-discuss@openindiana.org
http://openindiana.org/mailman/listinfo/openindiana-discuss

Reply via email to