Hi, the PDC system disk is not on the storage, just a 150GB partition for databases. That's why I can't see how Windows did not let me in even on vmware console. The requirements to have several DCs is a very nice trick from Microsoft to get more licenses... There is no zfs command running on my .bashrc but, now you opened my eyes : Just before entering the system via ssh, I tried to check the storage via our web interface, and it was correctly responding, until I went to the Pool management, where the web interface issued a "zpool list", and it showed me the available pools. Then I opened the tree to see the filesystem..........and there it stopped responding....... At least I understand why I could not enter the system anymore (not even on console...). Last questions: - shouldn't I find some logs into the svc/logs of the iscsi services? (I don't...) - should I rise the swap space? (it's now 4.5GB, phys memory is 8GB). - what may be the reasons of the pool failing? a zpool status shows it's all fine. - any other way I can prevent this from happening? Thanx so much! Gabriele. ---------------------------------------------------------------------------------- Da: Jim Klimov A: [email protected] Cc: Gabriele Bulfon Aldo Fornoni Data: 9 novembre 2012 19.04.09 CET Oggetto: Re: [discuss] illumos based ZFS storage failure On 2012-11-09 18:01, Gabriele Bulfon wrote: The Windows PDC for CIFS resides inside a VMWare5 machine. This PDC, has also a disk mounted via a VMWare-iScsi resource on the storage itself. ... There is a kind of loop in this situation, but I suspect that all the problem started with the iScsi not responding from the storage: - the PDC did not let me in because it was angry with the missing disk from VMWare-iScsci. If this is the case, is it normal that a Windows server give me no way to enter when a disk is not responding?? Or should I just think that the PDC was freezing by itself (but in this case, I shouldn't see iscsi time out on vmware). Do I understand correctly that the PDC VM itself is not hosted on this storage (i.e. the OS disk is not iSCSIed from illumos)? Just to clarify... If the OS disk hung (due to iSCSI), it might cause the Windows VM to hang, like any other OS. Inaccessibility due to a non-OS drive does seem strange... it should have been kicked off after a timeout, I think. - the CIFS was not responding because the PDC was timing out That's why there should be several DCs ;) I don't understand why the storage was not giving me the bash, though! And, I could not find any trace of errors, logs or whatsoever about iscsi problems. Sometimes I had hit conditions where ZFS/ZPOOL commands began hanging and never exiting, and holding a shell. Ultimately this was solved by either a reboot, or (rarely) by ZFS completing its internal housekeeping (like those deferred deletions). If your .bashrc does something like "zfs list" or "df -k", as I like doing on my systems to get a quick overview upon login, this can cause the never-appearance of a shell. Also the hanging processes can pile up and exhaust address space or even the process table, and the OS is inaccessible due to either no memory ("can't fork", if it is able to write that) or to constant context swapping in-kernel to switch between myriads of processes. For me, Solaris has survived about 10x as many sleeping processes as Linux in similar setups, but nothing is infinitely sturdy. Finally, the address space could also be exhausted by something big in /tmp or /var/run or other "swap"-based filesystems (if your tmpfs is virtual-memory-backed, and no limit is artificially imposed by the mount options on /tmp). This by itself might be the cause - precluding programs from working and maybe hanging the iSCSI server (at least, the userspace program parts). Whatever the reason, it is likely that the storage server could not write into its log files during the problem (i.e. syslog died) hence no reports to be found. HTH, //Jim Klimov
------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
