Thanx Alasdair, interesting ones. About the /usr/sbin/quota in /etc/profile: why not change it to not run it if login is root or another administrative user? This way I could leave quota management, but leave administrative logins able to get bash in these odd situations. Gabriele. ---------------------------------------------------------------------------------- Da: Alasdair Lumsden A: [email protected] Data: 10 novembre 2012 14.37.20 CET Oggetto: Re: [discuss] illumos based ZFS storage failure I haven't read the whole thread, but the next time it happens you'll want to invoke a panic and make the dump file available. You'll want to ensure that: 1. Multithreaded dump is disabled in /etc/system with: * Disable MT dump set dump_plat_mincpu=0 Without this there is a risk of your dump not saving correctly. 2. That you have a dump device and that it's big enough to capture your kernel size (zfs set volsize=X rpool/dump) 3. That dumpadm is happy and set to save cores etc: dumpadm -y -z on -c kernel -d /dev/zvol/dsk/rpool/dump There's lots of good info here: http://wiki.illumos.org/display/illumos/How+To+Report+Problems You can also inspect things with mdb while the system is up, but if it's a production system normally you want to get it rebooted and into production again ASAP. So in that situation, you can take a dump of the running system with: savecore -L One thing to keep in mind is /etc/profile runs /usr/sbin/quota, which can screw over logins when the zfs subsystem is unhappy. I really think it should be removed by default since on most systems quotas aren't even used. So comment it out - we do so on all our systems. This will give you a better chance of logging in when things go wrong. I think there's a way to SSH in bypassing /etc/profile but I can't remember what it is - perhaps someone can chime in. Good luck. Centralised storage is difficult to do and when it goes wrong everything that depends on it goes down. It's a "all your eggs in one giant failbasket". Doing it homebrew with ZFS is cost effective and can be fast, but it is also risky. This is why there are companies like Nexenta out there with certified combinations of hardware and software engineered to work together. This extends to validating firmware combinations of disks/HBAs/etc. Cheers, Alasdair ------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175541-02f10c6f Modify Your Subscription: https://www.listbox.com/member/?&id;secret=21175541-29e3e0ee Powered by Listbox: http://www.listbox.com
------------------------------------------- illumos-discuss Archives: https://www.listbox.com/member/archive/182180/=now RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be Modify Your Subscription: https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4 Powered by Listbox: http://www.listbox.com
