Re: [zfs-discuss] System started crashing hard after zpool reconfigure and OI upgrade
On Wed, Mar 20, 2013 at 08:50:40AM -0700, Peter Wood wrote: I'm sorry. I should have mentioned it that I can't find any errors in the logs. The last entry in /var/adm/messages is that I removed the keyboard after the last reboot and then it shows the new boot up messages when I boot up the system after the crash. The BIOS log is empty. I'm not sure how to check the IPMI but IPMI is not configured and I'm not using it. You definitely should! Plugin a cable into the dedicated network port and configure it (easiest way for you is probably to jump into the BIOS and assign the appropriate IP address etc.). Than, for a quick look, point your browser to the given IP port 80 (default login is ADMIN/ADMIN). Also you may now configure some other details (accounts/passwords/roles). To track the problem, either write a script, which polls the parameters in question periodically or just install the latest ipmiViewer and use this to monitor your sensors ad hoc. see ftp://ftp.supermicro.com/utility/IPMIView/ Just another observation - the crashes are more intense the more data the system serves (NFS). I'm looking into FRMW upgrades for the LSI now. Latest LSI FW should be P15, for this MB type 217 (2.17), MB-BIOS C28 (1.0b). However, I doubt, that your problem has anything to do with the SAS-ctrl or OI or ZFS. My guess is, that either your MB is broken (we had an X9DRH-iF, which instantly disappeared as soon as it got some real load) or you have a heat problem (watch you cpu temp e.g. via ipmiviewer). With 2GHz that's not very likely, but worth a try (socket placement on this board is not really smart IMHO). To test quickly - disable all addtional, unneeded service in OI, which may put some load on the machine (like NFS service, http and bla) and perhaps even export unneeded pools (just to be sure) - fire up your ipmiviewer and look at the sensors (set update to 10s) or refresh manually often - start 'openssl speed -multi 32' and keep watching your cpu temp sensors (with 2GHz I guess it takes ~ 12min) I guess, your machine disappears before the CPUs getting really hot (broken MB). If CPUs switch off (usually first CPU2 and a little bit later CPU1) you have a cooling problem. If nothing happens, well, than it could be an OI or ZFS problem ;-) Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 52768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Changing rpool device paths/drivers
On Thu, Oct 04, 2012 at 07:57:34PM -0500, Jerry Kemp wrote: I remember a similar video that was up on YouTube as done by some of the Sun guys employed in Germany. They build a big array from USB drives, then exported the pool. Once the system was down, they re-arranged all the drives in random order and ZFS was able to figure out how to put the raid all back together. I need to go find that video. http://constantin.glez.de/blog/2011/01/how-save-world-zfs-and-12-usb-sticks-4th-anniversary-video-re-release-edition ? Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Is there an actual newsgroup for zfs-discuss?
On Tue, Jun 12, 2012 at 08:12:48AM +1000, Alan Hargreaves wrote: There is a ZFS Community on the Oracle Communities that was just kicked off this month - https://communities.oracle.com/portal/server.pt/community/oracle_solaris_zfs_file_system/526 Ohh, another censored forum/crappy thing - no thanx! Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] bad seagate drive?
On Mon, Sep 12, 2011 at 12:52:42AM +0100, Matt Harrison wrote: On 11/09/2011 18:32, Krunal Desai wrote: On Sep 11, 2011, at 13:01 , Richard Elling wrote: The removed state can be the result of a transport issue. If this is a Solaris-based OS, then look at fmadm faulty for a diagnosis leading to a removal. If none, then look at fmdump -eV for errors relating to the disk. Last, check the zpool history to make sure one of those little imps didn't issue a zpool remove command. Definitely check your cabling; a few of my drives disappeared like this as 'REMOVED', turned out to be some loose SATA cables on my backplane. --khd Thanks guys, I reinstalled the drive after testing on the windows machine and it looks fine now. By the time I'd got on to the console it had already started resilvering. All done now and hopefully it will stay like that for a while. Hmmm, at least if S11x, ZFS mirror, ICH10 and cmdk (IDE) driver is involved, I'm 99.9% confident, that a while turns out to be some days or weeks, only - no matter what Platinium-Enterprise-HDDs you use ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] bad seagate drive?
On Sun, Sep 11, 2011 at 11:41:32AM +0100, Matt Harrison wrote: Hi, I've got a system with 3 WD and 3 seagate drives. Today I got an email that zpool status indicated one of the seagate drives as REMOVED. I've tried clearing the error but the pool becomes faulted again. Taken out the offending drive and plugged into a windows box with seatools install. Unfortunately seatools finds nothing wrong with the drive. Wondering, which OS version, driver and which controller? Also, is this always the 2nd drive of a 2-way mirror? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Gen-ATA read sector errors
On Thu, Jul 28, 2011 at 01:55:27PM +0200, Koopmann, Jan-Peter wrote: Hi, my system is running oi148 on a super micro X8SIL-F board. I have two pools (2 disc mirror, 4 disc RAIDZ) with RAID level SATA drives. (Hitachi HUA72205 and SAMSUNG HE103UJ). The system runs as expected however every few days (sometimes weeks) the system comes to a halt due to these errors: Dec 3 13:51:20 nasjpk gda: [ID 107833 kern.warning] WARNING: /pci@0,0/pci-ide@1f,2/ide@1/cmdk@0,0 (Disk1): Dec 3 13:51:20 nasjpk Error for commandX \'read sector\' Error Level: Fatal Dec 3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Requested Block 5503936, Error Block: 5503936 Dec 3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Sense Key: uncorrectable data error Dec 3 13:51:20 nasjpk gda: [ID 107833 kern.notice] Vendor \'Gen-ATA \' error code: XX7 It is not related to this one disk. It happens on all disks. Sometimes several are listed before the system crashes, sometimes just one. I cannot I tend to agree, that the IDE driver seems to have a problem: I.e. on our machines (HP Z400 with a 0B4Ch-D MB with a 82801JI (ICH10 Family) controller) using a rpool 2-way mirror of WDC WD5000AAKS HDDs) we also see sometimes, that one drive got disabled dueto too many errors. zpool clear revives the pool (i.e. the HDD gets resilvered very quickly without any problem) 'til it occures again (i.e. after some days, weeks, or months). Unfortunately we couldn't find a procedure to reproduce the problem (e.g. like for the Marvell ctrl in the early days). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] SATA disk perf question
On Wed, Jun 01, 2011 at 06:17:08PM -0700, Erik Trimble wrote: On Wed, 2011-06-01 at 12:54 -0400, Paul Kraus wrote: Here's how you calculate (average) how long a random IOPs takes: seek time + ((60 / RPMs) / 2))] A truly sequential IOPs is: (60 / RPMs) / 2) For that series of drives, seek time averages 8.5ms (per Seagate). So, you get 1 Random IOPs takes [8.5ms + 4.13ms] = 12.6ms, which translates to 78 IOPS 1 Sequential IOPs takes 4.13ms, which gives 120 IOPS. Note that due to averaging, the above numbers may be slightly higher or lower for any actual workload. Nahh, shouldn't it read numbers may be _significant_ higher or lower ...? ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, Apr 08, 2011 at 08:29:31PM +1200, Ian Collins wrote: On 04/ 8/11 08:08 PM, Mark Sandrock wrote: ... I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? ... No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. Same thing here + several zones (source code repositories, documentation, even a real samba server to avoid the MS crap, install server, shared installs (i.e. relocatable packages shared via NFS e.g. as /local/usr ...)). So yes, 7xxx is a no-go for us as well. If there are no X45xx, we'll find alternatives from other companies ... Guess I'm slow. :-) May be - flexibility/dependencies are some of the keywords ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS and TRIM
On Fri, Feb 04, 2011 at 03:30:45PM +0100, Pawel Jakub Dawidek wrote: On Sat, Jan 29, 2011 at 11:31:59AM -0500, Edward Ned Harvey wrote: What is the status of ZFS support for TRIM? [...] My initial idea was to implement 100% reliable TRIM, so that I can implement secure delete using it, eg. if ZFS is placed on top of disk Hmmm - IIRC, zfs loadbalances ZIL ops over available devs. Furthermore I guess, almost everyone, who considers SSDs for ZIL will use at least 2 devs (i.e. since there is no big benefit in having a mirror dev, many people will proably use one SSD/dev, some more paranoid people will choose to have a N-way mirror/dev). So why not turn the 2nd dev temp. off (freeze), do what you want (trim/reformat/etc) and than turn it on again? I assume of course, that the loadbalancer recognizes, when a dev goes offline/online and automatically uses the available ones, only ... If one doesn't have at least 2 devs, don't care about this home optimized setup ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Ext. UPS-backed SATA SSD ZIL?
On Sat, Nov 27, 2010 at 03:04:27AM -0800, Erik Trimble wrote: Hi, I haven't had a chance to test a Vertex 2 PRO against my 2 EX, and I'd be interested if anyone else has. The EX is SLC-based, and the PRO is MLC-based, but the claimed performance numbers are similar. If the PRO works well, it's less than half the cost, and would be a nice solution for most users who don't need ultra-super-performance from their ZIL. Well, we'll get some toys to play with in the next couple of weeks (i.e before x-mas), which probably allows us to mimic your env and do some testing, before they go into production (probably in March 2011). So if you or anybody else has a special setup to test, feel free to send me a note ... HW Details: a new X4540 + 9x OCZSSD2-2VTXP50G + 3x OCZSSD2-2VTXP50G + a SM server with an LSI 620J and 24x 15K SAS2-Drives (see http://iws.cs.uni-magdeburg.de/~elkner/supermicro/server.html) as well as a bunch of HP z400 Xeon W3680 based WS. Estimated delivery of 10G components is end of january 2011 (Nexus 5010 + some 3560X-nT-Ls). The DDRdrive is still the way to go for the ultimate ZIL accelleration, Well, not for us, since full height cards are a no go for us and PCIe 1.x x1 ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Running on Dell hardware?
On Tue, Oct 26, 2010 at 08:06:53AM +1300, Ian Collins wrote: On 10/26/10 01:38 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Ian Collins Sun hardware? Then you get all your support from one vendor. +1 Sun hardware costs more, but it's worth it, if you want to simply assume your stuff will work. In my case, I'd say the sun hardware was approx 50% to 2x higher cost than the equivalent dell setup. I find that claim odd. When ever we bought kit down here in NZ, Sun has been the best on price. Maybe that's changed under the new order. Add about 50% to the last price list from Sun und you will get the price it costs now ... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Legality and the future of zfs...
On Mon, Jul 12, 2010 at 05:05:41PM +0100, Andrew Gabriel wrote: Linder, Doug wrote: Out of sheer curiosity - and I'm not disagreeing with you, just wondering - how does ZFS make money for Oracle when they don't charge for it? Do you think it's such an important feature that it's a big factor in customers picking Solaris over other platforms? Yes, it is one of many significant factors in customers choosing Solaris over other OS's. Having chosen Solaris, customers then tend to buy Sun/Oracle systems to run it on. 2x hit the nail on the head. But only if one doesn't have to sell its kingdom to get recommended/security patches. Otherwise the windooze nerds take over ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Homegrown Hybrid Storage
On Sun, Jun 06, 2010 at 09:16:56PM -0700, Ken wrote: I'm looking at VMWare, ESXi 4, but I'll take any advice offered. ... I'm looking to build a virtualized web hosting server environment accessing files on a hybrid storage SAN. I was looking at using the Sun X-Fire x4540 with the following configuration: IMHO Solaris Zones with LOFS mounted ZFSs gives you the highest flexibility in all directions, probably the best performance and least resource consumption, fine grained resource management (CPU, memory, storage space) and less maintainance stress etc... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Sun X4500 disk drives
On Wed, May 12, 2010 at 09:34:28AM -0700, Doug wrote: We have a 2006 Sun X4500 with Hitachi 500G disk drives. Its been running for over four years and just now fmadm zpool reports a disk has failed. No data was lost (RAIDZ2 + hot spares worked as expected.) But, the server is out of warranty and we have no hardware support on it. Well - had the same thing here (X4500, Q1 2007) 2-3 times couple of month ago. The 'too many errors' msg ringed some bells: do you remember the race condition problems in the marvell driver (IIRC especially late u3, u4) which caused many 'bad ...' errors in the logs? So I simply checked the drive in question (QD 2xdd over the whole disk and checked, whether an error occured). Since not a single error or bad performance I put it back and no wonder, it is still working ;-) ). Your situation might be different, but checking may not hurt - your disks might be a victim of an SW aka ZFS error counter... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Large scale ZFS deployments out there (200 disks)
On Fri, Feb 26, 2010 at 09:25:57PM -0700, Eric D. Mudama wrote: ... I agree with the above, but the best practices guide: http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#ZFS_file_service_for_SMB_.28CIFS.29_or_SAMBA states in the SAMBA section that Beware that mounting 1000s of file systems, will impact your boot time. I'd say going from a 2-3 minute boot time to a 4+ hour boot time is more than just impact. That's getting hit by a train. At least on S10u8 its not that bad. Last time I patched and rebooted a X4500 with ~350 ZFS it took about 10min to come up, a X4600 with a 3510 and ~2350 ZFS took about 20min (almost all are shared via NFS). Shutting down/unshare them takes roughly the same time ... On the X4600 creating|destroying a single ZFS (no matter on which pool or how many ZFS belong to the same pool!) takes about 20 sec, renaming about 40 sec ... - that's really a pain ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote: On February 3, 2010 12:04:07 PM +0200 Henu henrik.he...@tut.fi wrote: Is there a possibility to get a list of changed files between two snapshots? Great timing as I just looked this up last night, I wanted to verify that an install program was only changing the files on disk that it claimed to be changing. So I have to say, come on. It took me but one google search and the answer was one of the top 3 hits. http://forums.freebsd.org/showthread.php?p=65632 # newer files find /file/system -newer /file/system/.zfs/snapshot/snapname -type f # deleted files cd /file/system/.zfs/snapshot/snapname find . -type f -exec test -f /file/system/{} || echo {} \; The above requires GNU find (for -newer), and obviously it only finds files. If you need symlinks or directory names modify as appropriate. The above is also obviously to compare a snapshot to the current filesystem. To compare two snapshots make the obvious modifications. Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp may help as well (should be faster). Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 12:19:50PM -0500, Frank Cusack wrote: On February 3, 2010 6:02:52 PM +0100 Jens Elkner jel+...@cs.uni-magdeburg.de wrote: On Wed, Feb 03, 2010 at 10:29:18AM -0500, Frank Cusack wrote: # newer files find /file/system -newer /file/system/.zfs/snapshot/snapname -type f # deleted files cd /file/system/.zfs/snapshot/snapname find . -type f -exec test -f /file/system/{} || echo {} \; The above requires GNU find (for -newer), and obviously it only finds files. If you need symlinks or directory names modify as appropriate. The above is also obviously to compare a snapshot to the current filesystem. To compare two snapshots make the obvious modifications. Perhaps http://iws.cs.uni-magdeburg.de/~elkner/ddiff/ wrt. dir2dir cmp may help as well (should be faster). If you don't need to know about deleted files, it wouldn't be. It's hard to be faster than walking through a single directory tree if ddiff has to walk through 2 directory trees. Yepp, but I guess the 'test ...' invocation for each file alone is much more time consuming and IIRC the test -f path has do do several stats as well, 'til it reaches its final target. So a lot of overhead again. However, just finding newer files via 'find' is probably unbeatable ;-) If you do need to know about deleted files, the find method still may be faster depending on how ddiff determines whether or not to do a file diff. The docs don't explain the heuristics so I wouldn't want to guess on that. ddiff is a single process and basically travels recursively through directories via a DirectoryStream (side by side) and stops it at the point, where no more information is required to make the final decision (depends on cmd line options). So it needs for very deep dirs with a lot of entries [much] more memory than find, yes. Not sure, how DirectoryStream is implemented, but I guess, it gets mapped to readdir(3C) and friends ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How to get a list of changed files between two snapshots?
On Wed, Feb 03, 2010 at 06:46:57PM -0500, Ross Walker wrote: On Feb 3, 2010, at 12:35 PM, Frank Cusack frank+lists/ z...@linetwo.net wrote: On February 3, 2010 12:19:50 PM -0500 Frank Cusack frank+lists/z...@linetwo.net wrote: If you do need to know about deleted files, the find method still may be faster depending on how ddiff determines whether or not to do a file diff. The docs don't explain the heuristics so I wouldn't want to guess on that. An improvement on finding deleted files with the find method would be to not limit your find criteria to files. Directories with deleted files will be newer than in the snapshot so you only need to look at those directories. I think this would be faster than ddiff in most cases. So was there a final consensus on the best way to find the difference between two snapshots (files/directories added, files/directories deleted and file/directories changed)? Find won't do it, ddiff won't do it, ddiff does exactly this. However it never looks at any timestamp since it is the most unimportant/unreliable path component tag wrt. what has been changed and does also not take file permissions and xattrs into account. So ddiff is all about path names, types and content. Not more but also not less ;-) I think the only real option is rsync. Of course you can zfs send the snap to another system and do the rsync there against a local previous version. Probably the worst of all suggested alternatives ... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Mon, Dec 14, 2009 at 01:29:50PM +0300, Andrey Kuzmin wrote: On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner jel+...@cs.uni-magdeburg.de wrote: ... Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice ... Flash-based read cache should help here by minimizing (metadata) read latency, and flash-based log would bring down write latency. The only Hmmm not yet sure - I think writing via NFS is the biggest problem. Anyway, almost finished the work for a 'generic collector' and data visualizer which allows us to better correlate them to each other on the fly (i.e. no rrd pain) and understand the numbers hopefully a little bit better ;-). drawback of using single F20 is that you're trying to minimize both with the same device. Yepp. But would that scenario change much, when one puts 4 SSDs at HDD slots instead? I guess, not really or would be even worse because it disturbs the data path from/to HDD controlers. Anyway, I'll try that out next year, when those neat toys are officially supported (and the budget for this got its final approval of course). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Sat, Dec 12, 2009 at 03:28:29PM +, Robert Milkowski wrote: Jens Elkner wrote: Hi Robert, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? IIRC the card presents 4x LUNs so you could use each of them for different purpose. You could also use different slices. oh. coool - IMHO this would be sufficient for our purposes (see next posting). me or not. Is this correct? It still does. The capacitor is not for flushing data to disks drives! The card has a small amount of DRAM memory on it which is being flushed to FLASH. Capacitor is to make sure it actually happens if the power is lost. Yepp - found the specs. (BTW: Was probably to late to think about the term Flash Accelerator having DRAM prestoserv in mind ;-)). Thanx, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote: As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) Hmm - good point. What I'm trying to accomplish: Actually our current prototype thumper setup is: root pool (1x 2-way mirror SATA) hotspare (2x SATA shared) pool1 (12x 2-way mirror SATA) ~25% used user homes pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs So pool2 is not really a problem - delivers about 600MB/s uncached, about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) and is not contineously stressed. However sync write is ~ 200 MB/s or 20 MB/s and mirror, only. Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice usually via NFS and a litle bit via samba - a lot of more or less small files, probably widely spread over the platters. E.g. checkin' out a project from a svn|* repository into a home takes hours. Also having its workspace on NFS isn't fun (compared to linux xfs driven local soft 2-way mirror). So data are coming in/going out currently via 1Gbps aggregated NICs, for X4540 we plan to use one (may be experiment with two some time later) 10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in and out for the last PCIe 8x Slot - the F20. Since IO55 is bound with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 2 GB/s to and from the F20 should be possible. So IMHO wrt. bandwith basically it makes not really a difference, whether one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20 (even when distributing the SSDs over the IO55(2) and MCP55). However, having it on a separate HT than the HDDs might be an advantage. Also one would be much more flexible/able to scale immediately, i.e. don't need to re-organize the pools because of the now unavailable slots/ is still able to use all HDD slots with normal HDDs. (we are certainly going to upgrade x4500 to x4540 next year ...) (And if Sun makes a F40 - dropping the SAS ports and putting 4 other Flash-Modules on it or is able to get flashMods with double speed , one could probably really get ~ 1.2 GB write and ~ 2GB/s read). So, seems to be a really interesting thing and I expect at least wrt. user homes a real improvement, no matter, how the final configuration will look like. Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 benchmarks? I guess at least if they turn out to be good enough, it wouldn't hurt ;-) Jens Elkner wrote: ... whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540 + SFA F20 PCIe?
Hi, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? Wrt. super capacitators: I would guess, at least wrt. X4540 it doesn't give one more protection, since if power is lost, the HDDs do not respond anymore and thus it doesn't matter, whether the log cache is protected for a short time or not. Is this correct? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(
On Wed, Oct 28, 2009 at 01:55:57AM -0700, Ben Middleton wrote: Hi, $ ludelete 10_05-09 System has findroot enabled GRUB Checking if last BE on any disk... ERROR: cannot mount '/.alt.10_05-09/var': directory is not empty ERROR: cannot mount mount point /.alt.10_05-09/var device rpool/ROOT/s10x_u7wos_08/var ERROR: failed to mount file system rpool/ROOT/s10x_u7wos_08/var on /.alt.10_05-09/var ERROR: unmounting partially mounted boot environment file systems ... rpool/ROOT/s10x_u7wos_08 17.4M 4.26G 4.10G /.alt.10_05-09 rpool/ROOT/s10x_u7wos_08/var 9.05M 4.26G 2.11G /.alt.10_05-09/var luumount /.alt.10_05-09 mount -p | grep /.alt.10_05-09 # if it lists something (e.g. tmp, swap, etc) reboot first and than: zfs set mountpoint=/mnt rpool/ROOT/s10x_u7wos_08 zfs mount rpool/ROOT/s10x_u7wos_08 rm -rf /mnt/var/* /mnt/var/.???* zfs umount /mnt # now that should work lumount 10_05-09 /mnt luumount /mnt # if not, send the output of mount -p | grep ' /mnt' Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] strange results ...
Hmmm, wondering about IMHO strange ZFS results ... X4440: 4x6 2.8GHz cores (Opteron 8439 SE), 64 GB RAM 6x Sun STK RAID INT V1.0 (Hitachi H103012SCSUN146G SAS) Nevada b124 Started with a simple test using zfs on c1t0d0s0: cd /var/tmp (1) time sh -c 'mkfile 32g bla ; sync' 0.16u 19.88s 5:04.15 6.5% (2) time sh -c 'mkfile 32g blabla ; sync' 0.13u 46.41s 5:22.65 14.4% (3) time sh -c 'mkfile 32g blablabla ; sync' 0.19u 26.88s 5:38.07 8.0% chmod 644 b* (4) time dd if=bla of=/dev/null bs=128k 262144+0 records in 262144+0 records out 0.26u 25.34s 6:06.16 6.9% (5) time dd if=blabla of=/dev/null bs=128k 262144+0 records in 262144+0 records out 0.15u 26.67s 4:46.63 9.3% (6) time dd if=blablabla of=/dev/null bs=128k 262144+0 records in 262144+0 records out 0.10u 20.56s 0:20.68 99.9% So 1-3 is more or less expected (~97..108 MB/s write). However 4-6 looks strange: 89, 114 and 1585 MB/s read! Since the arc size is ~55+-2GB (at least arcstat.pl says so), I guess (6) reads from memory completely. Hmm - maybe. However, I would expect, that when repeating 5-6, 'blablabla' gets replaced by 'bla' or 'blabla'. But the numbers say, that 'blablabla' is kept in the cache, since I get almost the same results as in the first run (and zpool iostat/arcstat.pl show for the blablabla almost no activity at all). So is this a ZFS bug? Or does the OS some magic here? 2nd) Never had a Sun STK RAID INT before. Actually my intention was to create a zpool mirror of sd0 and sd1 for boot and logs, and a 2x2-way zpool mirror with the 4 remaining disks. However, the controller seems not to support JBODs :( - which is also bad, since we can't simply put those disks into another machine with a different controller without data loss, because the controller seems to use its own format under the hood. Also the 256MB BBCache seems to be a little bit small for ZIL even if one would know, how to configure it ... So what would you recommend? Creating 2 appropriate STK INT arrays and using both as a single zpool device, i.e. without ZFS mirror devs and 2nd copies? Intent workload is MySQL DBs + VBox images wrt. to the 4 disk *mirror, logs and OS for the 2 disk *mirror, and should also act as a sunray server (user homes and add. apps are comming from another server via NFS). Any hints? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Liveupgrade'd to U8 and now can't boot previous U6 BE :(
On Fri, Oct 16, 2009 at 07:36:04PM -0700, Paul B. Henson wrote: I used live upgrade to update a U6+lots'o'patches system to vanilla U8. I ran across CR 6884728, which results in extraneous lines in vfstab preventing successful boot. I logged in with maintainence mode and deleted Haveing a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/solaris-upgrade.txt shouldn't hurt ;-) those lines, and the U8 BE came up ok. I wasn't sure if there were any other problems from that, so I tried to activate and boot back into my previous U6 BE. That now fails with this error: *** * This device is not bootable! * * It is either offlined or detached or faulted. * * Please try to boot from a different device.* *** NOTICE: spa_import_rootpool: error 22 Cannot mount root on /p...@1,0/pci1022,7...@4/pci11ab,1...@1/d...@0,0:a fstype zfs panic[cpu0]/thread=fbc283a0: vfs_mountroot: cannot mount root I can still boot fine into the new U8 BE, but so far have found no way to recover and boot into my previously existing U6 BE. Hmm - haven't done thumper upgrades yet, but on sparc there is no problem to boot into the old BE as long as the zpool hasn't been upgraded to U8's v15. So first thing to check is, whether the pool is still at =v10 (U7 used v10, not sure about U6). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, Oct 13, 2009 at 10:59:37PM -0600, Drew Balfour wrote: ... For Opensolaris, Solaris CIFS != samba. Solaris now has a native in kernel CIFS server which has nothing to do with samba. Apart from having it's commands start with smb, which can be confusing. http://www.opensolaris.org/os/project/cifs-server/ Ah ok. Thanx for clarification! Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Solaris 10 samba in AD mode broken when user in 32 AD groups
On Tue, Oct 13, 2009 at 09:20:23AM -0700, Paul B. Henson wrote: We're currently using the Sun bundled Samba to provide CIFS access to our ZFS user/group directories. ... Evidently the samba engineering group is in Prague. I don't know if it is a language problem, or where the confusion is coming from, but even after escalating this through our regional support manager, they are still refusing to fix this bug and claiming it is an RFE. Havn't tested the bundle samba stuff for a long time, since I don't trust it: The bundled stuff didn't work when tested; packages are IMHO awefully assembled; Problems are not understood by the involved engineers (or they are not willingly to understand); The team seems to follow the dogma, fix the symptoms and not the root cause. So at least if the bundled stuff is modified according to their RFEs on bugzilla, don't be suprised, if your environment gets screwed up - especially when you have a mixed users group, i.e. Windows and *ix based user, which are using workgroup directories for sharing their stuff. So we still use the original samba and it causes no headaches. Once we had a problem when switching some desktops to Vista, MS Office 2007 due to the new win strategy save changes to a tmp file, than rename to the original file - wrong ACLs, however this has been fixed within ONE DAY: Just did some code scanning, talked to Jeremy Allison via smb IRC channel and viola, he came up with a fix pretty fast. So I didn't need to waste my time explaining the problem again and again to SUN support, creating explorer archives, which usually hang the NFS services which couldn't be fixed without a reboot!, and waiting several months to get it fixed (BTW: IIRC, I opened a case for this via sun support, so if it hasn't be silently closed, its probably still open ...). Since we guess, that CIFS gets screwed up by the same team, we don't use it either (well, and can't because we've no ADS ;-)). My 10¢. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] alternative hardware configurations for zfs
On Sat, Sep 12, 2009 at 02:37:35PM -0500, Tim Cook wrote: On Sat, Sep 12, 2009 at 10:17 AM, Damjan Perenic ... I shopped for 1TB 7200rpm drives recently and I noticed Seagate Barracude ES.2 has 1TB version with SATA and SAS interface. On the flip side, according to storage review, the SATA version trumps the SAS version in pretty much everything but throughput (which is still negligible). [5]http://www.storagereview.com/php/benchmark/suite_v4.php?typeID=10testbed ID=4osID=6raidconfigID=1numDrives=1devID_0=354devID_1=362devCnt=2 --Tim Just in case if interested in SATA, perhaps this helps (made on an almost idle system): elkner.sol /pool2 uname -a SunOS sol 5.11 snv_98 i86pc i386 i86xpv elkner.sol /rpool prtdiag System Configuration: Intel S5000PAL BIOS Configuration: Intel Corporation S5000.86B.10.00.0091.081520081046 08/15/2008 BMC Configuration: IPMI 2.0 (KCS: Keyboard Controller Style) Processor Sockets Version Location Tag -- Intel(R) Xeon(R) CPU E5440 @ 2.83GHz CPU1 Intel(R) Xeon(R) CPU E5440 @ 2.83GHz CPU2 ... elkner.sol /pool2 + /usr/X11/bin/scanpci | grep -i sata Intel Corporation 631xESB/632xESB SATA AHCI Controller elkner.sol ~ iostat -E | \ awk '/^sd/ { print $1; getline; print; getline; print }' sd0 Vendor: ATA Product: ST3250310NS Revision: SN05 Serial No: Size: 250.06GB 250059350016 bytes sd1 Vendor: ATA Product: ST3250310NS Revision: SN04 Serial No: Size: 250.06GB 250059350016 bytes sd2 Vendor: ATA Product: ST3250310NS Revision: SN04 Serial No: Size: 250.06GB 250059350016 bytes sd3 Vendor: ATA Product: ST3250310NS Revision: SN05 Serial No: Size: 250.06GB 250059350016 bytes sd5 Vendor: ATA Product: ST31000340NS Revision: SN06 Serial No: Size: 1000.20GB 1000204886016 bytes sd6 Vendor: ATA Product: ST31000340NS Revision: SN06 Serial No: Size: 1000.20GB 1000204886016 bytes elkner.sol ~ zpool status | grep ONLINE state: ONLINE pool1 ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t2d0 ONLINE 0 0 0 c1t3d0 ONLINE 0 0 0 state: ONLINE pool2 ONLINE 0 0 0 mirrorONLINE 0 0 0 c1t4d0 ONLINE 0 0 0 c1t5d0 ONLINE 0 0 0 state: ONLINE rpool ONLINE 0 0 0 mirror ONLINE 0 0 0 c1t0d0s0 ONLINE 0 0 0 c1t1d0s0 ONLINE 0 0 0 elkner.sol /pool2 + time sh -c mkfile 4g xx; sync; echo ST31000340NS ST31000340NS real 3:55.2 user0.0 sys 1.9 elkner.sol ~ iostat -zmnx c1t4d0 c1t5d0 5 | grep -v device 0.0 154.20.0 19739.4 3.0 32.0 19.4 207.5 100 100 c1t4d0 0.0 125.80.0 16103.9 3.0 32.0 23.8 254.3 100 100 c1t5d0 0.0 133.00.0 16366.9 2.4 25.9 17.9 194.4 80 82 c1t4d0 0.0 158.00.0 19592.5 2.8 30.3 17.6 191.7 93 96 c1t5d0 0.0 159.40.0 20054.8 2.8 30.3 17.7 190.2 94 95 c1t4d0 0.0 140.20.0 17597.2 2.8 30.3 20.1 216.4 94 95 c1t5d0 0.0 134.80.0 16298.7 2.0 23.0 15.2 170.8 68 76 c1t4d0 0.0 154.40.0 18807.5 2.7 29.3 17.3 189.9 89 94 c1t5d0 0.0 188.40.0 24115.5 3.0 32.0 15.9 169.8 100 100 c1t4d0 0.0 159.80.0 20454.6 3.0 32.0 18.8 200.2 100 100 c1t5d0 0.0 120.00.0 14328.3 2.0 22.2 16.4 184.9 66 71 c1t4d0 0.0 143.20.0 17169.9 2.6 28.2 18.0 197.1 86 93 c1t5d0 0.0 157.00.0 19140.9 2.6 29.3 16.5 186.9 87 96 c1t4d0 0.0 169.20.0 20676.9 2.2 24.8 13.2 146.6 75 79 c1t5d0 0.0 156.20.0 19993.8 3.0 32.0 19.2 204.8 100 100 c1t4d0 0.0 140.40.0 17971.3 3.0 32.0 21.3 227.9 100 100 c1t5d0 0.0 138.80.0 16759.6 2.6 29.3 18.4 210.9 86 94 c1t4d0 0.0 146.60.0 17809.2 2.7 29.6 18.4 201.7 90 94 c1t5d0 0.0 133.80.0 16196.8 2.5 28.0 18.9 209.3 85 90 c1t4d0 0.0 134.00.0 16222.4 2.6 28.7 19.5 214.3 87 94 c1t5d0 r/sw/s kr/skw/s wait actv wsvc_t asvc_t %w %b device elkner.sol /pool1 + time sh -c 'mkfile 4g xx; sync; echo ST3250310NS' ST3250310NS real 1:33.5 user0.0 sys 2.0 elkner.sol ~ iostat -zmnx c1t2d0 c1t3d0 5 | grep -v device 0.2 408.61.6 49336.8 25.7 0.8 62.81.9 79 79 c1t3d0 0.2 432.61.6 53284.4 29.9 0.9 69.02.1 89 89 c1t2d0 0.2 456.01.6 56280.0 28.6 0.9 62.61.9 86 86 c1t3d0 0.8 389.8 17.6 45360.7 25.8 0.8 66.02.1 81 80 c1t2d0 0.4 368.63.2 42698.0 21.1 0.6 57.31.8 65 65 c1t3d0 1.0 432.48.0 52615.8 30.2 0.9 69.62.1 91 91 c1t2d0
Re: [zfs-discuss] live upgrade with lots of zfs filesystems
On Thu, Aug 27, 2009 at 10:59:16PM -0700, Paul B. Henson wrote: On Thu, 27 Aug 2009, Paul B. Henson wrote: However, I went to create a new boot environment to install the patches into, and so far that's been running for about an hour and a half :(, which was not expected or planned for. [...] I don't think I'm going to make my downtime window :(, and will probably need to reschedule the patching. I never considered I might have to start the patch process six hours before the window. Well, so far lucreate took 3.5 hours, lumount took 1.5 hours, applying the patches took all of 10 minutes, luumount took about 20 minutes, and luactivate has been running for about 45 minutes. I'm assuming it will Have a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.10.patch or http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.11.patch ... So first install most recent LU patches and than one of the above. Since still on vacation (for ~8 weeks), haven't checked, whether there are new LU patches out there and the patches still match (usually they do). If not, adjusting the files manually shouldn't be a problem ;-) There are also versions for pre svn_b107 and pre 121430-36,121431-37: see http://iws.cs.uni-magdeburg.de/~elkner/ More info: http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html#luslow Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Boot error
On Thu, Aug 27, 2009 at 04:05:15PM -0700, Grant Lowe wrote: I've got a 240z with Solaris 10 Update 7, all the latest patches from Sunsolve. I've installed a boot drive with ZFS. I mirrored the drive with zpool. I installed the boot block. The system had been working just fine. But for some reason, when I try to boot, I get the error: {1} ok boot -s Boot device: /p...@1c,60/s...@2/d...@0,0 File and args: -s SunOS Release 5.10 Version Generic_141414-08 64-bit Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Division by Zero {1} ok My guess: s0 was to small when updating the boot archive. So booting from a jumstart dir/CD, mounting s0 (e.g. to /a) and running bootadm update-archive -R /a should fix the problem. If you are low on space on /, manually rm -f /a/platform/sun4u/boot_archive before doing the update-archive. If still not enough space, try to move some other stuff temp. away, e.g. /core , /etc/mail/cf ... Good luck, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] bug access
Hi, This CR has been marked as incomplete by User 1-UM-1502 for the reason Need More Info. Please update the CR providing the information requested in the Evaluation and/or Comments field. hmmm - wondering, how to find out, what 'more info' means and how to provide this info. There is no URL in the bug-response and it even seems to be impossible, to obtain the current state via http://bugs.opensolaris.org/view_bug.do?bug_id= (Bug Database Search is even more bogus - in general it doesn't find any bugs by ID). Should I continue to ignore these responses and mark those bugs internally as 'gets probably never fixed'? Regards. jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] v440 - root mirror lost after LU
Hmmm, just upgraded some servers to U7. Unfortunately one server's primary disk died during the upgrade, so that luactivate was not able to activate the s10u7 BE (Unable to determine the configuration ...). Since the rpool is a 2-way mirror, the boot-device=/p...@1f,70/s...@2/d...@1,0:a was simply set to /p...@1f,70/s...@2/d...@0,0:a and checked, whether the machine still reboots unattended. As expected - no problem. At the evening the faulty disk was replaced and the mirror resilvered via 'zpool replace rpool c1t1d0s0' (see below). Since there was no error and everything stated to be healthy, the s10u7 BE was luactivated (no error here message as well) and 'init 6'. Unfortunately, now the server was gone and no known recipe helped to revive it (I guess, LU damaged the zpool.cache?) : Any hints, how to get the rpool back? Regards, jel. What has been tried 'til now: {3} ok boot Boot device: /p...@1f,70/s...@2/d...@1,0:a File and args: Bad magic number in disk label Can't open disk label package Can't open boot device {3} ok {3} ok boot /p...@1f,70/s...@2/d...@0,0:a Boot device: /p...@1f,70/s...@2/d...@0,0:a File and args: SunOS Release 5.10 Version Generic_13-08 64-bit Copyright 1983-2009 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. NOTICE: spa_import_rootpool: error 22 Cannot mount root on /p...@1f,70/s...@2/d...@0,0:a fstype zfs panic[cpu3]/thread=180e000: vfs_mountroot: cannot mount root 0180b950 genunix:vfs_mountroot+358 (800, 200, 0, 1875c00, 189f800, 18ca000) %l0-3: 010ba000 010ba208 0187bba8 %011e8400 %l4-7: 011e8400 018cc400 0600 %0200 0180ba10 genunix:main+a0 (1815178, 180c000, 18397b0, 18c6800, 181b578, 1815000) %l0-3: 01015400 0001 70002000 % %l4-7: 0183ec00 0003 0180c000 % skipping system dump - no dump device configured rebooting... SC Alert: Host System has Reset {3} ok boot net -s Boot device: /p...@1c,60/netw...@2 File and args: -s 1000 Mbps FDX Link up Timeout waiting for ARP/RARP packet 3a000 1000 Mbps FDX Link up SunOS Release 5.10 Version Generic_137137-09 64-bit Copyright 1983-2008 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. Hardware watchdog enabled Booting to milestone milestone/single-user:default. Configuring devices. Using RPC Bootparams for network configuration information. Attempting to configure interface ce1... Skipped interface ce1 Attempting to configure interface ce0... Configured interface ce0 Requesting System Maintenance Mode SINGLE USER MODE # mount -F zfs /dev/dsk/c1t1d0s0 /mnt cannot open '/dev/dsk/c1t1d0s0': invalid dataset name # mount -F zfs /dev/dsk/c1t0d0s0 /mnt cannot open '/dev/dsk/c1t0d0s0': invalid dataset name # zpool import pool: pool1 id: 5088500955966129017 state: ONLINE action: The pool can be imported using its name or numeric identifier. config: pool1 ONLINE mirrorONLINE c1t2d0 ONLINE c1t3d0 ONLINE pool: rpool id: 5910200402071733373 state: UNAVAIL action: The pool cannot be imported due to damaged devices or data. config: rpool UNAVAIL insufficient replicas mirror UNAVAIL corrupted data c1t1d0s0 ONLINE c1t0d0s0 ONLINE # dd if=/dev/rdsk/c1t1d0s0 of=/tmp/bb bs=1b iseek=1 count=15 15+0 records in 15+0 records out # dd if=/dev/rdsk/c1t1d0s0 of=/tmp/bb bs=1b iseek=1024 oseek=15 count=16 16+0 records in 16+0 records out # cmp /tmp/bb /usr/platform/`uname -i`/lib/fs/zfs/bootblk # echo $? 0 # dd if=/dev/rdsk/c1t0d0s0 of=/tmp/ab bs=1b iseek=1 count=15 15+0 records in 15+0 records out # dd if=/dev/rdsk/c1t0d0s0 of=/tmp/ab bs=1b iseek=1024 oseek=15 count=16 16+0 records in 16+0 records out # cmp /tmp/ab /usr/platform/`uname -i`/lib/fs/zfs/bootblk # echo $? 0 # pre-history: admin.tpol ~ # zpool status -xv pool: rpool state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function, possibly in a degraded state. action: Wait for the resilver to complete. scrub: resilver in progress for 0h28m, 98.10% done, 0h0m to go config: NAMESTATE READ WRITE CKSUM rpool DEGRADED 0 0 0 mirrorDEGRADED 0 0 0 replacing DEGRADED 0 0 0 c1t1d0s0/old FAULTED 0 0 0 corrupted data c1t1d0s0 ONLINE 0 0 0 c1t0d0s0ONLINE 0 0 0 errors: No known data errors admin.tpol ~ # zpool status -xv all pools are healthy admin.tpol ~ # zpool status
Re: [zfs-discuss] v440 - root mirror lost after LU
On Tue, Jun 16, 2009 at 05:58:00PM -0600, Lori Alt wrote: First: Thanx a lot, Lori for the quick help!!! On 06/16/09 16:32, Jens Elkner wrote: At the evening the faulty disk was replaced and the mirror resilvered via 'zpool replace rpool c1t1d0s0' (see below). Since there was no error and everything stated to be healthy, the s10u7 BE was luactivated (no error here message as well) and 'init 6'. Unfortunately, now the server was gone and no known recipe helped to revive it (I guess, LU damaged the zpool.cache?) : The other suggestion I have is to remove the /p...@1f,70/s...@2/d...@1,0:a Indeed, physically removing the new c1t1d0 was the key for solving the problem (netboot s10u7 gave the same import error as s10u6). Just in case, somebody is interested in the details: Since 'cfgadm -c unconfigure c1::dsk/c1t1d0' didn't work (no blue LED), the machine was 'poweroff'ed, disk removed and 'poweron'ed. Really strange: it came back with the s10u6 BE instead of the s10u7 BE and took quite a while 'til it gave up to get HDD1: WARNING: /p...@1d,70/s...@2,1 (mpt3): Disconnected command timeout for Target 1 Hardware watchdog enabled ... Now cfgadm did not show c1::dsk/c1t1d0. So re-inserted HDD1, which was properly logged 'SC Alert: DISK @ HDD1 has been inserted.' Unfortunately cfgadm didn't show it and 'cfgadm -c configure c1::dsk/c1t1d0' meant: cfgadm: Attachment point not found. However, 'format -e': AVAILABLE DISK SELECTIONS: 0. c1t0d0 SUN36G cyl 24620 alt 2 hd 27 sec 107 /p...@1f,70/s...@2/s...@0,0 1. c1t1d0 HITACHI-DK32EJ36NSUN36G-PQ08-33.92GB /p...@1f,70/s...@2/s...@1,0 2. c1t2d0 SEAGATE-ST3146707LC-0005-136.73GB /p...@1f,70/s...@2/s...@2,0 3. c1t3d0 SEAGATE-ST3146707LC-0005-136.73GB /p...@1f,70/s...@2/s...@3,0 Specify disk (enter its number): 1 selecting c1t1d0 [disk formatted] and now the machine seemed to be stalled. No ssh nor login via console possible. So 'poweroff'ed again, removed the disk, 'poweron'ed and this time the first thing done was 'zpool detach rpool c1t1d0' and scrubbing the pool (completed after 0h22m with 0 errors). After that a 'cfgadm -x insert_device c1' got back c1::dsk/c1t1d0 and 'format -e' worked as expected (1p showed an EFI partition table)! So the rest was trivial: SMI label, repart, label, reattached c1t1d0 to rpool (resilver took about 31m), installboot, verified boot 'ok' s10u6, luactivated s10u7 and finally verified boot 'ok' again. Once again, thanx a lot Lori for your quick help!!! Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs reliability under xen
On Wed, May 20, 2009 at 12:06:49AM +0300, Ahmed Kamal wrote: Is anyone even using ZFS under Xen in production in some form. If so, what's your impression of reliability ? Hmm, somebody needs to out itself. Short answer: yes. Details: Well, i've installed an IntelServer (2x QuadCore E5440, 2.8 GHz) with 4x 300GB-SATA (2x 2-way mirrors) in a small company (one of the ebay rated Top10 sellers in germany) in October 2008 running snv_b98 as dom0 ;-) domU1 is a win2003 32bit small business server running a MS SQL server, domU2 is a win2008 32bit standard server, which is used as terminal server (=10 users) basically to run several Sage[.de] products. Both domUs are using the latest PV driver, which - depending on the recordsize to write - reaches about the same xfer rates as in dom0. (see http://iws.cs.uni-magdeburg.de/~elkner/xVM/ for more info: the benchmarks were made on a X4600M, however the IntelServer produces about the same numbers). pool1/win2003sbs.dsk volsize 48G - pool1/win2003sbs.dsk volblocksize 8K - pool1/win2008ss.dsk volsize 24G - pool1/win2008ss.dsk volblocksize 8K - It is in production since ~ february 2009 and stable as expected: Sometimes, when the win2003 domU is too long idle, it doesn't wake up anymore. No problem, wrote a simple CGI-script, so that the users have a simple UI to check the state of the domUs (basically a ping), to pull the cable (virsh destroy) as well as start/suspend/resume them. Usually they use this bookmark ~ 2-3/week when they start working. Users are quite happy with it, since clicking on a link is not much more work than switching on their own PC - so not annoying/painful at all. Initially I gave win2003 2x16 GB partitions (C:, D: for SQL Data), winn2008 a single 16 GB partition. When the sage products were installed, it turned out, that the 16GB was almost filled, so the volsize of pool1/win2008ss.dsk was increased to 24GB and the C: partition dynam. extendend in Win2008 - no problem. Last month the SQL Server has filled up its D: partition (16GB). So it started to 'reboot' several times a day. However, rising the pool1/win2003sbs.dsk volsize to 48 GB (i.e. D: to 32GB) solved that problem. The only little hurdle here was, that on could not extend the partition per win partitionmanager's context menu: one had to use the command line tool ... Things, which are a little bit annoying is the ZFS send command. Takes pretty long, but since it is usually running at night, it is not a real problem. BTW: The server was installed from remote misusing a company internal linux server (jumpstart of course). Since seriell port was not connected, I had to ask an on-site user to initiate the PXE boot, but this was not a problem, too. So the summary: all people (incl. admins) are happy. However, if you need to decide, whether to use Xen, test your setup before going into production and ask your boss, whether he can live with innovative ... solutions ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] LU snv_93 - snv_101a (ZFS - ZFS )
On Thu, May 21, 2009 at 02:51:53PM -0700, Nandini Mocherla wrote: Here is the short story about of my Live Upgrade problem. This is not ... # mount -F zfs /dev/dsk/c1t2d0s0 /mnt cannot open '/dev/dsk/c1t2d0s0': invalid dataset name Have seen this when LUing from b110 to b114 on a V240 (well known Fcode error). The mount command didn't work, too. Fix was to boot back into b110 (see 'boot -L'), calling luactivate once more and init 6 . Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] swat java source?
Hi, does anybody know, whether it is possible to get the java source for swat (where/how)? Thanx, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How recoverable is an 'unrecoverable error'?
On Wed, Apr 15, 2009 at 10:32:13PM +0800, Uwe Dippel wrote: status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. ... errors: No known data errors Now I wonder where that error came from. It was just a single checksum Hmmm, had ~ 2 weeks ago also a curious thing with an StorEdge 3510 (2x2Gbps FC MP, 1 Controller, 2x6HDDs mirrored and exported as a single device, no ZIL etc. tricks) connected to a X4600: Since grill party time has started, the 3510 decided at a room temp of 33°C to go offline and take part on the party ;-). Result was that during the offline time everything blocked (i.e. didn't got timeout or error), which tried to access a ZFS on that pool - wrt. the POV more or less expected. After the 3510 came back, a 'zpool status ..' showed something like this: NAME STATEREAD WRITE CKSUM pool2FAULTED 289K 4.03M 0 c4t600C0FF0099C790E0144EC00d0 FAULTED 289K 4.03M 0 too many errors errors: Permanent errors have been detected in the following files: pool2/home/stud/inf/foobar:0x0 Still everything was blocking. After a 'zpool clear' all ZFS ( ~ 2300 on that pool) expect the listed one were accessable, but the status message kept unchanged. Curious, thought that blocking/waiting for the device to come back and the ZFS transaction stuff is actually made for a situation like this, aka re-commit un-ACKed actions ... Anyway, finally scrubbing the pool brought it back to normal ONLINE state without any errors. To be sure I compared the ZFS in question with the backup from some hours ago - no difference. So same question made in the subject. BTW: Some days later we had an even bigger grill party (~ 38°C) - this time the X4xxx machines in this room decided to go offline and take part as well (v4xx's kept running ;-)). So first the 3510 and some time later the X4600. This time the pool was after going back online in DEGRADED state, had some more errors like the above one and: metadata:0x103 metadata:0x4007 ... Clearing and scrubbing it brought it again back to normal ONLINE state without any errors. Spot check on the noted files with errors showed no damage ... Everything nice (wrt. data loss), but curious ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Error ZFS-8000-9P
On Fri, Apr 03, 2009 at 10:41:40AM -0700, Joe S wrote: Today, I noticed this: ... According to http://www.sun.com/msg/ZFS-8000-9P: The Message ID: ZFS-8000-9P indicates a device has exceeded the acceptable limit of errors allowed by the system. See document 203768 for additional information. ... I've had the same on a thumper with S10u6 1|2 month ago. Since logs did not show any disk error/warning for the last 6 month I just cleared the pool and finally scrubbed it and put back the 'tmp hotspare' used to the hot spare pool. No errors or warnings since then for that disk, so it was obviously a false/brain damaged alarm ... regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] strange 'too many errors' msg
Hi, just found on a X4500 with S10u6: fmd: [ID 441519 daemon.error] SUNW-MSG-ID: ZFS-8000-GH, TYPE: Fault, VER: 1, SEVERITY: Major EVENT-TIME: Wed Feb 11 16:03:26 CET 2009 PLATFORM: Sun Fire X4500, CSN: 00:14:4F:20:E0:2C , HOSTNAME: peng SOURCE: zfs-diagnosis, REV: 1.0 EVENT-ID: 74e6f0ec-b1e7-e49b-8d71-dc1c9b68ad2b DESC: The number of checksum errors associated with a ZFS device exceeded acceptable levels. Refer to http://sun.com/msg/ZFS-8000-GH for more information. AUTO-RESPONSE: The device has been marked as degraded. An attempt will be made to activate a hot spare if available. IMPACT: Fault tolerance of the pool may be compromised. REC-ACTION: Run 'zpool status -x' and replace the bad device. zpool status -x ... mirror DEGRADED 0 0 0 spare DEGRADED 0 0 0 c6t6d0 DEGRADED 0 0 0 too many errors c4t0d0 ONLINE 0 0 0 c7t6d0ONLINE 0 0 0 ... spares c4t0d0 INUSE currently in use c4t4d0 AVAIL Strange thing is, that for more than 3 month there was no single error logged with any drive. IIRC, before u4 I've seen occasionaly a bad checksum error message, but this was obviously the result from the wellknown race condition of the marvell driver when havy writes took place. So I tend to interprete it as an false alarm and think about 'zpool ... clear c6t6d0'. What do you think. Is this a good idea? Regards, jel. BTW: zpool status -x msg refers to http://www.sun.com/msg/ZFS-8000-9P, the event to http://sun.com/msg/ZFS-8000-GH - little bit inconsistent I think. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] s10u6 ludelete issues with zones on zfs root
On Fri, Jan 16, 2009 at 02:08:09PM -0500, amy.r...@tufts.edu wrote: I've installed an s10u6 machine with no UFS partitions at all. I've created a dataset for zones and one for a zone named default. I then do an lucreate and luactivate and a subsequent boot off the new BE. All of that appears to go just fine (though I've found that I MUST call the zone dataset zoneds for some reason, or it will rename it ot that for me). When I try to delete the old BE, it fails with the following message: It's a LU bug. Have a look at http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html The following patch fix it and provides an oppurtunity to speedup lucreate/lumount/luactivate and friends dramtically wrt. a machine with lots of LU unrelated filesystems (e.g. user homes). http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.10.patch or http://iws.cs.uni-magdeburg.de/~elkner/luc/lu-5.11.patch Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Tue, Dec 02, 2008 at 12:22:49PM -0800, Vincent Fox wrote: Reviving this thread. We have a Solaris 10u4 system recently patched with 137137-09. Unfortunately the patch was applied from multi-user mode, I wonder if this may have been original posters problem as well? Anyhow we are now stuck No - in my case it was a 'not enough space' on / problem, not the multi-user mode ;-). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] `zfs list` doesn't show my snapshot
On Tue, Nov 25, 2008 at 06:34:47PM -0500, Richard Morris - Sun Microsystems - Burlington United States wrote: option to list all datasets. So 6734907 added -t all which produces the same output as -t filesystem,volume,snapshot. 1. http://bugs.opensolaris.org/view_bug.do?bug_id=6734907 Hmmm - very strange, when I run 'zfs list -t all' on b101 it says: invalid type 'all' ... But the bug report says: Fixed In snv_99 Release Fixed solaris_nevada(snv_99) So, what do those fields really mean? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] `zfs list` doesn't show my snapshot
On Fri, Nov 21, 2008 at 03:42:17PM -0800, David Pacheco wrote: Pawel Tecza wrote: But I still don't understand why `zfs list` doesn't display snapshots by default. I saw it in the Net many times at the examples of zfs usage. This was PSARC/2008/469 - excluding snapshot info from 'zfs list' http://opensolaris.org/os/community/on/flag-days/pages/2008091003/ The uncomplete one - where is the '-t all' option? It's really annoying, error prone, time consuming to type stories on the command line ... Does anybody remember the keep it small and simple thing? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Tue, Nov 18, 2008 at 07:44:56PM -0800, Ed Clark wrote: Hi Ed, messages from the underlying pkging commands are captured in the /var/sadm/patch/PID/log file messages from patchadd itself and patch level scripts (prepatch, postpatch, etc) go to stdout/stderr these are two distinct sets of messages -- not really optimal, just the way patchadd has always been Yepp - thanks for making that clear. So moving /etc/gconf to /opt and /etc/mail/cf to /usr/lib/mail (inkl. creating the appropriate links) was sufficient to get it work. nice trick, but unfortunately it won't do -- officially you must _never_ make such changes that alter the type of system files ; if you do, the changes are at your own risk and completely unsupported Yes, considering it for a temp. change, only. However pkgtools are usually robust enough (as long as the dir is not empty) to handle such relocations properly - that's why I really like pkgtools (giving me the freedom I need ;-)). the basic reason for this patching can not be guaranteed behave in a deterministic manner when it encounters such changes -- this does cause real problems too, ie. a case where changing sendmail.cf to a symlink caused a kernel patch only half apply, leading to long term outages Oops - that's a big bummer [OT: and probably the result of the bad packaging strategy of solaris (i.e. not software oriented aka merging different sw into one sol package ...)]. Good to know, that this already happend... removing the old corrupt boot archive is a simple and safe way to free up some space, best part is on reboot the system will automatically rebuild it Hmmm - may be I'm wrong, but IMHO if there is not enough space for a new boot_archive the bootadm should not corrupt anything but leave the old one in place - I would guess, in 95% of al cases one comes away with it, since very often updates are not really required ... hmm ... something of double edged sword, at least the way it works currently we know with certainty when there was a problem building the archive and can go about rectifying it ; the problem with keeping the old boot archive is that the system may have the appearance of booting and possibly even running ok, but there is absolutely no guarantee of nominal operation, could be very confusing Yes, I understand your point of view. However, I didn't mean, silently ignore the unable to update boot archive but giving the user a simple way to fix the problem. So I would prefer the keep the old archive as long as it can not be updated, but issue big warnings on reboot/activation to get informed, that a fix is needed. At least in my case the system would have been offline for at most 30min, but because of the bug it was several days offline and without your help probably several weeks/months (i.e. my experience wrt. german sun support) ... Anyway, thanks a lot again, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Sun, Nov 16, 2008 at 09:27:32AM -0800, Ed Clark wrote: Hi Ed, 1. a copy of the 137137-09 patchadd log if you have http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ thanks for info - what you provided here is the patch pkg installation log, Yes, actually the only one, I have/could find. what i was actually after was patchadd log (ie. the messages output to terminal) Up to now I thought, that stderr and stdout are redirected from patchadd to the patchlog, but never checked in detail, since the log always had the info I needed ... -- both the patchadd log and the console log on reboot should have shown errors which would have provided hints as to what the problem was Haven't seen anything unusaly. But may be I've overseen it :( now the df/prtvtoc output was most useful : 137137-09 delivers sparc newboot, and the problem here appears to be that a root fs slice of 256M falls well below the minimum required size required for sparc newboot to operate nominally -- due to the lack of space in /, i suspect that 137137-09 postpatch failed to copy the ~180MB failsafe archive (/platform/sun4u/failsafe) to your system, and that the ~80M boot archive (/platform/sun4u/boot_archive) was not created correctly on the reboot after applying 137137-09 the 'seek failed' error message you see on boot is coming from the ufs bootblk fcode, which i suspect is due to not being able load the corrupt boot_archive Yes - that makes sense. you should be able to get your system to boot by doing the following 1. net/CD/DVD boot the system using a recent update release, u5/u6 should work, not sure about u4 or earlier 2. mount the root fs slice, cd to root-fs-mount-point 3. ls -l platform/sun4u 4. rm -f platform/sun4u/boot_archive 5. sbin/bootadm -a update_all Yepp - now I can see the problem: Creating boot_archive for /a updating /a/platform/sun4u/boot_archive 15+0 records in 15+0 records out cat: write error: No space left on device bootadm: write to file failed: /a/boot/solaris/filestat.ramdisk.tmp: No space left on device So moving /etc/gconf to /opt and /etc/mail/cf to /usr/lib/mail (inkl. creating the appropriate links) was sufficient to get it work. 6. ls -l platform/sun4u total 136770 -rw-r--r-- 1 root root 68716544 Nov 18 03:22 boot_archive -rw-r--r-- 1 root sys71808 Oct 3 23:28 bootlst -rw-r--r-- 1 root sys79976 Oct 3 23:34 cprboot drwxr-xr-x 11 root sys 512 Mar 19 2007 kernel drwxr-xr-x 4 root bin 512 Nov 12 22:17 lib drwxr-xr-x 2 root bin 512 Mar 19 2007 sbin -rw-r--r-- 1 root sys 1084048 Oct 3 23:28 wanboot Filesystem 1024-blocksUsed Available Capacity Mounted on /dev/dsk/c0t0d0s0 245947 205343 1601093%/ boot_archive corruption will be a recurrent problem on your configuration, every time the system determines that boot_archive needs to be rebuilt on reboot -- a very inelegant workaround would be to 'rm -f /platform/sun4u/boot_archive' every time before rebooting the system Hmmm - may be I'm wrong, but IMHO if there is not enough space for a new boot_archive the bootadm should not corrupt anything but leave the old one in place - I would guess, in 95% of al cases one comes away with it, since very often updates are not really required ... better option would be to reinstall the system, choosing a disk layout adequate for newboot Well, the 2nd exercise is to test zfs boot (all systems have at least a 2nd HDD). If this works, just converting to zfs is probably the better option ... Anyway, thanks a lot for your help! Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Thu, Nov 13, 2008 at 04:54:57PM -0800, Gerry Haskins wrote: Jens, http://www.sun.com/bigadmin/patches/firmware/release_history.jsp on the Big Admin Patching center, http://www.sun.com/bigadmin/patches/ list firmware revisions. Thanks a lot. Digged around there and found, that 121683-06 aka OBP 4.22.33 seems to be the most recent one for V240. So in theory it should be ok in my case. If it's the same as a V490, then I think the current firmware version is 121689-04, http://sunsolve.sun.com/search/advsearch.do?collection=PATCHtype=collectionsqueryKey5=121689toDocument=yes OK - so the OBPs are all the latest ones on my machines. Unfortunately I've not a 2nd V490 to test, whether the problem occurs there as well - so I'll better postbone its upgrade :( Anyway, thanks a lot Gerry, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Fri, Nov 14, 2008 at 01:07:29PM -0800, Ed Clark wrote: hi, is the system still in the same state initially reported ? Yes. ie. you have not manually run any commands (ie. installboot) that would have altered the slice containing the root fs where 137137-09 was applied could you please provide the following 1. a copy of the 137137-09 patchadd log if you have one available cp it to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ Can't spot anything unusual. 2. an indication of anything particular about the system configuration, ie. mirrored root No mirrors/raid: # format Searching for disks...done AVAILABLE DISK SELECTIONS: 0. c0t0d0 FUJITSU-MAN3367MC-0109 cyl 24343 alt 2 hd 4 sec 737 /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 1. c0t1d0 SEAGATE-ST336737LC-0102 cyl 29773 alt 2 hd 4 sec 606 /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 2. c0t2d0 FUJITSU-MAT3073N SUN72G-0602-68.37GB /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 3. c0t3d0 FUJITSU-MAT3073N SUN72G-0602-68.37GB /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 3. output from the following commands run against root fs where 137137-09 was applied ls -l usr/platform/sun4u/lib/fs/*/bootblk ls -l platform/sun4u/lib/fs/*/bootblk sum usr/platform/sun4u/lib/fs/*/bootblk sum platform/sun4u/lib/fs/*/bootblk dd if=/dev/rdsk/rootdsk of=/tmp/bb bs=1b iseek=1 count=15 cmp /tmp/bb usr/platform/sun4u/lib/fs/ufs/bootblk cmp /tmp/bb platform/sun4u/lib/fs/ufs/bootblk prtvtoc /dev/rdsk/rootdsk also cp to http://iws.cs.uni-magdeburg.de/~elkner/137137-09/ Seems to be ok, too. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
On Thu, Nov 13, 2008 at 10:50:02AM -0800, Enda wrote: What hardware are you on, and what firmware are you at. Issue is coming from firmware. Sun Fire V240 with OpenBoot 4.22.23 Tried to find out, whether there is an OBP patch available, but haven't found anything wrt. V240, V440 and V490 :( Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs boot - U6 kernel patch breaks sparc boot
Hi, in preparation to try zfs boot on sparc I installed all recent patches incl. feature patches comming from s10s_u3wos_10 and after reboot finally 137137-09 (still having everything on UFS). Now it doesn't boot at anymore: ### Sun Fire V240, No Keyboard Copyright 2006 Sun Microsystems, Inc. All rights reserved. OpenBoot 4.22.23, 2048 MB memory installed, Serial #63729301. Ethernet address 0:3:ba:cc:6e:95, Host ID: 83cc6e95. Rebooting with command: boot Boot device: /[EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0:a File and args: | seek failed Warning: Fcode sequence resulted in a net stack depth change of 1 Evaluating: Evaluating: The file just loaded does not appear to be executable. |1} ok | ### fsck /dev/rdsk/c0t0d0s0 doesn't find any problems. So mounted this slice on /tmp/a, and # find /tmp/a/boot /tmp/a/boot /tmp/a/boot/solaris /tmp/a/boot/solaris/bin /tmp/a/boot/solaris/bin/extract_boot_filelist /tmp/a/boot/solaris/bin/create_ramdisk /tmp/a/boot/solaris/bin/root_archive /tmp/a/boot/solaris/filelist.ramdisk /tmp/a/boot/solaris/filelist.safe /tmp/a/boot/solaris/filestat.ramdisk # cat /tmp/a/boot/solaris/filelist.ramdisk etc/cluster/nodeid etc/dacf.conf etc/mach kernel platform It looks different than on x86 (no kernels), so is it possible, that the patch didn't install all required files or is it simply broken? Or did somebody forget to mention, that an OBP update is required before installing this patch? Any hints? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] FYI - proposing storage pm project
On Mon, Nov 03, 2008 at 02:54:10PM -0800, Yuan Chu wrote: Hi, a disk may take seconds or even tens of seconds to come on line if it needs to be powered up and spin up. Yes - I really hate this on my U40 and tried to disable PM for HDD[s] completely. However, haven't found a way to do this (thought /etc/power.conf is the right place, but either it doesn't work as explained or is not the right place). HDD[s] are HITACHI HDS7225S Revision: A9CA Any hints, how to switch off PM for this HDD? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [Fwd: Re: ZSF Solaris]
On Tue, Oct 07, 2008 at 11:35:47AM +0530, Pramod Batni wrote: The reason why the (implicit) truncation could be taking long might be due to 6723423 [6]UFS slow following large file deletion with fix for 6513858 installed To overcome this problem for S10, the offending patch 127866-03 can be removed. It is not yet fixed in snv. A fix is being developed, not sure which build it would be available in. OK - thanx for your answer. Since the fixes in 03-05 seem to be important, I'll try to initiate an escalation of the case - does it help to get it in a little bit earlier? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zpool imports are slow when importing multiple storage pools
On Mon, Oct 06, 2008 at 05:08:13PM -0700, Richard Elling wrote: Scott Williamson wrote: Speaking of this, is there a list anywhere that details what we can expect to see for (zfs) updates in S10U6? The official release name is Solaris 10 10/08 Ooops - no beta this time? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] [Fwd: Re: ZSF Solaris]
On Mon, Oct 06, 2008 at 08:01:39PM +0530, Pramod Batni wrote: On Tue, Sep 30, 2008 at 09:44:21PM -0500, Al Hopper wrote: This behavior is common to tmpfs, UFS and I tested it on early ZFS releases. I have no idea why - I have not made the time to figure it out. What I have observed is that all operations on your (victim) test directory will max out (100% utilization) one CPU or one CPU core - and all directory operations become single-threaded and limited by the performance of one CPU (or core). And sometimes its just a little bug: E.g. with a recent version of Solaris (i.e. = snv_95 || = S10U5) on UFS: SunOS graf 5.10 Generic_137112-07 i86pc i386 i86pc (X4600, S10U5) = admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 9.78s 0:29.42 33.4% admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 293.37s 5:13.67 93.5% SunOS q 5.11 snv_98 i86pc i386 i86pc (U40, S11b98) = elkner.q /var/tmp time mkfile 2g xx 0.05u 3.63s 0:42.91 8.5% elkner.q /var/tmp time mkfile 2g xx 0.04u 315.15s 5:54.12 89.0% The reason why the (implicit) truncation could be taking long might be due to 6723423 [6]UFS slow following large file deletion with fix for 6513858 installed To overcome this problem for S10, the offending patch 127866-03 can be removed. Yes - removing 127867-05 (x86, i.e. going back to 127867-02) resolved the problem. On sparc removing 127866-05 brought me back to 127866-01 which didn't seem to solve the problem (maybe because didn't init 6 before). However installing 127866-02 and init 6 fixed it on sparc as well. Any hints, in which snv release it is fixed? Thanx a lot, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZSF Solaris
On Tue, Sep 30, 2008 at 09:44:21PM -0500, Al Hopper wrote: This behavior is common to tmpfs, UFS and I tested it on early ZFS releases. I have no idea why - I have not made the time to figure it out. What I have observed is that all operations on your (victim) test directory will max out (100% utilization) one CPU or one CPU core - and all directory operations become single-threaded and limited by the performance of one CPU (or core). And sometimes its just a little bug: E.g. with a recent version of Solaris (i.e. = snv_95 || = S10U5) on UFS: SunOS graf 5.10 Generic_137112-07 i86pc i386 i86pc (X4600, S10U5) = admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 9.78s 0:29.42 33.4% admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 293.37s 5:13.67 93.5% admin.graf /var/tmp rm xx admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 9.92s 0:31.75 31.4% admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 305.15s 5:28.67 92.8% admin.graf /var/tmp time dd if=/dev/zero of=xx bs=1k count=2048 2048+0 records in 2048+0 records out 0.00u 298.40s 4:58.46 99.9% admin.graf /var/tmp time sh -c 'mkfile 2g xx ; sync' 0.05u 394.06s 6:52.79 95.4% SunOS kaiser 5.10 Generic_137111-07 sun4u sparc SUNW,Sun-Fire-V440 (S10, U5) = admin.kaiser /var/tmp time mkfile 1g xx 0.14u 5.24s 0:26.72 20.1% admin.kaiser /var/tmp time mkfile 1g xx 0.13u 64.23s 1:25.67 75.1% admin.kaiser /var/tmp time mkfile 1g xx 0.13u 68.36s 1:30.12 75.9% admin.kaiser /var/tmp rm xx admin.kaiser /var/tmp time mkfile 1g xx 0.14u 5.79s 0:29.93 19.8% admin.kaiser /var/tmp time mkfile 1g xx 0.13u 66.37s 1:28.06 75.5% SunOS q 5.11 snv_98 i86pc i386 i86pc (U40, S11b98) = elkner.q /var/tmp time mkfile 2g xx 0.05u 3.63s 0:42.91 8.5% elkner.q /var/tmp time mkfile 2g xx 0.04u 315.15s 5:54.12 89.0% SunOS dax 5.11 snv_79a i86pc i386 i86pc (U40, S11b79) = elkner.dax /var/tmp time mkfile 2g xx 0.05u 3.09s 0:43.09 7.2% elkner.dax /var/tmp time mkfile 2g xx 0.05u 4.95s 0:43.62 11.4% Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Greenbytes/Cypress
On Tue, Sep 23, 2008 at 01:04:34PM -0500, Bob Friesenhahn wrote: On Tue, 23 Sep 2008, Eric Schrock wrote: http://www.opensolaris.org/jive/thread.jspa?threadID=73740tstart=0 I must apologize for anoying everyone. When Richard Elling posted the GreenBytes link without saying what it was I completely ignored it. I assumed that it would be Windows-centric content that I can not view since of course I am a dedicated Solaris user. I see that someone else mentioned that the content does not work for Solaris users. As a result I ignored the entire discussion as being about some silly animation of gumballs. Don't apologize - its not your fault! BTW: I have exactly the same problem/assumption ... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS hangs/freezes after disk failure,
On Mon, Aug 25, 2008 at 08:17:55PM +1200, Ian Collins wrote: John Sonnenschein wrote: Look, yanking the drives like that can seriously damage the drives or your motherboard. Solaris doesn't let you do it ... Haven't seen an andruid/universal soldier shipping with Solaris ... ;-) and assumes that something's gone seriously wrong if you try it. That Linux ignores the behavior and lets you do it sounds more like a bug in linux than anything else. Not sure, whether everything, what can't be understood, is likely a bug - maybe it is more forgiving and tries its best to solve the problem without taking you out of business (see below), even if it requires some hacks not in line with specifications ... One point that's been overlooked in all the chest thumping - PCs vibrate and cables fall out. I had this happen with an SCSI connector. Luckily Yes - and a colleague told me, that he've had the same problem once. Also he managed a SiemensFujitsu server, where the SCSI-controller card had a tiny hairline crack: very odd behavior, usually not reproducible, IIRC, the 4th ServiceEngineer finally replaced the card ... So pulling a drive is a possible, if rare, failure mode. Definitely! And expecting strange controller (or in general hardware) behavior is possibly a big + for an OS, which targets SMEs and home users as well (everybody knows about far east and other cheap HW producers, which sometimes seem to say, lets ship it, later we build a special driver for MS Windows, which workarounds the bug/problem ...). Similar story: ~ 2000+ we had a WG server with 4 IDE channels PATA, one HDD on each. HDD0 on CH0 mirrored to HDD2 on CH2, HDD1 on CH1 mirrored to HDD3 on CH3, using Linux Softraid driver. We found out, that when HDD1 on CH1 got on the blink, for some reason the controller got on the blink as well, i.e. took CH0 and vice versa down too. After reboot, we were able to force the md raid to re-take the bad marked drives and even found out, that the problem starts, when a certain part of a partition was accessed (which made the ops on that raid really slow for some minutes - but after the driver marked the drive(s) as bad, performance was back). Thus disabling the partition gave us the time to get a new drive... During all these ops nobody (except sysadmins) realized, that we had a problem - thanx to the md raid1 (with xfs btw.). And also we did not have any data corruption (at least, nobody has complained about it ;-)). Wrt. what I've experienced and read in ZFS-discussion etc. list I've the __feeling__, that we would have got really into trouble, using Solaris (even the most recent one) on that system ... So if one asks me, whether to run Solaris+ZFS on a production system, I usually say: definitely, but only, if it is a Sun server ... My 2¢ ;-) Regards, jel. PS: And yes, all the vendor specific workarounds/hacks are for Linux kernel folks a problem as well - at least on Torvalds side discouraged IIRC ... -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Jumpstart + ZFS boot: profile?
Hi, I wanna try to setup a machine via jumpstart with ZFS boot using svn_b95. Usually (UFS) I use a profile like this for it: install_typeinitial_install system_type standalone usedisk c1t0d0 partitioningexplicit filesys c1t0d0s0256 / filesys c1t0d0s116384 swap filesys c1t0d0s34096/var filesys c1t0d0s48192/usr filesys c1t0d0s54096/opt filesys c1t0d0s7free/joker ... # c1t0 gets replaced by the real bootdisk path via begin script Is something similar now possible wrt. ZFS, i.e. something like zpool create boot c1t0d0 swap=16G zfs create boot/root ; zfs set reservation=512M boot/root zfs create boot/var ; zfs set reservation=4G boot/var zfs create boot/usr ; zfs set reservation=8G boot/usr zfs create boot/opt ; zfs set reservation=4G boot/opt ??? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Jumpstart + ZFS boot: profile?
On Thu, Aug 14, 2008 at 02:33:19PM -0700, Richard Elling wrote: There is a section on jumpstart for root ZFS in the ZFS Administration Guide. http://www.opensolaris.org/os/community/zfs/docs/zfsadmin.pdf Ah ok - thanx for the link. Seems to be almost the same, as on the web pages (thought, they are out of date ...) Is something similar now possible wrt. ZFS, i.e. something like zpool create boot c1t0d0 swap=16G zfs create boot/root ; zfs set reservation=512M boot/root zfs create boot/var ; zfs set reservation=4G boot/var zfs create boot/usr ; zfs set reservation=8G boot/usr zfs create boot/opt ; zfs set reservation=4G boot/opt ??? So the answer is NO : . Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Jumpstart + ZFS boot: profile?
On Thu, Aug 14, 2008 at 10:49:54PM -0400, Ellis, Mike wrote: You can break out just var, not the others. Yepp - and that's not sufficient :( Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS ACLs/Samba integration
On Thu, Mar 13, 2008 at 11:33:57AM +, Darren J Moffat wrote: Paul B. Henson wrote: I'm currently prototyping a Solaris file server that will dish out user home directories and group project directories via NFSv4 and Samba. Why not the in kernel CIFS server ? E.g., how would one mimic: [office] comment = office path = /export/vol1/office valid users = @office force group = office create mode = 660 directory mode = 770 ... We already lost this functionality with the introduction of the NFSv4 ACL crap on ZFS and earned a lot of hate you feedbacks. Anyway, most users and staff switched/switching over to windows (we do not support Linux yet and Solaris is wrt. desktop at least 5 years behind the scene), so the last 5% of *x users need to live with it. However, if we would switch to Solaris CIFS (which AFAIK can not accomplish, what is required) we would have no friends anymore ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS array NVRAM cache?
On Tue, Sep 25, 2007 at 10:14:57AM -0700, Vincent Fox wrote: Where is ZFS with regards to the NVRAM cache present on arrays? I have a pile of 3310 with 512 megs cache, and even some 3510FC with 1-gig cache. It seems silly that it's going to waste. These are dual-controller units so I have no worry about loss of cache information. It looks like OpenSolaris has a way to force arguably correct behavior, but Solaris 10u3/4 do not. I see some threads from early this year about it, and nothing since. Made some simple tests wrt. cont. seq. writes/reads for a 3510 (singl. controller), single Host (v490) with 2 FC-HBAs - so, yes - I'm running now ZFS single disk over HW Raid10 (10disks) ... Haven't had the time, to test all combinations or mixed load cases, however, in case you wanna check, what I got: http://iws.cs.uni-magdeburg.de/~elkner/3510.txt Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS problems in dCache
On Wed, Aug 01, 2007 at 09:49:26AM -0700, Sergey Chechelnitskiy wrote: Hi Sergey, I have a flat directory with a lot of small files inside. And I have a java application that reads all these files when it starts. If this directory is located on ZFS the application starts fast (15 mins) when the number of files is around 300,000 and starts very slow (more than 24 hours) when the number of files is around 400,000. The question is why ? Let's set aside the question why this application is designed this way. I still needed to run this application. So, I installed a linux box with XFS, mounted this XFS directory to the Solaris box and moved my flat directory there. Then my application started fast ( 30 mins) even if the number of files (in the linux operated XFS directory mounted thru NSF to the Solaris box) was 400,000 or more. Basicly, what I want to do is to run this application on a Solaris box. Now I cannot do it. Just a rough guess - this might be a Solaris threading problem. See http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6518490 So perhaps starting the app with -XX:-UseThreadPriorities may help ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] New german white paper on ZFS
On Tue, Jun 19, 2007 at 05:19:05PM +0200, Constantin Gonzalez wrote: Hi, http://blogs.sun.com/constantin/entry/new_zfs_white_paper_in Excellent!!! I think it is a pretty good idea, to put the links for the paper and slides on the ZFS Documentation page aka http://www.opensolaris.org/os/community/zfs/docs/ Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Slashdot Article: Does ZFS Obsolete Expensive NAS/SANs?
On Mon, Jun 04, 2007 at 12:10:18PM -0700, eric kustarz wrote: There's going to be some very good stuff for ZFS in s10u4, can you please update the issues *and* features when it comes out? Yes, and don't forget to add, that the POSIX ACL has been dropped/replaced by the braindamaged NFS4 ACLs. Now (when one migrates/cp UFS to ZFS) one needs to chmod -R A- workgroup_dir and find workgroup_dir -type d -exec chmod 0770 {} + and similar find workgroup_dir -type f -exec chmod 0660 {} + to make files accessable again. Last but not least one needs to teach the *x nerds to revert to same procedures as 20 years ago: create/copy files/dirs to workgroup_dir and than chmod g+rw etc.. Yes, windows people are laughing now about *x nerds and the nerds start hating admins, which use ZFS for workgroup dirs and getting bad points because they used to forget the chmod etc. thing ... My 2 ¢. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] setup_install_server, cpio and zfs : fix needed ?
On Sun, May 13, 2007 at 12:24:48PM -0500, Gael wrote: As I saw the same issue with the previous release, I'm going to post that one here and not on the U4 Beta forum. I'm trying to create a miniroot image for wanboot from the S10u4 media (same issue occured with S10u3 media) on a zfs filesystem on a U3 server. ... If checking in parallel, I can see a cpio running but not doing a lot... 30 mins without activity yet... Have the same problem on a thumper with u3: creating a boot image there takes about 41 min, on a U40 with b55b 2.5 min. + /boot/solaris/bin/root_archive packmedia $DST $SCRATCH ON both system $SCRATCH is local UFS. $DST is ZFS on the thumper, the same directory NFS mounted on the U40. 21537 /bin/ksh /boot/solaris/bin/root_archive packmedia /home/elkner/tmp/miniroot/Sol 21576 newfs /dev/rlofi/1 21577 sh -c mkfs -F ufs /dev/rlofi/1 385800 600 1 8192 1024 16 10 120 2048 t 0 -1 8 1 21578 mkfs -F ufs /dev/rlofi/1 385800 600 1 8192 1024 16 10 120 2048 t 0 -1 8 1 n Here mkfs seems to take the most time even so prstat says, load 0.02 and less (mkfs is doing nothing) ... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] learn to quote
On Sat, Apr 28, 2007 at 01:20:45PM -0400, Christine Tran wrote: Jens Elkner wrote: So please: http://learn.to/quote We apparently need to learn German as well. -CT Not really. There is also a Dutch version ;-) For your convinience (and all people, which can't find the English | Dutch link right below Revision on top of the page): http://www.netmeister.org/news/learn2quote.html http://www.briachons.org/art/quote/ Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] learn to quote
Hi, why is it so time consuming being on an opensolaris discuss mailing list? Obviously because many people never learned/forgot how to quote. So please: http://learn.to/quote Thanx, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] migration/acl4 problem
On Thu, Mar 22, 2007 at 01:34:15PM -0600, Mark Shellenbaum wrote: There is one big difference which you see here. ZFS always honors the users umask, and that is why the file was created with 644 permission rather than 664 as UFS did. ZFS has to always apply the users umask because of POSIX. Wow, that's a big show stopper! If I tell the users, that after the transition they have to toggle their umask before/after writing to certain directories or need to do a chmod, I'm sure they wanna hang me right on the next tree and wanna get their OS changed to Linux/Windooze... Only if your goal is to ignore a users intent on what permissions their files should be created with. Think about users who set their umask to 077. They will be upset when their files are created with a more permissive mode. The ZFS way is much more secure. Nope - you're talking about a different thing. I did not say, that these ACLs would be set on every possible fs|directory on the system! We and several companies I worked for use it to have a shared data dir you might think of it as a kind of workgroup based CVS, where the members of the owning workgroup are in the role of committers. The rationale for this is obvious and actually the same as for CVS: the only thing that counts is, what one can find in /data/$workgroup/** So no need to waist time for asking, who has finally the latest version of a document or the version, which should be used wrt. communication with none-internal entities, etc. and furthermore it allows to reduce the huge pile of redundant data extremly... We used this pattern/policy successfully for more than 10 year: for window users it was achieved easily by using samba, on Linux servers using XFS ACLs and on Solaris servers using UFS ACLs. ZFS breaks it. And since Solaris has no smbmnt - we can't even get a workaround, which makes more or less sense... What is your real desired goal? Are you just wanting anybody in a specific group to be able to read,write all files in a certain directory tree? If so, then there are other ways to achieve this, with file and directory inheritance. May be I didn't use the right settings, but I played around with it before sending the original posting (zfs aclmode intentionally set to passthrough and added fd flags), but this didn't work either. So a working example/demo would be helpful ... Isn't there a flag/property for zfs, to get back the old behavior or to enable POSIX-ACLs instead of zfs-ACLs? A force_directory_create_mode=0770,force_file_create_mode=0660' (like for samba shares) property would be even better - no need to fight with ACLs... That would be bad. That would mean that every file in a file system would be forced to be created with forced set of permissions. And that's exactly the business requirement. And even more a practical expericence: Assume user always have to change their umask before writing to /data/workgroup/**. Since people are usually a little bit lazy and are focused on get the job done, it doesn't take very long until the have added umask 007 to their .login/.profile whatever. But now, anybody in the same workgroup is also able to read the users private data in its $HOME, e.g. $HOME/Mail/* ... So in theory you might be right, but in practice it turns out, that you are achieving exactly the opposite... Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] migration/acl4 problem
Hi, S10U3: It seems, that ufs POSIX-ACLs are not properly translated to zfs ACL4 entries, when one xfers a directory tree from UFS to ZFS. Test case: Assuming one has an user A and B, both belonging to group G and having their umask set to 022: 1) On UFS - as user A do: mkdir /dir chmod 0775 /dir setfacl -m d:u::rwx,d:g::rwx,d:o:r-x,d:m:rwx /dir # samba would say: force create mask = 0664; directory mode = 0775 - as user B do: cd /dir touch x ls -alv - as user A do: cd /dir echo bla x - results in: drwxrwxr-x+ 3 A G 512 Mar 22 01:20 . 0:user::rwx 1:group::rwx #effective:rwx 2:mask:rwx 3:other:r-x 4:default:user::rwx 5:default:group::rwx 6:default:mask:rwx 7:default:other:r-x ... -rw-rw-r-- 1 BG4 Mar 22 01:22 x 0:user::rw- 1:group::rw- #effective:rw- 2:mask:rw- 3:other:r-- 2) On zfs - e.g. as root do: cp -P -r -p /dir /pool1/zfsdir # cp: Insufficient memory to save acl entry cp -r -p /dir /pool1/zfsdir # cp: Insufficient memory to save acl entry find dir | cpio -puvmdP /pool1/docs/ - as user B do: cd /pool1/zfsdir/dir touch y - as user A do: cd /pool1/zfsdir/dir echo bla y # y: Permission denied. - result: drwxrwxr-x+ 2 A G4 Mar 22 01:36 . owner@:--:fdi---:deny owner@:--:--:deny owner@:rwxp---A-W-Co-:fdi---:allow owner@:---A-W-Co-:--:allow group@:--:fdi---:deny group@:--:--:deny group@:rwxp---A-W-Co-:fdi---:allow group@:---A-W-Co-:--:allow everyone@:-w-p---A-W-Co-:fdi---:deny everyone@:---A-W-Co-:--:deny everyone@:r-x---a-R-c--s:fdi---:allow everyone@:--a-R-c--s:--:allow owner@:--:--:deny owner@:rwxp---A-W-Co-:--:allow group@:--:--:deny group@:rwxp--:--:allow everyone@:-w-p---A-W-Co-:--:deny everyone@:r-x---a-R-c--s:--:allow ... -rw-r--r--+ 1 BG0 Mar 22 01:36 y owner@:--:--:deny owner@:---A-W-Co-:--:allow group@:--:--:deny group@:---A-W-Co-:--:allow everyone@:---A-W-Co-:--:deny everyone@:--a-R-c--s:--:allow owner@:--x---:--:deny owner@:rw-p---A-W-Co-:--:allow group@:-wxp--:--:deny So, has anybody a clue, how one is able to migrate directories from ufs to zfs without loosing functionality? I've read, that it is always possible to translate POSIX_ACLs to ACL4, but it doesn't seem to work. So I've a big migration problem ... :((( Also I haven't found anything, which explain, how ACL4 really works on Solaris, i.e. how the rules are applied. Yes, in order and only who matches. But what means 'who matches', what purpose have the 'owner@:--:--:deny' entries, what takes precendence (allow | deny | first match | last match), also I remember, that sometimes I heard, that if allow once matched, everything else is ignored - but than I' askling, why the order of the ACLEs are important. Last but not least, what purpose have the standard perms e.g. 0644 - completely ignored if ACLEs are present ? Or used as fallback, if no ACLE matches or ACLE match, but have not set anywhere e.g. the r bit ? Any hints? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?
On Wed, Feb 28, 2007 at 11:45:35AM +0100, Roch - PAE wrote: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460622 Any estimations, when we'll see a [feature] fix for U3? Should I open a call, to perhaps rise the priority for the fix? The bug applies to checksum as well. Although the fix now in the gate only addreses compression. There is a per pool limit on throuput due to checksum. Multiple pools may help. Yepp. However, pooling the the disks also means to limit the I/O for a single task to max. #disksOfPool*IOperDisk/2. So pooling would make sense to me, if one has a lot of tasks and is able to force them to a dedicated pool... So my conclusion is, the more pools the ore aggregate bandwitdh (if one is able to distribute the work properly over all disks), but the less bandwith for a single task :(( This performance feature was fixed in Nevada last week. Workaround is to create multiple pools with fewer disks. Does this make sense for mirrors only as well ? Yep. OK, since I can't get out more than ~1GB/s (only one PCI-X slot left for a 10Mbps NIC), I decided to split into 2m*12 + 2m*10 + s*2 (see below). But I do not wanna rise the write perf limit: It already dropped to average ~ 345 MB/s :((( http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6415647 is degrading a bit the perf (guesstimate of anywhere up to 10-20%). I would guess, even up to 45% ... Check out iostat 1 and you will see the '0s' : not good. Yes - saw even 5 consecutive 0s ... :( Here the layout (inkl. test results), I actually have in mind for production: 1. pool for big files (sources tarballs, multimedia, iso images): -- zpool create -n pool1 \ mirror c0t0d0 c1t0d0mirror c6t0d0 c7t0d0 \ mirror c4t1d0 c5t1d0 \ mirror c0t2d0 c1t2d0mirror c6t2d0 c7t2d0 \ mirror c4t3d0 c5t3d0 \ mirror c0t4d0 c1t4d0mirror c6t4d0 c7t4d0 \ mirror c4t5d0 c5t5d0 \ mirror c0t6d0 c1t6d0mirror c6t6d0 c7t6d0 \ mirror c4t7d0 c5t7d0 \ spare c4t0d0 c4t4d0 \ (2x 256G) write(min/max/aver): 0 674 343.7 2. pool for mixed stuff (homes, apps): -- zpool create -n pool2 \ \ mirror c0t1d0 c1t1d0mirror c6t1d0 c7t1d0 \ mirror c4t2d0 c5t2d0 \ mirror c0t3d0 c1t3d0mirror c6t3d0 c7t3d0 \ \ mirror c0t5d0 c1t5d0mirror c6t5d0 c7t5d0 \ mirror c4t6d0 c5t6d0 \ mirror c0t7d0 c1t7d0mirror c6t7d0 c7t7d0 \ spare c4t0d0 c4t4d0 \ (2x 256G) write(min/max/aver): 0 600 386.0 1. + 2. (2x 256G) write(min/max/aver): 0 1440 637.9 1. + 2. (4x 128G) write(min/max/aver): 3.5 1268 709.5 (381+328.5) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?
On Mon, Feb 26, 2007 at 06:36:47PM -0800, Richard Elling wrote: Jens Elkner wrote: Currently I'm trying to figure out the best zfs layout for a thumper wrt. to read AND write performance. First things first. What is the expected workload? Random, sequential, lots of little files, few big files, 1 Byte iops, synchronous data, constantly changing access times, ??? Mixed. I.e. 1) as a homes server for student's and staff's ~, so small and big files (BTW: what is small and what is big?) as well as compressed/text files (you know, the more space people have, the more messier they get ...) - target to samba and nfs 2) app server in the sence of shared nfs space, where applications get installed once and can be used everywhere, e.g. eclipse, soffice, jdk*, teX, Pro Engineer, studio 11 and the like. Later I wanna have the same functionality for firefox, thunderbird, etc. for windows clients via samba, but this requires a little bit ore tweaking to get it work aka time I do not have right now ... Anyway, when ~ 30 students start their monster app like eclipse, oxygen, soffice at once (what happens in seminars quite frequently), I would be lucky to get same performance via nfs as from a local HDD ... 3) Video streaming, i.e. capturing as well as broadcasting/editing via smb/nfs. In general, striped mirror is the best bet for good performance with redundancy. Yes - thought about doing a mirror c0t0d0 c1t0d0 mirror c4t0d0 c6t0d0 mirror c7t0d0 c0t4d0 \ mirror c0t1d0 c1t1d0 mirror c4t1d0 c5t1d0 mirror c6t1d0 c7t1d0 \ mirror c0t2d0 c1t2d0 mirror c4t2d0 c5t2d0 mirror c6t2d0 c7t2d0 \ mirror c0t3d0 c1t3d0 mirror c4t3d0 c5t3d0 mirror c6t3d0 c7t3d0 \ mirror c1t4d0 c7t4d0 mirror c4t4d0 c6t4d0 \ mirror c0t5d0 c1t5d0 mirror c4t5d0 c5t5d0 mirror c6t5d0 c7t5d0 \ mirror c0t6d0 c1t6d0 mirror c4t6d0 c5t6d0 mirror c6t6d0 c7t6d0 \ mirror c0t7d0 c1t7d0 mirror c4t7d0 c5t7d0 mirror c6t7d0 c7t7d0 (probably removing 5th line and using those drives for hotspare). But perhaps it might be better, to split the mirrors into 3 different pools (but not sure why: my brain says no, my belly says yes ;-)). I did some simple mkfile 512G tests and found out, that per average ~ 500 MB/s seems to be the maximum on can reach (tried initial default setup, all 46 HDDs as R0, etc.). How many threads? One mkfile thread may be CPU bound. Very good point! Using 2 mkfile 256G I got (min/max/av) 473/750/630 MB/s (via zpool iostat 10) with the layout shown above and no compression enabled. Just to proof it I got with 4 mkfile 128G 407/815/588, with 3 mkfile 170G 401/788/525, 1 mkfile 512G was 397/557/476. Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] understanding zfs/thunoer bottlenecks?
On Tue, Feb 27, 2007 at 11:35:37AM +0100, Roch - PAE wrote: That might be a per pool limitation due to http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6460622 Not sure - did not use compression feature... This performance feature was fixed in Nevada last week. Workaround is to create multiple pools with fewer disks. Does this make sense for mirrors only as well ? Also this http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6415647 is degrading a bit the perf (guesstimate of anywhere up to 10-20%). Hmm - sounds similar (zpool iostat 10): pool1 4.36G 10.4T 0 5.04K 0 636M pool1 9.18G 10.4T 0 4.71K204 591M pool1 17.8G 10.4T 0 5.21K 0 650M pool1 24.0G 10.4T 0 5.65K 0 710M pool1 30.9G 10.4T 0 6.26K 0 786M pool1 36.3G 10.4T 0 2.74K 0 339M pool1 41.5G 10.4T 0 4.27K 1.60K 533M pool1 46.7G 10.4T 0 4.19K 0 527M pool1 46.7G 10.4T 0 2.28K 1.60K 290M pool1 55.7G 10.4T 0 5.18K 0 644M pool1 59.9G 10.4T 0 6.17K 0 781M pool1 68.8G 10.4T 0 5.63K 0 702M pool1 73.8G 10.3T 0 3.93K 0 492M pool1 78.7G 10.3T 0 2.96K 0 366M pool1 83.2G 10.3T 0 5.58K 0 706M pool1 91.5G 10.3T 4 6.09K 6.54K 762M pool1 96.4G 10.3T 0 2.74K 0 338M pool1101G 10.3T 0 3.88K 1.75K 485M pool1106G 10.3T 0 3.85K 0 484M pool1106G 10.3T 0 2.79K 1.60K 355M pool1110G 10.3T 0 2.97K 0 369M pool1119G 10.3T 0 5.20K 0 647M pool1124G 10.3T 0 3.64K 1.80K 455M pool1124G 10.3T 0 3.54K 0 453M pool1128G 10.3T 0 2.77K 0 343M pool1133G 10.3T 0 3.92K102 491M pool1137G 10.3T 0 2.43K 0 300M pool1141G 10.3T 0 3.26K 0 407M pool1148G 10.3T 0 5.35K 0 669M pool1152G 10.3T 0 3.14K 0 392M pool1156G 10.3T 0 3.01K 0 374M pool1160G 10.3T 0 4.47K 0 562M pool1164G 10.3T 0 3.04K 0 379M pool1168G 10.3T 0 3.39K 0 424M pool1172G 10.3T 0 3.67K 0 459M pool1176G 10.2T 0 3.91K 0 490M pool1183G 10.2T 4 5.58K 6.34K 699M pool1187G 10.2T 0 3.30K 1.65K 406M pool1195G 10.2T 0 3.24K 0 401M pool1198G 10.2T 0 3.21K 0 401M pool1203G 10.2T 0 3.87K 0 486M pool1206G 10.2T 0 4.92K 0 623M pool1214G 10.2T 0 5.13K 0 642M pool1222G 10.2T 0 5.02K 0 624M pool1225G 10.2T 0 4.19K 0 530M pool1234G 10.2T 0 5.62K 0 700M pool1238G 10.2T 0 6.21K 0 787M pool1247G 10.2T 0 5.47K 0 681M pool1254G 10.2T 0 3.94K 0 488M pool1258G 10.2T 0 3.54K 0 442M pool1262G 10.2T 0 3.53K 0 442M pool1267G 10.2T 0 4.01K 0 504M pool1274G 10.2T 0 5.32K 0 664M pool1274G 10.2T 4 3.42K 6.69K 438M pool1278G 10.2T 0 3.44K 1.70K 428M pool1282G 10.1T 0 3.44K 0 429M pool1289G 10.1T 0 5.43K 0 680M pool1293G 10.1T 0 3.36K 0 419M pool1297G 10.1T 0 3.39K306 423M pool1301G 10.1T 0 3.33K 0 416M pool1308G 10.1T 0 5.48K 0 685M pool1312G 10.1T 0 2.89K 0 360M pool1316G 10.1T 0 3.65K 0 457M pool1320G 10.1T 0 3.10K 0 386M pool1327G 10.1T 0 5.48K 0 686M pool1334G 10.1T 0 3.31K 0 406M pool1337G 10.1T 0 5.28K 0 669M pool1345G 10.1T 0 3.30K 0 402M pool1349G 10.1T 0 3.48K 1.60K 437M pool1349G 10.1T 0 3.42K 0 436M pool1353G 10.1T 0 3.05K 0 379M pool1358G 10.1T 0 3.81K 0 477M pool1362G 10.1T 0 3.40K 0 425M pool1366G 10.1T 4 3.23K 6.59K 401M pool1370G 10.1T 0 3.47K 1.65K 432M pool1376G 10.1T 0 4.98K 0 623M pool1380G 10.1T 0 2.97K 0 369M pool1384G 10.0T 0 3.52K409 439M pool1390G 10.0T 0 5.00K 0 626M pool1398G 10.0T 0 3.38K 0 414M pool1404G 10.0T 0 5.09K 0 637M pool1408G 10.0T 0 3.18K 0 397M pool1412G 10.0T 0 3.19K 0 397M
[zfs-discuss] Re: zfs bogus (10 u3)?
Hi Wire ;-), What's the output of zpool list zfs list ? Ooops, already destroyed the pool. Anyway, slept a night over it and found a maybe explaination: Files were created with mkfile an mkfile has an option -n. It was not used to create the files, however I interrupted mkfile (^C). So I guess, mkfile always creates a sparse file first, and when no '-n' is given, it starts to allocate the blocks ... Regards, jel. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] understanding zfs/thunoer bottlenecks?
Currently I'm trying to figure out the best zfs layout for a thumper wrt. to read AND write performance. I did some simple mkfile 512G tests and found out, that per average ~ 500 MB/s seems to be the maximum on can reach (tried initial default setup, all 46 HDDs as R0, etc.). According to http://www.amd.com/us-en/assets/content_type/DownloadableAssets/ArchitectureWP_062806.pdf I would assume, that much more and at least in theory a max. ~ 2.5 GB/s should be possible with R0 (assuming the throughput for a single thumper HDD is ~ 54 MB/s)... Is somebody able to enlighten me? Thanx, jel. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zone with lofs zfs - why legacy
I've created a zone which should mount the /pool1/flexlm.ulo zfs via lofs: + zfs create pool1/flexlm.ulo + zfs set atime=off pool1/flexlm.ulo + zfs set sharenfs=off pool1/flexlm.ulo + zonecfg -z flexlm ... add fs set dir=/usr/local set special=/pool1/flexlm.ulo set type=lofs add options rw,nodevices end ... This seems to work. However the manual zfs(5) say, that the mountpoint property has to be set to legacy. 1) why ? 2) if one sets the zfs property to legacy, the manual does not say, how an /etc/vfstab entry should look like (and a mount_zfs man page doesn't exist)... 3) the /pool1/flexlm.ulo property is set to atime=off. Do I need to specifiy this option or something similar, when creating the zone? 4) Wrt. best performance, only , what should one prefer: add fs:dir or add fs:dataset ? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: zone with lofs zfs - why legacy
'Robert Milkowski wrote:' Hi Robert, Monday, October 23, 2006, 7:15:39 PM, you wrote: JE 3) the /pool1/flexlm.ulo property is set to atime=off. Do I need JE to specifiy this option or something similar, when creating the zone? no, you don't. OK. JE 4) Wrt. best performance, only , what should one prefer: add fs:dir or add fs:dataset ? The performance should be the same in both cases - only a difference in features. ps. of course you do realize that mounting a filesystem over lofs will degrade performance Yes, I guessed that, but hopefully not that much ... Thinking about it, it would suggest to me (if I need abs. max. perf): the best thing to do is, to create a pool inside the zone and to use zfs on it ? Regards, jens. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Re: Re: zone with lofs zfs - why legacy
Using a ZFS filesystem within a zone will go just as fast as in the global zone, so there's no need to create multiple pools. So, Robert is actually wrong (at least in theory): using a zfs via add:fs:dir..,type=lofs gives probably less performances than using it via add:dataset:name. Correct? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss