Re: [zfs-discuss] how to set up solaris os and cache within one SSD
2011/11/11 Jeff Savit jeff.sa...@oracle.com On 11/10/2011 06:38 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss zfs-discuss-boun...@opensolaris.org] On Behalf Of Jeff Savit Also, not a good idea for performance to partition the disks as you suggest. Not totally true. By default, if you partition the disks, then the disk write cache gets disabled. But it's trivial to simply force enable it thus solving the problem. Granted - I just didn't want to get into a long story. With a self-described 'newbie' building a storage server I felt the best advice is to keep as simple as possible without adding steps (and without adding exposition about cache on partitioned disks - but now that you brought it up, yes, he can certainly do that). Besides, there's always a way to fill up the 1TB disks :-) Besides the OS image, it could also store gold images for the guest virtual machines, maintained separately from the operational images. how big of the solaris os'partition do you suggest? regards, Jeff -- *Jeff Savit* | Principal Sales Consultant Phone: 602.824.6275 | Email: jeff.sa...@oracle.com | Blog: http://blogs.oracle.com/jsavit Oracle North America Commercial Hardware Operating Environments Infrastructure S/W Pillar 2355 E Camelback Rd | Phoenix, AZ 85016 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to set up solaris os and cache within one SSD
On Fri, Nov 11, 2011 at 2:52 PM, darkblue darkblue2...@gmail.com wrote: I recommend buying either the oracle hardware or the nexenta on whatever they recommend for hardware. Definitely DO NOT run the free version of solaris without updates and expect it to be reliable. That's a bit strong. Yes I do regularly update my supported (Oracle) systems, but I've never had problems with my own build Solaris Express systems. I waste far more time on (now luckily legacy) fully supported Solaris 10 boxes! what does it mean? It means some people have experienced problem on both supported and unsupported solaris box, but using Oracle hardware would give you higher chance of having less problem, since Oracle (supposedly) tests their software on their hardware regularly to make sure they works nicely. I am going to install solaris 10 u10 on this server.it that any problem about compatible? As mentioned earlier, if you want fully-tested configuration, running solaris on oracle hardware is a no-brainer choice. Another alternative is using nexenta on hardware they certify, like http://www.nexenta.com/corp/newsflashes/86-2010/728-nexenta-announces-supermicro-partnership , since they've run enough tests on the combination. Also, if you look at posts on this lists, the usual recommendation is to use SAS disks instead of SATA for best performance and reliability. and which version of solaris or solaris derived do you suggest to build storage with the above hardware. Why not the recently-released solaris 11? And while we're on the subject, if using legal software is among your concerns, and you don't have solaris support (something like $2k/scoket/year, which is the only legal way to license solaris for non-oracle hardware), why not use openindiana? -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to set up solaris os and cache within one SSD
On 11/11/11 08:52 PM, darkblue wrote: 2011/11/11 Ian Collins i...@ianshome.com mailto:i...@ianshome.com On 11/11/11 02:42 AM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org mailto:zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- mailto:zfs-discuss- boun...@opensolaris.org mailto:boun...@opensolaris.org] On Behalf Of darkblue 1 * XEON 5606 1 * supermirco X8DT3-LN4F 6 * 4G RECC RAM 22 * WD RE3 1T harddisk 4 * intel 320 (160G) SSD 1 * supermicro 846E1-900B chassis I just want to say, this isn't supported hardware, and although many people will say they do this without problem, I've heard just as many people (including myself) saying it's unstable that way. I've never had issues with Supermicro boards. I'm using a similar model and everything on the board is supported. I recommend buying either the oracle hardware or the nexenta on whatever they recommend for hardware. Definitely DO NOT run the free version of solaris without updates and expect it to be reliable. That's a bit strong. Yes I do regularly update my supported (Oracle) systems, but I've never had problems with my own build Solaris Express systems. I waste far more time on (now luckily legacy) fully supported Solaris 10 boxes! what does it mean? Solaris 10 live upgrade is a pain in the arse! It gets confused when you have lots of filesystems, clones and zones. I am going to install solaris 10 u10 on this server.it http://server.it that any problem about compatible? and which version of solaris or solaris derived do you suggest to build storage with the above hardware. I'm running 11 Express now, upgrading to Solaris 11 this weekend. Unless you have good reason to use Solaris 10, use Solaris 11 or OpenIndiana. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to set up solaris os and cache within one SSD
2011/11/11 Ian Collins i...@ianshome.com On 11/11/11 08:52 PM, darkblue wrote: 2011/11/11 Ian Collins i...@ianshome.com mailto:i...@ianshome.com On 11/11/11 02:42 AM, Edward Ned Harvey wrote: From: zfs-discuss-bounces@**opensolaris.orgzfs-discuss-boun...@opensolaris.org mailto:zfs-discuss-bounces@**opensolaris.orgzfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- mailto:zfs-discuss- boun...@opensolaris.org mailto:bounces@opensolaris.**orgboun...@opensolaris.org ] On Behalf Of darkblue 1 * XEON 5606 1 * supermirco X8DT3-LN4F 6 * 4G RECC RAM 22 * WD RE3 1T harddisk 4 * intel 320 (160G) SSD 1 * supermicro 846E1-900B chassis I just want to say, this isn't supported hardware, and although many people will say they do this without problem, I've heard just as many people (including myself) saying it's unstable that way. I've never had issues with Supermicro boards. I'm using a similar model and everything on the board is supported. I recommend buying either the oracle hardware or the nexenta on whatever they recommend for hardware. Definitely DO NOT run the free version of solaris without updates and expect it to be reliable. That's a bit strong. Yes I do regularly update my supported (Oracle) systems, but I've never had problems with my own build Solaris Express systems. I waste far more time on (now luckily legacy) fully supported Solaris 10 boxes! what does it mean? Solaris 10 live upgrade is a pain in the arse! It gets confused when you have lots of filesystems, clones and zones. I am going to install solaris 10 u10 on this server.it http://server.it that any problem about compatible? and which version of solaris or solaris derived do you suggest to build storage with the above hardware. I'm running 11 Express now, upgrading to Solaris 11 this weekend. Unless you have good reason to use Solaris 10, use Solaris 11 or OpenIndiana. I was once consider Openindiana, but it's still on development stage, I don't know if this version(oi_151a) is stable enough for production usage -- Ian. __**_ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/**mailman/listinfo/zfs-discusshttp://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] how to set up solaris os and cache within one SSD
On 11/11/2011 01:02 AM, darkblue wrote: 2011/11/11 Jeff Savit jeff.sa...@oracle.com mailto:jeff.sa...@oracle.com On 11/10/2011 06:38 AM, Edward Ned Harvey wrote: From:zfs-discuss-boun...@opensolaris.org mailto:zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org mailto:boun...@opensolaris.org] On Behalf Of Jeff Savit Also, not a good idea for performance to partition the disks as you suggest. Not totally true. By default, if you partition the disks, then the disk write cache gets disabled. But it's trivial to simply force enable it thus solving the problem. Granted - I just didn't want to get into a long story. With a self-described 'newbie' building a storage server I felt the best advice is to keep as simple as possible without adding steps (and without adding exposition about cache on partitioned disks - but now that you brought it up, yes, he can certainly do that). Besides, there's always a way to fill up the 1TB disks :-) Besides the OS image, it could also store gold images for the guest virtual machines, maintained separately from the operational images. how big of the solaris os'partition do you suggest? That's one of the best things about ZFS and *not* putting separate pools on the same disk - you don't have to worry about sizing partitions. Use two of the rotating disks to install Solaris on a mirrored root pool (rpool). The OS build will take up a small portion of the 1TB usable data (and you don't want to go above 80% full so it's really 800GB effectively). You can use the remaining space in that pool for additional ZFS datasets to hold golden OS images, iTunes, backups, whatever. Or simply not worry about it and let there be unused space. Disk space is relatively cheap - complexity and effort are not. For all we know, the disk space you're buying is more than ample for the application and it might not even be worth devising the most space-efficient layout. If that's not the case, then the next topic would be how to stretch capacity via clones, compression, and RAIDZn. Along with several others posting here, I recommend you use Solaris 11 rather than Solaris 10. A lot of things are much easier, such as managing boot environments and sharing file systems via NFS, CIFS, iSCSI, and there's a lot of added functionality. I further (and strongly) endorse the suggestion of using a system from Oracle with supported OS and hardware, but I don't want to get into any arguments about hardware or licensing please. regards, Jeff ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs sync=disabled property
disk. This behavior is what makes NFS over ZFS slow without a slog: NFS does everything O_SYNC by default, No, it doesn't. Howver VMWare by default issues all writes as SYNC. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs sync=disabled property
On Thu, 10 Nov 2011, Tomas Forsman wrote: Loss of data as seen by the client can definitely occur. When a client writes something, and something else ends up on disk - I call that corruption. Doesn't matter whose fault it is and technical details, the wrong data was stored despite the client being careful when writing. Unlike many filesystems, zfs does not prioritize sync data over async data when it comes to finally writing the data to main store. Sync data is written to an intent log, which is replayed (as required) when the server reboots. Disabling sync disables this intent log and so data should be consistently set back in time if sync is disabled and the server does an unclean reboot. From this standpoint, the filesystem does not become corrupted. Regardless, data formats like databases could become internally corrupted due to the data written in a zfs transaction group not being representative of a coherent database transaction. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
Paul Kraus wrote: My main reasons for using zfs are pretty basic compared to some here What are they ? (the reasons for using ZFS) All technical reasons aside, I can tell you one huge reason I love ZFS, and it's one that is clearly being completely ignored by btrfs: ease of use. The zfs command set is wonderful and very English-like (for a unix command set). It's simple, clear, and logical. The grammar makes sense. I almost never have to refer to the man page. The last time I looked, the commands for btrfs were the usual incomprehensible gibberish with a thousand squiggles and numbers. It looked like a real freaking headache, to be honest. With zfs I can do really complex operations off the top of my head. It's very clear to me that someone spent a lot of time making the commands work that way, and that the commands have a lot of intelligence behind the scenes. After many years spent poring over manuals for SVM and VxFS and writing meter-long commands with a thousand fiddly little parameters, it is SUCH a relief. It's a pleasure to use. Like swimming in crystal clear water after years in murky soup. I haven't used btrfs. But just from what I've heard, I have two suggestions for it: 1) Change the stupid name. Btrfs is neither a pronounceable word nor a good acromyn. ButterFS sounds stupid. Just call it BFS or something, please. 2) After renaming it BFS, steal the entire ZFS command set and change the zs to bs. Have 'bpool' and 'bfs' commands, and exactly copy their syntax. The source code underneath may be copyrighted, but I doubt you can copyright command names, and probably even Oracle wouldn't be petty enough to raise a legal stink (though you never now with them). It would be nice if, for once, people writing software actually took usability into account, and the ulcers of sysadmins. Kudos to ZFS for bucking the horrible trend of impossibly complex syntax. -- Learn more about Merchant Link at www.merchantlink.com. THIS MESSAGE IS CONFIDENTIAL. This e-mail message and any attachments are proprietary and confidential information intended only for the use of the recipient(s) named above. If you are not the intended recipient, you may not print, distribute, or copy this message or any attachments. If you have received this communication in error, please notify the sender by return e-mail and delete this message and any attachments from your computer. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zfs xattr not supported prevents smb mount
Hello I have some zfs filesystems shared via cifs. Some of them I can mount and others I can't. They appear identical in properties and ACLs; the only difference I've found is the successful ones have xattr {A--m} and the others have {}. But I can't set that xattr on the share to see if it allows it to be mounted. chmod S+cA share chmod: ERROR: extended system attributes not supported for share (even though it has the xattr=on property) (The mount fails (after entering the correct password) with tree connect failed: syserr = Permission denied and the log message access denied: share ACL.) Maybe I'm barking up the wrong tree and it's not the xattr of the share which is causing the problem but I'd be grateful for some enlightenment. This is Solaris 11 (and is a 'regression' from Solaris 11 Express). Thanks ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Fri, Nov 11, 2011 at 1:39 PM, Linder, Doug doug.lin...@merchantlink.com wrote: Paul Kraus wrote: My main reasons for using zfs are pretty basic compared to some here What are they ? (the reasons for using ZFS) All technical reasons aside, I can tell you one huge reason I love ZFS, and it's one that is clearly being completely ignored by btrfs: ease of use. The zfs command set is wonderful and very English-like (for a unix command set). It's simple, clear, and logical. The grammar makes sense. I almost never have to refer to the man page. The last time I looked, the commands for btrfs were the usual incomprehensible gibberish with a thousand squiggles and numbers. It looked like a real freaking headache, to be honest. The command syntax paradigm of zfs (command sub-command object parameters) is not unique to zfs, but seems to have been the way of doing things in Solaris 10. The _new_ functions of Solaris 10 were all this way (to the best of my knowledge)... zonecfg zoneadm svcadm svccfg ... and many others are written this way. To boot the zone named foo you use the command zoneadm -z foo boot, to disable the service named sendmail, svcadm disable sendmail, etc. Someone at Sun was thinking :-) -- {1-2-3-4-5-6-7-} Paul Kraus - Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) - Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) - Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Fri, Nov 11, 2011 at 4:27 PM, Paul Kraus p...@kraus-haus.org wrote: The command syntax paradigm of zfs (command sub-command object parameters) is not unique to zfs, but seems to have been the way of doing things in Solaris 10. The _new_ functions of Solaris 10 were all this way (to the best of my knowledge)... zonecfg zoneadm svcadm svccfg ... and many others are written this way. To boot the zone named foo you use the command zoneadm -z foo boot, to disable the service named sendmail, svcadm disable sendmail, etc. Someone at Sun was thinking :-) I'd have preferred zoneadm boot foo. The -z zone command thing is a bit of a sore point, IMO. But yes, all these new *adm(1M( and *cfg(1M) commands in S10 are wonderful, especially when compared to past and present alternatives in the Unix/Linux world. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Linder, Doug All technical reasons aside, I can tell you one huge reason I love ZFS, and it's one that is clearly being completely ignored by btrfs: ease of use. The zfs command set is wonderful and very English-like (for a unix command set). It's simple, clear, and logical. The grammar makes sense. I almost never have to refer to the man page. The last time I looked, the commands for btrfs were the usual incomprehensible gibberish with a thousand squiggles and numbers. It looked like a real freaking headache, to be honest. Maybe you're doing different things from me. btrfs subvol create, delete, snapshot, mkfs, ... For me, both ZFS and BTRFS have normal user interfaces and/or command syntax. 1) Change the stupid name. Btrfs is neither a pronounceable word nor a good acromyn. ButterFS sounds stupid. Just call it BFS or something, please. LOL. Well, for what it's worth, there are three common pronunciations for btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) Check wikipedia. (This isn't really true, but I like to joke, after saying something like that, I wrote the wikipedia page just now.) ;-) Speaking of which. zettabyte filesystem. ;-) Is it just a dumb filesystem with a lot of address bits? Or is it something that offers functionality that other filesystems don't have? ;-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs sync=disabled property
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Bob Friesenhahn data formats like databases could become internally corrupted due to the data written in a zfs transaction group not being representative of a coherent database transaction. Although this is true, it's only true to the extent that corruption is a term applicable to power loss. If some database application is performing operations all over the place, unaware of what ZFS or any other filesystem sync policy is in force... Then ZFS ungracefully crashing with sync disabled and rewinding to some previous state would be just like ext4 having the power yanked out suddenly. If your application is able to survive power loss, then it's able to survive ungraceful crash with zfs sync disabled. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] about btrfs and zfs
On Sat, Nov 12, 2011 at 9:25 AM, Edward Ned Harvey opensolarisisdeadlongliveopensola...@nedharvey.com wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Linder, Doug All technical reasons aside, I can tell you one huge reason I love ZFS, and it's one that is clearly being completely ignored by btrfs: ease of use. The zfs command set is wonderful and very English-like (for a unix command set). It's simple, clear, and logical. The grammar makes sense. I almost never have to refer to the man page. The last time I looked, the commands for btrfs were the usual incomprehensible gibberish with a thousand squiggles and numbers. It looked like a real freaking headache, to be honest. Maybe you're doing different things from me. btrfs subvol create, delete, snapshot, mkfs, ... For me, both ZFS and BTRFS have normal user interfaces and/or command syntax. the gramatically-correct syntax would be btrfs create subvolume, but the current tool/syntax is an improvement over the old ones (btrfsctl, btrfs-vol, etc). 1) Change the stupid name. Btrfs is neither a pronounceable word nor a good acromyn. ButterFS sounds stupid. Just call it BFS or something, please. LOL. Well, for what it's worth, there are three common pronunciations for btrfs. Butterfs, Betterfs, and B-Tree FS (because it's based on b-trees.) ... as long as you don't call it BiTterly bRoken FS :) -- Fajar ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] weird bug with Seagate 3TB USB3 drive
On Nov 10, 2011, at 7:47 PM, David Magda wrote: On Nov 10, 2011, at 18:41, Daniel Carosone wrote: On Tue, Oct 11, 2011 at 08:17:55PM -0400, John D Groenveld wrote: Under both Solaris 10 and Solaris 11x, I receive the evil message: | I/O request is not aligned with 4096 disk sector size. | It is handled through Read Modify Write but the performance is very low. I got similar with 4k sector 'disks' (as a comstar target with blk=4096) when trying to use them to force a pool to ashift=12. The labels are found at the wrong offset when the block numbers change, and maybe the GPT label has issues too. Anyone know if Solaris 11 has better support for detecting the native block size of the underlying storage? Better than ? If the disks advertise 512 bytes, the only way around it is with a whitelist. I would be rather surprised if Oracle sells 4KB sector disks for Solaris systems… -- richard -- ZFS and performance consulting http://www.RichardElling.com LISA '11, Boston, MA, December 4-9 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] zfs sync=disabled property
Generally, there should not be corruption, only a roll-back to a previous state. *HOWEVER*, its possible that an application which has state outside of the filesystem (such as effects on network peers, or even state written to *other* filesystems) will encounter a consistency problem as the application will not be expecting this potentially partial rollback of state. This state *could* be state tracked in remote systems, or VMs, for example. Generally, I discourage disabling the sync unless you know *exactly* what you are doing. On my build filesystems I do it, because I can regenerate all the data, and a loss of up to 30 seconds of data is no problem for me. But I don't do this on home directories, or filesystems used for arbitrary application storage. And I would *never* do this for a filesystem that is backing a database. As they say, better safe than sorry. - Garrett On Nov 10, 2011, at 11:12 AM, Tomas Forsman wrote: On 10 November, 2011 - Bob Friesenhahn sent me these 1,6K bytes: On Wed, 9 Nov 2011, Tomas Forsman wrote: At all times, if there's a server crash, ZFS will come back along at next boot or mount, and the filesystem will be in a consistent state, that was indeed a valid state which the filesystem actually passed through at some moment in time. So as long as all the applications you're running can accept the possibility of going back in time as much as 30 sec, following an ungraceful ZFS crash, then it's safe to disable ZIL (set sync=disabled). Client writes block 0, server says OK and writes it to disk. Client writes block 1, server says OK and crashes before it's on disk. Client writes block 2.. waaiits.. waiits.. server comes up and, server says OK and writes it to disk. Now, from the view of the clients, block 0-2 are all OK'd by the server and no visible errors. On the server, block 1 never arrived on disk and you've got silent corruption. The silent corruption (of zfs) does not occur due to simple reason that flushing all of the block writes are acknowledged by the disks and then a new transaction occurs to start the next transaction group. The previous transaction is not closed until the next transaction has been successfully started by writing the previous TXG group record to disk. Given properly working hardware, the worst case scenario is losing the whole transaction group and no corruption occurs. Loss of data as seen by the client can definitely occur. When a client writes something, and something else ends up on disk - I call that corruption. Doesn't matter whose fault it is and technical details, the wrong data was stored despite the client being careful when writing. /Tomas -- Tomas Forsman, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss