Re: [zfs-discuss] X4540 no next-gen product?
Sounds like many of us are in a similar situation. To clarify my original post. The goal here was to continue with what was a cost effective solution to some of our Storage requirements. I'm looking for hardware that wouldn't cause me to get the run around from the Oracle support folks, finger pointing between vendors, or have lots of grief from an untested combination of parts. If this isn't possible we'll certainly find a another solution. I already know it won't be the 7000 series. Thank you, Chris Banal Marion Hakanson wrote: jp...@cam.ac.uk said: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a 10G connection between your storage and your front ends and you'll have the bandwidth[1]. Actually if you really were hitting 1000x8Mbits I'd put 2, but that is just a question of scale. In a different situation I have boxes which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Your situation appears to be pretty ideal for storage hardware, so perfectly achievable from an appliance. Depending on usage, I disagree with your bandwidth and latency figures above. An X4540, or an X4170 with J4000 JBOD's, has more bandwidth to its disks than 10Gbit ethernet. You would need three 10GbE interfaces between your CPU and the storage appliance to equal the bandwidth of a single 8-port 3Gb/s SAS HBA (five of them for 6Gb/s SAS). It's also the case that the Unified Storage platform doesn't have enough bandwidth to drive more than four 10GbE ports at their full speed: http://dtrace.org/blogs/brendan/2009/09/22/7410-hardware-update-and-analyzing-t he-hypertransport/ We have a customer (internal to the university here) that does high throughput gene sequencing. They like a server which can hold the large amounts of data, do a first pass analysis on it, and then serve it up over the network to a compute cluster for further computation. Oracle has nothing in their product line (anymore) to meet that need. They ended up ordering an 8U chassis w/40x 2TB drives in it, and are willing to pay the $2k/yr retail ransom to Oracle to run Solaris (ZFS) on it, at least for the first year. Maybe OpenIndiana next year, we'll see. Bye Oracle Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
Can anyone comment on Solaris with zfs on HP systems? Do things work reliably? When there is trouble how many hoops does HP make you jump through (how painful is it to get a part replaced that isn't flat out smokin')? Have you gotten bounced between vendors? Thanks, Chris Erik Trimble wrote: Talk to HP then. They still sell Officially Supported Solaris servers and disk storage systems in more varieties than Oracle does. The StorageWorks 600 Modular Disk System may be what you're looking for (70 x 2.5 drives per enclosure, 5U, SAS/SATA/FC attachment to any server, $35k list price for 70TB). Or the StorageWorks 70 Modular Disk Array (25 x 2.5 drives, 1U, SAS attachment, $11k list price for 12.5TB) -Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540 no next-gen product?
While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... -- Thank you, Chris Banal ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] How well does zfs mirror handle temporary disk offlines?
Erik Trimble wrote: On Tue, 2011-01-18 at 14:51 -0500, Torrey McMahon wrote: On 1/18/2011 2:46 PM, Philip Brown wrote: My specific question is, how easily does ZFS handle*temporary* SAN disconnects, to one side of the mirror? What if the outage is only 60 seconds? 3 minutes? 10 minutes? an hour? No idea how well it will reconnect the device but we had an X4500 that would randomly boot up and one or two disks would be missing. Reboot again and one or two other disks would be missing. While we were trouble shooting this problem this happened dozens and dozens of times and zfs had no trouble with it as far as I could tell. Would only resliver the data that was changed while that drive was offline. We had no data loss. Thank you, Chris Banal ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] zpool iostat / how to tell if your iop bound
What is the best way to tell if your bound by the number of individual operations per second / random io? zpool iostat has an operations column but this doesn't really tell me if my disks are saturated. Traditional iostat doesn't seem to be the greatest place to look when utilizing zfs. Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] full backup == scrub?
Assuming no snapshots. Do full backups (ie. tar or cpio) eliminate the need for a scrub? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] improve meta data performance
We have a SunFire X4500 running Solaris 10U5 which does about 5-8k nfs ops of which about 90% are meta data. In hind sight it would have been significantly better to use a mirrored configuration but we opted for 4 x (9+2) raidz2 at the time. We can not take the downtime necessary to change the zpool configuration. We need to improve the meta data performance with little to no money. Does anyone have any suggestions? Is there such a thing as a Sun supported NVRAM PCI-X card compatible with the X4500 which can be used as an L2ARC? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] bigger zfs arc
On Sat, Oct 3, 2009 at 11:33 AM, Richard Elling richard.ell...@gmail.comwrote: On Oct 3, 2009, at 10:26 AM, Chris Banal wrote: On Fri, Oct 2, 2009 at 10:57 PM, Richard Elling richard.ell...@gmail.com wrote: c is the current size the ARC. c will change dynamically, as memory pressure and demand change. How is the relative greediness of c determined? Is there a way to make it more greedy on systems with lots of free memory? AFAIK, there is no throttle on the ARC, so c will increase as the I/O demand dictates. The L2ARC has a fill throttle because those IOPS can compete with the other devices on the system. Other then memory pressure what would cause c to decrease? On a system that does nightly backups which are many times the amount of physical memory and does nothing but nfs. Why would we see c well below zfs_arc_max and plenty of free memory? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] bigger zfs arc
On Fri, Oct 2, 2009 at 10:57 PM, Richard Elling richard.ell...@gmail.comwrote: c is the current size the ARC. c will change dynamically, as memory pressure and demand change. How is the relative greediness of c determined? Is there a way to make it more greedy on systems with lots of free memory? When an L2ARC is attached does it get used if there is no memory pressure? My guess is no. for the same reason an L2ARC takes so long to fill. arc_summary.pl from the same system is You want to cache stuff closer to where it is being used. Expect the L2ARC to contain ARC evictions. If c is much smaller than zfs_arc_max and there is no memory pressure can we reasonably expect that the L2ARC is not likely to be used often? Do items get evicted from the L2ARC before the L2ARC is full? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] bigger zfs arc
We have a production server which does nothing but nfs from zfs. This particular machine has plenty of free memory. Blogs and Documentation state that zfs will use as much memory as is necessary but how is necessary calculated? If the memory is free and unused would it not be beneficial to increase the relative necessary size calculation of the arc even if the extra cache isn't likely to get hit often? When an L2ARC is attached does it get used if there is no memory pressure? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] NLM_DENIED_NOLOCKS Solaris 10u5 X4500
This was previously posed to the sun-managers mailing list but the only reply I received recommended I post here at well. We have a production Solaris 10u5 / ZFS X4500 file server which is reporting NLM_DENIED_NOLOCKS immediately for any nfs locking request. The lockd does not appear to be busy so is it possible we have hit some sort of limit on the number of files that can be locked? Are there any items to check before restarting lockd / statd. This appears to have at least temporarily cleared up the issue. Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Directory size value
It appears as though zfs reports the size of a directory to be one byte per file. Traditional file systems such as ufs or ext3 report the actual size of the data needed to store the directory. This causes some trouble with the default behavior of some nfs clients (linux) to decide to to use a readdirplus call when directory contents are small vs a readdir call when the contents are large. We've found this particularly troublesome with Maildir style mail folders. The speedup by not using readdirplus is a factor of 100 in this particular situation. While we have a work around it would seem that this non-standard behavior might cause trouble for others and in other areas. Are there any suggestions for dealing with this difference and/or why zfs does not represent its directory sizes in a more traditional manner? From the linux kernel source. ./fs/nfs/inode.c:#define NFS_LIMIT_READDIRPLUS (8*PAGE_SIZE) ie. ZFS zfshost:~/testdir ls -1 | wc -l 330 zfshost:~/testdir stat . File: `.' Size: 332 Blocks: 486IO Block: 32768 directory Device: 29h/41d Inode: 540058 Links: 2 Access: (0775/drwxrwxr-x) Uid: ( 2891/ banal) Gid: ( 101/film) Access: 2008-11-05 18:40:16.0 -0800 Modify: 2009-09-01 16:09:52.782674099 -0700 Change: 2009-09-01 16:09:52.782674099 -0700 ie. ext3 ext3host:~/testdir ls -1 | wc -l 330 ext3host:~/testdir stat . File: `.' Size: 36864 Blocks: 72 IO Block: 4096 directory Device: 807h/2055d Inode: 23887981Links: 2 Access: (0775/drwxrwxr-x) Uid: ( 2891/ banal) Gid: ( 101/film) Access: 2009-09-21 13:44:00.0 -0700 Modify: 2009-09-21 13:44:31.0 -0700 Change: 2009-09-21 13:44:31.0 -0700 Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] snv_XXX features / fixes - Solaris 10 version
Since most zfs features / fixes are reported in snv_XXX terms. Is there some sort of way to figure out which versions of Solaris 10 have the equivalent features / fixes? Thanks, Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss