Re: [zfs-discuss] X4540 no next-gen product?
On Fri, Apr 8 at 22:03, Erik Trimble wrote: I want my J4000's back, too. And, I still want something like HP's MSA 70 (25 x 2.5 drive JBOD in a 2U formfactor) Just noticed that SuperMicro is now selling a 4U 72-bay 2.5 6Gbit/s SAS chassis, the SC417. Unclear from the documentation how many 6Gbit/s SAS lanes are connected for that many devices though. Maybe that plus a support contract from Sun would be a worthy replacement, though you definitely won't have a single vendor to contact for service issues. --eric -- Eric D. Mudama edmud...@bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 8 Apr 2011, at 19:43, Marion Hakanson hakan...@ohsu.edu wrote: which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Depending on usage, I disagree with your bandwidth and latency figures above. An X4540, or an X4170 with J4000 JBOD's, has more bandwidth to its disks than 10Gbit ethernet. Actually I think our figures more or less agree. 12 disks = 7 mbits 48 disks = 4x7mbits What is actually required in practice depends on a lot of factors Julian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Julian King Actually I think our figures more or less agree. 12 disks = 7 mbits 48 disks = 4x7mbits I know that sounds like terrible performance to me. Any time I benchmark disks, a cheap generic SATA can easily sustain 500Mbit, and any decent drive can easily sustain 1Gbit. Of course it's lower when there's significant random seeking happening... But if you have a data model which is able to stream sequentially, the above is certainly true. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/09/2011 01:41 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Julian King Actually I think our figures more or less agree. 12 disks = 7 mbits 48 disks = 4x7mbits I know that sounds like terrible performance to me. Any time I benchmark disks, a cheap generic SATA can easily sustain 500Mbit, and any decent drive can easily sustain 1Gbit. I think he mistyped and meant 7gbit/s. Of course it's lower when there's significant random seeking happening... But if you have a data model which is able to stream sequentially, the above is certainly true. Unfortunately, this is exactly my scenario, where I want to stream large volumes of data in many concurrent threads over large datasets which have no hope of fitting in RAM or L2ARC and with generally very little locality. -- Saso ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 9 Apr 2011, at 12:59, Sašo Kiselkov skiselkov...@gmail.com wrote: On 04/09/2011 01:41 PM, Edward Ned Harvey wrote: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Julian King Actually I think our figures more or less agree. 12 disks = 7 mbits 48 disks = 4x7mbits I know that sounds like terrible performance to me. Any time I benchmark disks, a cheap generic SATA can easily sustain 500Mbit, and any decent drive can easily sustain 1Gbit. I think he mistyped and meant 7gbit/s. Oops. Yes I did! Of course it's lower when there's significant random seeking happening... But if you have a data model which is able to stream sequentially, the above is certainly true. Unfortunately, this is exactly my scenario, where I want to stream large volumes of data in many concurrent threads over large datasets which have no hope of fitting in RAM or L2ARC and with generally very little locality. Clearly one of those situation where any set up will struggle. -- Saso Julian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. We have had to take the retrograde step of adding more, smaller servers (like the ones we consolidated on the X4540s!). -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 4/8/2011 12:37 AM, Ian Collins wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. We have had to take the retrograde step of adding more, smaller servers (like the ones we consolidated on the X4540s!). Sorry, I read the question differently, as in I have X4500/X4540 now, and want more of them, but Oracle doesn't sell them anymore, what can I buy?. The 7000-series (now: Unified Storage) *are* storage appliances. If you have an X4540/X4500 (and some cash burning a hole in your pocket), Oracle will be happy to sell you a support license (which should include later versions of ZFS software). But, don't quote me on that - talk to a Sales Rep if you want a Quote. wink -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 2:37 AM, Ian Collins i...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 8/11 08:08 PM, Mark Sandrock wrote: On Apr 8, 2011, at 2:37 AM, Ian Collinsi...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. They replaced several application servers backed by a SAN for a fraction the price of a new SAN. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 3:29 AM, Ian Collins i...@ianshome.com wrote: On 04/ 8/11 08:08 PM, Mark Sandrock wrote: On Apr 8, 2011, at 2:37 AM, Ian Collinsi...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: On 4/7/2011 10:25 AM, Chris Banal wrote: While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Anything's a fraction of the price of a SAN, isn't it? :-) Mark They replaced several application servers backed by a SAN for a fraction the price of a new SAN. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 8/11 09:49 PM, Mark Sandrock wrote: On Apr 8, 2011, at 3:29 AM, Ian Collinsi...@ianshome.com wrote: On 04/ 8/11 08:08 PM, Mark Sandrock wrote: On Apr 8, 2011, at 2:37 AM, Ian Collinsi...@ianshome.com wrote: On 04/ 8/11 06:30 PM, Erik Trimble wrote: The move seems to be to the Unified Storage (aka ZFS Storage) line, which is a successor to the 7000-series OpenStorage stuff. http://www.oracle.com/us/products/servers-storage/storage/unified-storage/index.html Which is not a lot of use to those of us who use X4540s for what they were intended: storage appliances. Can you elaborate briefly on what exactly the problem is? I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? Guess I'm slow. :-) No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Regards, ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, 8 Apr 2011, Mark Sandrock wrote: And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it The rather extreme loss of I/O performance (at least several orders of magnitude) to the application, along with increased I/O latency, seems like quite a drawback. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, 8 Apr 2011, Erik Trimble wrote: Sorry, I read the question differently, as in I have X4500/X4540 now, and want more of them, but Oracle doesn't sell them anymore, what can I buy?. The 7000-series (now: Unified Storage) *are* storage appliances. They may be storage appliances, but the user can not put their own software on them. This limits the appliance to only the features that Oracle decides to put on it. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 08/04/2011 14:59, Bob Friesenhahn wrote: On Fri, 8 Apr 2011, Erik Trimble wrote: Sorry, I read the question differently, as in I have X4500/X4540 now, and want more of them, but Oracle doesn't sell them anymore, what can I buy?. The 7000-series (now: Unified Storage) *are* storage appliances. They may be storage appliances, but the user can not put their own software on them. This limits the appliance to only the features that Oracle decides to put on it. Isn't that the very definition of an Appliance ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, April 8, 2011 10:06, Darren J Moffat wrote: They may be storage appliances, but the user can not put their own software on them. This limits the appliance to only the features that Oracle decides to put on it. Isn't that the very definition of an Appliance ? Yes, but the OP wasn't looking for an appliance, he were looking for a (general) server that could hold lots of disks. The X4540 was well-designed and suited their need for storage and CPU (as it did Greenplum as well); it was fairly unique as a design. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 7:50 AM, Evaldas Auryla evaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? Mark ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, Apr 08, 2011 at 08:29:31PM +1200, Ian Collins wrote: On 04/ 8/11 08:08 PM, Mark Sandrock wrote: ... I don't follow? What else would an X4540 or a 7xxx box be used for, other than a storage appliance? ... No, I just wasn't clear - we use ours as storage/application servers. They run Samba, Apache and various other applications and P2V zones that access the large pool of data. Each also acts as a fail over box (both data and applications) for the other. Same thing here + several zones (source code repositories, documentation, even a real samba server to avoid the MS crap, install server, shared installs (i.e. relocatable packages shared via NFS e.g. as /local/usr ...)). So yes, 7xxx is a no-go for us as well. If there are no X45xx, we'll find alternatives from other companies ... Guess I'm slow. :-) May be - flexibility/dependencies are some of the keywords ;-) Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/08/2011 05:20 PM, Mark Sandrock wrote: On Apr 8, 2011, at 7:50 AM, Evaldas Auryla evaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? Mark Several: 1) Density: The X4540 has far greater density than 1U server + Sun's J4200 or J4400 storage arrays. The X4540 did 12 disks / 1RU, whereas a 1U + 2xJ4400 only manages ~5.3 disks / 1RU. 2) Number of components involved: server + disk enclosure means you have more PSUs which can die on you, more cabling to accidentally disconnect and generally more hassle with installation. 3) Spare management: With the X4540 you only have to have one kind of spare component: the server. With servers + enclosures, you might need to keep multiple. I agree that besides 1), both 2) a 3) are a relatively trivial problem to solve. Of course, server + enclosure builds do have their place, such as when you might need to scale, but even then you could just hook them up to a X4540 (or purchase a new one - I never quite understood why the storage-enclosure-only variant of the X4540 case was more expensive than an identical server). In short, I think the X4540 was an elegant and powerful system that definitely had its market, especially in my area of work (digital video processing - heavy on latency, throughput and IOPS - an area, where the 7000-series with its over-the-network access would just be a totally useless brick). -- Saso ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 08/04/2011 17:47, Sašo Kiselkov wrote: In short, I think the X4540 was an elegant and powerful system that definitely had its market, especially in my area of work (digital video processing - heavy on latency, throughput and IOPS - an area, where the 7000-series with its over-the-network access would just be a totally useless brick). As an engineer I'm curious have you actually tried a suitably sized S7000 or are you assuming it won't perform suitably for you ? -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/08/2011 06:59 PM, Darren J Moffat wrote: On 08/04/2011 17:47, Sašo Kiselkov wrote: In short, I think the X4540 was an elegant and powerful system that definitely had its market, especially in my area of work (digital video processing - heavy on latency, throughput and IOPS - an area, where the 7000-series with its over-the-network access would just be a totally useless brick). As an engineer I'm curious have you actually tried a suitably sized S7000 or are you assuming it won't perform suitably for you ? No, I haven't tried a S7000, but I've tried other kinds of network storage and from a design perspective, for my applications, it doesn't even make a single bit of sense. I'm talking about high-volume real-time video streaming, where you stream 500-1000 (x 8Mbit/s) live streams from a machine over UDP. Having to go over the network to fetch the data from a different machine is kind of like building a proxy which doesn't really do anything - if the data is available from a different machine over the network, then why the heck should I just put another machine in the processing path? For my applications, I need a machine with as few processing components between the disks and network as possible, to maximize throughput, maximize IOPS and minimize latency and jitter. Cheers, -- Saso ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
No, I haven't tried a S7000, but I've tried other kinds of network storage and from a design perspective, for my applications, it doesn't even make a single bit of sense. I'm talking about high-volume real-time video streaming, where you stream 500-1000 (x 8Mbit/s) live streams from a machine over UDP. Having to go over the network to fetch the data from a different machine is kind of like building a proxy which doesn't really do anything - if the data is available from a different machine over the network, then why the heck should I just put another machine in the processing path? For my applications, I need a machine with as few processing components between the disks and network as possible, to maximize throughput, maximize IOPS and minimize latency and jitter. I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a 10G connection between your storage and your front ends and you'll have the bandwidth[1]. Actually if you really were hitting 1000x8Mbits I'd put 2, but that is just a question of scale. In a different situation I have boxes which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Your situation appears to be pretty ideal for storage hardware, so perfectly achievable from an appliance. I can't speak for the S7000 range. I ignored that entire product line because when I asked about it the markup was insane compared to just buying X4500/X4540s. The price for Oracle kit isn't remotely tenable, so the death of the X45xx range is a moot point for me anyway, since I couldn't afford it. [1] Just in case, you also shouldn't be adding any particularly significant latency either. Jitter, maybe, depending on the specifics of the streams involved. Saso Julian -- Julian King Computer Officer, University of Cambridge, Unix Support ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
jp...@cam.ac.uk said: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a 10G connection between your storage and your front ends and you'll have the bandwidth[1]. Actually if you really were hitting 1000x8Mbits I'd put 2, but that is just a question of scale. In a different situation I have boxes which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Your situation appears to be pretty ideal for storage hardware, so perfectly achievable from an appliance. Depending on usage, I disagree with your bandwidth and latency figures above. An X4540, or an X4170 with J4000 JBOD's, has more bandwidth to its disks than 10Gbit ethernet. You would need three 10GbE interfaces between your CPU and the storage appliance to equal the bandwidth of a single 8-port 3Gb/s SAS HBA (five of them for 6Gb/s SAS). It's also the case that the Unified Storage platform doesn't have enough bandwidth to drive more than four 10GbE ports at their full speed: http://dtrace.org/blogs/brendan/2009/09/22/7410-hardware-update-and-analyzing-t he-hypertransport/ We have a customer (internal to the university here) that does high throughput gene sequencing. They like a server which can hold the large amounts of data, do a first pass analysis on it, and then serve it up over the network to a compute cluster for further computation. Oracle has nothing in their product line (anymore) to meet that need. They ended up ordering an 8U chassis w/40x 2TB drives in it, and are willing to pay the $2k/yr retail ransom to Oracle to run Solaris (ZFS) on it, at least for the first year. Maybe OpenIndiana next year, we'll see. Bye Oracle Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
Sounds like many of us are in a similar situation. To clarify my original post. The goal here was to continue with what was a cost effective solution to some of our Storage requirements. I'm looking for hardware that wouldn't cause me to get the run around from the Oracle support folks, finger pointing between vendors, or have lots of grief from an untested combination of parts. If this isn't possible we'll certainly find a another solution. I already know it won't be the 7000 series. Thank you, Chris Banal Marion Hakanson wrote: jp...@cam.ac.uk said: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a 10G connection between your storage and your front ends and you'll have the bandwidth[1]. Actually if you really were hitting 1000x8Mbits I'd put 2, but that is just a question of scale. In a different situation I have boxes which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Your situation appears to be pretty ideal for storage hardware, so perfectly achievable from an appliance. Depending on usage, I disagree with your bandwidth and latency figures above. An X4540, or an X4170 with J4000 JBOD's, has more bandwidth to its disks than 10Gbit ethernet. You would need three 10GbE interfaces between your CPU and the storage appliance to equal the bandwidth of a single 8-port 3Gb/s SAS HBA (five of them for 6Gb/s SAS). It's also the case that the Unified Storage platform doesn't have enough bandwidth to drive more than four 10GbE ports at their full speed: http://dtrace.org/blogs/brendan/2009/09/22/7410-hardware-update-and-analyzing-t he-hypertransport/ We have a customer (internal to the university here) that does high throughput gene sequencing. They like a server which can hold the large amounts of data, do a first pass analysis on it, and then serve it up over the network to a compute cluster for further computation. Oracle has nothing in their product line (anymore) to meet that need. They ended up ordering an 8U chassis w/40x 2TB drives in it, and are willing to pay the $2k/yr retail ransom to Oracle to run Solaris (ZFS) on it, at least for the first year. Maybe OpenIndiana next year, we'll see. Bye Oracle Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 4/8/2011 1:58 PM, Chris Banal wrote: Sounds like many of us are in a similar situation. To clarify my original post. The goal here was to continue with what was a cost effective solution to some of our Storage requirements. I'm looking for hardware that wouldn't cause me to get the run around from the Oracle support folks, finger pointing between vendors, or have lots of grief from an untested combination of parts. If this isn't possible we'll certainly find a another solution. I already know it won't be the 7000 series. Thank you, Chris Banal Talk to HP then. They still sell Officially Supported Solaris servers and disk storage systems in more varieties than Oracle does. The StorageWorks 600 Modular Disk System may be what you're looking for (70 x 2.5 drives per enclosure, 5U, SAS/SATA/FC attachment to any server, $35k list price for 70TB). Or the StorageWorks 70 Modular Disk Array (25 x 2.5 drives, 1U, SAS attachment, $11k list price for 12.5TB) -Erik Marion Hakanson wrote: jp...@cam.ac.uk said: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a 10G connection between your storage and your front ends and you'll have the bandwidth[1]. Actually if you really were hitting 1000x8Mbits I'd put 2, but that is just a question of scale. In a different situation I have boxes which peak at around 7 Gb/s down a 10G link (in reality I don't need that much because it is all about the IOPS for me). That is with just twelve 15k disks. Your situation appears to be pretty ideal for storage hardware, so perfectly achievable from an appliance. Depending on usage, I disagree with your bandwidth and latency figures above. An X4540, or an X4170 with J4000 JBOD's, has more bandwidth to its disks than 10Gbit ethernet. You would need three 10GbE interfaces between your CPU and the storage appliance to equal the bandwidth of a single 8-port 3Gb/s SAS HBA (five of them for 6Gb/s SAS). It's also the case that the Unified Storage platform doesn't have enough bandwidth to drive more than four 10GbE ports at their full speed: http://dtrace.org/blogs/brendan/2009/09/22/7410-hardware-update-and-analyzing-t he-hypertransport/ We have a customer (internal to the university here) that does high throughput gene sequencing. They like a server which can hold the large amounts of data, do a first pass analysis on it, and then serve it up over the network to a compute cluster for further computation. Oracle has nothing in their product line (anymore) to meet that need. They ended up ordering an 8U chassis w/40x 2TB drives in it, and are willing to pay the $2k/yr retail ransom to Oracle to run Solaris (ZFS) on it, at least for the first year. Maybe OpenIndiana next year, we'll see. Bye Oracle Regards, Marion ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, 8 Apr 2011, J.P. King wrote: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a But memory is much faster than either. It most situations the data would already be buffered in the X4540's memory so that it is instantly available. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 4/8/2011 4:50 PM, Bob Friesenhahn wrote: On Fri, 8 Apr 2011, J.P. King wrote: I can't speak for this particular situation or solution, but I think in principle you are wrong. Networks are fast. Hard drives are slow. Put a But memory is much faster than either. It most situations the data would already be buffered in the X4540's memory so that it is instantly available. Bob Certainly, as a low-end product, the X4540 (and X4500) offered unmatched flexibility and performance per dollar. It *is* sad to see them go. But, given Oracle's strategic direction, is anyone really surprised? PS - Nexenta, I think you've got a product position opportunity here... PPS - about the closest thing Oracle makes to the X4540 now is the X4270 M2 in the 2.5 drive config - 24 x 2.5 drives, 2 x Westmere-EP CPUs, in a 2U rack cabinet, somewhere around $25k (list) for the 24x500GB SATA model with (2) 6-core Westmeres + 16GB RAM. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
Can anyone comment on Solaris with zfs on HP systems? Do things work reliably? When there is trouble how many hoops does HP make you jump through (how painful is it to get a part replaced that isn't flat out smokin')? Have you gotten bounced between vendors? Thanks, Chris Erik Trimble wrote: Talk to HP then. They still sell Officially Supported Solaris servers and disk storage systems in more varieties than Oracle does. The StorageWorks 600 Modular Disk System may be what you're looking for (70 x 2.5 drives per enclosure, 5U, SAS/SATA/FC attachment to any server, $35k list price for 70TB). Or the StorageWorks 70 Modular Disk Array (25 x 2.5 drives, 1U, SAS attachment, $11k list price for 12.5TB) -Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 9/11 03:20 AM, Mark Sandrock wrote: On Apr 8, 2011, at 7:50 AM, Evaldas Aurylaevaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? One less x86 box (even more of an issue now we have to mortgage the children for support), a lot less $. Not to mention an existing infrastructure built using X4540s and me looking a fool explaining to the client they can't get any more so the systems we have spent two years building up are a dead end. One size does not fit all, choice is good for business. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
Sounds like many of us are in a similar situation. To clarify my original post. The goal here was to continue with what was a cost effective solution to some of our Storage requirements. I'm looking for hardware that wouldn't cause me to get the run around from the Oracle support folks, finger pointing between vendors, or have lots of grief from an untested combination of parts. If this isn't possible we'll certainly find a another solution. I already know it won't be the 7000 series. Thank you, Chris Banal For us the unfortunate answer to the situation was to abandon Oracle/Sun and ZFS entirely. Despite evaluating and considering ZFS on other platforms it just wasn't worth the trouble; we need storage today. While we will likely expand our existing fleet of X4540's as much as possible with JBOD that will be the end of that solution and our use of ZFS. Ultimately a large storage vendor (EMC) came to the table with a solution similar to the X4540 at a $/GB and $/iop level that no other vendor could even get close to. We will revisit this decision later depending on the progress of Illumos and others but for now things are still too uncertain to make the financial commitment. - Adam ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 9:39 PM, Ian Collins i...@ianshome.com wrote: On 04/ 9/11 03:20 AM, Mark Sandrock wrote: On Apr 8, 2011, at 7:50 AM, Evaldas Aurylaevaldas.aur...@edqm.eu wrote: On 04/ 8/11 01:14 PM, Ian Collins wrote: You have built-in storage failover with an AR cluster; and they do NFS, CIFS, iSCSI, HTTP and WebDav out of the box. And you have fairly unlimited options for application servers, once they are decoupled from the storage servers. It doesn't seem like much of a drawback -- although it may be for some smaller sites. I see AR clusters going in in local high schools and small universities. Which is all fine and dandy if you have a green field, or are willing to re-architect your systems. We just wanted to add a couple more x4540s! Hi, same here, it's a sad news that Oracle decided to stop x4540s production line. Before, ZFS geeks had choice - buy 7000 series if you want quick out of the box storage with nice GUI, or build your own storage with x4540 line, which by the way has brilliant engineering design, the choice is gone now. Okay, so what is the great advantage of an X4540 versus X86 server plus disk array(s)? One less x86 box (even more of an issue now we have to mortgage the children for support), a lot less $. Not to mention an existing infrastructure built using X4540s and me looking a fool explaining to the client they can't get any more so the systems we have spent two years building up are a dead end. One size does not fit all, choice is good for business. I'm not arguing. If it were up to me, we'd still be selling those boxes. Mark -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 04/ 9/11 03:53 PM, Mark Sandrock wrote: I'm not arguing. If it were up to me, we'd still be selling those boxes. Maybe you could whisper in the right ear? :) -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Apr 8, 2011, at 11:19 PM, Ian Collins i...@ianshome.com wrote: On 04/ 9/11 03:53 PM, Mark Sandrock wrote: I'm not arguing. If it were up to me, we'd still be selling those boxes. Maybe you could whisper in the right ear? I wish. I'd have a long list if I could do that. Mark :) -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On 4/8/2011 9:19 PM, Ian Collins wrote: On 04/ 9/11 03:53 PM, Mark Sandrock wrote: I'm not arguing. If it were up to me, we'd still be selling those boxes. Maybe you could whisper in the right ear? :) Three little words are all that Oracle Product Managers hear: Business case justification wry smile I want my J4000's back, too. And, I still want something like HP's MSA 70 (25 x 2.5 drive JBOD in a 2U formfactor) -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 no next-gen product?
On Fri, Apr 8 at 18:08, Chris Banal wrote: Can anyone comment on Solaris with zfs on HP systems? Do things work reliably? When there is trouble how many hoops does HP make you jump through (how painful is it to get a part replaced that isn't flat out smokin')? Have you gotten bounced between vendors? When I was choosing between HP and Dell about two years ago, the HP RAID adapter wasn't supported out-of-the-box by solaris, while the Dell T410/610/710 systems were using the Dell SAS-6i/R, which is a rebranded LSI 1068i-R adapter. I believe Dell's H200 is basically an LSI 9211-8i, which also works well. I can't comment on HP's support, I have no experience with it. We now self-support our software (OpenIndiana b148) --eric -- Eric D. Mudama edmud...@bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540 no next-gen product?
While I understand everything at Oracle is top secret these days. Does anyone have any insight into a next-gen X4500 / X4540? Does some other Oracle / Sun partner make a comparable system that is fully supported by Oracle / Sun? http://www.oracle.com/us/products/servers-storage/servers/previous-products/index.html What do X4500 / X4540 owners use if they'd like more comparable zfs based storage and full Oracle support? I'm aware of Nexenta and other cloned products but am specifically asking about Oracle supported hardware. However, does anyone know if these type of vendors will be at NAB this year? I'd like to talk to a few if they are... -- Thank you, Chris Banal ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 RIP
On Mon, Nov 08, 2010 at 11:51:02PM -0800, matthew patton wrote: I have this with 36 2TB drives (and 2 separate boot drives). http://www.colfax-intl.com/jlrid/SpotLight_more_Acc.asp?L=134S=58B=2267 That's just a Supermicro SC847. http://www.supermicro.com/products/chassis/4U/?chs=847 Stay away from the 24 port expander backplanes. I've gone thru several and they still don't work right - timeout and dropped drives under load. The 12-port works just fine connected to a variety of controllers. If you insist on the 24-port expander backplane, use a non-expander equipped LSI controller to drive it. What do you mean by non-expander equipped LSI controller? I got fed up with the 24-port expander board and went with -A1 (all independent) and that's worked much more reliably. Ray ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 RIP
On Nov 9, 2010, at 12:24 PM, Maurice Volaski wrote: http://www.supermicro.com/products/chassis/4U/?chs=847 Stay away from the 24 port expander backplanes. I've gone thru several and they still don't work right - timeout and dropped drives under load. The 12-port works just fine connected to a variety of controllers. If you insist on the 24-port expander backplane, use a non-expander equipped LSI controller to drive it. I was wondering if you can clarify. Isn't the case that all 24-port backplane utilize expander chips directly on the backplane to support their 24 ports or are they utilized only when something else, such as another 12-port backplane, is connected to one of the cascade ports in the back? I think he is referring to the different flavors of the 847, namely the one that uses expanders (E1, E2, E16, E26) vs. the one that does not (the 847A). This page about a storage server build does a very good job of detailing all the different versions of the 847: http://www.natecarlson.com/2010/05/07/review-supermicros-sc847a-4u-chassis-with-36-drive-bays/ --Ware ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540 RIP
Oracle have deleted the best ZFS platform I know, the X4540. Does anyone know of an equivalent system? None of the current Oracle/Sun offerings come close. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 RIP
I have this with 36 2TB drives (and 2 separate boot drives). http://www.colfax-intl.com/jlrid/SpotLight_more_Acc.asp?L=134S=58B=2267 It's not exactly the same (it has cons/pros), but it is definitely less expensive. I'm running b147 on it with an LSI controller. -Moazam On Mon, Nov 8, 2010 at 7:22 PM, Ian Collins i...@ianshome.com wrote: Oracle have deleted the best ZFS platform I know, the X4540. Does anyone know of an equivalent system? None of the current Oracle/Sun offerings come close. -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 RIP
I have this with 36 2TB drives (and 2 separate boot drives). http://www.colfax-intl.com/jlrid/SpotLight_more_Acc.asp?L=134S=58B=2267 That's just a Supermicro SC847. http://www.supermicro.com/products/chassis/4U/?chs=847 Stay away from the 24 port expander backplanes. I've gone thru several and they still don't work right - timeout and dropped drives under load. The 12-port works just fine connected to a variety of controllers. If you insist on the 24-port expander backplane, use a non-expander equipped LSI controller to drive it. I got fed up with the 24-port expander board and went with -A1 (all independent) and that's worked much more reliably. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner jel+...@cs.uni-magdeburg.de wrote: On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote: As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) Hmm - good point. What I'm trying to accomplish: Actually our current prototype thumper setup is: root pool (1x 2-way mirror SATA) hotspare (2x SATA shared) pool1 (12x 2-way mirror SATA) ~25% used user homes pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs So pool2 is not really a problem - delivers about 600MB/s uncached, about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) and is not contineously stressed. However sync write is ~ 200 MB/s or 20 MB/s and mirror, only. Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice usually via NFS and a litle bit via samba - a lot of more or less small files, probably widely spread over the platters. E.g. checkin' out a project from a svn|* repository into a home takes hours. Also having its workspace on NFS isn't fun (compared to linux xfs driven local soft 2-way mirror). Flash-based read cache should help here by minimizing (metadata) read latency, and flash-based log would bring down write latency. The only drawback of using single F20 is that you're trying to minimize both with the same device. So, seems to be a really interesting thing and I expect at least wrt. user homes a real improvement, no matter, how the final configuration will look like. Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 benchmarks? I guess at least if they turn out to be good enough, it wouldn't hurt ;-) Would be interesting indeed. Regards, Andrey Jens Elkner wrote: ... whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Mon, Dec 14, 2009 at 01:29:50PM +0300, Andrey Kuzmin wrote: On Mon, Dec 14, 2009 at 4:04 AM, Jens Elkner jel+...@cs.uni-magdeburg.de wrote: ... Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice ... Flash-based read cache should help here by minimizing (metadata) read latency, and flash-based log would bring down write latency. The only Hmmm not yet sure - I think writing via NFS is the biggest problem. Anyway, almost finished the work for a 'generic collector' and data visualizer which allows us to better correlate them to each other on the fly (i.e. no rrd pain) and understand the numbers hopefully a little bit better ;-). drawback of using single F20 is that you're trying to minimize both with the same device. Yepp. But would that scenario change much, when one puts 4 SSDs at HDD slots instead? I guess, not really or would be even worse because it disturbs the data path from/to HDD controlers. Anyway, I'll try that out next year, when those neat toys are officially supported (and the budget for this got its final approval of course). Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Sat, Dec 12, 2009 at 03:28:29PM +, Robert Milkowski wrote: Jens Elkner wrote: Hi Robert, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? IIRC the card presents 4x LUNs so you could use each of them for different purpose. You could also use different slices. oh. coool - IMHO this would be sufficient for our purposes (see next posting). me or not. Is this correct? It still does. The capacitor is not for flushing data to disks drives! The card has a small amount of DRAM memory on it which is being flushed to FLASH. Capacitor is to make sure it actually happens if the power is lost. Yepp - found the specs. (BTW: Was probably to late to think about the term Flash Accelerator having DRAM prestoserv in mind ;-)). Thanx, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote: As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) Hmm - good point. What I'm trying to accomplish: Actually our current prototype thumper setup is: root pool (1x 2-way mirror SATA) hotspare (2x SATA shared) pool1 (12x 2-way mirror SATA) ~25% used user homes pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs So pool2 is not really a problem - delivers about 600MB/s uncached, about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) and is not contineously stressed. However sync write is ~ 200 MB/s or 20 MB/s and mirror, only. Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/soffice usually via NFS and a litle bit via samba - a lot of more or less small files, probably widely spread over the platters. E.g. checkin' out a project from a svn|* repository into a home takes hours. Also having its workspace on NFS isn't fun (compared to linux xfs driven local soft 2-way mirror). So data are coming in/going out currently via 1Gbps aggregated NICs, for X4540 we plan to use one (may be experiment with two some time later) 10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in and out for the last PCIe 8x Slot - the F20. Since IO55 is bound with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 2 GB/s to and from the F20 should be possible. So IMHO wrt. bandwith basically it makes not really a difference, whether one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20 (even when distributing the SSDs over the IO55(2) and MCP55). However, having it on a separate HT than the HDDs might be an advantage. Also one would be much more flexible/able to scale immediately, i.e. don't need to re-organize the pools because of the now unavailable slots/ is still able to use all HDD slots with normal HDDs. (we are certainly going to upgrade x4500 to x4540 next year ...) (And if Sun makes a F40 - dropping the SAS ports and putting 4 other Flash-Modules on it or is able to get flashMods with double speed , one could probably really get ~ 1.2 GB write and ~ 2GB/s read). So, seems to be a really interesting thing and I expect at least wrt. user homes a real improvement, no matter, how the final configuration will look like. Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 benchmarks? I guess at least if they turn out to be good enough, it wouldn't hurt ;-) Jens Elkner wrote: ... whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
On Dec 13, 2009, at 5:04 PM, Jens Elkner wrote: On Sat, Dec 12, 2009 at 04:23:21PM +, Andrey Kuzmin wrote: As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) Hmm - good point. What I'm trying to accomplish: Actually our current prototype thumper setup is: root pool (1x 2-way mirror SATA) hotspare (2x SATA shared) pool1 (12x 2-way mirror SATA) ~25% used user homes pool2 (10x 2-way mirror SATA) ~25% used mm files, archives, ISOs So pool2 is not really a problem - delivers about 600MB/s uncached, about 1.8 GB/s cached (i.e. read a 2nd time, tested with a 3.8GB iso) and is not contineously stressed. However sync write is ~ 200 MB/s or 20 MB/s and mirror, only. Problem is pool1 - user homes! So GNOME/firefox/eclipse/subversion/ soffice usually via NFS and a litle bit via samba - a lot of more or less small files, probably widely spread over the platters. E.g. checkin' out a project from a svn|* repository into a home takes hours. Also having its workspace on NFS isn't fun (compared to linux xfs driven local soft 2-way mirror). This is probably a latency problem, not a bandwidth problem. Use zilstat to see how much ZIL traffic you have and, if the number is significant, consider using the F20 for a separate log device. -- richard So data are coming in/going out currently via 1Gbps aggregated NICs, for X4540 we plan to use one (may be experiment with two some time later) 10 Gbps NIC. So max. 2 GB/s read and write. This leaves still 2GB/s in and out for the last PCIe 8x Slot - the F20. Since IO55 is bound with 4GB/s bidirectional HT to the Mezzanine Connector1, in theory those 2 GB/s to and from the F20 should be possible. So IMHO wrt. bandwith basically it makes not really a difference, whether one puts 4 SSDs into HDD slots or using the 4 Flash-Modules on the F20 (even when distributing the SSDs over the IO55(2) and MCP55). However, having it on a separate HT than the HDDs might be an advantage. Also one would be much more flexible/able to scale immediately, i.e. don't need to re-organize the pools because of the now unavailable slots/ is still able to use all HDD slots with normal HDDs. (we are certainly going to upgrade x4500 to x4540 next year ...) (And if Sun makes a F40 - dropping the SAS ports and putting 4 other Flash-Modules on it or is able to get flashMods with double speed , one could probably really get ~ 1.2 GB write and ~ 2GB/s read). So, seems to be a really interesting thing and I expect at least wrt. user homes a real improvement, no matter, how the final configuration will look like. Maybe the experts at the source are able to do some 4x SSD vs. 1xF20 benchmarks? I guess at least if they turn out to be good enough, it wouldn't hurt ;-) Jens Elkner wrote: ... whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
Jens Elkner wrote: Hi, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? IIRC the card presents 4x LUNs so you could use each of them for different purpose. You could also use different slices. me or not. Is this correct? It still does. The capacitor is not for flushing data to disks drives! The card has a small amount of DRAM memory on it which is being flushed to FLASH. Capacitor is to make sure it actually happens if the power is lost. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) On 12/12/09, Robert Milkowski mi...@task.gda.pl wrote: Jens Elkner wrote: Hi, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? IIRC the card presents 4x LUNs so you could use each of them for different purpose. You could also use different slices. me or not. Is this correct? It still does. The capacitor is not for flushing data to disks drives! The card has a small amount of DRAM memory on it which is being flushed to FLASH. Capacitor is to make sure it actually happens if the power is lost. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Regards, Andrey ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 + SFA F20 PCIe?
Andrey Kuzmin wrote: As to whether it makes sense (as opposed to two distinct physical devices), you would have read cache hits competing with log writes for bandwidth. I doubt both will be pleased :-) As usual it depends on your workload. In many real-life scenarios the bandwidth probably won't be an issue. Then also keep in mind that you can put up-to 4 ssd modules on it and each module iirc is presented as a separate device anyway. So in order to get all the performance you need to make sure to issue I/O to all modules. -- Robert Milkowski http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540 + SFA F20 PCIe?
Hi, just got a quote from our campus reseller, that readzilla and logzilla are not available for the X4540 - hmm strange Anyway, wondering whether it is possible/supported/would make sense to use a Sun Flash Accelerator F20 PCIe Card in a X4540 instead of 2.5 SSDs? If so, is it possible to partition the F20, e.g. into 36 GB logzilla, 60GB readzilla (also interesting for other X servers)? Wrt. super capacitators: I would guess, at least wrt. X4540 it doesn't give one more protection, since if power is lost, the HDDs do not respond anymore and thus it doesn't matter, whether the log cache is protected for a short time or not. Is this correct? Regards, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 12768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 boot flash
CFs designed for the professional photography market have better specifications than CFs designed for the consumer market. CF is pretty cheap, you can pick up 16GB-32GB from $80-$200 depending on brand/quality. Assuming they do incorporate wear leveling, and considering even a fairly busy server isn't going to use up *that* much space (I have a couple E3000's still running which have 4GB disk mirrors for the OS), if you get a decent CF card I suppose it would quite possibly outlast the server. I think the dig against CF is that they tend to have a low write speed for small iops. They are optimized for writing large files, like photos. Would a 32GB-SanDisk Extreme® CompactFlash® Card 60MB/s (SDCFX-032G-P61) or a 16GB-SanDisk Extreme® CompactFlash® Card 60MB/s (SDCFX-016G-A61) qualify as a decent card, or is there other an other brand I should look for? Are 32GB supported at this point? How about UDMA 400x? -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
I have exactly these symptoms on 3 thumpers now. 2 x x4540s and 1 x x4500 Rebooting/Power cycling doesn't even bring them back. The only thing I found, is that if I boot from the osol.2009.06 Cd, I can see all the drives I had to reinstall the OS on one box. I've only just recently upgraded them to snv_122. Before that, I could change disks without problems. Could it be something introduced since snv_111? John -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Jorgen Lundman wrote: Finally came to the reboot maintenance to reboot the x4540 to make it see the newly replaced HDD. I tried, reboot, then power-cycle, and reboot -- -r, but I can not make the x4540 accept any HDD in that bay. I'm starting to think that perhaps we did not lose the original HDD, but rather the slot, and there is a hardware problem. This is what I see after a reboot, the disk is c1t5d0, sd37, s...@5,0 or slot 13. c1::dsk/c1t4d0 disk connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured unknown c1::dsk/c1t6d0 disk connectedconfigured unknown Does format show it? -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Nope, that it does not. Ian Collins wrote: Jorgen Lundman wrote: Finally came to the reboot maintenance to reboot the x4540 to make it see the newly replaced HDD. I tried, reboot, then power-cycle, and reboot -- -r, but I can not make the x4540 accept any HDD in that bay. I'm starting to think that perhaps we did not lose the original HDD, but rather the slot, and there is a hardware problem. This is what I see after a reboot, the disk is c1t5d0, sd37, s...@5,0 or slot 13. c1::dsk/c1t4d0 disk connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured unknown c1::dsk/c1t6d0 disk connectedconfigured unknown Does format show it? -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Jorgen Lundman wrote: Ian Collins wrote: Jorgen Lundman wrote: Finally came to the reboot maintenance to reboot the x4540 to make it see the newly replaced HDD. I tried, reboot, then power-cycle, and reboot -- -r, but I can not make the x4540 accept any HDD in that bay. I'm starting to think that perhaps we did not lose the original HDD, but rather the slot, and there is a hardware problem. This is what I see after a reboot, the disk is c1t5d0, sd37, s...@5,0 or slot 13. c1::dsk/c1t4d0 disk connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured unknown c1::dsk/c1t6d0 disk connectedconfigured unknown Does format show it? Nope, that it does not. Time to call the repair man! -- Ian. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Finally came to the reboot maintenance to reboot the x4540 to make it see the newly replaced HDD. I tried, reboot, then power-cycle, and reboot -- -r, but I can not make the x4540 accept any HDD in that bay. I'm starting to think that perhaps we did not lose the original HDD, but rather the slot, and there is a hardware problem. This is what I see after a reboot, the disk is c1t5d0, sd37, s...@5,0 or slot 13. c1::dsk/c1t4d0 disk connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured unknown c1::dsk/c1t6d0 disk connectedconfigured unknown # devfsadm -v devfsadm[893]: verbose: no devfs node or mismatched dev_t for /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:a devfsadm[893]: verbose: symlink /dev/dsk/c1t5d0s0 - ../../devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:a devfsadm[893]: verbose: no devfs node or mismatched dev_t for /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:b devfsadm[893]: verbose: symlink /dev/dsk/c1t5d0s1 - ../../devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:b [snip] Only messages in dmesg are: Aug 20 02:23:05 x4500-10.unix rootnex: [ID 349649 kern.info] xsvc1 at root: space 0 offset 0 Aug 20 02:23:05 x4500-10.unix genunix: [ID 936769 kern.info] xsvc1 is /x...@0,0 Aug 20 02:23:09 x4500-10.unix scsi: [ID 583861 kern.info] sd37 at mpt1: target 5 lun 0 Aug 20 02:23:09 x4500-10.unix genunix: [ID 936769 kern.info] sd37 is /p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0 Aug 20 02:23:09 x4500-10.unix pseudo: [ID 129642 kern.info] pseudo-device: devinfo0 Aug 20 02:23:09 x4500-10.unix genunix: [ID 936769 kern.info] devinfo0 is /pseudo/devi...@0 r...@x4500-10.unix # Aug 20 02:23:12 x400-10.unix genunix: WARNING: constraints forbid retire: /p...@3c,0/pci10de,3...@f/pci1000,1...@0/s...@7,0 # cd ../../devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/ r...@x4500-10.unix # ls -l ./s...@5,0:a: No such device or address ./s...@5,0:a,raw: No such device or address ./s...@5,0:b: No such device or address ./s...@5,0:b,raw: No such device or address ./s...@5,0:c: No such device or address [snip lots, these errors only show up the first time you ls] total 24 drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@0,0 brw-r- 1 root sys 30, 2048 Jul 6 09:34 s...@0,0:a crw-r- 1 root sys 30, 2048 Jul 6 09:34 s...@0,0:a,raw brw-r- 1 root sys 30, 2049 Jul 6 09:34 s...@0,0:b crw-r- 1 root sys 30, 2049 Jul 6 09:34 s...@0,0:b,raw [snip] crw-r- 1 root sys 30, 2067 Jul 6 09:44 s...@0,0:t,raw brw-r- 1 root sys 30, 2068 Jul 6 09:50 s...@0,0:u crw-r- 1 root sys 30, 2068 Jul 6 09:44 s...@0,0:u,raw drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@1,0 brw-r- 1 root sys 30, 2112 Jul 6 09:50 s...@1,0:a crw-r- 1 root sys 30, 2112 Jul 6 09:48 s...@1,0:a,raw brw-r- 1 root sys 30, 2113 Jul 6 09:50 s...@1,0:b [snip] brw-r- 1 root sys 30, 2132 Jul 6 09:50 s...@1,0:u crw-r- 1 root sys 30, 2132 Jul 6 09:48 s...@1,0:u,raw brw-r- 1 root sys 30, 2119 Aug 20 02:23 s...@1,0:wd crw-r- 1 root sys 30, 2119 Aug 20 02:23 s...@1,0:wd,raw drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@2,0 brw-r- 1 root sys 30, 2176 Jul 6 09:50 s...@2,0:a crw-r- 1 root sys 30, 2176 Jul 6 09:48 s...@2,0:a,raw brw-r- 1 root sys 30, 2177 Jul 6 09:50 s...@2,0:b [snip] brw-r- 1 root sys 30, 2196 Jul 6 09:50 s...@2,0:u crw-r- 1 root sys 30, 2196 Jul 6 09:48 s...@2,0:u,raw brw-r- 1 root sys 30, 2183 Aug 20 02:23 s...@2,0:wd crw-r- 1 root sys 30, 2183 Aug 20 02:23 s...@2,0:wd,raw drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@3,0 brw-r- 1 root sys 30, 2240 Jul 2 15:30 s...@3,0:a crw-r- 1 root sys 30, 2240 Jul 6 09:48 s...@3,0:a,raw brw-r- 1 root sys 30, 2241 Jul 6 09:50 s...@3,0:b [snip] brw-r- 1 root sys 30, 2260 Jul 6 09:50 s...@3,0:u crw-r- 1 root sys 30, 2260 Jul 6 09:48 s...@3,0:u,raw brw-r- 1 root sys 30, 2247 Jul 6 09:50 s...@3,0:wd crw-r- 1 root sys 30, 2247 Jul 6 09:43 s...@3,0:wd,raw drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@4,0 brw-r- 1 root sys 30, 2304 Jul 6 09:50 s...@4,0:a crw-r- 1 root sys 30, 2304 Jul 6 09:48 s...@4,0:a,raw brw-r- 1 root sys 30, 2305 Jul 6 09:50 s...@4,0:b [snip] brw-r- 1 root sys 30, 2324 Jul 6 09:50 s...@4,0:u crw-r- 1 root sys 30, 2324 Jul 6 09:48 s...@4,0:u,raw brw-r- 1 root sys 30, 2311 Aug 20 02:23 s...@4,0:wd crw-r- 1 root sys
[zfs-discuss] x4540 dead HDD replacement, remains configured.
x4540 snv_117 We lost a HDD last night, and it seemed to take out most of the bus or something and forced us to reboot. (We have yet to experience losing a disk that didn't force a reboot mind you). So today, I'm looking at replacing the broken HDD, but no amount of work makes it turn on the blue LED. After trying that for an hour, we just replaced the HDD anyway. But no amount of work will make it use/recognise it. (We tried more than one working spare HDD too). For example: # zpool status raidz1 DEGRADED 0 0 0 c5t1d0ONLINE 0 0 0 c0t5d0ONLINE 0 0 0 spare DEGRADED 0 0 285K c1t5d0 UNAVAIL 0 0 0 cannot open c4t7d0 ONLINE 0 0 0 4.13G resilvered c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 spares c4t7d0 INUSE currently in use # zpool offline zpool1 c1t5d0 raidz1 DEGRADED 0 0 0 c5t1d0ONLINE 0 0 0 c0t5d0ONLINE 0 0 0 spare DEGRADED 0 0 285K c1t5d0 OFFLINE 0 0 0 c4t7d0 ONLINE 0 0 0 4.13G resilvered c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 # cfgadm -al Ap_Id Type Receptacle Occupant Condition c1 scsi-bus connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured failed # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -fc unconfigure c1::dsk/c1t5d0 # cfgadm -fc unconfigure c1::dsk/c1t5d0 # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # hdadm offline slot 13 1:5:9: 13: 17: 21: 25: 29: 33: 37: 41: 45: c0t1 c0t5 c1t1 c1t5 c2t1 c2t5 c3t1 c3t5 c4t1 c4t5 c5t1 c5t5 ^b+ ^++ ^b+ ^-- ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # fmadm faulty FRU : HD_ID_47 (hc://:product-id=Sun-Fire-X4540:chassis-id=0915AMR048:server-id=x4500-10.unix:serial=9QMB024K:part=SEAGATE-ST35002NSSUN500G-09107B024K:revision=SU0D/chassis=0/bay=47/disk=0) faulty # fmadm repair HD_ID_47 fmadm: recorded repair to HD_ID_47 # format | grep c1t5d0 # # hdadm offline slot 13 1:5:9: 13: 17: 21: 25: 29: 33: 37: 41: 45: c0t1 c0t5 c1t1 c1t5 c2t1 c2t5 c3t1 c3t5 c4t1 c4t5 c5t1 c5t5 ^b+ ^++ ^b+ ^-- ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # ipmitool sunoem led get|grep 13 hdd13.fail.led | ON hdd13.ok2rm.led | OFF # zpool online zpool1 c1t5d0 warning: device 'c1t5d0' onlined, but remains in faulted state use 'zpool replace' to replace devices that are no longer present # cfgadm -c disconnect c1::dsk/c1t5d0 cfgadm: Hardware specific failure: operation not supported for SCSI device Bah, why were they changed to SCSI? Increasing the size of the hammer... # cfgadm -x replace_device c1::sd37 Replacing SCSI device: /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0 This operation will suspend activity on SCSI bus: c1 Continue (yes/no)? y SCSI bus quiesced successfully. It is now safe to proceed with hotplug operation. Enter y if operation is complete or n to abort (yes/no)? y # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed I am fairly certain that if I reboot, it will all come back ok again. But I would like to believe that I should be able to replace a disk without rebooting on a X4540. Any other commands I should try? Lund -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
I suspect this is what it is all about: # devfsadm -v devfsadm[16283]: verbose: no devfs node or mismatched dev_t for /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:a [snip] and indeed: brw-r- 1 root sys 30, 2311 Aug 6 15:34 s...@4,0:wd crw-r- 1 root sys 30, 2311 Aug 6 15:24 s...@4,0:wd,raw drwxr-xr-x 2 root sys2 Aug 6 14:31 s...@5,0 drwxr-xr-x 2 root sys2 Apr 17 17:52 s...@6,0 brw-r- 1 root sys 30, 2432 Jul 6 09:50 s...@6,0:a crw-r- 1 root sys 30, 2432 Jul 6 09:48 s...@6,0:a,raw Perhaps because it was booted with the dead disk in place, it never configured the entire sd5 mpt driver. Why the other hard-disks work I don't know. I suspect the only way to fix this, is to reboot again. Lund Jorgen Lundman wrote: x4540 snv_117 We lost a HDD last night, and it seemed to take out most of the bus or something and forced us to reboot. (We have yet to experience losing a disk that didn't force a reboot mind you). So today, I'm looking at replacing the broken HDD, but no amount of work makes it turn on the blue LED. After trying that for an hour, we just replaced the HDD anyway. But no amount of work will make it use/recognise it. (We tried more than one working spare HDD too). For example: # zpool status raidz1 DEGRADED 0 0 0 c5t1d0ONLINE 0 0 0 c0t5d0ONLINE 0 0 0 spare DEGRADED 0 0 285K c1t5d0 UNAVAIL 0 0 0 cannot open c4t7d0 ONLINE 0 0 0 4.13G resilvered c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 spares c4t7d0 INUSE currently in use # zpool offline zpool1 c1t5d0 raidz1 DEGRADED 0 0 0 c5t1d0ONLINE 0 0 0 c0t5d0ONLINE 0 0 0 spare DEGRADED 0 0 285K c1t5d0 OFFLINE 0 0 0 c4t7d0 ONLINE 0 0 0 4.13G resilvered c2t5d0ONLINE 0 0 0 c3t5d0ONLINE 0 0 0 # cfgadm -al Ap_Id Type Receptacle Occupant Condition c1 scsi-bus connectedconfigured unknown c1::dsk/c1t5d0 disk connectedconfigured failed # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -c unconfigure c1::dsk/c1t5d0 # cfgadm -fc unconfigure c1::dsk/c1t5d0 # cfgadm -fc unconfigure c1::dsk/c1t5d0 # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # hdadm offline slot 13 1:5:9: 13: 17: 21: 25: 29: 33: 37: 41: 45: c0t1 c0t5 c1t1 c1t5 c2t1 c2t5 c3t1 c3t5 c4t1 c4t5 c5t1 c5t5 ^b+ ^++ ^b+ ^-- ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # fmadm faulty FRU : HD_ID_47 (hc://:product-id=Sun-Fire-X4540:chassis-id=0915AMR048:server-id=x4500-10.unix:serial=9QMB024K:part=SEAGATE-ST35002NSSUN500G-09107B024K:revision=SU0D/chassis=0/bay=47/disk=0) faulty # fmadm repair HD_ID_47 fmadm: recorded repair to HD_ID_47 # format | grep c1t5d0 # # hdadm offline slot 13 1:5:9: 13: 17: 21: 25: 29: 33: 37: 41: 45: c0t1 c0t5 c1t1 c1t5 c2t1 c2t5 c3t1 c3t5 c4t1 c4t5 c5t1 c5t5 ^b+ ^++ ^b+ ^-- ^++ ^++ ^++ ^++ ^++ ^++ ^++ ^++ # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed # ipmitool sunoem led get|grep 13 hdd13.fail.led | ON hdd13.ok2rm.led | OFF # zpool online zpool1 c1t5d0 warning: device 'c1t5d0' onlined, but remains in faulted state use 'zpool replace' to replace devices that are no longer present # cfgadm -c disconnect c1::dsk/c1t5d0 cfgadm: Hardware specific failure: operation not supported for SCSI device Bah, why were they changed to SCSI? Increasing the size of the hammer... # cfgadm -x replace_device c1::sd37 Replacing SCSI device: /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0 This operation will suspend activity on SCSI bus: c1 Continue (yes/no)? y SCSI bus quiesced successfully. It is now safe to proceed with hotplug operation. Enter y if operation is complete or n to abort (yes/no)? y # cfgadm -al c1::dsk/c1t5d0 disk connectedconfigured failed I am fairly certain that if I reboot, it will all come back ok again. But I would like to believe that I should be able to replace a disk without rebooting on a X4540. Any other commands I should try? Lund -- Jorgen
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
On Wed, Aug 5, 2009 at 11:48 PM, Jorgen Lundmanlund...@gmo.jp wrote: I suspect this is what it is all about: # devfsadm -v devfsadm[16283]: verbose: no devfs node or mismatched dev_t for /devices/p...@0,0/pci10de,3...@b/pci1000,1...@0/s...@5,0:a [snip] and indeed: brw-r- 1 root sys 30, 2311 Aug 6 15:34 s...@4,0:wd crw-r- 1 root sys 30, 2311 Aug 6 15:24 s...@4,0:wd,raw drwxr-xr-x 2 root sys 2 Aug 6 14:31 s...@5,0 drwxr-xr-x 2 root sys 2 Apr 17 17:52 s...@6,0 brw-r- 1 root sys 30, 2432 Jul 6 09:50 s...@6,0:a crw-r- 1 root sys 30, 2432 Jul 6 09:48 s...@6,0:a,raw Perhaps because it was booted with the dead disk in place, it never configured the entire sd5 mpt driver. Why the other hard-disks work I don't know. I suspect the only way to fix this, is to reboot again. Lund I have a pair of X4540's also, and getting any kind of drive status, or failure alert is a lost cause. I've opened several cases with Sun with the following issues: ILOM/BMC can't see any drives (status, FRU, firmware, etc) FMA cannot see a drive failure (you can pull a drive, and it could be hours before 'zpool status' will show a failed drive, even during a 'zfs scrub') Hot swapping drives rarely works, system will not see new drive until a reboot Things I've tried that Sun has suggested: New BIOS New controller firmware New ILOM firmware Upgrading to new releases of Osol (currently on 118, no luck) Replacing ILOM card Custom FMA configs Nothing works, and my cases with Sun have been open for about 6 months now, with no resolution in sight. Given that Sun now makes the 7000, I can only assume their support on the more whitebox version, AKA X4540, is either near an end, or they don't intend to support any advanced monitoring whatsoever. Sad, really.. as my $900 Dell and HP servers can send SMS, Jabber messages, SNMP traps, etc, on ANY IPMI event, hardware issue, and what have you without any tinkering or excuses. -- Brent Jones br...@servuhome.net ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Whoah! We have yet to experience losing a disk that didn't force a reboot Do you have any notes on how many times this has happened Jorgen, or what steps you've taken each time? I appreciate you're probably more concerned with getting an answer to your question, but if ZFS needs a reboot to cope with failures on even an x4540, that's an absolute deal breaker for everything we want to do with ZFS. Ross -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 dead HDD replacement, remains configured.
Well, to be fair, there were some special cases. I know we had 3 separate occasions with broken HDDs, when we were using UFS. 2 of these appeared to hang, and the 3rd only hung once we replaced the disk. This is most likely due to use using UFS in zvol (for quotas). We got an IDR patch, and eventually this was released as UFS 3-way deadlock writing log with zvol. I forget the number right now, but the patch is out. This is the very first time we have lost a disk in a purely-ZFS system, and I was somewhat hoping that this would be the time everything went smoothly. But it did not. However, I have also experienced (once) a disk dying in such a way that it took out the chain in a netapp, so perhaps the disk died like this here to (it is really dead). But still disappointing. Power cycling the x4540 takes about 7 minutes (service to service), but with Sol svn116(?) and up it can do quiesce-reboots, which take about 57 seconds. In this case, we had to power cycle. Ross wrote: Whoah! We have yet to experience losing a disk that didn't force a reboot Do you have any notes on how many times this has happened Jorgen, or what steps you've taken each time? I appreciate you're probably more concerned with getting an answer to your question, but if ZFS needs a reboot to cope with failures on even an x4540, that's an absolute deal breaker for everything we want to do with ZFS. Ross -- Jorgen Lundman | lund...@lundman.net Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work) Shibuya-ku, Tokyo| +81 (0)90-5578-8500 (cell) Japan| +81 (0)3 -3375-1767 (home) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 boot flash
On Sat, 6 Jun 2009, Richard Elling wrote: The presumption is that you are using UFS for the CF, not ZFS. UFS is not COW, so there is a potential endurance problem for blocks which are known to be rewritten many times. ZFS will not have this problem, so if you use ZFS root, you are better served by ignoring the previous advice. My understanding was that all modern CF cards incorporate wear leveling, and I was interpreting the recommendation as trying to prevent wearing out the entire card, not necessarily particular blocks. of writes to the swap device. For OpenSolaris (enterprise support contracts now available!) which uses ZFS for swap, don't worry, be As of U6, even luddite S10 users can avail of zfs for boot/swap/dump: r...@ike ~ # uname -a SunOS ike 5.10 Generic_138889-08 i86pc i386 i86pc r...@ike ~ # swap -l swapfile dev swaplo blocks free /dev/zvol/dsk/ospool/swap 181,2 8 8388600 8388600 In short, if you use ZFS for root, ignore the warnings. How about the lack of redundancy? Is the failure rate for CF so low there's no risk in running a critical server without a mirrored root pool? And what about bit rot? Without redundancy zfs can only detect but not correct read errors (unless, I suppose, configured with copies1). How much more would it have cost to include two CF slots that it wasn't warranted? 5 GBytes seems pretty large for a slog, but yes, I think this is a good idea. What is the best formula to calculate slog size? I found a recent thread: http://jp.opensolaris.org/jive/thread.jspa?threadID=78758tstart=1 in which a Sun engineer (presumably unofficially of course ;) ) mentioned 10-18GB as more than sufficent. On the other hand: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29 says A rule of thumb is that you should size the separate log to be able to handle 10 seconds of your expected synchronous write workload. It would be rare to need more than 100 MBytes in a separate log device, but the separate log must be at least 64 MBytes. Big gap between 100MB and 10-18GB. The first thread also mentioned in passing that splitting up an SSD between slog and root pool might have undesirable performance issues, although I don't think that was discussed to resolution. CFs designed for the professional photography market have better specifications than CFs designed for the consumer market. CF is pretty cheap, you can pick up 16GB-32GB from $80-$200 depending on brand/quality. Assuming they do incorporate wear leveling, and considering even a fairly busy server isn't going to use up *that* much space (I have a couple E3000's still running which have 4GB disk mirrors for the OS), if you get a decent CF card I suppose it would quite possibly outlast the server. But I think I'd still rather have two 8-/. Show of hands, anybody with an x4540 that's booting off non-redundant CF? This is not an accurate statement. Enterprise-class SSDs (eg. STEC Zeus) have DRAM write buffers. The Flash Mini-DIMMs Sun uses also have DRAM write buffers. These offer very low write latency for slogs. Yah, that misconception has already been pointed out to me offlist. I actually came upon it in correspondence with you, I had asked about using a slice of an SSD for a slog rather than the whole disk, and you mentioned that the advice for using the whole disk rather than a slice was only for traditional spinning hard drives and didn't apply to SSD's, I thought because of something to do with the write cache but I guess I misunderstood. I didn't save that message, perhaps you could be kind enough to refresh my memory as to why slices of SSD's are ok while slices of hard disks are best avoided? -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] x4540 boot flash
Paul B. Henson wrote: On Sat, 6 Jun 2009, Richard Elling wrote: The presumption is that you are using UFS for the CF, not ZFS. UFS is not COW, so there is a potential endurance problem for blocks which are known to be rewritten many times. ZFS will not have this problem, so if you use ZFS root, you are better served by ignoring the previous advice. My understanding was that all modern CF cards incorporate wear leveling, and I was interpreting the recommendation as trying to prevent wearing out the entire card, not necessarily particular blocks. Wear leveling is an attempt to solve the problem of multiple writes to the same physical block. of writes to the swap device. For OpenSolaris (enterprise support contracts now available!) which uses ZFS for swap, don't worry, be As of U6, even luddite S10 users can avail of zfs for boot/swap/dump: r...@ike ~ # uname -a SunOS ike 5.10 Generic_138889-08 i86pc i386 i86pc r...@ike ~ # swap -l swapfile dev swaplo blocks free /dev/zvol/dsk/ospool/swap 181,2 8 8388600 8388600 Yes, and as you can see, my attempts to get the verbiage changed have failed :-( In short, if you use ZFS for root, ignore the warnings. How about the lack of redundancy? Is the failure rate for CF so low there's no risk in running a critical server without a mirrored root pool? And what about bit rot? Without redundancy zfs can only detect but not correct read errors (unless, I suppose, configured with copies1). How much more would it have cost to include two CF slots that it wasn't warranted? The failure rate is much lower than disks, with the exception of the endurance problem. Flash memory is not susceptible to the bit rot that plaques magnetic media. Nor is flash memory susceptible to the radiation-induced bit flips that plague DRAMs. Or, to look at this another way, billions of consumer electronics devices use a single flash boot disk and there doesn't seem to be many people complaining they aren't mirrored. Indeed, even if you have a mirrored OS on flash, you don't have a mirrored OBP or BIOS (which is also on flash). So, the risk here is significantly lower than HDDs. 5 GBytes seems pretty large for a slog, but yes, I think this is a good idea. What is the best formula to calculate slog size? I found a recent thread: http://jp.opensolaris.org/jive/thread.jspa?threadID=78758tstart=1 in which a Sun engineer (presumably unofficially of course ;) ) mentioned 10-18GB as more than sufficent. On the other hand: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29 says A rule of thumb is that you should size the separate log to be able to handle 10 seconds of your expected synchronous write workload. It would be rare to need more than 100 MBytes in a separate log device, but the separate log must be at least 64 MBytes. This was a ROT when the default txg sync time was 5 seconds... I'll update this soon because that is no longer the case. Big gap between 100MB and 10-18GB. The first thread also mentioned in passing that splitting up an SSD between slog and root pool might have undesirable performance issues, although I don't think that was discussed to resolution. Yep, big gap. This is why I wrote zilstat, so that you can see what your workload might use before committing to a slog. There may be a good zilstat RFE here: I can see when the txg commits, so zilstat should be able to collect per-txg rather than per-time-period. Consider it added to my todo list. http://www.richardelling.com/Home/scripts-and-programs-1/zilstat CFs designed for the professional photography market have better specifications than CFs designed for the consumer market. CF is pretty cheap, you can pick up 16GB-32GB from $80-$200 depending on brand/quality. Assuming they do incorporate wear leveling, and considering even a fairly busy server isn't going to use up *that* much space (I have a couple E3000's still running which have 4GB disk mirrors for the OS), if you get a decent CF card I suppose it would quite possibly outlast the server. I think the dig against CF is that they tend to have a low write speed for small iops. They are optimized for writing large files, like photos. But I think I'd still rather have two 8-/. Show of hands, anybody with an x4540 that's booting off non-redundant CF? This is not an accurate statement. Enterprise-class SSDs (eg. STEC Zeus) have DRAM write buffers. The Flash Mini-DIMMs Sun uses also have DRAM write buffers. These offer very low write latency for slogs. Yah, that misconception has already been pointed out to me offlist. I actually came upon it in correspondence with you, I had asked about using a slice of an SSD for a slog rather than the whole disk, and you mentioned that the advice for using the whole disk rather than a slice was only for traditional spinning hard drives and didn't apply
Re: [zfs-discuss] x4540 boot flash
Paul B. Henson wrote: So I was looking into the boot flash feature of the newer x4540, and evidently it is simply a CompactFlash slot, with all of the disadvantages and limitations of that type of media. The sun deployment guide recommends minimizing writes to a CF boot device, in particular by NFS mounting /var from a different server, disabling swap or swapping to a different device, and doing all logging over the network. argv. So we've had the discussion many times over the past 4 years about why these recommendations are largely bogus. Alas, once published, they seem to live forever. The presumption is that you are using UFS for the CF, not ZFS. UFS is not COW, so there is a potential endurance problem for blocks which are known to be rewritten many times. ZFS will not have this problem, so if you use ZFS root, you are better served by ignoring the previous advice. For additional background, if you worry about UFS and endurance, then you want to avoid all writes, because metadata is at fixed locations, and you could potentially hit endurance problems at those locations. Some people think that /var collects a lot of writes, and it might if you happen to be running a high-volume e-mail server using sendmail. Since almost nobody does that in today's internet, the risk is quite small. The second thought was that you will be swapping often and therefore you want to avoid the endurance problem which affects swap (where the swap device is raw, not a file system). In practice, if you have a lot of swap activity, then your performance will stink and you will be more likely to actually buy some RAM to solve the problem. Also, most modern machines are overconfigured for RAM, so the actual swap device usage for modern machines is typically low. I had some data which validated this assumption, about 4 years ago. It is easy to monitor swap usage, so see for yourself if your workload does a lot of writes to the swap device. For OpenSolaris (enterprise support contracts now available!) which uses ZFS for swap, don't worry, be happy. In short, if you use ZFS for root, ignore the warnings. Not exactly a configuration I would prefer. My sales SE said most people weren't utilizing the CF boot feature. The concept is nice, but an implementation with SSD quality flash rather than basic CF (also, preferably redundant devices) would have been better. It depends on the market. In telco, many people use CF for boot because they are much more reliable under much more diverse environmental conditions than magnetic media. If I had an x4540 (which I don't, unfortunately, we picked up a half dozen x4500's just before they were end of sale'd), what I think would be interesting to do would be install two of the 32GB SSD disks in the boot slots, use a 1-5GB sliced mirror as a slog, and the remaining 27-31GB as a sliced mirrored root pool. 5 GBytes seems pretty large for a slog, but yes, I think this is a good idea. From what I understand you don't need very much space for an effective slog, and SSD's don't have the write failure limitations of CF. CFs designed for the professional photography market have better specifications than CFs designed for the consumer market. Also, the recommendation for giving ZFS entire discs rather than slices evidently isn't applicable to SSD's as they don't have a write cache. It seems this approach would give you a blazing fast slog, as well as a redundant boot mirror without having to waste an additional two SATA slots. This is not an accurate statement. Enterprise-class SSDs (eg. STEC Zeus) have DRAM write buffers. The Flash Mini-DIMMs Sun uses also have DRAM write buffers. These offer very low write latency for slogs. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] x4540 boot flash
So I was looking into the boot flash feature of the newer x4540, and evidently it is simply a CompactFlash slot, with all of the disadvantages and limitations of that type of media. The sun deployment guide recommends minimizing writes to a CF boot device, in particular by NFS mounting /var from a different server, disabling swap or swapping to a different device, and doing all logging over the network. Not exactly a configuration I would prefer. My sales SE said most people weren't utilizing the CF boot feature. The concept is nice, but an implementation with SSD quality flash rather than basic CF (also, preferably redundant devices) would have been better. If I had an x4540 (which I don't, unfortunately, we picked up a half dozen x4500's just before they were end of sale'd), what I think would be interesting to do would be install two of the 32GB SSD disks in the boot slots, use a 1-5GB sliced mirror as a slog, and the remaining 27-31GB as a sliced mirrored root pool. From what I understand you don't need very much space for an effective slog, and SSD's don't have the write failure limitations of CF. Also, the recommendation for giving ZFS entire discs rather than slices evidently isn't applicable to SSD's as they don't have a write cache. It seems this approach would give you a blazing fast slog, as well as a redundant boot mirror without having to waste an additional two SATA slots. If anybody would like to donate an x4540 to a budget stricken California State University I'd be happy to test it out and report back ;). Given we just found out today that the entire summer quarter schedule of classes has been canceled due to budget cuts :(, I don't see new hardware in our future anytime soon sigh... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
Paul B. Henson wrote: On Wed, 13 May 2009, Richard Elling wrote: If I wanted to swap between a 32GB SSD and a 1TB SATA drive, I guess I would need to make a partition/slice on the TB drive of exactly the size of the SSD? Yes, but note that an SMI label hangs onto the outdated notion of cylinders and you can't make a slice except on cylinder boundaries. Hmm... So I probably wouldn't be able to use the entire SSD, but instead create a partition on both the SSD and the SATA drive of the same size? They wouldn't necessarily have the same cylinder size, right? So I'd have to find the least common multiple of the cylinder sizes and create partitions appropriately. You can always change the cylinder sizes to suit, or use EFI labels. In general I know it is recommended to give ZFS the entire disk, in the specific case of the ZIL will there be any performance degradation if it is on a slice of the SSD rather than the entire disk? The use full disk recommendation should make zero difference for an SSD. It only applies to HDDs with volatile write buffers (caches). In that case, the pool knows the log device is failed. So, if I understand correctly, if the log device fails while the pool is active, the log device is marked faulty, logging returns to in-pool, and everything works perfectly fine and happy like until the log device is replaced? It would seem the only difference between a pool without a log device and one with a failed log device is that the latter knows it used to have a log device? If so, it would seem trivial to support removing a log device from a pool, unless I'm misunderstanding why has that not been implemented? This is CR6574286 http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6574286 -- richard Just as in the disabled ZIL case, the on-disk format is still correct. It is client applications that may be inconsistent. There may be a way to recover the pool, Sun Service will have a more definitive stance. Eh, Sun Service doesn't necessarily like definitiveness :). However, I do have multiple service contracts, and probably will open a ticket requesting further details on upcoming log improvements and recovery modes. It would be nicer to hear it straight from the source (hint hint hint ;) ), but barring that hopefully I can get it escalated to someone who can fill in the gaps. Thanks much... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
On Wed, 13 May 2009, Richard Elling wrote: If I wanted to swap between a 32GB SSD and a 1TB SATA drive, I guess I would need to make a partition/slice on the TB drive of exactly the size of the SSD? Yes, but note that an SMI label hangs onto the outdated notion of cylinders and you can't make a slice except on cylinder boundaries. Hmm... So I probably wouldn't be able to use the entire SSD, but instead create a partition on both the SSD and the SATA drive of the same size? They wouldn't necessarily have the same cylinder size, right? So I'd have to find the least common multiple of the cylinder sizes and create partitions appropriately. In general I know it is recommended to give ZFS the entire disk, in the specific case of the ZIL will there be any performance degradation if it is on a slice of the SSD rather than the entire disk? In that case, the pool knows the log device is failed. So, if I understand correctly, if the log device fails while the pool is active, the log device is marked faulty, logging returns to in-pool, and everything works perfectly fine and happy like until the log device is replaced? It would seem the only difference between a pool without a log device and one with a failed log device is that the latter knows it used to have a log device? If so, it would seem trivial to support removing a log device from a pool, unless I'm misunderstanding why has that not been implemented? Just as in the disabled ZIL case, the on-disk format is still correct. It is client applications that may be inconsistent. There may be a way to recover the pool, Sun Service will have a more definitive stance. Eh, Sun Service doesn't necessarily like definitiveness :). However, I do have multiple service contracts, and probably will open a ticket requesting further details on upcoming log improvements and recovery modes. It would be nicer to hear it straight from the source (hint hint hint ;) ), but barring that hopefully I can get it escalated to someone who can fill in the gaps. Thanks much... -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
Paul B. Henson wrote: I see Sun has recently released part number XRA-ST1CH-32G2SSD, a 32GB SATA SSD for the x4540 server. I didn't find that exact part number, but I notice that manufacturing part 371-4196 32GB Solid State Drive, SATA Interface is showing up in a number of systems. IIRC, this would be an Intel X25-E. (shock rated at 1,000 Gs @ 0.5ms, so it should still work if I fall off my horse ;-) We have five x4500's we purchased last year that we are deploying to provide file and web services to our users. One issue that we have had is horrible performance for the single threaded process creating lots of small files over NFS scenario. The bottleneck in that case is fairly clear, and to verify it we temporarily disabled the ZIL on one of the servers. Extraction time for a large tarball into an NFSv4 mounted filesystem dropped from 20 minutes to 2 minutes. Obviously, it is strongly recommended not to run with the ZIL disabled, and we don't particularly want to do so in production. However, for some of our users, performance is simply unacceptable for various usage cases (including not only tar extracts, but other common software development processes such as svn checkouts). Yep. Same sort of workload. As such, we have been investigating the possibility of improving performance via a slog, preferably on some type of NVRAM or SSD. We haven't really found anything appropriate, and now we see Sun has officially released something very possibly like what we have been looking for. My sales rep tells me the drive is only qualified for use in an x4540. However, as a standard SATA interface SSD there is theoretically no reason why it would not work in an x4500, they even share the exact same drive sleds. I was told Sun just didn't want to spend the time/effort to qualify it for the older hardware (kind of sucks that servers we bought less than a year ago are being abandoned). We are considering using them anyway, in the worst case if Sun support complains that they are installed and refuses to continue any diagnostic efforts, presumably we can simply swap them out for standard hard drives. slog devices can be replaced like any other zfs vdev, correct? Or alternatively, what is the state of removing a slog device and reverting back to a pool embedded log? Generally, Sun doesn't qualify new devices with EOLed systems. Today, you can remove a cache device, but not a log device. You can replace a log device. Before you start down this path, you should take a look at the workload using zilstat, which will show you the kind of work the ZIL is doing. If you don't see any ZIL activity, no need to worry about a separate log. http://www.richardelling.com/Home/scripts-and-programs-1/zilstat If you decide you need a log device... read on. Usually, the log device does not need to be very big. A good strategy would be to create a small partition or slice, say 1 GByte, on an idle disk. Add this as a log device to the pool. If this device is a HDD, then you might not see much of a performance boost. But now that you have a log device setup, you can experiment with replacing the log device with another. You won't be able to remove the log device, but you can relocate or grow it on the fly. So, has anyone played with this new SSD in an x4500 and can comment on whether or not they seemed to work okay? I can't imagine no one inside of Sun, regardless of official support level, hasn't tried it :). Feel free to post anonymously or reply off list if you don't want anything on the record ;). From reviewing the Sun hybrid storage documentation, it describes two different flash devices, the Logzilla, optimized for blindingly fast writes and intended as a ZIL slog, and the Cachezilla, optimized for fast reads and intended for use as L2ARC. Is this one of those, or some other device? If the latter, what are its technical read/write performance characteristics? Intel claims 3,300 4kByte random write iops. A really fast HDD may reach 300 4kByte random write iops, but there are no really fast SATA HDDs. http://www.intel.com/design/flash/nand/extreme/index.htm We currently have all 48 drives allocated, 23 mirror pairs and two hot spares. Is there any timeline on the availability of removing an active vdev from a pool, which would allow us to swap out a couple of devices without destroying and having to rebuild our pool? My rule of thumb is to have a hot spare. Having lots of hot spares only makes a big difference for sites where you cannot service the systems within a few days, such as remote locations. But you can remove a hot spare, so that could be a source of your experimental 1 GByte log. What is the current state of behavior in the face of slog failure? It depends on both the failure and event tree... Theoretically, if a dedicated slog device failed, the pool could simply revert to logging embedded in the pool. Yes, and this is what would happen in the case where the log
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
On Wed, 13 May 2009, Richard Elling wrote: I didn't find that exact part number, but I notice that manufacturing part 371-4196 32GB Solid State Drive, SATA Interface is showing up in a number of systems. IIRC, this would be an Intel X25-E. Hmm, the part number I provided was off an official quote from our authorized reseller, googling it comes up with one sun.com link: http://www.sun.com/executives/iforce/mysun/docs/Support2a_ReleaseContentInfo.html and a bunch of Japanese sites. List price was $1500, if it is actually an OEM'd Intel X25-E that's quite a markup, street price on that has dropped below $500. If it's not, it sure would be nice to see some specs. Generally, Sun doesn't qualify new devices with EOLed systems. Understood, it just sucks to have bought a system on its deathbed without prior knowledge thereof. Today, you can remove a cache device, but not a log device. You can replace a log device. I guess if we ended up going this way replacing the log device with a standard hard drive in case of support issues would be the only way to go. Those log device replacement also require the replacement device be of equal or greater size? If I wanted to swap between a 32GB SSD and a 1TB SATA drive, I guess I would need to make a partition/slice on the TB drive of exactly the size of the SSD? Before you start down this path, you should take a look at the workload using zilstat, which will show you the kind of work the ZIL is doing. If you don't see any ZIL activity, no need to worry about a separate log. http://www.richardelling.com/Home/scripts-and-programs-1/zilstat Would a dramatic increase in performance when disabling the ZIL also be sufficient evidence? Even with only me as the only person using our test x4500 disabling the ZIL provides markedly better performance as originally described for certain use cases. Usually, the log device does not need to be very big. A good strategy would be to create a small partition or slice, say 1 GByte, on an idle disk. If the log device was too small, you potentially could end up bottlenecked waiting for transactions to be committed to free up log device blocks? Intel claims 3,300 4kByte random write iops. Is that before after the device gets full and starts needing to erase whole pages to write new blocks 8-/? My rule of thumb is to have a hot spare. Having lots of hot spares only makes a big difference for sites where you cannot service the systems within a few days, such as remote locations. Eh, they're just downstairs, and we have 7x24 gold on them. Plus I have 5, each with 2 hot spares. I wouldn't have an issue trading a hot spare for a log device other than potential issues with the log device failing if not mirrored. Yes, and this is what would happen in the case where the log device completely failed while the pool was operational -- the ZIL will revert to using the main pool. But would then go belly up if the system ever rebooted? You said currently you cannot remove a log device, if the pool reverts to an embedded log upon slog failure, and continues to work after a reboot, you've effectively removed the slog, other than I guess it might keep complaining and showing a dead slog device. This is the case where the log device fails completely while the pool is not operational. Upon import, the pool will look for an operational log device and will not find it. This means that any committed transactions that would have been in the log device are not recoverable *and* the pool won't know the extent of this missing information. So is there simply no recovery available for such a pool? Presumably the majority of the data in the pool would probably be fine. OTOH, if you are paranoid and feel very strongly about CYA, then by all means, mirror the log :-). That all depends on the outcome in that rare as it might case where the log device fails and the pool is inaccessible. If it's just a matter of some manual intervention to reset the pool to a happy state and the potential loss of any uncommitted transactions (which, according to the evil zfs tuning guide don't result in a corrupted zfs filesystem, only in potentially unhappy nfs clients), I could live with that. If all of the data in the poll is trashed and must be restored from backup, that would be problematic. [editorial comment: it would be to Sun's benefit if Sun people would respond to Sun product questions. Harrrummppff.] Maybe they're too busy running in circles trying to figure out what life under Oracle dominion is going to be like :(. -- Paul B. Henson | (909) 979-6361 | http://www.csupomona.edu/~henson/ Operating Systems and Network Analyst | hen...@csupomona.edu California State Polytechnic University | Pomona CA 91768 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
On Wed, May 13 at 17:27, Paul B. Henson wrote: On Wed, 13 May 2009, Richard Elling wrote: Intel claims 3,300 4kByte random write iops. Is that before after the device gets full and starts needing to erase whole pages to write new blocks 8-/? The quoted numbers are minimums, not up to like on the X25-M devices. I believe that they're measuring sustained 4k full-pack random writes, long after the device has filled and needs to be doing garbage collection, wear leveling, etc. --eric -- Eric D. Mudama edmud...@mail.bounceswoosh.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540 32GB SSD in x4500 as slog
Paul B. Henson wrote: On Wed, 13 May 2009, Richard Elling wrote: I didn't find that exact part number, but I notice that manufacturing part 371-4196 32GB Solid State Drive, SATA Interface is showing up in a number of systems. IIRC, this would be an Intel X25-E. Hmm, the part number I provided was off an official quote from our authorized reseller, googling it comes up with one sun.com link: http://www.sun.com/executives/iforce/mysun/docs/Support2a_ReleaseContentInfo.html and a bunch of Japanese sites. List price was $1500, if it is actually an OEM'd Intel X25-E that's quite a markup, street price on that has dropped below $500. If it's not, it sure would be nice to see some specs. Generally, Sun doesn't qualify new devices with EOLed systems. Understood, it just sucks to have bought a system on its deathbed without prior knowledge thereof. Since it costs real $$ to do such things, given the current state of the economy, I don't think you'll find anyone in the computer business not trying to sell new product. Today, you can remove a cache device, but not a log device. You can replace a log device. I guess if we ended up going this way replacing the log device with a standard hard drive in case of support issues would be the only way to go. Those log device replacement also require the replacement device be of equal or greater size? Yes, standard mirror rules apply. This is why I try to make it known that you don't generally need much size for the log device. They are solving a latency problem, not a space or bandwidth problem. If I wanted to swap between a 32GB SSD and a 1TB SATA drive, I guess I would need to make a partition/slice on the TB drive of exactly the size of the SSD? Yes, but note that an SMI label hangs onto the outdated notion of cylinders and you can't make a slice except on cylinder boundaries. Before you start down this path, you should take a look at the workload using zilstat, which will show you the kind of work the ZIL is doing. If you don't see any ZIL activity, no need to worry about a separate log. http://www.richardelling.com/Home/scripts-and-programs-1/zilstat Would a dramatic increase in performance when disabling the ZIL also be sufficient evidence? Even with only me as the only person using our test x4500 disabling the ZIL provides markedly better performance as originally described for certain use cases. Yes. If the latency through the data path to write to the log was zero, then it would perform the same as disabling the ZIL. Usually, the log device does not need to be very big. A good strategy would be to create a small partition or slice, say 1 GByte, on an idle disk. If the log device was too small, you potentially could end up bottlenecked waiting for transactions to be committed to free up log device blocks? zilstat can give you an idea of how much data is being written to the log, so you can make that decision. Of course you can always grow the log, or add another. But I think you will find that if a txg commits in 30 seconds or less (less as it becomes more busy), then the amount of data sent to the log will be substantially less than 1 GByte per txg commit. Once the txg commits, then the log space is freed. Intel claims 3,300 4kByte random write iops. Is that before after the device gets full and starts needing to erase whole pages to write new blocks 8-/? Buy two, if you add two log devices, then the data is striped across them (add != attach) My rule of thumb is to have a hot spare. Having lots of hot spares only makes a big difference for sites where you cannot service the systems within a few days, such as remote locations. Eh, they're just downstairs, and we have 7x24 gold on them. Plus I have 5, each with 2 hot spares. I wouldn't have an issue trading a hot spare for a log device other than potential issues with the log device failing if not mirrored. Yes, and this is what would happen in the case where the log device completely failed while the pool was operational -- the ZIL will revert to using the main pool. But would then go belly up if the system ever rebooted? You said currently you cannot remove a log device, if the pool reverts to an embedded log upon slog failure, and continues to work after a reboot, you've effectively removed the slog, other than I guess it might keep complaining and showing a dead slog device. In that case, the pool knows the log device is failed. This is the case where the log device fails completely while the pool is not operational. Upon import, the pool will look for an operational log device and will not find it. This means that any committed transactions that would have been in the log device are not recoverable *and* the pool won't know the extent of this missing information. So is there simply no recovery available for such a pool? Presumably the majority of the data in the pool would
Re: [zfs-discuss] X4540
Are right that X4500 have single point of failure but keeping a spare server module is not that expensive. As there are no cables , replacing will tkae a few seconds and after the boot everything will be ok. Besides cluster support for JBOD's will come shortly, that setup will eleminate SPOF Mertol Mertol Ozyoney Storage Practice - Sales Manager Sun Microsystems, TR Istanbul TR Phone +902123352200 Mobile +905339310752 Fax +90212335 Email [EMAIL PROTECTED] -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Bob Friesenhahn Sent: Monday, July 14, 2008 3:58 AM To: Moore, Joe Cc: zfs-discuss@opensolaris.org Subject: Re: [zfs-discuss] X4540 On Fri, 11 Jul 2008, Moore, Joe wrote: Bob Friesenhahn I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. Assuming, of course that the requirements for the more expensive SAN hardware don't include, for example, surviving a controller or motherboard failure (or gracefully a RAM chip failure) without requiring an extensive downtime for replacement, or other extended downtime because there's only 1 set of chips that can talk to those disks. I am totally with you here since today I can not access my storage pool due to server motherboard failure and I don't know when Sun will successfully fix it. Since I use an external RAID array for my file server, there would not be so much hardship except that I do not have a spare file server available. Bob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Fri, 11 Jul 2008, Moore, Joe wrote: Bob Friesenhahn I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. Assuming, of course that the requirements for the more expensive SAN hardware don't include, for example, surviving a controller or motherboard failure (or gracefully a RAM chip failure) without requiring an extensive downtime for replacement, or other extended downtime because there's only 1 set of chips that can talk to those disks. I am totally with you here since today I can not access my storage pool due to server motherboard failure and I don't know when Sun will successfully fix it. Since I use an external RAID array for my file server, there would not be so much hardship except that I do not have a spare file server available. Bob ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Well, I'm not holding out much hope of Sun working with these suppliers any time soon. I asked Vmetro why they don't work with Sun considering how well ZFS seems to fit with their products, and this was the reply I got: Micro Memory has a long history of working with Sun, and I worked at Sun for almost 10 years developing Solaris x86. We have tried to get various Sun Product Managers responsible for these servers (Thumper) to work with us on this and they have said no. We have tried to get Sun's integration group to work with us (where they would integrate upon customer request, charging the customer for integration and support), and they have also said no. They don't feel there is an adequate business case to justify it as all of the opportunities are so small. This is an incredibly frustrating response for all the Sun customers who could have really benefited from these cards. Why develop the ability to move the ZIL to nvram devices, benchmark the Thumper on one of them, and then refuse to work with the manufacturer to offer the card to customers? May be post this to Jonathan's blog. When the stock is down so much, it's bad that some guy somewhere is not doing his/her job properly of providing something the customers want. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Jul 11, 2008, at 5:32 PM, Richard Elling wrote: Yes, of course. But there is only one CF slot. Cool coincidence that the following article on CF cards and DMA transfers was posted to /. http://hardware.slashdot.org/article.pl?sid=08/07/12/1851251 I take it that Sun's going ship/sell OEM'd CF cards of some sort for Loki. Hopefully they're ones that don't crap out on DMA transfers. /dale ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Well, I'm not holding out much hope of Sun working with these suppliers any time soon. I asked Vmetro why they don't work with Sun considering how well ZFS seems to fit with their products, and this was the reply I got: Micro Memory has a long history of working with Sun, and I worked at Sun for almost 10 years developing Solaris x86. We have tried to get various Sun Product Managers responsible for these servers (Thumper) to work with us on this and they have said no. We have tried to get Sun's integration group to work with us (where they would integrate upon customer request, charging the customer for integration and support), and they have also said no. They don't feel there is an adequate business case to justify it as all of the opportunities are so small. This is an incredibly frustrating response for all the Sun customers who could have really benefited from these cards. Why develop the ability to move the ZIL to nvram devices, benchmark the Thumper on one of them, and then refuse to work with the manufacturer to offer the card to customers? I appreciate Sun are working on their own flash memory solutions, but surely it's to their benefit and ours to take advantage of the technology already on the market with years of tried tested use behind it? This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Jul 10, 2008, at 12:42, Tim wrote: It's the same reason you don't see HDS or EMC rushing to adjust the price of the SYM or USP-V based on Sun releasing the thumpers. No one ever got fired for buying EMC/HDS/NTAP I know my company has corporate standards for various aspects of IT, and if someone purchases something out side of that (which is frowned upon) then you're on your own. If you open a service / trouble ticket for it they'll just close it saying not supported. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Bob Friesenhahn I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. Assuming, of course that the requirements for the more expensive SAN hardware don't include, for example, surviving a controller or motherboard failure (or gracefully a RAM chip failure) without requiring an extensive downtime for replacement, or other extended downtime because there's only 1 set of chips that can talk to those disks. Real SAN storage is dual-ported to dual controller nodes so that you can replace a motherboard without taking down access to the disk. Or install a new OS version without waiting for the system to POST. How can other products remain profitable when competing against such a star performer? Features. RAS. Simplicity. Corporate Inertia (having storage admins who don't know OpenSolaris). Executive outings with StorageTek-logo'd golfballs. The last 2 aren't something I'd build a business case around, but they're a reality. --Joe ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Fri, Jul 11, 2008 at 9:25 AM, Moore, Joe [EMAIL PROTECTED] wrote: Features. RAS. Simplicity. Corporate Inertia (having storage admins who don't know OpenSolaris). Executive outings with StorageTek-logo'd golfballs. The last 2 aren't something I'd build a business case around, but they're a reality. --Joe Why not? There's several in the market today whom I suspect have done just that :D I won't name names, but for anyone in the industry I doubt I have to. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Richard Elling wrote: The best news, for many folks, is that you can boot from an (externally pluggable) CF card, so that you don't have to burn two disks for the OS. Can these be mirrored? I've been bitten by these cards failing (in a camera). Ian ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Ian Collins wrote: Richard Elling wrote: The best news, for many folks, is that you can boot from an (externally pluggable) CF card, so that you don't have to burn two disks for the OS. Can these be mirrored? I've been bitten by these cards failing (in a camera). Yes, of course. But there is only one CF slot. If you are worried about data loss, zfs set copies=2. If you are worried about CF loss, mirror to something else. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
I think it's a cracking upgrade Richard. I was hoping Sun would do something like this, so it's great to see it arrive. As others have said though, I think Sun are missing a trick by not working with Vmetro or Fusion-io to add nvram cards to the range now. In particular, if Sun were to work with Fusion-io and add Solaris drivers for the ioDrive, you'd be in a position right now to offer a 48TB server with 64GB of read cache, and 80GB of write cache You could even offer the same card on the smaller x4240. Can you imagine how well those machines would work as NFS servers? Either one would make a superb NFS storage platform for VMware: You've got incredible performance, ZFS snapshots for backups, and ZFS send/receive to replicate the data elsewhere. NetApp and EMC charge a small fortune for a NAS that can do all that, and they don't offer anywhere near that amount of fast cache. Both servers would take Infiniband too, which is dirt cheap these days at $125 a card, is supported by VMware, and particularly on the smaller server, is way faster than anything EMC or NetApp offer. As a NFS storage platform, you'd be beating EMC and NetApp on price, spindle count, features and performance. I really hope somebody at Sun considers this, and thinks about expanding the What can you do with an x4540 section on the website to include VMware. Ross This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Oh god, I hope not. A patent on fitting a card in a PCI-E slot, or using nvram with RAID (which raid controllers have been doing for years) would just be rediculous. This is nothing more than cache, and even with the American patent system I'd have though it hard to get that past the obviousness test. This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Jul 10, 2008, at 7:05 AM, Ross wrote: Oh god, I hope not. A patent on fitting a card in a PCI-E slot, or using nvram with RAID (which raid controllers have been doing for years) would just be rediculous. This is nothing more than cache, and even with the American patent system I'd have though it hard to get that past the obviousness test. How quickly they forget. Take a look at the Prestoserve User's Guide for a refresher... http://docs.sun.com/app/docs/doc/801-4896-11 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Spencer Shepler wrote: On Jul 10, 2008, at 7:05 AM, Ross wrote: Oh god, I hope not. A patent on fitting a card in a PCI-E slot, or using nvram with RAID (which raid controllers have been doing for years) would just be rediculous. This is nothing more than cache, and even with the American patent system I'd have though it hard to get that past the obviousness test. How quickly they forget. Take a look at the Prestoserve User's Guide for a refresher... http://docs.sun.com/app/docs/doc/801-4896-11 Or Fast Write Cache http://docs.sun.com/app/docs/coll/fast-write-cache2.0 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Thu, 10 Jul 2008, Ross wrote: As a NFS storage platform, you'd be beating EMC and NetApp on price, spindle count, features and performance. I really hope somebody at Sun considers this, and thinks about expanding the What can you do with an x4540 section on the website to include VMware. I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. How can other products remain profitable when competing against such a star performer? Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Thu, Jul 10, 2008 at 10:20 AM, Bob Friesenhahn [EMAIL PROTECTED] wrote: On Thu, 10 Jul 2008, Ross wrote: As a NFS storage platform, you'd be beating EMC and NetApp on price, spindle count, features and performance. I really hope somebody at Sun considers this, and thinks about expanding the What can you do with an x4540 section on the website to include VMware. I expect that Sun is realizing that it is already undercutting much of the rest of its product line. These minor updates would allow the X4540 to compete against much more expensive StorageTek SAN hardware. How can other products remain profitable when competing against such a star performer? Bob == Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss Because at the end of the day, the x4540 still isn't *there* (and probably never will be) for 24/7 SAN/LUN access. AFAIK, nothing in the storagetek line-up is worth a damn as far as NAS goes that would compete with this. I honestly don't believe anyone looking at a home-grown x4540 is TRULY in the market for a high end STK SAN anyways. It's the same reason you don't see HDS or EMC rushing to adjust the price of the SYM or USP-V based on Sun releasing the thumpers. --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Torrey McMahon wrote: Spencer Shepler wrote: On Jul 10, 2008, at 7:05 AM, Ross wrote: Oh god, I hope not. A patent on fitting a card in a PCI-E slot, or using nvram with RAID (which raid controllers have been doing for years) would just be rediculous. This is nothing more than cache, and even with the American patent system I'd have though it hard to get that past the obviousness test. How quickly they forget. Take a look at the Prestoserve User's Guide for a refresher... http://docs.sun.com/app/docs/doc/801-4896-11 Or Fast Write Cache http://docs.sun.com/app/docs/coll/fast-write-cache2.0 Yeah, the J-shaped scar just below my right shoulder blade... For the benefit of the alias, these sorts of products have a very limited market because they store state inside the server and use batteries. RAS guys hate batteries, especially those which are sitting on non-hot-pluggable I/O cards. While there are some specific cards which do allow hardware assisted remote replication (a previous Sun technology called reflective memory as used by VAXclusters) most of the issues are with serviceability and not availability. It is really bad juju to leave state in the wrong place during a service event. Where I think the jury is deadlocked is whether these are actually faster than RAID cards like http://www.sun.com/storagetek/storage_networking/hba/raid/ But from a performability perspective, the question is whether or not such cards perform significantly better than SSDs? Thoughts? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] X4540
So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
The X4540 uses on-board LSI SAS controllers (C1068E). - Eric On Wed, Jul 09, 2008 at 02:59:26PM -0500, Tim wrote: So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Wed, Jul 9, 2008 at 3:09 PM, Eric Schrock [EMAIL PROTECTED] wrote: The X4540 uses on-board LSI SAS controllers (C1068E). - Eric On Wed, Jul 09, 2008 at 02:59:26PM -0500, Tim wrote: So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -- Eric Schrock, Fishworks http://blogs.sun.com/eschrock Perfect. Which means good ol' supermicro would come through :) WOHOO! AOC-USAS-L8i http://www.supermicro.com/products/accessories/addon/AOC-USAS-L8i.cfm --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Wed, Jul 9, 2008 at 2:59 PM, Tim [EMAIL PROTECTED] wrote: So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ Any word on why PCI-Express was not extended to the expansion slots? I put PCI-Express cards in every other server that I connect to 10 gigabit Ethernet or the SAN (FC tape drives). -- Mike Gerdts http://mgerdts.blogspot.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Tim wrote: So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? One NVidia MCP-55 and two NVidia IO-55s replace the thumper's AMD-8132 HT to PCI-X bridges. The new configuration is such that the expandable PCI-E slots have their own IO-55. The MCP-55 and one IO-55 connect to 3 LSI 1068E and provide 2x GbE each. This should be a better balance than the thumper's configuration. LSI 1068E SAS/SATA controllers replace thumper's Marvell SAS/SATA controllers. You might recognize the LSI 1068, and its smaller cousin, the 1064, as being used in many other Sun servers from the T1000 to the M9000. 8-port PCI-E SAS/SATA card is supported for additional expansion, such as a J4500 (the JBOD-only version) http://www.sun.com/storagetek/storage_networking/hba/sas/specs.xml The best news, for many folks, is that you can boot from an (externally pluggable) CF card, so that you don't have to burn two disks for the OS. I think we have solved many of the deficiencies noted in the thumper, including more CPU and memory capacity. Please let us know what you think :-) -- richard The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ --Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Wed, Jul 09, 2008 at 03:19:53PM -0500, Mike Gerdts wrote: Any word on why PCI-Express was not extended to the expansion slots? I put PCI-Express cards in every other server that I connect to 10 gigabit Ethernet or the SAN (FC tape drives). The webpage is incorrect. There are three 8x PCI-E half-height slots on the X4540. - Eric -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Might also want to have them talk to byteandswitch. * **We went to the next-generation Intel processors [and] we have used the latest generation of our Solaris ZFS software, he explains, adding that the J4000 JBODs can also be connected to the X4540.* Either the 4540 is using XEON's now, someone was misquoted, or someone was confused :) http://www.byteandswitch.com/document.asp?doc_id=158533WT.svl=news1_1 --Tim On Wed, Jul 9, 2008 at 3:44 PM, Richard Elling [EMAIL PROTECTED] wrote: Yes, thanks for catching this. I'm sure it is just a copy-n-paste mistake. I've alerted product manager to get it fixed. -- richard Mike Gerdts wrote: On Wed, Jul 9, 2008 at 3:29 PM, Richard Elling [EMAIL PROTECTED] wrote: 8-port PCI-E SAS/SATA card is supported for additional expansion, such as a J4500 (the JBOD-only version) http://www.sun.com/storagetek/storage_networking/hba/sas/specs.xml Based upon my previous message, this message, and Jeorg Moellenkamp's blog entry[1], I think that the hardware specifications page[2] needs to be updated so that the expansion slots say PCI-Express rather than PCI-X. 1. http://www.c0t0d0s0.org/archives/4605-New-storage-from-Sun-J420044004500-and-X4540-Storage-Server.html 2. http://www.sun.com/servers/x64/x4540/specs.xml ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Wed, Jul 09, 2008 at 03:52:27PM -0500, Tim wrote: Is the 4540 still running a rageXL? I find that somewhat humorous if it's an Nvidia chipset with ATI video :) According to SMBIOS there is an on-board device of type AST2000 VGA. - Eric -- Eric Schrock, Fishworkshttp://blogs.sun.com/eschrock ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Tim wrote: Is the 4540 still running a rageXL? I find that somewhat humorous if it's an Nvidia chipset with ATI video :) Yes, it is part of the chip which handles the management interface. I don't find this to be a contradiction, though. AMD bought ATI and we're using AMD Quad-core CPUs. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
Eric Schrock wrote: On Wed, Jul 09, 2008 at 03:52:27PM -0500, Tim wrote: Is the 4540 still running a rageXL? I find that somewhat humorous if it's an Nvidia chipset with ATI video :) According to SMBIOS there is an on-board device of type AST2000 VGA. Yes, I think I found another copy-n-paste error in some docs :-( It does appear to be an AST2000, something like: http://www.aspeedtech.com/ast2000.html -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] X4540
On Wed, Jul 9, 2008 at 3:29 PM, Richard Elling [EMAIL PROTECTED] wrote: Tim wrote: So, I see Sun finally updated the Thumper, and it appears they're now using a PCI-E backplane. Anyone happen to know what the chipset is? Any chance we'll see an 8-port PCI-E SATA card finally?? One NVidia MCP-55 and two NVidia IO-55s replace the thumper's AMD-8132 HT to PCI-X bridges. The new configuration is such that the expandable PCI-E slots have their own IO-55. The MCP-55 and one IO-55 connect to 3 LSI 1068E and provide 2x GbE each. This should be a better balance than the thumper's configuration. LSI 1068E SAS/SATA controllers replace thumper's Marvell SAS/SATA controllers. You might recognize the LSI 1068, and its smaller cousin, the 1064, as being used in many other Sun servers from the T1000 to the M9000. 8-port PCI-E SAS/SATA card is supported for additional expansion, such as a J4500 (the JBOD-only version) http://www.sun.com/storagetek/storage_networking/hba/sas/specs.xml The best news, for many folks, is that you can boot from an (externally pluggable) CF card, so that you don't have to burn two disks for the OS. I think we have solved many of the deficiencies noted in the thumper, including more CPU and memory capacity. Please let us know what you think :-) Not that I'm in the market for one - but I think a version with (possibly fewer) 15k RPM SAS disks would be a best seller - especially for applications that require more IOPS. Like RDBMS for example. And yes, I realize that one could install a SAS card into the 4540 and attach it to one of the SAS based J4nnn boxes - but that's not the same physical density that a 4540 with SAS disks would offer. Or even a mixture of SATA and SAS drives And it would be great if Sun would OEM the Micro Memory (aka vmetro) cards. Obviously its only a question of time before Sun will bring its own RAM/flash cards to the market - but an OEM deal would make product available now and probably won't compete with what Sun has in mind (based entirely on my own crystal ball gazing). We all know how big a win this is for NFS shares! Congrats to Sun, Team ZFS and open storage. The new x45xx and J4xxx boxes are *great* additions to Suns product line. -- richard The new Sun Fire X4540 server uses PCI Express IO technology for more than triple the system IO-to-network bandwidth. http://www.sun.com/servers/x64/x4540/ --Tim Regards, -- Al Hopper Logical Approach Inc,Plano,TX [EMAIL PROTECTED] Voice: 972.379.2133 Timezone: US CDT OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007 http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss