Re: Storage server HW advice/feedback req for setup overall & in particular reliability/QoS of SATA, to protect from controller- or BIOS-induced system crashes? Dedicated PCI SATA HBA needed??
On Fri, Feb 19, 2016 at 05:01:19PM +0700, Tinker wrote: > On 2016-02-17 01:01, j...@bitminer.ca wrote: > .. > >Why do you think you need to build such a device? Why don't you buy it? > > > >(Dell PowerEdge VRTX, HP hyper converged, etc) > > Colocation requires rack servers, but thanks for thinking about it. > > >Some important things: > > > > - what is the purpose of this collection of clients, servers, > >networks and software? > > - who will judge, and how will they judge, the effectiveness of it? > >How fast/correctly it performs not just reliability > > - what is their budget? > > - how much time will they give you? > > - how will you spend your time? > > - how will you prove to yourself that you have finished? How can you > >prove to your users/customer that it works? > > > I feel the general takehome from this conversation and Nick Holland's > suggestions, is that really anything might break and everything needs to be > set up to handle that. > > So like, all of > > * data integrity verifications, > * checksumming of everything, > * automated routines to take a node out of use in case of IO slowdown or > failure, or any other error, and > * live syncing of important stuff to another datacenter for the case o > power failures > > need to be in place for there to be any real data integrity + QoS > guarantees. > > > Thanks! You keep forgetting "planning for the manual stuff that will need to be done when all automated stuff fails".
Re: Storage server HW advice/feedback req for setup overall & in particular reliability/QoS of SATA, to protect from controller- or BIOS-induced system crashes? Dedicated PCI SATA HBA needed??
On 2016-02-17 01:01, j...@bitminer.ca wrote: .. Why do you think you need to build such a device? Why don't you buy it? (Dell PowerEdge VRTX, HP hyper converged, etc) Colocation requires rack servers, but thanks for thinking about it. Some important things: - what is the purpose of this collection of clients, servers, networks and software? - who will judge, and how will they judge, the effectiveness of it? How fast/correctly it performs not just reliability - what is their budget? - how much time will they give you? - how will you spend your time? - how will you prove to yourself that you have finished? How can you prove to your users/customer that it works? I feel the general takehome from this conversation and Nick Holland's suggestions, is that really anything might break and everything needs to be set up to handle that. So like, all of * data integrity verifications, * checksumming of everything, * automated routines to take a node out of use in case of IO slowdown or failure, or any other error, and * live syncing of important stuff to another datacenter for the case o power failures need to be in place for there to be any real data integrity + QoS guarantees. Thanks!
Re: Storage server HW advice/feedback req for setup overall & in particular reliability/QoS of SATA, to
Hi, This is to ask you for your thoughts/advice on the best hardware setup for an OpenBSD server. Oh where to start. You have a lot of enthusiasm clearly but not a lot of experience. OK, I'll bite. "best" is subjective. The server(s) will be surrounded by clients (they are servers after all). What is the best client for this best server? What is the purpose of this collection of servers and clients? What is your budget? Who will evaluate this system and on what basis will they describe it as successful or not? This email ultimately reduces to the question, "What HW & config do you suggest for minimizing the possibility of IO freeze or system crash from BIOS or SATA card, in the event of SSD/HDD malfunction?", however I'll take the whole reasoning around the HW choice from ground up with you just to see that you feel that I got it all right. This post and others seem to show you are very concerned with I/O freeze. Yet that is a rare occurence, by comparison to hundreds of other possibilities for system failure. AC power failure, for instance. I hope this email will serve as general advice for others re. best practice for OpenBSD server hardware choices. GOAL I am setting up an SSD-based storage facility that needs high data integrity guarantees and high performance (random reads/writes). The goal is to be able to safely store and constantly process something about as important as, say, medical records. "high" and "guarantee" are mutually incompatible. You either get a guarantee or you don't. (Any guarantee is unlikely to be credible.) Now, if they said "perfect" and "guarantee" then your statement would be correct, however, still unbelievable. There is a disconnect here in the logic. Needless to say, at some point such a storage server *will* fail, and the only way to get to any sense of a pretty-much-100% uptime guarantee, is to set up the facility in the form of multiple servers in a reduntant cluster. OK, now you have a choice: do you want to spend lots of money on highly reliable servers, and cluster them, or spend less money on less reliable servers and rely on the clustering for overall reliability? OpenBSD does not support clustered filesystems, so here you must be assuming some other non-OpenBSD package, such as from ports, to implement "clusters". Is this right? What the individual server can do then is to never ever deliver broken data. And, locally and collectively there needs to be a well working mechanism for detecting when a node needs maintenance & take it out of use then. Another error in logic. "never ever" is incompatible with "*will* fail". You might want to review how Netflix manages failure. Look up "chaos monkey". The gist of which is, based on a "will fail" assumption, they constantly test handling failures. What I want to ask you about nw then, is your thoughts on what would be the most suitable hardware configuration for the individual server, for them to function for as long as possible without need for physical administrator intervention. Why do you think you need to build such a device? Why don't you buy it? (Dell PowerEdge VRTX, HP hyper converged, etc) (And for when physical admin intervention would be needed, to reduce competence need for that maintenance if possible, to only involve hotswapping or adding a physical disk - so that is to minimize need of reboots due to SATA controller issues, weird BIOS behavior, or other reasons.) GENERAL PROBLEM SURFACE OF SERVER HARDWARE It seems to me that the accumulated experience with respect to why servers break, is 1) anything storage-related, 2) PSU, 3) other. You don't give any source for this claim. Check out various publications by Google and other at-scale users about their experience. So then, stability aspects should be given consideration in that order. For 2), the PSU can be made redundant easily, and PSU failures are fairly rare anyhow, so that is pretty much what is reasonable to do for that. You omit AC power failures, distribution panel faults, uninterruptible power systems, power cables, unintended pressure by fingers roaming on/off buttons, feet kicking power cables, and so on. Why do you leave these risks out? For 3), the "other" category would either be because of bad thermal conditions (so that needs to be given proper consideration), or happen anyhow, for which no safeguards exist anyhow, so we just need to take that. The rest of this post will discuss 1) the storage aspect, only. THE STORAGE SOLUTION Originally I thought RAID 5/6 would provide data integrity guarantees and performance well. Then I saw the benchmark for a high-end RAID card showing 25MB/sec write (= 95% overhead) and 80% overhead on reads (http://www.storagereview.com/lsi_megaraid_sas3_93618i_review) per disk The reference you cite says no such thing. The word "overhead" does not appear in the article. That reference has some flaky methodology
Storage server HW advice/feedback req for setup overall & in particular reliability/QoS of SATA, to protect from controller- or BIOS-induced system crashes? Dedicated PCI SATA HBA needed??
Hi, This is to ask you for your thoughts/advice on the best hardware setup for an OpenBSD server. This email ultimately reduces to the question, "What HW & config do you suggest for minimizing the possibility of IO freeze or system crash from BIOS or SATA card, in the event of SSD/HDD malfunction?", however I'll take the whole reasoning around the HW choice from ground up with you just to see that you feel that I got it all right. I hope this email will serve as general advice for others re. best practice for OpenBSD server hardware choices. GOAL I am setting up an SSD-based storage facility that needs high data integrity guarantees and high performance (random reads/writes). The goal is to be able to safely store and constantly process something about as important as, say, medical records. Needless to say, at some point such a storage server *will* fail, and the only way to get to any sense of a pretty-much-100% uptime guarantee, is to set up the facility in the form of multiple servers in a reduntant cluster. What the individual server can do then is to never ever deliver broken data. And, locally and collectively there needs to be a well working mechanism for detecting when a node needs maintenance & take it out of use then. What I want to ask you about now then, is your thoughts on what would be the most suitable hardware configuration for the individual server, for them to function for as long as possible without need for physical administrator intervention. (And for when physical admin intervention would be needed, to reduce competence need for that maintenance if possible, to only involve hotswapping or adding a physical disk - so that is to minimize need of reboots due to SATA controller issues, weird BIOS behavior, or other reasons.) GENERAL PROBLEM SURFACE OF SERVER HARDWARE It seems to me that the accumulated experience with respect to why servers break, is 1) anything storage-related, 2) PSU, 3) other. So then, stability aspects should be given consideration in that order. For 2), the PSU can be made redundant easily, and PSU failures are fairly rare anyhow, so that is pretty much what is reasonable to do for that. For 3), the "other" category would either be because of bad thermal conditions (so that needs to be given proper consideration), or happen anyhow, for which no safeguards exist anyhow, so we just need to take that. The rest of this post will discuss 1) the storage aspect, only. THE STORAGE SOLUTION Originally I thought RAID 5/6 would provide data integrity guarantees and performance well. Then I saw the benchmark for a high-end RAID card showing 25MB/sec write (= 95% overhead) and 80% overhead on reads (http://www.storagereview.com/lsi_megaraid_sas3_93618i_review) per disk set, which is enough to make me understand that the upcoming softraid RAID1C with 2-4 drives will be far better at delivering those qualities - Of course I didn't see any benchmarks on RAID1C, but I guess its overhead for both read and write will be <<10-15% in average at least with its default CRC32C. (Perhaps RAID1C needs to be fortified with a better checksumming algorithm, and perhaps also double mirror reads on any read (depending on how the scrubbing works - didn't check this yet), though that is a separate conversation.) Of course to really know how well RAID1C will perform, I would need to benchmark it, but, there seems to be a general consensus in the RAID community that checksummed mirroring is preferable to RAID 5/6, so like, I perceive that this preliminary understanding I have that RAID1C will be the winning option, is well founded. The SSD:s would be enterprise grade and hence *should* shut down immediately if they start malfunctioning, so there should be essentially no QoS dumps in the softraid from any IO operations that take ultra-long to complete e.g. >>>10 seconds. For the RAID1C to really deliver then (now that PSU, CPU, RAM, and SSD all work), all that would be needed is that the remaining factors deliver well, so that is the SATA connectivity and that the BIOS operates transparently. HARDWARE BUDGET A good Xeon Supermicro server with onboard SATA and ethernet with decent PSU, RAM, CPU is some 1000:ds USD. 2TB x 2-3 enterprise SSD:s is around 2700-4000 USD. If any specialized SATA controllers if needed would be below 2000 USD anyhow. QUESTION Someone with 30 years of admin experience warned me that in the case that an individual storage drive dies, the SATA controller could crash, or the BIOS could kill the whole system. Also he warned me that if any disk in the boot softraid RAID1 would break, then the BIOS could get so confused that the system even wouldn't want to boot - and for that reason I guess the boot disks should be separated altogether from the "data disks", as the further will have a much, much lower turnover. A SATA-controller- or BIOS-induced system crash, freeze, or other need to reboot the system because of malfunction because of them, would be really