Re: Pros and cons of using RAID or different RAIDS?
On Mon, 2013-04-22 at 02:04 +0200, Shawn Heisey wrote: Aside from cost, the main reason that I have not seriously investigated SSD drives is because I have not come across a solution for any level of RAID (even RAID1) with SSDs that exposes TRIM to the operating system. Without reliable TRIM support, an SSD solution is not viable for a long-term setup. Why not? Enterprise-oriented benchmarks starts by hammering the drives until they are fragmented enough that performance does not suffer any more from subsequent writes. Even in that state they have vastly superior latency, compared to spinning drives. - Toke Eskildsen, State and University Library, Denmark
Re: Pros and cons of using RAID or different RAIDS?
When I read documentation about Hbase it says RAID is not recommended for many cases. When we talk about SolrCloud (and consider that if a machine goes down there is a failure system via replicas) and when we think about the purposes of different RAID disks: do they true - using RAID systems for: * *fault tolerance*: does not make sense because there is already a mechanism at SolrCloud and instead of using my disks for such kind of RAID purpose I can use that disk at somewhere else for a replica? * *read and write performance:* I should select a RAID version for considering about a good performance of read/write. All in all maybe I should consider about Non-RAID drive architectures as like JBOD? What do you guys think about not considering RAID versions which has good fault tolerance but considering read/write performance and maybe considering about Non-RAID drive architectures? 2013/4/20 Shawn Heisey s...@elyograg.org On 4/20/2013 7:36 AM, Toke Eskildsen wrote: Furkan KAMACI [furkankam...@gmail.com]: Is there any documentation that explains pros and cons of using RAID or different RAIDS? There's plenty for RAID in general, but I do not know of any in-depth Solr-specific guides. For index updates, you want high bulk read- and write-speed. That makes the striped versions, such as RAID 5 6, poor choices for a heavily updated index. For searching you want low latency and high throughput for small random access reads. All the common RAIDs gives you higher throughput for those. The only RAID level I'm aware of that satisfies speed requirements for both indexing and queries is RAID10, striping across mirror sets. The speed goes up with each pair of disks you add. The only problem with RAID10 is that you lose half of your raw disk space, just like with RAID1. This is the raid level that I use for my Solr servers. I have six 1TB SATA drives, giving me a usable volume of 3TB. I notice a significant disk speed increase compared to a server with single or mirrored disks. It is faster on both random and contiguous reads. RAID 5 and 6 (striping with parity) don't lose as much disk space; one or two disks depending on which one you choose. Read speed is very good with these levels, but unfortunately there is a penalty for writes due to the parity stripes, and that penalty can be quite severe. If you have a caching RAID controller, the write penalty is mitigated for writes that fit in the cache (usually up to 1GB), but once you start writing continuously, the penalty comes back. In the event of a disk failure, all RAID levels will have lower performance during rebuild. RAID10 will have no performance impact before you replace the disk, and will have a mild and short-lived performance impact while the rebuild is happening. RAID5/6 has a major performance impact as soon as a disk fails, and an even higher performance impact during the rebuild, which can take a very long time. Rebuilding a failed disk on a RAID6 volume that has 23 1TB disks is a process that takes about 24 hours, and I can say that from personal experience. Thanks, Shawn
Re: Pros and cons of using RAID or different RAIDS?
On 4/21/2013 4:23 PM, Furkan KAMACI wrote: When I read documentation about Hbase it says RAID is not recommended for many cases. When we talk about SolrCloud (and consider that if a machine goes down there is a failure system via replicas) and when we think about the purposes of different RAID disks: do they true - using RAID systems for: * *fault tolerance*: does not make sense because there is already a mechanism at SolrCloud and instead of using my disks for such kind of RAID purpose I can use that disk at somewhere else for a replica? * *read and write performance:* I should select a RAID version for considering about a good performance of read/write. All in all maybe I should consider about Non-RAID drive architectures as like JBOD? What do you guys think about not considering RAID versions which has good fault tolerance but considering read/write performance and maybe considering about Non-RAID drive architectures? I never build a server without at least RAID1. I figure the cost of an extra disk is more than worth the time and hassle involved in the initial setup of the server, so that I don't have to reinstall everything when a disk fails. If you have a lot of servers in your SolrCloud cluster and you've worked out a way to build a new one extremely quickly, then JBOD might be a good solution. The I/O performance of an individual server will not be as high as it would be with RAID10, but you may see good performance from the cluster as a whole, especially if each node has plenty of RAM. Aside from cost, the main reason that I have not seriously investigated SSD drives is because I have not come across a solution for any level of RAID (even RAID1) with SSDs that exposes TRIM to the operating system. Without reliable TRIM support, an SSD solution is not viable for a long-term setup. Thanks, Shawn
Re: Pros and cons of using RAID or different RAIDS?
On Apr 20, 2013, at 05:03 , Otis Gospodnetic otis.gospodne...@gmail.com wrote: Yeah, but as far as I know, there is nothing Solr-specific about that. See http://www.acnc.com/raid There's a hw-specific dimension to this, too: for my company's enterprise search solution, we had to replace our initial RAID setup (RAID 5, I think, using some sort of integrated RAID controller) because the performance was much lower than expected (and required). We switched to a SAN connection, and that has given us the performance we needed.
RE: Pros and cons of using RAID or different RAIDS?
Furkan KAMACI [furkankam...@gmail.com]: Is there any documentation that explains pros and cons of using RAID or different RAIDS? There's plenty for RAID in general, but I do not know of any in-depth Solr-specific guides. For index updates, you want high bulk read- and write-speed. That makes the striped versions, such as RAID 5 6, poor choices for a heavily updated index. For searching you want low latency and high throughput for small random access reads. All the common RAIDs gives you higher throughput for those. - Toke Eskildsen
Re: Pros and cons of using RAID or different RAIDS?
On 4/20/2013 7:36 AM, Toke Eskildsen wrote: Furkan KAMACI [furkankam...@gmail.com]: Is there any documentation that explains pros and cons of using RAID or different RAIDS? There's plenty for RAID in general, but I do not know of any in-depth Solr-specific guides. For index updates, you want high bulk read- and write-speed. That makes the striped versions, such as RAID 5 6, poor choices for a heavily updated index. For searching you want low latency and high throughput for small random access reads. All the common RAIDs gives you higher throughput for those. The only RAID level I'm aware of that satisfies speed requirements for both indexing and queries is RAID10, striping across mirror sets. The speed goes up with each pair of disks you add. The only problem with RAID10 is that you lose half of your raw disk space, just like with RAID1. This is the raid level that I use for my Solr servers. I have six 1TB SATA drives, giving me a usable volume of 3TB. I notice a significant disk speed increase compared to a server with single or mirrored disks. It is faster on both random and contiguous reads. RAID 5 and 6 (striping with parity) don't lose as much disk space; one or two disks depending on which one you choose. Read speed is very good with these levels, but unfortunately there is a penalty for writes due to the parity stripes, and that penalty can be quite severe. If you have a caching RAID controller, the write penalty is mitigated for writes that fit in the cache (usually up to 1GB), but once you start writing continuously, the penalty comes back. In the event of a disk failure, all RAID levels will have lower performance during rebuild. RAID10 will have no performance impact before you replace the disk, and will have a mild and short-lived performance impact while the rebuild is happening. RAID5/6 has a major performance impact as soon as a disk fails, and an even higher performance impact during the rebuild, which can take a very long time. Rebuilding a failed disk on a RAID6 volume that has 23 1TB disks is a process that takes about 24 hours, and I can say that from personal experience. Thanks, Shawn
Pros and cons of using RAID or different RAIDS?
Is there any documentation that explains pros and cons of using RAID or different RAIDS?
Re: Pros and cons of using RAID or different RAIDS?
Yeah, but as far as I know, there is nothing Solr-specific about that. See http://www.acnc.com/raid Otis -- Solr ElasticSearch Support http://sematext.com/ On Fri, Apr 19, 2013 at 11:19 AM, Furkan KAMACI furkankam...@gmail.com wrote: Is there any documentation that explains pros and cons of using RAID or different RAIDS?