Re: Pros and cons of using RAID or different RAIDS?

2013-04-22 Thread Toke Eskildsen
On Mon, 2013-04-22 at 02:04 +0200, Shawn Heisey wrote:
 Aside from cost, the main reason that I have not seriously investigated
 SSD drives is because I have not come across a solution for any level of
 RAID (even RAID1) with SSDs that exposes TRIM to the operating system.
 Without reliable TRIM support, an SSD solution is not viable for a
 long-term setup.

Why not? Enterprise-oriented benchmarks starts by hammering the drives
until they are fragmented enough that performance does not suffer any
more from subsequent writes. Even in that state they have vastly
superior latency, compared to spinning drives.

- Toke Eskildsen, State and University Library, Denmark



Re: Pros and cons of using RAID or different RAIDS?

2013-04-21 Thread Furkan KAMACI
When I read documentation about Hbase it says RAID is not recommended for
many cases. When we talk about SolrCloud (and consider that if a machine
goes down there is a failure system via replicas) and when we think about
the purposes of different RAID disks:

do they true -
using RAID systems for:

* *fault tolerance*: does not make sense because there is already a
mechanism at SolrCloud and instead of using my disks for such kind of RAID
purpose I can use that disk at somewhere else for a replica?

* *read and write performance:* I should select a RAID version for
considering about a good performance of read/write.

All in all maybe I should consider about Non-RAID drive architectures as
like JBOD?

What do you guys think about not considering RAID versions which has good
fault tolerance but considering read/write performance and maybe
considering about Non-RAID drive architectures?



2013/4/20 Shawn Heisey s...@elyograg.org

 On 4/20/2013 7:36 AM, Toke Eskildsen wrote:
  Furkan KAMACI [furkankam...@gmail.com]:
  Is there any documentation that explains pros and cons of using RAID or
  different RAIDS?
 
  There's plenty for RAID in general, but I do not know of any in-depth
 Solr-specific guides.
 
  For index updates, you want high bulk read- and write-speed. That makes
 the striped versions, such as RAID 5  6, poor choices for a heavily
 updated index.
 
  For searching you want low latency and high throughput for small random
 access reads. All the common RAIDs gives you higher throughput for those.

 The only RAID level I'm aware of that satisfies speed requirements for
 both indexing and queries is RAID10, striping across mirror sets.  The
 speed goes up with each pair of disks you add.  The only problem with
 RAID10 is that you lose half of your raw disk space, just like with
 RAID1.  This is the raid level that I use for my Solr servers.  I have
 six 1TB SATA drives, giving me a usable volume of 3TB.  I notice a
 significant disk speed increase compared to a server with single or
 mirrored disks.  It is faster on both random and contiguous reads.

 RAID 5 and 6 (striping with parity) don't lose as much disk space; one
 or two disks depending on which one you choose.  Read speed is very good
 with these levels, but unfortunately there is a penalty for writes due
 to the parity stripes, and that penalty can be quite severe.  If you
 have a caching RAID controller, the write penalty is mitigated for
 writes that fit in the cache (usually up to 1GB), but once you start
 writing continuously, the penalty comes back.

 In the event of a disk failure, all RAID levels will have lower
 performance during rebuild.  RAID10 will have no performance impact
 before you replace the disk, and will have a mild and short-lived
 performance impact while the rebuild is happening.  RAID5/6 has a major
 performance impact as soon as a disk fails, and an even higher
 performance impact during the rebuild, which can take a very long time.
  Rebuilding a failed disk on a RAID6 volume that has 23 1TB disks is a
 process that takes about 24 hours, and I can say that from personal
 experience.

 Thanks,
 Shawn




Re: Pros and cons of using RAID or different RAIDS?

2013-04-21 Thread Shawn Heisey
On 4/21/2013 4:23 PM, Furkan KAMACI wrote:
 When I read documentation about Hbase it says RAID is not recommended for
 many cases. When we talk about SolrCloud (and consider that if a machine
 goes down there is a failure system via replicas) and when we think about
 the purposes of different RAID disks:
 
 do they true -
 using RAID systems for:
 
 * *fault tolerance*: does not make sense because there is already a
 mechanism at SolrCloud and instead of using my disks for such kind of RAID
 purpose I can use that disk at somewhere else for a replica?
 
 * *read and write performance:* I should select a RAID version for
 considering about a good performance of read/write.
 
 All in all maybe I should consider about Non-RAID drive architectures as
 like JBOD?
 
 What do you guys think about not considering RAID versions which has good
 fault tolerance but considering read/write performance and maybe
 considering about Non-RAID drive architectures?

I never build a server without at least RAID1.  I figure the cost of an
extra disk is more than worth the time and hassle involved in the
initial setup of the server, so that I don't have to reinstall
everything when a disk fails.

If you have a lot of servers in your SolrCloud cluster and you've worked
out a way to build a new one extremely quickly, then JBOD might be a
good solution.  The I/O performance of an individual server will not be
as high as it would be with RAID10, but you may see good performance
from the cluster as a whole, especially if each node has plenty of RAM.

Aside from cost, the main reason that I have not seriously investigated
SSD drives is because I have not come across a solution for any level of
RAID (even RAID1) with SSDs that exposes TRIM to the operating system.
Without reliable TRIM support, an SSD solution is not viable for a
long-term setup.

Thanks,
Shawn



Re: Pros and cons of using RAID or different RAIDS?

2013-04-20 Thread Raymond Wiker
On Apr 20, 2013, at 05:03 , Otis Gospodnetic otis.gospodne...@gmail.com wrote:
 Yeah, but as far as I know, there is nothing Solr-specific about that.
 
 See http://www.acnc.com/raid
 

There's a hw-specific dimension to this, too: for my company's enterprise 
search solution, we had to replace our initial RAID setup (RAID 5, I think, 
using some sort of integrated RAID controller) because the performance was 
much lower than expected (and required). We switched to a SAN connection, and 
that has given us the performance we needed.



RE: Pros and cons of using RAID or different RAIDS?

2013-04-20 Thread Toke Eskildsen
Furkan KAMACI [furkankam...@gmail.com]:
 Is there any documentation that explains pros and cons of using RAID or
 different RAIDS?

There's plenty for RAID in general, but I do not know of any in-depth 
Solr-specific guides.

For index updates, you want high bulk read- and write-speed. That makes the 
striped versions, such as RAID 5  6, poor choices for a heavily updated index.

For searching you want low latency and high throughput for small random access 
reads. All the common RAIDs gives you higher throughput for those.

- Toke Eskildsen

Re: Pros and cons of using RAID or different RAIDS?

2013-04-20 Thread Shawn Heisey
On 4/20/2013 7:36 AM, Toke Eskildsen wrote:
 Furkan KAMACI [furkankam...@gmail.com]:
 Is there any documentation that explains pros and cons of using RAID or
 different RAIDS?
 
 There's plenty for RAID in general, but I do not know of any in-depth 
 Solr-specific guides.
 
 For index updates, you want high bulk read- and write-speed. That makes the 
 striped versions, such as RAID 5  6, poor choices for a heavily updated 
 index.
 
 For searching you want low latency and high throughput for small random 
 access reads. All the common RAIDs gives you higher throughput for those.

The only RAID level I'm aware of that satisfies speed requirements for
both indexing and queries is RAID10, striping across mirror sets.  The
speed goes up with each pair of disks you add.  The only problem with
RAID10 is that you lose half of your raw disk space, just like with
RAID1.  This is the raid level that I use for my Solr servers.  I have
six 1TB SATA drives, giving me a usable volume of 3TB.  I notice a
significant disk speed increase compared to a server with single or
mirrored disks.  It is faster on both random and contiguous reads.

RAID 5 and 6 (striping with parity) don't lose as much disk space; one
or two disks depending on which one you choose.  Read speed is very good
with these levels, but unfortunately there is a penalty for writes due
to the parity stripes, and that penalty can be quite severe.  If you
have a caching RAID controller, the write penalty is mitigated for
writes that fit in the cache (usually up to 1GB), but once you start
writing continuously, the penalty comes back.

In the event of a disk failure, all RAID levels will have lower
performance during rebuild.  RAID10 will have no performance impact
before you replace the disk, and will have a mild and short-lived
performance impact while the rebuild is happening.  RAID5/6 has a major
performance impact as soon as a disk fails, and an even higher
performance impact during the rebuild, which can take a very long time.
 Rebuilding a failed disk on a RAID6 volume that has 23 1TB disks is a
process that takes about 24 hours, and I can say that from personal
experience.

Thanks,
Shawn



Pros and cons of using RAID or different RAIDS?

2013-04-19 Thread Furkan KAMACI
Is there any documentation that explains pros and cons of using RAID or
different RAIDS?


Re: Pros and cons of using RAID or different RAIDS?

2013-04-19 Thread Otis Gospodnetic
Yeah, but as far as I know, there is nothing Solr-specific about that.

See http://www.acnc.com/raid

Otis
--
Solr  ElasticSearch Support
http://sematext.com/





On Fri, Apr 19, 2013 at 11:19 AM, Furkan KAMACI furkankam...@gmail.com wrote:
 Is there any documentation that explains pros and cons of using RAID or
 different RAIDS?