Re: weird behavior with RAID 0 on EC2

2013-03-31 Thread aaron morton
Ok, if you're going to look into it, please keep me/us posted. It's not on my radar. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 28/03/2013, at 2:43 PM, Alain RODRIGUEZ arodr...@gmail.com wrote: Ok, if you're

Re: weird behavior with RAID 0 on EC2

2013-03-31 Thread Alexis Lê-Quôc
Alain, Can you post your mdadm --detail /dev/md0 output here as well as your iostat -x -d when that happens. A bad ephemeral drive on EC2 is not unheard of. Alexis | @alq | http://datadog.com P.S. also, disk utilization is not a reliable metric, iostat's await and svctm are more useful imho.

Re: weird behavior with RAID 0 on EC2

2013-03-31 Thread Rudolf van der Leeden
I've seen the same behaviour (SLOW ephemeral disk) a few times. You can't do anything with a single slow disk except not using it. Our solution was always: Replace the m1.xlarge instance asap and everything is good. -Rudolf. On 31.03.2013, at 18:58, Alexis Lê-Quôc wrote: Alain, Can you

Re: weird behavior with RAID 0 on EC2

2013-03-28 Thread Alain RODRIGUEZ
Ok, if you're going to look into it, please keep me/us posted. It happen twice for me, the same day, within a few hours on the same node and only happened to 1 node out of 12, making this node almost unreachable. 2013/3/28 aaron morton aa...@thelastpickle.com I noticed this on an m1.xlarge

Re: weird behavior with RAID 0 on EC2

2013-03-27 Thread aaron morton
I noticed this on an m1.xlarge (cassandra 1.1.10) instance today as well, 1 or 2 disks in a raid 0 running at 85 to 100% the others 35 to 50ish. Have not looked into it. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton

weird behavior with RAID 0 on EC2

2013-03-26 Thread Alain RODRIGUEZ
We use C* on m1.xLarge AWS EC2 servers, with 4 disks xvdb, xvdc, xvdd, xvde parts of a logical Raid0 (md0). I use to see their use increasing in the same way. This morning there was a normal minor compaction followed by messages dropped on one node (out of 12). Looking closely at this node I saw