Sounds like you're simply throwing more seq scans at it via m/r than
your disk can handle. iostat could confirm that disk is the
bottleneck. But real monitoring would be better.
http://www.datastax.com/products/opscenter
On Thu, Dec 8, 2011 at 1:02 AM, Patrik Modesto patrik.mode...@gmail.com
Where do you see the timeout exceptions? in the mappers?
How many mappers reducers slots are you using? What does your disk setup
look like? do you have HDFS on same disk as cassandra data dir?
-Jake
On Tue, Dec 6, 2011 at 4:50 AM, Patrik Modesto patrik.mode...@gmail.comwrote:
Hi,
I'm
If you're getting lots of timeout exceptions with mapreduce, you might take a
look at http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
We saw that and tweaked a variety of things - all of which are listed there.
Ultimately, we also boosted hadoop's tolerance for them as well and
Thank you Jeremy, I've already changed the max.*.failures to 20, it
help jobs to finish but doesn't solve the source of the timeouts. I'll
try the other tips.
Regards,
Patrik
On Wed, Dec 7, 2011 at 17:29, Jeremy Hanna jeremy.hanna1...@gmail.com wrote:
If you're getting lots of timeout
Hi Jake,
I see the timeouts in mappers as well as at random-access backend
daemons (for web services). There are now 10 mappers, 2 reducers on
each node. There is one big 4-disk raid10 array on each node on which
there is cassandra together with HDFS. We store just few GB of files
on HDFS,
Hi,
I'm quite desperate about Cassandra's performance in our production
cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
raid10, cassandra 0.8.8, RF=3 and Hadoop.
We four keyspaces, one is the large one, it has 2 CFs, one is kind of
index, the other holds data. There are
I'm quite desperate about Cassandra's performance in our production
cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
raid10, cassandra 0.8.8, RF=3 and Hadoop.
We four keyspaces, one is the large one, it has 2 CFs, one is kind of
index, the other holds data. There are