[jira] [Comment Edited] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-24 Thread Sean Bridges (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183783#comment-14183783
 ] 

Sean Bridges edited comment on CASSANDRA-8177 at 10/25/14 12:18 AM:


{quote}
My guess for sequential repair generating lots of IO is that, when reading from 
snapshot, it is hitting disk for each snapshot SSTable to read its bloom 
filters, index files etc
{quote}

When you snapshot you are hardlinking the old and original sstables, they are 
the same files, so the os cache shouldn't be the difference


was (Author: sgbridges):
{quote}
My guess for sequential repair generating lots of IO is that, when reading from 
snapshot, it is hitting disk for each snapshot SSTable to read its bloom 
filters, index files etc
{quote}

When you snapshot you are hardlinking the old and original sstables, they are 
the same file, so the os cache shouldn't be the difference

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
Assignee: Yuki Morishita
 Attachments: cassc-week.png, iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8177) sequential repair is much more expensive than parallel repair

2014-10-24 Thread Yuki Morishita (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183824#comment-14183824
 ] 

Yuki Morishita edited comment on CASSANDRA-8177 at 10/25/14 12:59 AM:
--

One more thing that is likely related is when snapshotting, *all* SSTables are 
snapshot and opened even if the part of them are validated.
(Fixed in CASSANDRA-7024 for 2.1)


was (Author: yukim):
One more thing that is likely related is when snapshotting, *all* SSTables are 
snapshot and opened even if the part of them are validated.
(Fixed in CASSANDRA-7024)

 sequential repair is much more expensive than parallel repair
 -

 Key: CASSANDRA-8177
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8177
 Project: Cassandra
  Issue Type: Bug
Reporter: Sean Bridges
Assignee: Yuki Morishita
 Attachments: cassc-week.png, iostats.png


 This is with 2.0.10
 The attached graph shows io read/write throughput (as measured with iostat) 
 when doing repairs.
 The large hump on the left is a sequential repair of one node.  The two much 
 smaller peaks on the right are parallel repairs.
 This is a 3 node cluster using vnodes (I know vnodes on small clusters isn't 
 recommended).  Cassandra reports load of 40 gigs.
 We noticed a similar problem with a larger cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)