[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-30 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13808820#comment-13808820
 ] 

Marcus Eriksson commented on CASSANDRA-6142:


ok, looks good to me

how about 1.2?

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809198#comment-13809198
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

The 2.0 backport was bad enough, I don't even want to think about 1.2.  They're 
all pretty rare corner cases, so I'm fine with telling people to upgrade to 2.0 
if they care.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-30 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13809201#comment-13809201
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

Split that out to CASSANDRA-6274 to keep CHANGES clean when i tag it 2.0.3.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-24 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13804861#comment-13804861
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

Posted 2.0 fixes to https://github.com/jbellis/cassandra/commits/6142-2.0.  
Note that b959e8ff3bccd3437de70d33da91307ab9c12a19 is a different, 
less-invasive approach than the one taken for trunk.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-22 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801548#comment-13801548
 ] 

Marcus Eriksson commented on CASSANDRA-6142:


looks good to me

regarding saveOutOfOrderRows, i guess a solution would be to flush a new 
sstable from the TreeSet when its size exceeds some limit? Unsure how common 
this is though.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13801998#comment-13801998
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

I'm guessing not super common because the existing code will just break if it 
hits that case.  (A LCR object will throw errors if you try to use it after 
advancing the underlying stream to another row.)

I guess the next step is probably for me to pull the fixes out for application 
to 2.0.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-19 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1370#comment-1370
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

Pushed fixes for these to the same branch.  (They are indeed existing bugs in 
LCR.)

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-17 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13798480#comment-13798480
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

Damn, not sure how I missed that.  Suspect another existing bug.  Will 
investigate.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-16 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13796515#comment-13796515
 ] 

Marcus Eriksson commented on CASSANDRA-6142:


CompactionsPurgeTest fails:
{noformat}
[junit] Testsuite: org.apache.cassandra.db.compaction.CompactionsPurgeTest
[junit] Tests run: 6, Failures: 2, Errors: 0, Skipped: 0, Time elapsed: 
9.365 sec
[junit] 
[junit] Testcase: 
testMinTimestampPurge(org.apache.cassandra.db.compaction.CompactionsPurgeTest): 
  FAILED
[junit] expected:2 but was:1
[junit] junit.framework.AssertionFailedError: expected:2 but was:1
[junit] at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testMinTimestampPurge(CompactionsPurgeTest.java:185)
[junit] 
[junit] 
[junit] Testcase: 
testCompactionPurgeTombstonedRow(org.apache.cassandra.db.compaction.CompactionsPurgeTest):
FAILED
[junit] expected:10 but was:5
[junit] junit.framework.AssertionFailedError: expected:10 but was:5
[junit] at 
org.apache.cassandra.db.compaction.CompactionsPurgeTest.testCompactionPurgeTombstonedRow(CompactionsPurgeTest.java:313)
[junit] 
[junit] 
[junit] Test org.apache.cassandra.db.compaction.CompactionsPurgeTest FAILED
{noformat}
and a few nits:
in LazilyCompactedRow:
* make reducer and merger final
* remove comment about reducer being null on row 123


 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785287#comment-13785287
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

I tried parallelizing at the OnDiskAtomIterator level instead 
(thread-per-iterator-per-partition, buffering into a queue) and for small 
partitions the performance is ridiculously bad, easily 100x worse than single 
threaded mode.

Any better ideas [~krummas] [~yukim] [~iamaleksey] [~slebresne]?  If not I will 
post a patch to rip out PCI.

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-03 Thread Marcus Eriksson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785321#comment-13785321
 ] 

Marcus Eriksson commented on CASSANDRA-6142:


i tried improving it a while back as well, got basically the same results, yes, 
we should remove it

concluded that the best way to improve the speed was to do more compactions in 
parallel (CASSANDRA-5936 - i should finish that up..)

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (CASSANDRA-6142) Remove multithreaded compaction

2013-10-03 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785612#comment-13785612
 ] 

Jonathan Ellis commented on CASSANDRA-6142:
---

Pushed removal to https://github.com/jbellis/cassandra/commits/6142.

Also removes PrecompactedRow, which is no longer necessary, and fixes a couple 
existing bugs in LCR and Scrub that this revealed (last two commits).

 Remove multithreaded compaction
 ---

 Key: CASSANDRA-6142
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6142
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 2.1


 There is at best a very small sweet spot for multithreaded compaction 
 (ParallelCompactionIterable).  For large rows, we stall the pipeline and fall 
 back to a single LCR pass.  For small rows, the overhead of the coordination 
 outweighs the benefits of parallelization (45s to compact 2x1M stress rows 
 with multithreading enabled, vs 35 with it disabled).



--
This message was sent by Atlassian JIRA
(v6.1#6144)