[jira] [Commented] (CASSANDRA-2252) off-heap memtables

2011-07-14 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065135#comment-13065135
 ] 

Stu Hood commented on CASSANDRA-2252:
-

bq. And if they are so large they do not, then your rate of key allocation is 
glacial and again it shouldn't matter.
Compaction builds up an IndexSummary slowly enough that I theorized it might be 
causing fragmentation... didn't get a chance to prove it though.

bq. There is no logical unit of slabbing for key cache, we shouldn't be doing 
that at all.
Agreed. We actually ended up disabling the key cache and saw a nice boost in 
time-to-promotion-failure, but I would love to find an actual solution.

bq. Once you promoted a slab in old gen, it stays there, instead of being GC'd 
and replaced with a slab in new gen again.
The bookkeeping might be worth it, yes.


 off-heap memtables
 --

 Key: CASSANDRA-2252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
 Fix For: 1.0

 Attachments: 0001-add-MemtableAllocator.txt, 
 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, merged-2252.tgz

   Original Estimate: 0.4h
  Remaining Estimate: 0.4h

 The memtable design practically actively fights Java's GC design.  Todd 
 Lipcon gave a good explanation over on HBASE-3455.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2868) Native Memory Leak

2011-07-14 Thread Chris Burroughs (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Burroughs updated CASSANDRA-2868:
---

Attachment: low-load-36-hours-initial-results.png

Initial results.  Graph of VmRSS from /proc/PID/status at 10 second intervals 
from my last comment to now.  Box on the left has GCInspector disabled.  These 
are on two test boxes under trivial load so this is all still *very* tentative. 
 Will start testing under real load by early next week.

 Native Memory Leak
 --

 Key: CASSANDRA-2868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
Reporter: Daniel Doubleday
Priority: Minor
 Attachments: 2868-v1.txt, low-load-36-hours-initial-results.png


 We have memory issues with long running servers. These have been confirmed by 
 several users in the user list. That's why I report.
 The memory consumption of the cassandra java process increases steadily until 
 it's killed by the os because of oom (with no swap)
 Our server is started with -Xmx3000M and running for around 23 days.
 pmap -x shows
 Total SST: 1961616 (mem mapped data and index files)
 Anon  RSS: 6499640
 Total RSS: 8478376
 This shows that  3G are 'overallocated'.
 We will use BRAF on one of our less important nodes to check wether it is 
 related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2868) Native Memory Leak

2011-07-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065247#comment-13065247
 ] 

Jonathan Ellis commented on CASSANDRA-2868:
---

Promising!

 Native Memory Leak
 --

 Key: CASSANDRA-2868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
Reporter: Daniel Doubleday
Priority: Minor
 Attachments: 2868-v1.txt, low-load-36-hours-initial-results.png


 We have memory issues with long running servers. These have been confirmed by 
 several users in the user list. That's why I report.
 The memory consumption of the cassandra java process increases steadily until 
 it's killed by the os because of oom (with no swap)
 Our server is started with -Xmx3000M and running for around 23 days.
 pmap -x shows
 Total SST: 1961616 (mem mapped data and index files)
 Anon  RSS: 6499640
 Total RSS: 8478376
 This shows that  3G are 'overallocated'.
 We will use BRAF on one of our less important nodes to check wether it is 
 related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2753) Capture the max client timestamp for an SSTable

2011-07-14 Thread Daniel Doubleday (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daniel Doubleday updated CASSANDRA-2753:


Attachment: SSTableWriterTest.patch

Dunno if SSTableWriterTest is the right place but the added test would break.

 Capture the max client timestamp for an SSTable
 ---

 Key: CASSANDRA-2753
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
 Fix For: 1.0

 Attachments: 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 
 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, 
 SSTableWriterTest.patch, supercolumn.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1146732 - /cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java

2011-07-14 Thread jbellis
Author: jbellis
Date: Thu Jul 14 14:32:16 2011
New Revision: 1146732

URL: http://svn.apache.org/viewvc?rev=1146732view=rev
Log:
add test for including supercolumn tombstone time in max timestamp computation
patch by Daniel Doubleday; reviewed by jbellis for CASSANDRA-2753

Modified:

cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java

Modified: 
cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java?rev=1146732r1=1146731r2=1146732view=diff
==
--- 
cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java
 (original)
+++ 
cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java
 Thu Jul 14 14:32:16 2011
@@ -21,16 +21,15 @@ package org.apache.cassandra.io.sstable;
  */
 
 
+import static org.apache.cassandra.Util.addMutation;
 import static org.junit.Assert.*;
 
 import java.io.IOException;
 import java.nio.ByteBuffer;
-import java.util.Arrays;
-import java.util.HashMap;
-import java.util.List;
-import java.util.Map;
+import java.util.*;
 import java.util.concurrent.ExecutionException;
 
+import org.apache.cassandra.Util;
 import org.junit.Test;
 
 import org.apache.cassandra.CleanupHelper;
@@ -137,4 +136,38 @@ public class SSTableWriterTest extends C
 // ensure max timestamp is captured during rebuild
 assert sstr.getMaxTimestamp() == 4321L;
 }
+
+@Test
+public void testSuperColumnMaxTimestamp() throws IOException, 
ExecutionException, InterruptedException
+{
+ColumnFamilyStore store = 
Table.open(Keyspace1).getColumnFamilyStore(Super1);
+RowMutation rm;
+DecoratedKey dk = Util.dk(key1);
+
+// add data
+rm = new RowMutation(Keyspace1, dk.key);
+addMutation(rm, Super1, SC1, 1, val1, 0);
+rm.apply();
+store.forceBlockingFlush();
+
+validateMinTimeStamp(store.getSSTables(), 0);
+
+// remove
+rm = new RowMutation(Keyspace1, dk.key);
+rm.delete(new QueryPath(Super1, ByteBufferUtil.bytes(SC1)), 1);
+rm.apply();
+store.forceBlockingFlush();
+
+validateMinTimeStamp(store.getSSTables(), 0);
+
+CompactionManager.instance.performMaximal(store);
+assertEquals(1, store.getSSTables().size());
+validateMinTimeStamp(store.getSSTables(), 1);
+}
+
+private void validateMinTimeStamp(CollectionSSTableReader ssTables, int 
timestamp)
+{
+for (SSTableReader ssTable : ssTables)
+assertTrue(ssTable.getMaxTimestamp() = timestamp);
+}
 }




[jira] [Commented] (CASSANDRA-2753) Capture the max client timestamp for an SSTable

2011-07-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065292#comment-13065292
 ] 

Jonathan Ellis commented on CASSANDRA-2753:
---

lgtm, thanks!

 Capture the max client timestamp for an SSTable
 ---

 Key: CASSANDRA-2753
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
 Fix For: 1.0

 Attachments: 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 
 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, 
 SSTableWriterTest.patch, supercolumn.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-47) SSTable compression

2011-07-14 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-47:
-

Attachment: CASSANDRA-47-v2.patch

v2 eliminates need in sparse files, index section added at the end of the file 
to hold chunk sizes, so header size increased to 18 bytes - 2 control bytes, 8 
bytes for uncompressed size and 8 bytes indicating offset of the index section. 
That approach does not use any additional space except of 4 bytes required to 
store index section length (at the header of that section), chunk sizes are 
important information so I don't count size need to store them as overhead.

Also tried to import/export large files using sstable2json and json2sstable to 
make sure that it works.

 SSTable compression
 ---

 Key: CASSANDRA-47
 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
  Labels: compression
 Fix For: 1.0

 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, 
 snappy-java-1.0.3-rc4.jar


 We should be able to do SSTable compression which would trade CPU for I/O 
 (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-47) SSTable compression

2011-07-14 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065300#comment-13065300
 ] 

Pavel Yaskevich commented on CASSANDRA-47:
--

Forgot to mention that this is rebased with latest trunk (latest commit 
4629648899e637e8e03938935f126689cce5ad48)

 SSTable compression
 ---

 Key: CASSANDRA-47
 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
  Labels: compression
 Fix For: 1.0

 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, 
 snappy-java-1.0.3-rc4.jar


 We should be able to do SSTable compression which would trade CPU for I/O 
 (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-47) SSTable compression

2011-07-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065311#comment-13065311
 ] 

Jonathan Ellis commented on CASSANDRA-47:
-

bq. index section added at the end of the file to hold chunk sizes

We used to do this with the index entries, but keeping that in memory until 
you're done can cause a lot of memory pressure.  I like the suggestion of 
moving index entry to (key, compresed-chunk-offset, 
uncompressed-offset-within-chunk) better.

 SSTable compression
 ---

 Key: CASSANDRA-47
 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
  Labels: compression
 Fix For: 1.0

 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, 
 snappy-java-1.0.3-rc4.jar


 We should be able to do SSTable compression which would trade CPU for I/O 
 (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2753) Capture the max client timestamp for an SSTable

2011-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065316#comment-13065316
 ] 

Hudson commented on CASSANDRA-2753:
---

Integrated in Cassandra #958 (See 
[https://builds.apache.org/job/Cassandra/958/])
add test for including supercolumn tombstone time in max timestamp 
computation
patch by Daniel Doubleday; reviewed by jbellis for CASSANDRA-2753

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1146732
Files : 
* 
/cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java


 Capture the max client timestamp for an SSTable
 ---

 Key: CASSANDRA-2753
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Alan Liang
Assignee: Alan Liang
Priority: Minor
 Fix For: 1.0

 Attachments: 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 
 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 
 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, 
 SSTableWriterTest.patch, supercolumn.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-47) SSTable compression

2011-07-14 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065321#comment-13065321
 ] 

Pavel Yaskevich commented on CASSANDRA-47:
--

I did that for flexibility to use CompressedDataFile without relaying on index 
file, that index is read only once upon CompressedSegmentedFile completion and 
then just gets passed to constructor in CSF.getSegment() so even if file is 
very big like 5-7 GB it will only make about 1 megabyte overhead of keeping 
that index in memory. Index also allows us to skip reading additional 4 bytes 
of chunk length from file every time we do re-buffer.

 SSTable compression
 ---

 Key: CASSANDRA-47
 URL: https://issues.apache.org/jira/browse/CASSANDRA-47
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
  Labels: compression
 Fix For: 1.0

 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, 
 snappy-java-1.0.3-rc4.jar


 We should be able to do SSTable compression which would trade CPU for I/O 
 (almost always a good trade).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Trivial Update of ThirdPartySupport by Michael Weir

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ThirdPartySupport page has been changed by Michael Weir:
http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=14rev2=15

  Companies providing support for Apache Cassandra are not endorsed by the 
Apache Software Foundation, although some of these companies employ 
[[Committers]] to the Apache project.
  
- {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] 
offers the Acunu Data Platform for faster, more consistent performance from 
your Cassandra applications (free to use for up to 2 nodes). Acunu also 
provides Cassandra training, support and professional services.  
+ Companies that employ Apache Cassandra Committers:
  
  {{http://www.datastax.com/sites/all/themes/datastax20110201/logo.png}} 
[[http://datastax.com|Datastax]] DataStax, the commercial leader in Apache 
Cassandra™ offers products and services that make it easy for customers to 
build, deploy and operate elastically scalable and cloud-optimized applications 
and data services. The company has over 90 customers, including leaders such as 
Netflix, Cisco, Rackspace and Constant Contact, and spanning verticals 
including web, financial services, telecommunications, logistics and government.
+ 
+ 
+ Other companies:
+ 
+ {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] 
offers the Acunu Data Platform for faster, more consistent performance from 
your Cassandra applications (free to use for up to 2 nodes). Acunu also 
provides Cassandra training, support and professional services.  
  
  {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} 
[[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase, 
  


[Cassandra Wiki] Trivial Update of ThirdPartySupport by Michael Weir

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ThirdPartySupport page has been changed by Michael Weir:
http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=15rev2=16

  Companies providing support for Apache Cassandra are not endorsed by the 
Apache Software Foundation, although some of these companies employ 
[[Committers]] to the Apache project.
  
- Companies that employ Apache Cassandra Committers:
+ Companies that employ Apache Cassandra [[Committers]]:
  
  {{http://www.datastax.com/sites/all/themes/datastax20110201/logo.png}} 
[[http://datastax.com|Datastax]] DataStax, the commercial leader in Apache 
Cassandra™ offers products and services that make it easy for customers to 
build, deploy and operate elastically scalable and cloud-optimized applications 
and data services. The company has over 90 customers, including leaders such as 
Netflix, Cisco, Rackspace and Constant Contact, and spanning verticals 
including web, financial services, telecommunications, logistics and government.
  


[jira] [Commented] (CASSANDRA-2888) CQL support for JDBC DatabaseMetaData

2011-07-14 Thread Dave Carlson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065383#comment-13065383
 ] 

Dave Carlson commented on CASSANDRA-2888:
-

1.0.4-SNAPSHOT works for me - I can work with that and I'll close this.  The 
toolset is a large proprietary/commercial J2EE document storage and retrieval 
system.

 CQL support for JDBC DatabaseMetaData
 -

 Key: CASSANDRA-2888
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2888
 Project: Cassandra
  Issue Type: Improvement
  Components: Drivers
Affects Versions: 0.8.1
 Environment: J2SE 1.6.0_22 x64 on Fedora 15
Reporter: Dave Carlson
Assignee: Rick Shaw
Priority: Minor
  Labels: cql, newbie
 Fix For: 0.8.2

   Original Estimate: 96h
  Remaining Estimate: 96h

 In order to increase the drop-in capability of CQL to existing JDBC app 
 bases, CQL must be updated to include at least semi-valid responses to the 
 JDBC metadata portion.
 without enhancement:
  com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI 
  pointer)
 Resource has error: java.lang.UnsupportedOperationException: method not 
 supported
 ...
 with enhancement:
  com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI 
  pointer)
 org.apache.cassandra.cql.jdbc.CassandraConnection@1915470e

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2888) CQL support for JDBC DatabaseMetaData

2011-07-14 Thread Dave Carlson (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dave Carlson resolved CASSANDRA-2888.
-

   Resolution: Fixed
Fix Version/s: 0.8.2

works in 1.0.4-SNAPSHOT of CQL

 CQL support for JDBC DatabaseMetaData
 -

 Key: CASSANDRA-2888
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2888
 Project: Cassandra
  Issue Type: Improvement
  Components: Drivers
Affects Versions: 0.8.1
 Environment: J2SE 1.6.0_22 x64 on Fedora 15
Reporter: Dave Carlson
Assignee: Rick Shaw
Priority: Minor
  Labels: cql, newbie
 Fix For: 0.8.2

   Original Estimate: 96h
  Remaining Estimate: 96h

 In order to increase the drop-in capability of CQL to existing JDBC app 
 bases, CQL must be updated to include at least semi-valid responses to the 
 JDBC metadata portion.
 without enhancement:
  com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI 
  pointer)
 Resource has error: java.lang.UnsupportedOperationException: method not 
 supported
 ...
 with enhancement:
  com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI 
  pointer)
 org.apache.cassandra.cql.jdbc.CassandraConnection@1915470e

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Trivial Update of ThirdPartySupport by BrandonWilliams

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The ThirdPartySupport page has been changed by BrandonWilliams:
http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=16rev2=17

Comment:
Fix transposed content

  
  {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] 
offers the Acunu Data Platform for faster, more consistent performance from 
your Cassandra applications (free to use for up to 2 nodes). Acunu also 
provides Cassandra training, support and professional services.  
  
- {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} 
[[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase, 
+ {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} 
[[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase, 
MongoDB, and Other databases like Riak, Redis, Membase, Tokyocabinet, etc 
[[http://bigdata.impetus.com/# | More info about BigData @Impetus]]
  
  {{http://www.onzra.com/images/Small-Logo.gif}} [[http://www.ONZRA.com|ONZRA]] 
has been around for over 10 years and specializes on enterprise grade 
architecture, development and security consulting services utilizing many large 
scale database technologies such as Cassandra, Oracle, Alegro Graph, and much 
more.
  
- MongoDB, and Other databases like Riak, Redis, Membase, Tokyocabinet, etc 
[[http://bigdata.impetus.com/# | More info about BigData @Impetus]]
+ 
  
  
  


[jira] [Commented] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065404#comment-13065404
 ] 

Brandon Williams commented on CASSANDRA-2843:
-

2843_b.patch fails to apply a chunk to trunk in 
src/java/org/apache/cassandra/db/ColumnFamily, can you rebase?

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

2011-07-14 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065409#comment-13065409
 ] 

Brandon Williams commented on CASSANDRA-2496:
-

I see two more things to be done with this patch.  First, when re-replicating 
nodes report back to the removal coordinator, if the coordinator has restarted 
it won't understand them, and they will infinitely loop retrying the 
confirmation.  Second, since we're holding dead states, we need to make sure 
that bootstrapping/moving nodes can take over these dead tokens.

 Gossip should handle 'dead' states
 --

 Key: CASSANDRA-2496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Attachments: 0001-Rework-token-removal-process.txt, 
 0002-add-2115-back.txt


 For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths

2011-07-14 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1788:
--

Attachment: 1788-v5.txt

Part of the old sendOneWay (the packbody copy) looks like this:

{code}
DataOutputBuffer buffer = new DataOutputBuffer();
buffer.writeUTF(id);
Message.serializer().serialize(message, buffer, 
message.getVersion());
data = buffer.getData();
{code}

byte[] data is NOT restricted to just the serialized bytes in the buffer -- it 
will include any unused bytes at the end, as well.

v5 skips garbage bytes like this for backwards compatibility.

 reduce copies on read, write paths
 --

 Key: CASSANDRA-1788
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 1.0

 Attachments: 0001-setup.txt, 
 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 
 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt

   Original Estimate: 24h
  Remaining Estimate: 24h

 Currently, we do _three_ unnecessary copies (that is, writing to the socket 
 is necessary; any other copies made are overhead) for each message:
 - constructing the Message body byte[] (this is typically a call to a 
 ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in 
 SchemaCheckVerbHandler's reply)
 - which is copied to a buffer containing the entire Message (i.e. including 
 Header) when sendOneWay calls Message.serializer.serialize()
 - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt
 - which is what we write to the socket
 For deserialize we perform a similar orgy of copies:
 - IncomingTcpConnection reads the Message length, allocates a byte[], and 
 reads the serialized Message into it
 - ITcpC then calls Message.serializer().deserialize, which allocates a new 
 byte[] for the body and copies that part
 - finally, the verbHandler (determined by the now-deserialized Message 
 header) deserializes the actual object represented by the body
 Most of these are out of scope for 0.7 but I think we can at least elide the 
 last copy on the write path and the first on the read.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2860) Versioning works *too* well

2011-07-14 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065423#comment-13065423
 ] 

Brandon Williams commented on CASSANDRA-2860:
-

+1

 Versioning works *too* well
 ---

 Key: CASSANDRA-2860
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2860
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.1
Reporter: Brandon Williams
Assignee: Jonathan Ellis
 Fix For: 0.8.2

 Attachments: 2860-v2.txt, 2860.txt


 The scenario goes something like this: you upgrade from 0.7 to 0.8, but all 
 the nodes remember that the remote side is 0.7, so they in turn speak 0.7, 
 causing the local node to also think the remote is 0.7, even though both are 
 really 0.8.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2899) cli silently fails when classes are quoted

2011-07-14 Thread Brandon Williams (JIRA)
cli silently fails when classes are quoted
--

 Key: CASSANDRA-2899
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2899
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Pavel Yaskevich
Priority: Minor


For example: CREATE COLUMN FAMILY autocomplete_meta WITH comparator = 
'UTF8Type' AND default_validation_class = 'UTF8Type' AND key_validation_class = 
'UTF8Type'

Neither validation class is actually set, but if you remove the quotes 
everything works.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_b.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_c.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065483#comment-13065483
 ] 

Yang Yang edited comment on CASSANDRA-2843 at 7/14/11 7:25 PM:
---



rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)


also fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected. 

  was (Author: yangyangyyy):
fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected.
  
 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected.

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: rebased , against HEAD of trunk 
(4629648899e637e8e03938935f126689cce5ad48))

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_c.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: 

rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)


also fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected. )

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2882) describe_ring should include datacenter/topology information

2011-07-14 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065496#comment-13065496
 ] 

Nate McCall commented on CASSANDRA-2882:


This should be considered related to CASSANDRA-1777

I think both of these are crucial to providing clients that can take advantage 
of topologies. 

 describe_ring should include datacenter/topology information
 

 Key: CASSANDRA-2882
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2882
 Project: Cassandra
  Issue Type: Improvement
  Components: API, Core
Reporter: Mark Guzman
Priority: Minor

 describe_ring is great for getting a list of nodes in the cluster, but it 
 doesn't provide any information about the network topology which prevents 
 it's use in a multi-dc setup. It would be nice if we added another list to 
 the TokenRange object containing the DC information. 
 Optimally I could have ask any Cassandra node for this information and on the 
 client-side prefer local nodes but be able to fail to remote nodes without 
 requiring another lookup.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2129) removetoken after removetoken rf error fails to work

2011-07-14 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2129:


Reviewer: xedin  (was: thepaul)

 removetoken after removetoken rf error fails to work
 

 Key: CASSANDRA-2129
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2129
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Mike Bulman
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2129-v2.txt, 2129.txt

   Original Estimate: 4h
  Remaining Estimate: 4h

 2 node cluster, a keyspace existed with rf=2.  Tried removetoken and got:
 mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
 bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
 Exception in thread main java.lang.IllegalStateException: replication 
 factor (2) exceeds number of endpoints (1)
 Deleted the keyspace, and tried again:
 mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
 bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
 Exception in thread main java.lang.UnsupportedOperationException: This node 
 is already processing a removal. Wait for it to complete.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

2011-07-14 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-2496:
---

Attachment: 0003-update-gossip-related-comments.patch.txt

These small patches build on the others.

0003-update-gossip-related-comments.patch.txt: updates gossip-related comments 
derp derp.

 Gossip should handle 'dead' states
 --

 Key: CASSANDRA-2496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Attachments: 0001-Rework-token-removal-process.txt, 
 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt


 For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

2011-07-14 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-2496:
---

Attachment: 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt

0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt: use REMOVED_TOKEN instead of 
STATUS_LEFT (would probably be ok either way, but otherwise, the REMOVED_TOKEN 
state would not be used). Seems this is more the way it was intended.

 Gossip should handle 'dead' states
 --

 Key: CASSANDRA-2496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Attachments: 0001-Rework-token-removal-process.txt, 
 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 
 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt


 For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states

2011-07-14 Thread paul cannon (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

paul cannon updated CASSANDRA-2496:
---

Attachment: 0005-drain-self-if-removetoken-d-elsewhere.patch.txt

0005-drain-self-if-removetoken-d-elsewhere.patch.txt : when node X was 
partitioned and removetoken'd but then it shows up again, it should shut itself 
down, rather than becoming a zombie

 Gossip should handle 'dead' states
 --

 Key: CASSANDRA-2496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Attachments: 0001-Rework-token-removal-process.txt, 
 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 
 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 
 0005-drain-self-if-removetoken-d-elsewhere.patch.txt


 For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states

2011-07-14 Thread paul cannon (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065531#comment-13065531
 ] 

paul cannon commented on CASSANDRA-2496:


I'll see what I can do to test the infinitely loop retrying the confirmation 
and bootstrapping/moving nodes can take over these dead tokens situations.

 Gossip should handle 'dead' states
 --

 Key: CASSANDRA-2496
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Attachments: 0001-Rework-token-removal-process.txt, 
 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 
 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 
 0005-drain-self-if-removetoken-d-elsewhere.patch.txt


 For background, see CASSANDRA-2371

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-14 Thread Yang Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

rebased against 4629648899e637e8e03938935f126689cce5ad48

also fixed a bug in my test,
the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse 
gives a compile error

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2129) removetoken after removetoken rf error fails to work

2011-07-14 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065542#comment-13065542
 ] 

Pavel Yaskevich commented on CASSANDRA-2129:


+1

 removetoken after removetoken rf error fails to work
 

 Key: CASSANDRA-2129
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2129
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Mike Bulman
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2129-v2.txt, 2129.txt

   Original Estimate: 4h
  Remaining Estimate: 4h

 2 node cluster, a keyspace existed with rf=2.  Tried removetoken and got:
 mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
 bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
 Exception in thread main java.lang.IllegalStateException: replication 
 factor (2) exceeds number of endpoints (1)
 Deleted the keyspace, and tried again:
 mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ 
 bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794
 Exception in thread main java.lang.UnsupportedOperationException: This node 
 is already processing a removal. Wait for it to complete.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1146900 - in /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra: locator/ service/

2011-07-14 Thread brandonwilliams
Author: brandonwilliams
Date: Thu Jul 14 21:24:11 2011
New Revision: 1146900

URL: http://svn.apache.org/viewvc?rev=1146900view=rev
Log:
Allow RF to exceed the number of nodes (but disallow writes)
Patch by brandonwilliams, reviewed by Pavel Yaskevich for CASSANDRA-2129

Modified:

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/OldNetworkTopologyStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/SimpleStrategy.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/WriteResponseHandler.java

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java?rev=1146900r1=1146899r2=1146900view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
 Thu Jul 14 21:24:11 2011
@@ -87,9 +87,8 @@ public abstract class AbstractReplicatio
  * we return a List to avoid an extra allocation when sorting by proximity 
later
  * @param searchToken the token the natural endpoints are requested for
  * @return a copy of the natural endpoints for the given token
- * @throws IllegalStateException if the number of requested replicas is 
greater than the number of known endpoints
  */
-public ArrayListInetAddress getNaturalEndpoints(Token searchToken) 
throws IllegalStateException
+public ArrayListInetAddress getNaturalEndpoints(Token searchToken)
 {
 Token keyToken = 
TokenMetadata.firstToken(tokenMetadata.sortedTokens(), searchToken);
 ArrayListInetAddress endpoints = getCachedEndpoints(keyToken);
@@ -99,10 +98,6 @@ public abstract class AbstractReplicatio
 keyToken = 
TokenMetadata.firstToken(tokenMetadataClone.sortedTokens(), searchToken);
 endpoints = new 
ArrayListInetAddress(calculateNaturalEndpoints(searchToken, 
tokenMetadataClone));
 cacheEndpoint(keyToken, endpoints);
-// calculateNaturalEndpoints should have checked this already, 
this is a safety
-assert getReplicationFactor() = endpoints.size() : 
String.format(endpoints %s generated for RF of %s,
-  
Arrays.toString(endpoints.toArray()),
-  
getReplicationFactor());
 }
 
 return new ArrayListInetAddress(endpoints);
@@ -115,9 +110,8 @@ public abstract class AbstractReplicatio
  *
  * @param searchToken the token the natural endpoints are requested for
  * @return a copy of the natural endpoints for the given token
- * @throws IllegalStateException if the number of requested replicas is 
greater than the number of known endpoints
  */
-public abstract ListInetAddress calculateNaturalEndpoints(Token 
searchToken, TokenMetadata tokenMetadata) throws IllegalStateException;
+public abstract ListInetAddress calculateNaturalEndpoints(Token 
searchToken, TokenMetadata tokenMetadata);
 
 public IWriteResponseHandler 
getWriteResponseHandler(CollectionInetAddress writeEndpoints,
  MultimapInetAddress, 
InetAddress hintedEndpoints,

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java?rev=1146900r1=1146899r2=1146900view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java
 Thu Jul 14 21:24:11 2011
@@ -120,9 +120,6 @@ public class NetworkTopologyStrategy ext
 dcEndpoints.add(endpoint);
 }
 
-if (dcEndpoints.size()  dcReplicas)
-throw new IllegalStateException(String.format(datacenter (%s) 
has no more endpoints, (%s) replicas still needed,
-  dcName, 
dcReplicas - dcEndpoints.size()));
 if (logger.isDebugEnabled())
 logger.debug({} endpoints in datacenter {} for token {} ,

[Cassandra Wiki] Update of JmxInterface by defmikekoh

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The JmxInterface page has been changed by defmikekoh:
http://wiki.apache.org/cassandra/JmxInterface?action=diffrev1=24rev2=25

- If you start it using the standard startup script, Cassandra will listen for 
connections on port 8080 to view and tweak variables which it exposes via 
[[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/|JMX]]. 
This may be helpful for debugging and monitoring.
+ If you start it using the standard startup script, Cassandra will listen for 
connections on port 8080 (port 7199 starting in 0.8.0-beta1) to view and tweak 
variables which it exposes via 
[[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/|JMX]]. 
This may be helpful for debugging and monitoring.  See also [[JmxGotchas]].
  
  The MemtableThresholds page describes how to use 
[[http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html|Jconsole]] 
as a client for this.
  


[Cassandra Wiki] Update of RunningCassandra by defmikekoh

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The RunningCassandra page has been changed by defmikekoh:
http://wiki.apache.org/cassandra/RunningCassandra?action=diffrev1=15rev2=16

  $ CASSANDRA_INCLUDE=/tmp/new.in.sh bin/cassandra
  }}}
  
- Among other things, the defaults in `bin/cassandra.in.sh` include a maximum 
heap size (-Xmx) of 1GB, which you'll almost certainly want to considering 
tailoring for your environment. The port to access Cassandra's JmxInterface is 
also configured here through the `com.sun.management.jmxremote.port` property 
and defaults to 8080. 
+ Among other things, the defaults in `bin/cassandra.in.sh` include a maximum 
heap size (-Xmx) of 1GB, which you'll almost certainly want to considering 
tailoring for your environment. The port to access Cassandra's JmxInterface is 
also configured here through the `com.sun.management.jmxremote.port` property 
and defaults to 8080 (7199 starting in v0.8.0-beta1). 
  
  Additionally, the script recognizes a number of command line arguments, 
invoking the script with the `-h` option prints a brief summary of them.
  


[Cassandra Wiki] Update of MemtableThresholds by defmikekoh

2011-07-14 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The MemtableThresholds page has been changed by defmikekoh:
http://wiki.apache.org/cassandra/MemtableThresholds?action=diffrev1=21rev2=22

  == Using Jconsole To Optimize Thresholds ==
  Cassandra's column-family mbeans have a number of attributes that can prove 
invaluable in determining optimal thresholds. One way to access this 
instrumentation is by using Jconsole, a graphical monitoring and management 
application that ships with your JDK.
  
- Launching Jconsole with no arguments will display the New Connection dialog 
box. If you are running Jconsole on the same machine that  Cassandra is running 
on, then you can connect using the PID, otherwise you will need to connect 
remotely. The default startup scripts for Cassandra cause the VM to listen on 
port 8080 using the JVM option:
+ Launching Jconsole with no arguments will display the New Connection dialog 
box. If you are running Jconsole on the same machine that  Cassandra is running 
on, then you can connect using the PID, otherwise you will need to connect 
remotely. The default startup scripts for Cassandra cause the VM to listen on 
port 8080 (7199 starting in v0.8.0-beta1) using the JVM option:
  
   . -Dcom.sun.management.jmxremote.port=8080
  


svn commit: r1146923 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java

2011-07-14 Thread brandonwilliams
Author: brandonwilliams
Date: Thu Jul 14 23:42:11 2011
New Revision: 1146923

URL: http://svn.apache.org/viewvc?rev=1146923view=rev
Log:
Do not allow extra params to nodetool commands to prevent confusion.
Patch by Jon Hermes, reviewed by brandonwilliams for CASSANDRA-2740

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java?rev=1146923r1=1146922r2=1146923view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java
 Thu Jul 14 23:42:11 2011
@@ -539,19 +539,19 @@ public class NodeCmd
 
 switch (command)
 {
-case RING: nodeCmd.printRing(System.out); break;
-case INFO: nodeCmd.printInfo(System.out); break;
-case CFSTATS : nodeCmd.printColumnFamilyStats(System.out); 
break;
-case DECOMMISSION: probe.decommission(); break;
-case LOADBALANCE : probe.loadBalance(); break;
-case CLEARSNAPSHOT   : probe.clearSnapshot(); break;
-case TPSTATS : nodeCmd.printThreadPoolStats(System.out); 
break;
-case VERSION : nodeCmd.printReleaseVersion(System.out); 
break;
-case COMPACTIONSTATS : nodeCmd.printCompactionStats(System.out); 
break;
-case DISABLEGOSSIP   : probe.stopGossiping(); break;
-case ENABLEGOSSIP: probe.startGossiping(); break;
-case DISABLETHRIFT   : probe.stopThriftServer(); break;
-case ENABLETHRIFT: probe.startThriftServer(); break;
+case RING: complainNonzeroArgs(arguments, command); 
nodeCmd.printRing(System.out); break;
+case INFO: complainNonzeroArgs(arguments, command); 
nodeCmd.printInfo(System.out); break;
+case CFSTATS : complainNonzeroArgs(arguments, command); 
nodeCmd.printColumnFamilyStats(System.out); break;
+case DECOMMISSION: complainNonzeroArgs(arguments, command); 
probe.decommission(); break;
+case LOADBALANCE : complainNonzeroArgs(arguments, command); 
probe.loadBalance(); break;
+case CLEARSNAPSHOT   : complainNonzeroArgs(arguments, command); 
probe.clearSnapshot(); break;
+case TPSTATS : complainNonzeroArgs(arguments, command); 
nodeCmd.printThreadPoolStats(System.out); break;
+case VERSION : complainNonzeroArgs(arguments, command); 
nodeCmd.printReleaseVersion(System.out); break;
+case COMPACTIONSTATS : complainNonzeroArgs(arguments, command); 
nodeCmd.printCompactionStats(System.out); break;
+case DISABLEGOSSIP   : complainNonzeroArgs(arguments, command); 
probe.stopGossiping(); break;
+case ENABLEGOSSIP: complainNonzeroArgs(arguments, command); 
probe.startGossiping(); break;
+case DISABLETHRIFT   : complainNonzeroArgs(arguments, command); 
probe.stopThriftServer(); break;
+case ENABLETHRIFT: complainNonzeroArgs(arguments, command); 
probe.startThriftServer(); break;
 
 case DRAIN :
 try { probe.drain(); }
@@ -647,6 +647,15 @@ public class NodeCmd
 System.exit(3);
 }
 
+private static void complainNonzeroArgs(String[] args, NodeCommand cmd)
+{
+if (args.length  0) {
+System.err.println(Too many arguments for command 
'+cmd.toString()+'.);
+printUsage();
+System.exit(1);
+}
+}
+
 private static void optionalKSandCFs(NodeCommand nc, String[] cmdArgs, 
NodeProbe probe) throws InterruptedException, IOException
 {
 // if there is one additional arg, it's the keyspace; more are 
columnfamilies




[jira] [Commented] (CASSANDRA-2740) nodetool decommission should throw an error when there are extra params

2011-07-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065624#comment-13065624
 ] 

Hudson commented on CASSANDRA-2740:
---

Integrated in Cassandra-0.7 #528 (See 
[https://builds.apache.org/job/Cassandra-0.7/528/])
Do not allow extra params to nodetool commands to prevent confusion.
Patch by Jon Hermes, reviewed by brandonwilliams for CASSANDRA-2740

brandonwilliams : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1146923
Files : 
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java


 nodetool decommission should throw an error when there are extra params
 ---

 Key: CASSANDRA-2740
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2740
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Jon Hermes
Priority: Trivial
 Fix For: 0.7.8

 Attachments: 2740.txt


 removetoken takes a token parameter, but decommission works against the node 
 where the call is issued.  This allows confusion such as 'nodetool -h 
 localhost decommission ip or token' actually decommissioning the local 
 node, instead of whatever was passed to it.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-2894) add paging to get_count

2011-07-14 Thread Vijay (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vijay reassigned CASSANDRA-2894:


Assignee: Vijay

 add paging to get_count
 ---

 Key: CASSANDRA-2894
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2894
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Reporter: Jonathan Ellis
Assignee: Vijay
Priority: Minor
  Labels: lhf
 Fix For: 1.0


 It is non-intuitive that get_count materializes the entire slice-to-count on 
 the coordinator node (to perform read repair and  CL.ONE consistency).  Even 
 experienced users have been known to cause memory problems by requesting 
 large counts.
 The user cannot page the count himself, because you need a start and stop 
 column to do that, and get_count only returns an integer.
 So the best fix is for us to do the paging under the hood, in 
 CassandraServer.  Add a limit to the slicepredicate they specify, and page 
 through it.
 We could add a global setting for count_slice_size, and document that counts 
 of more columns than that will have higher latency (because they make 
 multiple calls through StorageProxy for the pages).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-1876) Allow minor Parallel Compaction

2011-07-14 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood resolved CASSANDRA-1876.
-

Resolution: Fixed

I think this one has been sufficiently resolved in trunk.

 Allow minor Parallel Compaction
 ---

 Key: CASSANDRA-1876
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876
 Project: Cassandra
  Issue Type: Improvement
Reporter: Germán Kondolf
Priority: Minor
 Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, 
 compactionPatch-V3.txt


 Hi,
 According to the dev's list discussion (1) I've patched the CompactionManager 
 to allow parallel compaction.
 Mainly it splits the sstables to compact in the desired buckets, configured 
 by a new parameter: compaction_parallelism with the current default of 1.
 Then, it just submits the units of work to a new executor and waits for the 
 finalization.
 The patch was created in the trunk, so I don't know the exact affected 
 version, I assume that is 0.8.
 I'll try to apply this patch to 0.6.X also for my current production 
 installation, and then reattach it.
 (1) http://markmail.org/thread/cldnqfh3s3nufnke

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF

2011-07-14 Thread Jonathan Ellis (JIRA)
Allow taking advantage of multiple cores while compacting a single CF
-

 Key: CASSANDRA-2901
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2901
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor


Moved from CASSANDRA-1876:

There are five stages: read, deserialize, merge, serialize, and write. We 
probably want to continue doing read+deserialize and serialize+write together, 
or you waste a lot copying to/from buffers.

So, what I would suggest is: one thread per input sstable doing read + 
deserialize (a row at a time). One thread merging corresponding rows from each 
input sstable. One thread doing serialize + writing the output. This should 
give us between 2x and 3x speedup (depending how much doing the merge on 
another thread than write saves us).

This will require roughly 2x the memory, to allow the reader threads to work 
ahead of the merge stage. (I.e. for each input sstable you will have up to one 
row in a queue waiting to be merged, and the reader thread working on the 
next.) Seems quite reasonable on that front.

Multithreaded compaction should be either on or off. It doesn't make sense to 
try to do things halfway (by doing the reads with a
threadpool whose size you can grow/shrink, for instance): we still have 
compaction threads tuned to low priority, by default, so the impact on the rest 
of the system won't be very different. Nor do we expect to have so many input 
sstables that we lose a lot in context switching between reader threads. (If 
this is a concern, we already have a tunable to limit the number of sstables 
merged at a time in a single CF.)

IMO it's acceptable to punt completely on rows that are larger than memory, and 
fall back to the old non-parallel code there. I don't see any sane way to 
parallelize large-row compactions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1876) Allow minor Parallel Compaction

2011-07-14 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065688#comment-13065688
 ] 

Jonathan Ellis commented on CASSANDRA-1876:
---

created CASSANDRA-2901 to follow up on concurrency-for-single-merge.

 Allow minor Parallel Compaction
 ---

 Key: CASSANDRA-1876
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876
 Project: Cassandra
  Issue Type: Improvement
Reporter: Germán Kondolf
Priority: Minor
 Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, 
 compactionPatch-V3.txt


 Hi,
 According to the dev's list discussion (1) I've patched the CompactionManager 
 to allow parallel compaction.
 Mainly it splits the sstables to compact in the desired buckets, configured 
 by a new parameter: compaction_parallelism with the current default of 1.
 Then, it just submits the units of work to a new executor and waits for the 
 finalization.
 The patch was created in the trunk, so I don't know the exact affected 
 version, I assume that is 0.8.
 I'll try to apply this patch to 0.6.X also for my current production 
 installation, and then reattach it.
 (1) http://markmail.org/thread/cldnqfh3s3nufnke

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira