[jira] [Commented] (CASSANDRA-2252) off-heap memtables
[ https://issues.apache.org/jira/browse/CASSANDRA-2252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065135#comment-13065135 ] Stu Hood commented on CASSANDRA-2252: - bq. And if they are so large they do not, then your rate of key allocation is glacial and again it shouldn't matter. Compaction builds up an IndexSummary slowly enough that I theorized it might be causing fragmentation... didn't get a chance to prove it though. bq. There is no logical unit of slabbing for key cache, we shouldn't be doing that at all. Agreed. We actually ended up disabling the key cache and saw a nice boost in time-to-promotion-failure, but I would love to find an actual solution. bq. Once you promoted a slab in old gen, it stays there, instead of being GC'd and replaced with a slab in new gen again. The bookkeeping might be worth it, yes. off-heap memtables -- Key: CASSANDRA-2252 URL: https://issues.apache.org/jira/browse/CASSANDRA-2252 Project: Cassandra Issue Type: Improvement Reporter: Jonathan Ellis Assignee: Jonathan Ellis Fix For: 1.0 Attachments: 0001-add-MemtableAllocator.txt, 0002-add-off-heap-MemtableAllocator-support.txt, 2252-v3.txt, merged-2252.tgz Original Estimate: 0.4h Remaining Estimate: 0.4h The memtable design practically actively fights Java's GC design. Todd Lipcon gave a good explanation over on HBASE-3455. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2868) Native Memory Leak
[ https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Burroughs updated CASSANDRA-2868: --- Attachment: low-load-36-hours-initial-results.png Initial results. Graph of VmRSS from /proc/PID/status at 10 second intervals from my last comment to now. Box on the left has GCInspector disabled. These are on two test boxes under trivial load so this is all still *very* tentative. Will start testing under real load by early next week. Native Memory Leak -- Key: CASSANDRA-2868 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Reporter: Daniel Doubleday Priority: Minor Attachments: 2868-v1.txt, low-load-36-hours-initial-results.png We have memory issues with long running servers. These have been confirmed by several users in the user list. That's why I report. The memory consumption of the cassandra java process increases steadily until it's killed by the os because of oom (with no swap) Our server is started with -Xmx3000M and running for around 23 days. pmap -x shows Total SST: 1961616 (mem mapped data and index files) Anon RSS: 6499640 Total RSS: 8478376 This shows that 3G are 'overallocated'. We will use BRAF on one of our less important nodes to check wether it is related to mmap and report back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2868) Native Memory Leak
[ https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065247#comment-13065247 ] Jonathan Ellis commented on CASSANDRA-2868: --- Promising! Native Memory Leak -- Key: CASSANDRA-2868 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.6 Reporter: Daniel Doubleday Priority: Minor Attachments: 2868-v1.txt, low-load-36-hours-initial-results.png We have memory issues with long running servers. These have been confirmed by several users in the user list. That's why I report. The memory consumption of the cassandra java process increases steadily until it's killed by the os because of oom (with no swap) Our server is started with -Xmx3000M and running for around 23 days. pmap -x shows Total SST: 1961616 (mem mapped data and index files) Anon RSS: 6499640 Total RSS: 8478376 This shows that 3G are 'overallocated'. We will use BRAF on one of our less important nodes to check wether it is related to mmap and report back. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2753) Capture the max client timestamp for an SSTable
[ https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Doubleday updated CASSANDRA-2753: Attachment: SSTableWriterTest.patch Dunno if SSTableWriterTest is the right place but the added test would break. Capture the max client timestamp for an SSTable --- Key: CASSANDRA-2753 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Alan Liang Assignee: Alan Liang Priority: Minor Fix For: 1.0 Attachments: 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, SSTableWriterTest.patch, supercolumn.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1146732 - /cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java
Author: jbellis Date: Thu Jul 14 14:32:16 2011 New Revision: 1146732 URL: http://svn.apache.org/viewvc?rev=1146732view=rev Log: add test for including supercolumn tombstone time in max timestamp computation patch by Daniel Doubleday; reviewed by jbellis for CASSANDRA-2753 Modified: cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java Modified: cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java URL: http://svn.apache.org/viewvc/cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java?rev=1146732r1=1146731r2=1146732view=diff == --- cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java (original) +++ cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java Thu Jul 14 14:32:16 2011 @@ -21,16 +21,15 @@ package org.apache.cassandra.io.sstable; */ +import static org.apache.cassandra.Util.addMutation; import static org.junit.Assert.*; import java.io.IOException; import java.nio.ByteBuffer; -import java.util.Arrays; -import java.util.HashMap; -import java.util.List; -import java.util.Map; +import java.util.*; import java.util.concurrent.ExecutionException; +import org.apache.cassandra.Util; import org.junit.Test; import org.apache.cassandra.CleanupHelper; @@ -137,4 +136,38 @@ public class SSTableWriterTest extends C // ensure max timestamp is captured during rebuild assert sstr.getMaxTimestamp() == 4321L; } + +@Test +public void testSuperColumnMaxTimestamp() throws IOException, ExecutionException, InterruptedException +{ +ColumnFamilyStore store = Table.open(Keyspace1).getColumnFamilyStore(Super1); +RowMutation rm; +DecoratedKey dk = Util.dk(key1); + +// add data +rm = new RowMutation(Keyspace1, dk.key); +addMutation(rm, Super1, SC1, 1, val1, 0); +rm.apply(); +store.forceBlockingFlush(); + +validateMinTimeStamp(store.getSSTables(), 0); + +// remove +rm = new RowMutation(Keyspace1, dk.key); +rm.delete(new QueryPath(Super1, ByteBufferUtil.bytes(SC1)), 1); +rm.apply(); +store.forceBlockingFlush(); + +validateMinTimeStamp(store.getSSTables(), 0); + +CompactionManager.instance.performMaximal(store); +assertEquals(1, store.getSSTables().size()); +validateMinTimeStamp(store.getSSTables(), 1); +} + +private void validateMinTimeStamp(CollectionSSTableReader ssTables, int timestamp) +{ +for (SSTableReader ssTable : ssTables) +assertTrue(ssTable.getMaxTimestamp() = timestamp); +} }
[jira] [Commented] (CASSANDRA-2753) Capture the max client timestamp for an SSTable
[ https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065292#comment-13065292 ] Jonathan Ellis commented on CASSANDRA-2753: --- lgtm, thanks! Capture the max client timestamp for an SSTable --- Key: CASSANDRA-2753 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Alan Liang Assignee: Alan Liang Priority: Minor Fix For: 1.0 Attachments: 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, SSTableWriterTest.patch, supercolumn.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-47) SSTable compression
[ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-47: - Attachment: CASSANDRA-47-v2.patch v2 eliminates need in sparse files, index section added at the end of the file to hold chunk sizes, so header size increased to 18 bytes - 2 control bytes, 8 bytes for uncompressed size and 8 bytes indicating offset of the index section. That approach does not use any additional space except of 4 bytes required to store index section length (at the header of that section), chunk sizes are important information so I don't count size need to store them as overhead. Also tried to import/export large files using sstable2json and json2sstable to make sure that it works. SSTable compression --- Key: CASSANDRA-47 URL: https://issues.apache.org/jira/browse/CASSANDRA-47 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Labels: compression Fix For: 1.0 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar We should be able to do SSTable compression which would trade CPU for I/O (almost always a good trade). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-47) SSTable compression
[ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065300#comment-13065300 ] Pavel Yaskevich commented on CASSANDRA-47: -- Forgot to mention that this is rebased with latest trunk (latest commit 4629648899e637e8e03938935f126689cce5ad48) SSTable compression --- Key: CASSANDRA-47 URL: https://issues.apache.org/jira/browse/CASSANDRA-47 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Labels: compression Fix For: 1.0 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar We should be able to do SSTable compression which would trade CPU for I/O (almost always a good trade). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-47) SSTable compression
[ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065311#comment-13065311 ] Jonathan Ellis commented on CASSANDRA-47: - bq. index section added at the end of the file to hold chunk sizes We used to do this with the index entries, but keeping that in memory until you're done can cause a lot of memory pressure. I like the suggestion of moving index entry to (key, compresed-chunk-offset, uncompressed-offset-within-chunk) better. SSTable compression --- Key: CASSANDRA-47 URL: https://issues.apache.org/jira/browse/CASSANDRA-47 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Labels: compression Fix For: 1.0 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar We should be able to do SSTable compression which would trade CPU for I/O (almost always a good trade). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2753) Capture the max client timestamp for an SSTable
[ https://issues.apache.org/jira/browse/CASSANDRA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065316#comment-13065316 ] Hudson commented on CASSANDRA-2753: --- Integrated in Cassandra #958 (See [https://builds.apache.org/job/Cassandra/958/]) add test for including supercolumn tombstone time in max timestamp computation patch by Daniel Doubleday; reviewed by jbellis for CASSANDRA-2753 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1146732 Files : * /cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java Capture the max client timestamp for an SSTable --- Key: CASSANDRA-2753 URL: https://issues.apache.org/jira/browse/CASSANDRA-2753 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Alan Liang Assignee: Alan Liang Priority: Minor Fix For: 1.0 Attachments: 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V2.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to-V3.patch, 0001-capture-max-timestamp-and-created-SSTableMetadata-to.patch, 0003-capture-max-timestamp-for-sstable-and-introduced-SST.patch, SSTableWriterTest.patch, supercolumn.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-47) SSTable compression
[ https://issues.apache.org/jira/browse/CASSANDRA-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065321#comment-13065321 ] Pavel Yaskevich commented on CASSANDRA-47: -- I did that for flexibility to use CompressedDataFile without relaying on index file, that index is read only once upon CompressedSegmentedFile completion and then just gets passed to constructor in CSF.getSegment() so even if file is very big like 5-7 GB it will only make about 1 megabyte overhead of keeping that index in memory. Index also allows us to skip reading additional 4 bytes of chunk length from file every time we do re-buffer. SSTable compression --- Key: CASSANDRA-47 URL: https://issues.apache.org/jira/browse/CASSANDRA-47 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Labels: compression Fix For: 1.0 Attachments: CASSANDRA-47-v2.patch, CASSANDRA-47.patch, snappy-java-1.0.3-rc4.jar We should be able to do SSTable compression which would trade CPU for I/O (almost always a good trade). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Trivial Update of ThirdPartySupport by Michael Weir
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ThirdPartySupport page has been changed by Michael Weir: http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=14rev2=15 Companies providing support for Apache Cassandra are not endorsed by the Apache Software Foundation, although some of these companies employ [[Committers]] to the Apache project. - {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] offers the Acunu Data Platform for faster, more consistent performance from your Cassandra applications (free to use for up to 2 nodes). Acunu also provides Cassandra training, support and professional services. + Companies that employ Apache Cassandra Committers: {{http://www.datastax.com/sites/all/themes/datastax20110201/logo.png}} [[http://datastax.com|Datastax]] DataStax, the commercial leader in Apache Cassandra™ offers products and services that make it easy for customers to build, deploy and operate elastically scalable and cloud-optimized applications and data services. The company has over 90 customers, including leaders such as Netflix, Cisco, Rackspace and Constant Contact, and spanning verticals including web, financial services, telecommunications, logistics and government. + + + Other companies: + + {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] offers the Acunu Data Platform for faster, more consistent performance from your Cassandra applications (free to use for up to 2 nodes). Acunu also provides Cassandra training, support and professional services. {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} [[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase,
[Cassandra Wiki] Trivial Update of ThirdPartySupport by Michael Weir
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ThirdPartySupport page has been changed by Michael Weir: http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=15rev2=16 Companies providing support for Apache Cassandra are not endorsed by the Apache Software Foundation, although some of these companies employ [[Committers]] to the Apache project. - Companies that employ Apache Cassandra Committers: + Companies that employ Apache Cassandra [[Committers]]: {{http://www.datastax.com/sites/all/themes/datastax20110201/logo.png}} [[http://datastax.com|Datastax]] DataStax, the commercial leader in Apache Cassandra™ offers products and services that make it easy for customers to build, deploy and operate elastically scalable and cloud-optimized applications and data services. The company has over 90 customers, including leaders such as Netflix, Cisco, Rackspace and Constant Contact, and spanning verticals including web, financial services, telecommunications, logistics and government.
[jira] [Commented] (CASSANDRA-2888) CQL support for JDBC DatabaseMetaData
[ https://issues.apache.org/jira/browse/CASSANDRA-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065383#comment-13065383 ] Dave Carlson commented on CASSANDRA-2888: - 1.0.4-SNAPSHOT works for me - I can work with that and I'll close this. The toolset is a large proprietary/commercial J2EE document storage and retrieval system. CQL support for JDBC DatabaseMetaData - Key: CASSANDRA-2888 URL: https://issues.apache.org/jira/browse/CASSANDRA-2888 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Environment: J2SE 1.6.0_22 x64 on Fedora 15 Reporter: Dave Carlson Assignee: Rick Shaw Priority: Minor Labels: cql, newbie Fix For: 0.8.2 Original Estimate: 96h Remaining Estimate: 96h In order to increase the drop-in capability of CQL to existing JDBC app bases, CQL must be updated to include at least semi-valid responses to the JDBC metadata portion. without enhancement: com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI pointer) Resource has error: java.lang.UnsupportedOperationException: method not supported ... with enhancement: com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI pointer) org.apache.cassandra.cql.jdbc.CassandraConnection@1915470e -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2888) CQL support for JDBC DatabaseMetaData
[ https://issues.apache.org/jira/browse/CASSANDRA-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dave Carlson resolved CASSANDRA-2888. - Resolution: Fixed Fix Version/s: 0.8.2 works in 1.0.4-SNAPSHOT of CQL CQL support for JDBC DatabaseMetaData - Key: CASSANDRA-2888 URL: https://issues.apache.org/jira/browse/CASSANDRA-2888 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Environment: J2SE 1.6.0_22 x64 on Fedora 15 Reporter: Dave Carlson Assignee: Rick Shaw Priority: Minor Labels: cql, newbie Fix For: 0.8.2 Original Estimate: 96h Remaining Estimate: 96h In order to increase the drop-in capability of CQL to existing JDBC app bases, CQL must be updated to include at least semi-valid responses to the JDBC metadata portion. without enhancement: com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI pointer) Resource has error: java.lang.UnsupportedOperationException: method not supported ... with enhancement: com.largecompany.JDBCManager.getConnection(vague Cassandra JNDI pointer) org.apache.cassandra.cql.jdbc.CassandraConnection@1915470e -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Trivial Update of ThirdPartySupport by BrandonWilliams
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ThirdPartySupport page has been changed by BrandonWilliams: http://wiki.apache.org/cassandra/ThirdPartySupport?action=diffrev1=16rev2=17 Comment: Fix transposed content {{http://media.acunu.com/library/logo.png}} [[http://www.acunu.com|Acunu]] offers the Acunu Data Platform for faster, more consistent performance from your Cassandra applications (free to use for up to 2 nodes). Acunu also provides Cassandra training, support and professional services. - {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} [[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase, + {{http://www.impetus.com/sites/impetus.com/impetus/gifs/logo_impetus.png}} [[http://www.impetus.com/ |Impetus]] provides expertise in Cassandra, Hbase, MongoDB, and Other databases like Riak, Redis, Membase, Tokyocabinet, etc [[http://bigdata.impetus.com/# | More info about BigData @Impetus]] {{http://www.onzra.com/images/Small-Logo.gif}} [[http://www.ONZRA.com|ONZRA]] has been around for over 10 years and specializes on enterprise grade architecture, development and security consulting services utilizing many large scale database technologies such as Cassandra, Oracle, Alegro Graph, and much more. - MongoDB, and Other databases like Riak, Redis, Membase, Tokyocabinet, etc [[http://bigdata.impetus.com/# | More info about BigData @Impetus]] +
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065404#comment-13065404 ] Brandon Williams commented on CASSANDRA-2843: - 2843_b.patch fails to apply a chunk to trunk in src/java/org/apache/cassandra/db/ColumnFamily, can you rebase? better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065409#comment-13065409 ] Brandon Williams commented on CASSANDRA-2496: - I see two more things to be done with this patch. First, when re-replicating nodes report back to the removal coordinator, if the coordinator has restarted it won't understand them, and they will infinitely loop retrying the confirmation. Second, since we're holding dead states, we need to make sure that bootstrapping/moving nodes can take over these dead tokens. Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1788) reduce copies on read, write paths
[ https://issues.apache.org/jira/browse/CASSANDRA-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1788: -- Attachment: 1788-v5.txt Part of the old sendOneWay (the packbody copy) looks like this: {code} DataOutputBuffer buffer = new DataOutputBuffer(); buffer.writeUTF(id); Message.serializer().serialize(message, buffer, message.getVersion()); data = buffer.getData(); {code} byte[] data is NOT restricted to just the serialized bytes in the buffer -- it will include any unused bytes at the end, as well. v5 skips garbage bytes like this for backwards compatibility. reduce copies on read, write paths -- Key: CASSANDRA-1788 URL: https://issues.apache.org/jira/browse/CASSANDRA-1788 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Fix For: 1.0 Attachments: 0001-setup.txt, 0002-remove-copies-from-network-path.txt, 1788-v2.txt, 1788-v3.txt, 1788-v4.txt, 1788-v5.txt, 1788-v5.txt, 1788.txt Original Estimate: 24h Remaining Estimate: 24h Currently, we do _three_ unnecessary copies (that is, writing to the socket is necessary; any other copies made are overhead) for each message: - constructing the Message body byte[] (this is typically a call to a ICompactSerializer[2] serialize method, but sometimes we cheat e.g. in SchemaCheckVerbHandler's reply) - which is copied to a buffer containing the entire Message (i.e. including Header) when sendOneWay calls Message.serializer.serialize() - which is copied to a newly-allocated ByteBuffer when sendOneWay calls packIt - which is what we write to the socket For deserialize we perform a similar orgy of copies: - IncomingTcpConnection reads the Message length, allocates a byte[], and reads the serialized Message into it - ITcpC then calls Message.serializer().deserialize, which allocates a new byte[] for the body and copies that part - finally, the verbHandler (determined by the now-deserialized Message header) deserializes the actual object represented by the body Most of these are out of scope for 0.7 but I think we can at least elide the last copy on the write path and the first on the read. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2860) Versioning works *too* well
[ https://issues.apache.org/jira/browse/CASSANDRA-2860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065423#comment-13065423 ] Brandon Williams commented on CASSANDRA-2860: - +1 Versioning works *too* well --- Key: CASSANDRA-2860 URL: https://issues.apache.org/jira/browse/CASSANDRA-2860 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.1 Reporter: Brandon Williams Assignee: Jonathan Ellis Fix For: 0.8.2 Attachments: 2860-v2.txt, 2860.txt The scenario goes something like this: you upgrade from 0.7 to 0.8, but all the nodes remember that the remote side is 0.7, so they in turn speak 0.7, causing the local node to also think the remote is 0.7, even though both are really 0.8. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2899) cli silently fails when classes are quoted
cli silently fails when classes are quoted -- Key: CASSANDRA-2899 URL: https://issues.apache.org/jira/browse/CASSANDRA-2899 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Pavel Yaskevich Priority: Minor For example: CREATE COLUMN FAMILY autocomplete_meta WITH comparator = 'UTF8Type' AND default_validation_class = 'UTF8Type' AND key_validation_class = 'UTF8Type' Neither validation class is actually set, but if you remove the quotes everything works. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_b.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065483#comment-13065483 ] Yang Yang edited comment on CASSANDRA-2843 at 7/14/11 7:25 PM: --- rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48) also fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. was (Author: yangyangyyy): fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48) also fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. ) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2882) describe_ring should include datacenter/topology information
[ https://issues.apache.org/jira/browse/CASSANDRA-2882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065496#comment-13065496 ] Nate McCall commented on CASSANDRA-2882: This should be considered related to CASSANDRA-1777 I think both of these are crucial to providing clients that can take advantage of topologies. describe_ring should include datacenter/topology information Key: CASSANDRA-2882 URL: https://issues.apache.org/jira/browse/CASSANDRA-2882 Project: Cassandra Issue Type: Improvement Components: API, Core Reporter: Mark Guzman Priority: Minor describe_ring is great for getting a list of nodes in the cluster, but it doesn't provide any information about the network topology which prevents it's use in a multi-dc setup. It would be nice if we added another list to the TokenRange object containing the DC information. Optimally I could have ask any Cassandra node for this information and on the client-side prefer local nodes but be able to fail to remote nodes without requiring another lookup. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2129) removetoken after removetoken rf error fails to work
[ https://issues.apache.org/jira/browse/CASSANDRA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-2129: Reviewer: xedin (was: thepaul) removetoken after removetoken rf error fails to work Key: CASSANDRA-2129 URL: https://issues.apache.org/jira/browse/CASSANDRA-2129 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.0 Reporter: Mike Bulman Assignee: Brandon Williams Priority: Minor Fix For: 0.8.2 Attachments: 2129-v2.txt, 2129.txt Original Estimate: 4h Remaining Estimate: 4h 2 node cluster, a keyspace existed with rf=2. Tried removetoken and got: mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794 Exception in thread main java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) Deleted the keyspace, and tried again: mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794 Exception in thread main java.lang.UnsupportedOperationException: This node is already processing a removal. Wait for it to complete. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] paul cannon updated CASSANDRA-2496: --- Attachment: 0003-update-gossip-related-comments.patch.txt These small patches build on the others. 0003-update-gossip-related-comments.patch.txt: updates gossip-related comments derp derp. Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] paul cannon updated CASSANDRA-2496: --- Attachment: 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt: use REMOVED_TOKEN instead of STATUS_LEFT (would probably be ok either way, but otherwise, the REMOVED_TOKEN state would not be used). Seems this is more the way it was intended. Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] paul cannon updated CASSANDRA-2496: --- Attachment: 0005-drain-self-if-removetoken-d-elsewhere.patch.txt 0005-drain-self-if-removetoken-d-elsewhere.patch.txt : when node X was partitioned and removetoken'd but then it shows up again, it should shut itself down, rather than becoming a zombie Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065531#comment-13065531 ] paul cannon commented on CASSANDRA-2496: I'll see what I can do to test the infinitely loop retrying the confirmation and bootstrapping/moving nodes can take over these dead tokens situations. Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch rebased against 4629648899e637e8e03938935f126689cce5ad48 also fixed a bug in my test, the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2129) removetoken after removetoken rf error fails to work
[ https://issues.apache.org/jira/browse/CASSANDRA-2129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065542#comment-13065542 ] Pavel Yaskevich commented on CASSANDRA-2129: +1 removetoken after removetoken rf error fails to work Key: CASSANDRA-2129 URL: https://issues.apache.org/jira/browse/CASSANDRA-2129 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.0 Reporter: Mike Bulman Assignee: Brandon Williams Priority: Minor Fix For: 0.8.2 Attachments: 2129-v2.txt, 2129.txt Original Estimate: 4h Remaining Estimate: 4h 2 node cluster, a keyspace existed with rf=2. Tried removetoken and got: mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794 Exception in thread main java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) Deleted the keyspace, and tried again: mbulman@ripcord-maverick1:/usr/src/cassandra/tags/cassandra-0.7.0$ bin/nodetool -h localhost removetoken 159559397954378837828954138596956659794 Exception in thread main java.lang.UnsupportedOperationException: This node is already processing a removal. Wait for it to complete. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1146900 - in /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra: locator/ service/
Author: brandonwilliams Date: Thu Jul 14 21:24:11 2011 New Revision: 1146900 URL: http://svn.apache.org/viewvc?rev=1146900view=rev Log: Allow RF to exceed the number of nodes (but disallow writes) Patch by brandonwilliams, reviewed by Pavel Yaskevich for CASSANDRA-2129 Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/OldNetworkTopologyStrategy.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/SimpleStrategy.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/WriteResponseHandler.java Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java?rev=1146900r1=1146899r2=1146900view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java Thu Jul 14 21:24:11 2011 @@ -87,9 +87,8 @@ public abstract class AbstractReplicatio * we return a List to avoid an extra allocation when sorting by proximity later * @param searchToken the token the natural endpoints are requested for * @return a copy of the natural endpoints for the given token - * @throws IllegalStateException if the number of requested replicas is greater than the number of known endpoints */ -public ArrayListInetAddress getNaturalEndpoints(Token searchToken) throws IllegalStateException +public ArrayListInetAddress getNaturalEndpoints(Token searchToken) { Token keyToken = TokenMetadata.firstToken(tokenMetadata.sortedTokens(), searchToken); ArrayListInetAddress endpoints = getCachedEndpoints(keyToken); @@ -99,10 +98,6 @@ public abstract class AbstractReplicatio keyToken = TokenMetadata.firstToken(tokenMetadataClone.sortedTokens(), searchToken); endpoints = new ArrayListInetAddress(calculateNaturalEndpoints(searchToken, tokenMetadataClone)); cacheEndpoint(keyToken, endpoints); -// calculateNaturalEndpoints should have checked this already, this is a safety -assert getReplicationFactor() = endpoints.size() : String.format(endpoints %s generated for RF of %s, - Arrays.toString(endpoints.toArray()), - getReplicationFactor()); } return new ArrayListInetAddress(endpoints); @@ -115,9 +110,8 @@ public abstract class AbstractReplicatio * * @param searchToken the token the natural endpoints are requested for * @return a copy of the natural endpoints for the given token - * @throws IllegalStateException if the number of requested replicas is greater than the number of known endpoints */ -public abstract ListInetAddress calculateNaturalEndpoints(Token searchToken, TokenMetadata tokenMetadata) throws IllegalStateException; +public abstract ListInetAddress calculateNaturalEndpoints(Token searchToken, TokenMetadata tokenMetadata); public IWriteResponseHandler getWriteResponseHandler(CollectionInetAddress writeEndpoints, MultimapInetAddress, InetAddress hintedEndpoints, Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java?rev=1146900r1=1146899r2=1146900view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java Thu Jul 14 21:24:11 2011 @@ -120,9 +120,6 @@ public class NetworkTopologyStrategy ext dcEndpoints.add(endpoint); } -if (dcEndpoints.size() dcReplicas) -throw new IllegalStateException(String.format(datacenter (%s) has no more endpoints, (%s) replicas still needed, - dcName, dcReplicas - dcEndpoints.size())); if (logger.isDebugEnabled()) logger.debug({} endpoints in datacenter {} for token {} ,
[Cassandra Wiki] Update of JmxInterface by defmikekoh
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The JmxInterface page has been changed by defmikekoh: http://wiki.apache.org/cassandra/JmxInterface?action=diffrev1=24rev2=25 - If you start it using the standard startup script, Cassandra will listen for connections on port 8080 to view and tweak variables which it exposes via [[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/|JMX]]. This may be helpful for debugging and monitoring. + If you start it using the standard startup script, Cassandra will listen for connections on port 8080 (port 7199 starting in 0.8.0-beta1) to view and tweak variables which it exposes via [[http://java.sun.com/javase/technologies/core/mntr-mgmt/javamanagement/|JMX]]. This may be helpful for debugging and monitoring. See also [[JmxGotchas]]. The MemtableThresholds page describes how to use [[http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html|Jconsole]] as a client for this.
[Cassandra Wiki] Update of RunningCassandra by defmikekoh
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The RunningCassandra page has been changed by defmikekoh: http://wiki.apache.org/cassandra/RunningCassandra?action=diffrev1=15rev2=16 $ CASSANDRA_INCLUDE=/tmp/new.in.sh bin/cassandra }}} - Among other things, the defaults in `bin/cassandra.in.sh` include a maximum heap size (-Xmx) of 1GB, which you'll almost certainly want to considering tailoring for your environment. The port to access Cassandra's JmxInterface is also configured here through the `com.sun.management.jmxremote.port` property and defaults to 8080. + Among other things, the defaults in `bin/cassandra.in.sh` include a maximum heap size (-Xmx) of 1GB, which you'll almost certainly want to considering tailoring for your environment. The port to access Cassandra's JmxInterface is also configured here through the `com.sun.management.jmxremote.port` property and defaults to 8080 (7199 starting in v0.8.0-beta1). Additionally, the script recognizes a number of command line arguments, invoking the script with the `-h` option prints a brief summary of them.
[Cassandra Wiki] Update of MemtableThresholds by defmikekoh
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The MemtableThresholds page has been changed by defmikekoh: http://wiki.apache.org/cassandra/MemtableThresholds?action=diffrev1=21rev2=22 == Using Jconsole To Optimize Thresholds == Cassandra's column-family mbeans have a number of attributes that can prove invaluable in determining optimal thresholds. One way to access this instrumentation is by using Jconsole, a graphical monitoring and management application that ships with your JDK. - Launching Jconsole with no arguments will display the New Connection dialog box. If you are running Jconsole on the same machine that Cassandra is running on, then you can connect using the PID, otherwise you will need to connect remotely. The default startup scripts for Cassandra cause the VM to listen on port 8080 using the JVM option: + Launching Jconsole with no arguments will display the New Connection dialog box. If you are running Jconsole on the same machine that Cassandra is running on, then you can connect using the PID, otherwise you will need to connect remotely. The default startup scripts for Cassandra cause the VM to listen on port 8080 (7199 starting in v0.8.0-beta1) using the JVM option: . -Dcom.sun.management.jmxremote.port=8080
svn commit: r1146923 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java
Author: brandonwilliams Date: Thu Jul 14 23:42:11 2011 New Revision: 1146923 URL: http://svn.apache.org/viewvc?rev=1146923view=rev Log: Do not allow extra params to nodetool commands to prevent confusion. Patch by Jon Hermes, reviewed by brandonwilliams for CASSANDRA-2740 Modified: cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java Modified: cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java?rev=1146923r1=1146922r2=1146923view=diff == --- cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java (original) +++ cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java Thu Jul 14 23:42:11 2011 @@ -539,19 +539,19 @@ public class NodeCmd switch (command) { -case RING: nodeCmd.printRing(System.out); break; -case INFO: nodeCmd.printInfo(System.out); break; -case CFSTATS : nodeCmd.printColumnFamilyStats(System.out); break; -case DECOMMISSION: probe.decommission(); break; -case LOADBALANCE : probe.loadBalance(); break; -case CLEARSNAPSHOT : probe.clearSnapshot(); break; -case TPSTATS : nodeCmd.printThreadPoolStats(System.out); break; -case VERSION : nodeCmd.printReleaseVersion(System.out); break; -case COMPACTIONSTATS : nodeCmd.printCompactionStats(System.out); break; -case DISABLEGOSSIP : probe.stopGossiping(); break; -case ENABLEGOSSIP: probe.startGossiping(); break; -case DISABLETHRIFT : probe.stopThriftServer(); break; -case ENABLETHRIFT: probe.startThriftServer(); break; +case RING: complainNonzeroArgs(arguments, command); nodeCmd.printRing(System.out); break; +case INFO: complainNonzeroArgs(arguments, command); nodeCmd.printInfo(System.out); break; +case CFSTATS : complainNonzeroArgs(arguments, command); nodeCmd.printColumnFamilyStats(System.out); break; +case DECOMMISSION: complainNonzeroArgs(arguments, command); probe.decommission(); break; +case LOADBALANCE : complainNonzeroArgs(arguments, command); probe.loadBalance(); break; +case CLEARSNAPSHOT : complainNonzeroArgs(arguments, command); probe.clearSnapshot(); break; +case TPSTATS : complainNonzeroArgs(arguments, command); nodeCmd.printThreadPoolStats(System.out); break; +case VERSION : complainNonzeroArgs(arguments, command); nodeCmd.printReleaseVersion(System.out); break; +case COMPACTIONSTATS : complainNonzeroArgs(arguments, command); nodeCmd.printCompactionStats(System.out); break; +case DISABLEGOSSIP : complainNonzeroArgs(arguments, command); probe.stopGossiping(); break; +case ENABLEGOSSIP: complainNonzeroArgs(arguments, command); probe.startGossiping(); break; +case DISABLETHRIFT : complainNonzeroArgs(arguments, command); probe.stopThriftServer(); break; +case ENABLETHRIFT: complainNonzeroArgs(arguments, command); probe.startThriftServer(); break; case DRAIN : try { probe.drain(); } @@ -647,6 +647,15 @@ public class NodeCmd System.exit(3); } +private static void complainNonzeroArgs(String[] args, NodeCommand cmd) +{ +if (args.length 0) { +System.err.println(Too many arguments for command '+cmd.toString()+'.); +printUsage(); +System.exit(1); +} +} + private static void optionalKSandCFs(NodeCommand nc, String[] cmdArgs, NodeProbe probe) throws InterruptedException, IOException { // if there is one additional arg, it's the keyspace; more are columnfamilies
[jira] [Commented] (CASSANDRA-2740) nodetool decommission should throw an error when there are extra params
[ https://issues.apache.org/jira/browse/CASSANDRA-2740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065624#comment-13065624 ] Hudson commented on CASSANDRA-2740: --- Integrated in Cassandra-0.7 #528 (See [https://builds.apache.org/job/Cassandra-0.7/528/]) Do not allow extra params to nodetool commands to prevent confusion. Patch by Jon Hermes, reviewed by brandonwilliams for CASSANDRA-2740 brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1146923 Files : * /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/tools/NodeCmd.java nodetool decommission should throw an error when there are extra params --- Key: CASSANDRA-2740 URL: https://issues.apache.org/jira/browse/CASSANDRA-2740 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Jon Hermes Priority: Trivial Fix For: 0.7.8 Attachments: 2740.txt removetoken takes a token parameter, but decommission works against the node where the call is issued. This allows confusion such as 'nodetool -h localhost decommission ip or token' actually decommissioning the local node, instead of whatever was passed to it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2894) add paging to get_count
[ https://issues.apache.org/jira/browse/CASSANDRA-2894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay reassigned CASSANDRA-2894: Assignee: Vijay add paging to get_count --- Key: CASSANDRA-2894 URL: https://issues.apache.org/jira/browse/CASSANDRA-2894 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Labels: lhf Fix For: 1.0 It is non-intuitive that get_count materializes the entire slice-to-count on the coordinator node (to perform read repair and CL.ONE consistency). Even experienced users have been known to cause memory problems by requesting large counts. The user cannot page the count himself, because you need a start and stop column to do that, and get_count only returns an integer. So the best fix is for us to do the paging under the hood, in CassandraServer. Add a limit to the slicepredicate they specify, and page through it. We could add a global setting for count_slice_size, and document that counts of more columns than that will have higher latency (because they make multiple calls through StorageProxy for the pages). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-1876) Allow minor Parallel Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood resolved CASSANDRA-1876. - Resolution: Fixed I think this one has been sufficiently resolved in trunk. Allow minor Parallel Compaction --- Key: CASSANDRA-1876 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876 Project: Cassandra Issue Type: Improvement Reporter: Germán Kondolf Priority: Minor Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, compactionPatch-V3.txt Hi, According to the dev's list discussion (1) I've patched the CompactionManager to allow parallel compaction. Mainly it splits the sstables to compact in the desired buckets, configured by a new parameter: compaction_parallelism with the current default of 1. Then, it just submits the units of work to a new executor and waits for the finalization. The patch was created in the trunk, so I don't know the exact affected version, I assume that is 0.8. I'll try to apply this patch to 0.6.X also for my current production installation, and then reattach it. (1) http://markmail.org/thread/cldnqfh3s3nufnke -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2901) Allow taking advantage of multiple cores while compacting a single CF
Allow taking advantage of multiple cores while compacting a single CF - Key: CASSANDRA-2901 URL: https://issues.apache.org/jira/browse/CASSANDRA-2901 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Priority: Minor Moved from CASSANDRA-1876: There are five stages: read, deserialize, merge, serialize, and write. We probably want to continue doing read+deserialize and serialize+write together, or you waste a lot copying to/from buffers. So, what I would suggest is: one thread per input sstable doing read + deserialize (a row at a time). One thread merging corresponding rows from each input sstable. One thread doing serialize + writing the output. This should give us between 2x and 3x speedup (depending how much doing the merge on another thread than write saves us). This will require roughly 2x the memory, to allow the reader threads to work ahead of the merge stage. (I.e. for each input sstable you will have up to one row in a queue waiting to be merged, and the reader thread working on the next.) Seems quite reasonable on that front. Multithreaded compaction should be either on or off. It doesn't make sense to try to do things halfway (by doing the reads with a threadpool whose size you can grow/shrink, for instance): we still have compaction threads tuned to low priority, by default, so the impact on the rest of the system won't be very different. Nor do we expect to have so many input sstables that we lose a lot in context switching between reader threads. (If this is a concern, we already have a tunable to limit the number of sstables merged at a time in a single CF.) IMO it's acceptable to punt completely on rows that are larger than memory, and fall back to the old non-parallel code there. I don't see any sane way to parallelize large-row compactions. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1876) Allow minor Parallel Compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13065688#comment-13065688 ] Jonathan Ellis commented on CASSANDRA-1876: --- created CASSANDRA-2901 to follow up on concurrency-for-single-merge. Allow minor Parallel Compaction --- Key: CASSANDRA-1876 URL: https://issues.apache.org/jira/browse/CASSANDRA-1876 Project: Cassandra Issue Type: Improvement Reporter: Germán Kondolf Priority: Minor Attachments: 1876-reformatted.txt, compactionPatch-V2.txt, compactionPatch-V3.txt Hi, According to the dev's list discussion (1) I've patched the CompactionManager to allow parallel compaction. Mainly it splits the sstables to compact in the desired buckets, configured by a new parameter: compaction_parallelism with the current default of 1. Then, it just submits the units of work to a new executor and waits for the finalization. The patch was created in the trunk, so I don't know the exact affected version, I assume that is 0.8. I'll try to apply this patch to 0.6.X also for my current production installation, and then reattach it. (1) http://markmail.org/thread/cldnqfh3s3nufnke -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira