[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068830#comment-13068830 ] Yang Yang commented on CASSANDRA-2843: -- Brandon: I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it? [default@query] create column family NoCache ... with comparator = AsciiType ... and default_validation_class = AsciiType ... and key_validation_class = AsciiType ... and keys_cached = 0 ... and rows_cached = 0; Unable to set Compaction Strategy Class of AsciiType thanks Yang better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2521: Attachment: 2521-v5.patch Attaching v5. It is rebased and now all unit tests are passing. It also adds a few requested comments and only triggers GC for disk space when that could possibly be useful (i.e, when mmap is used on a non sun jvm (more precisely, if the mmap cleaner is not available)). Will commmit this based on earlier +1 and on Terje successful testing (thanks a lot for that btw). Move away from Phantom References for Compaction/Memtable - Key: CASSANDRA-2521 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Sylvain Lebresne Fix For: 1.0 Attachments: 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt, 2521-v5.patch http://wiki.apache.org/cassandra/MemtableSSTable Let's move to using reference counting instead of relying on GC to be called in StorageService. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2929) Don't include tmp files as sstable when create column families
Don't include tmp files as sstable when create column families -- Key: CASSANDRA-2929 URL: https://issues.apache.org/jira/browse/CASSANDRA-2929 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Trivial Fix For: 0.7.8, 0.8.2 Attachments: 0001-Don-t-include-tmp-files-as-sstables-when-creating-CF.patch When we open a column family and populate the SSTableReader, we happen to include -tmp files. This has no change to actually happen in a real life situation, but that is what was triggering a race in the unit tests triggering spurious assertion failure in estimateRowsFromIndex. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2929) Don't include tmp files as sstable when create column families
[ https://issues.apache.org/jira/browse/CASSANDRA-2929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2929: Attachment: 0001-Don-t-include-tmp-files-as-sstables-when-creating-CF.patch Patch is against 0.7 Don't include tmp files as sstable when create column families -- Key: CASSANDRA-2929 URL: https://issues.apache.org/jira/browse/CASSANDRA-2929 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Trivial Fix For: 0.7.8, 0.8.2 Attachments: 0001-Don-t-include-tmp-files-as-sstables-when-creating-CF.patch When we open a column family and populate the SSTableReader, we happen to include -tmp files. This has no change to actually happen in a real life situation, but that is what was triggering a race in the unit tests triggering spurious assertion failure in estimateRowsFromIndex. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Reopened] (CASSANDRA-2825) Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config.
[ https://issues.apache.org/jira/browse/CASSANDRA-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne reopened CASSANDRA-2825: - Reopening because the patch broke BootStrapperTest (it somehow hangs forever until junit timeout) Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config. -- Key: CASSANDRA-2825 URL: https://issues.apache.org/jira/browse/CASSANDRA-2825 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0, 0.8.1 Reporter: Michael Allen Assignee: Brandon Williams Fix For: 0.8.2 Attachments: 2825-v2.txt, 2825.txt This was done in sequence. A, B, C, and D. Node A with token 0 explicitly set in config. The rest with auto_bootstrap: true and no token explicitly assigned. B and C work as expected. D ends up stealing C's token. from system.log on C: INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 638) Node /10.171.47.226 is now part of the cluster INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 606) InetAddress /10.171.47.226 is now UP INFO [GossipStage:1] 2011-06-24 16:42:09,432 StorageService.java (line 769) Nodes /10.171.47.226 and /10.171.55.77 have the same token 61078635599166706937511052402724559481. /10.171.47.226 is the new owner WARN [GossipStage:1] 2011-06-24 16:42:09,432 TokenMetadata.java (line 120) Token 61078635599166706937511052402724559481 changing ownership from /10.171.55.77 to /10.171.47.226 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2521) Move away from Phantom References for Compaction/Memtable
[ https://issues.apache.org/jira/browse/CASSANDRA-2521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068887#comment-13068887 ] Hudson commented on CASSANDRA-2521: --- Integrated in Cassandra #967 (See [https://builds.apache.org/job/Cassandra/967/]) Use reference counting to delete sstables instead of relying on the GC patch by slebresne; reviewed by jbellis for CASSANDRA-2521 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149085 Files : * /cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableUtils.java * /cassandra/trunk/src/java/org/apache/cassandra/service/AntiEntropyService.java * /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java * /cassandra/trunk/src/java/org/apache/cassandra/db/DataTracker.java * /cassandra/trunk/src/java/org/apache/cassandra/db/Table.java * /cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java * /cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java * /cassandra/trunk/src/java/org/apache/cassandra/streaming/StreamOutSession.java * /cassandra/trunk/test/unit/org/apache/cassandra/streaming/SerializationsTest.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java * /cassandra/trunk/src/java/org/apache/cassandra/io/util/BufferedSegmentedFile.java * /cassandra/trunk/src/java/org/apache/cassandra/io/util/SegmentedFile.java * /cassandra/trunk/src/java/org/apache/cassandra/io/util/MmappedSegmentedFile.java * /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableDeletingTask.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageServiceMBean.java * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/src/java/org/apache/cassandra/streaming/PendingFile.java * /cassandra/trunk/src/java/org/apache/cassandra/streaming/StreamInSession.java * /cassandra/trunk/src/java/org/apache/cassandra/streaming/StreamOut.java * /cassandra/trunk/src/java/org/apache/cassandra/service/GCInspector.java * /cassandra/trunk/test/unit/org/apache/cassandra/streaming/StreamingTransferTest.java * /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java * /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableDeletingReference.java Move away from Phantom References for Compaction/Memtable - Key: CASSANDRA-2521 URL: https://issues.apache.org/jira/browse/CASSANDRA-2521 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Chris Goffinet Assignee: Sylvain Lebresne Fix For: 1.0 Attachments: 0001-Use-reference-counting-to-decide-when-a-sstable-can-.patch, 0001-Use-reference-counting-to-decide-when-a-sstable-can-v2.patch, 0002-Force-unmapping-files-before-deletion-v2.patch, 2521-v3.txt, 2521-v4.txt, 2521-v5.patch http://wiki.apache.org/cassandra/MemtableSSTable Let's move to using reference counting instead of relying on GC to be called in StorageService. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068896#comment-13068896 ] Sylvain Lebresne commented on CASSANDRA-2843: - bq. You mean just as a javadoc comment? Yes. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149121 - in /cassandra/branches/cassandra-0.8: ./ conf/ src/java/org/apache/cassandra/concurrent/ src/java/org/apache/cassandra/db/compaction/ src/java/org/apache/cassandra/service/ test
Author: slebresne Date: Thu Jul 21 11:11:50 2011 New Revision: 1149121 URL: http://svn.apache.org/viewvc?rev=1149121view=rev Log: Properly synchronize merkle tree computation patch by slebresne; reviewed by jbellis for CASSANDRA-2816 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/conf/cassandra.yaml cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AntiEntropyService.java cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1149121r1=1149120r2=1149121view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Thu Jul 21 11:11:50 2011 @@ -38,6 +38,7 @@ * fix re-using index CF sstable names after drop/recreate (CASSANDRA-2872) * prepend CF to default index names (CASSANDRA-2903) * fix hint replay (CASSANDRA-2928) + * Properly synchronize merkle tree computation (CASSANDRA-2816) 0.8.1 Modified: cassandra/branches/cassandra-0.8/conf/cassandra.yaml URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/conf/cassandra.yaml?rev=1149121r1=1149120r2=1149121view=diff == --- cassandra/branches/cassandra-0.8/conf/cassandra.yaml (original) +++ cassandra/branches/cassandra-0.8/conf/cassandra.yaml Thu Jul 21 11:11:50 2011 @@ -253,13 +253,15 @@ column_index_size_in_kb: 64 # will be logged specifying the row key. in_memory_compaction_limit_in_mb: 64 -# Number of compaction threads. This default to the number of processors, +# Number of compaction threads (NOT including validation compactions +# for anti-entropy repair). This default to the number of processors, # enabling multiple compactions to execute at once. Using more than one # thread is highly recommended to preserve read performance in a mixed # read/write workload as this avoids sstables from accumulating during long # running compactions. The default is usually fine and if you experience # problems with compaction running too slowly or too fast, you should look at # compaction_throughput_mb_per_sec first. +# # Uncomment to make compaction mono-threaded. #concurrent_compactors: 1 @@ -267,7 +269,8 @@ in_memory_compaction_limit_in_mb: 64 # system. The faster you insert data, the faster you need to compact in # order to keep the sstable count down, but in general, setting this to # 16 to 32 times the rate you are inserting data is more than sufficient. -# Setting this to 0 disables throttling. +# Setting this to 0 disables throttling. Note that this account for all types +# of compaction, including validation compaction. compaction_throughput_mb_per_sec: 16 # Track cached row keys during compaction, and re-cache their new Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java?rev=1149121r1=1149120r2=1149121view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java Thu Jul 21 11:11:50 2011 @@ -52,9 +52,14 @@ public class DebuggableThreadPoolExecuto this(1, Integer.MAX_VALUE, TimeUnit.SECONDS, new LinkedBlockingQueueRunnable(), new NamedThreadFactory(threadPoolName, priority)); } -public DebuggableThreadPoolExecutor(int corePoolSize, long keepAliveTime, TimeUnit unit, BlockingQueueRunnable workQueue, ThreadFactory threadFactory) +public DebuggableThreadPoolExecutor(int corePoolSize, long keepAliveTime, TimeUnit unit, BlockingQueueRunnable queue, ThreadFactory factory) { -super(corePoolSize, corePoolSize, keepAliveTime, unit, workQueue, threadFactory); +this(corePoolSize, corePoolSize, keepAliveTime, unit, queue, factory); +} + +protected DebuggableThreadPoolExecutor(int corePoolSize, int maxPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueueRunnable workQueue, ThreadFactory threadFactory) +{ +super(corePoolSize, maxPoolSize, keepAliveTime, unit, workQueue, threadFactory);
[jira] [Commented] (CASSANDRA-2816) Repair doesn't synchronize merkle tree creation properly
[ https://issues.apache.org/jira/browse/CASSANDRA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068907#comment-13068907 ] Sylvain Lebresne commented on CASSANDRA-2816: - Alright, v5 looks good to me. Committed, thanks. Repair doesn't synchronize merkle tree creation properly Key: CASSANDRA-2816 URL: https://issues.apache.org/jira/browse/CASSANDRA-2816 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.2 Attachments: 0001-Schedule-merkle-tree-request-one-by-one.patch, 2816-v2.txt, 2816-v4.txt, 2816-v5.txt, 2816_0.8_v3.patch Being a little slow, I just realized after having opened CASSANDRA-2811 and CASSANDRA-2815 that there is a more general problem with repair. When a repair is started, it will send a number of merkle tree to its neighbor as well as himself and assume for correction that the building of those trees will be started on every node roughly at the same time (if not, we end up comparing data snapshot at different time and will thus mistakenly repair a lot of useless data). This is bogus for many reasons: * Because validation compaction runs on the same executor that other compaction, the start of the validation on the different node is subject to other compactions. 0.8 mitigates this in a way by being multi-threaded (and thus there is less change to be blocked a long time by a long running compaction), but the compaction executor being bounded, its still a problem) * if you run a nodetool repair without arguments, it will repair every CFs. As a consequence it will generate lots of merkle tree requests and all of those requests will be issued at the same time. Because even in 0.8 the compaction executor is bounded, some of those validations will end up being queued behind the first ones. Even assuming that the different validation are submitted in the same order on each node (which isn't guaranteed either), there is no guarantee that on all nodes, the first validation will take the same time, hence desynchronizing the queued ones. Overall, it is important for the precision of repair that for a given CF and range (which is the unit at which trees are computed), we make sure that all node will start the validation at the same time (or, since we can't do magic, as close as possible). One (reasonably simple) proposition to fix this would be to have repair schedule validation compactions across nodes one by one (i.e, one CF/range at a time), waiting for all nodes to return their tree before submitting the next request. Then on each node, we should make sure that the node will start the validation compaction as soon as requested. For that, we probably want to have a specific executor for validation compaction and: * either we fail the whole repair whenever one node is not able to execute the validation compaction right away (because no thread are available right away). * we simply tell the user that if he start too many repairs in parallel, he may start seeing some of those repairing more data than it should. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2924) Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1
[ https://issues.apache.org/jira/browse/CASSANDRA-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068927#comment-13068927 ] Rick Shaw commented on CASSANDRA-2924: -- +1 (lesson learned) Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1 --- Key: CASSANDRA-2924 URL: https://issues.apache.org/jira/browse/CASSANDRA-2924 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Reporter: Rick Shaw Assignee: Rick Shaw Priority: Minor Labels: JDBC Fix For: 0.8.2 Attachments: 2924-v2.txt, consolidate-connection-v1.txt For the JDBC Driver suite, additional cleanup and consolidation of classes {{Connection}} and {{CassandraConnection}} were in order. Those changes drove a few casual additional changes in related classes {{CResultSet}}, {{CassandraStatement}} and {{CassandraPreparedStatement}} in order to continue to communicate properly. The class {{Utils}} was also enhanced to move more static utility methods into this holder class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty
[ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068931#comment-13068931 ] Sylvain Lebresne commented on CASSANDRA-2829: - bq. It feels like we need to add a most recent write at information as well as the oldest write/replay position at one. This would not need to be persisted to disk. Agreed, I think this is the right fix too. memtable with no post-flush activity can leave commitlog permanently dirty --- Key: CASSANDRA-2829 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829 Project: Cassandra Issue Type: Bug Components: Core Reporter: Aaron Morton Assignee: Jonathan Ellis Fix For: 0.8.2 Attachments: 0001-2829-unit-test.patch, 0002-2829.patch Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed. Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. {noformat} $ sudo ls -lah commitlog/ total 6.9G drwx-- 2 cassandra cassandra 12K 2011-06-24 20:38 . drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 .. -rw--- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log -rw--- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308876643288.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308877711517.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308879395824.log.header ... -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log -rw-r--r-- 1 cassandra cassandra 36 2011-06-24 20:47 CommitLog-1308946745380.log.header -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log -rw-r--r-- 1 cassandra cassandra 44 2011-06-24 20:47 CommitLog-1308947888397.log.header {noformat} The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty. {noformat} $ bin/logtool dirty /tmp/logs/commitlog/ Not connected to a server, Keyspace and Column Family names are not available. /tmp/logs/commitlog/CommitLog-1308876643288.log.header Keyspace Unknown: Cf id 0: 444 /tmp/logs/commitlog/CommitLog-1308877711517.log.header Keyspace Unknown: Cf id 1: 68848763 ... /tmp/logs/commitlog/CommitLog-1308944451460.log.header Keyspace Unknown: Cf id 1: 61074 /tmp/logs/commitlog/CommitLog-1308945597471.log.header Keyspace Unknown: Cf id 1000: 43175492 Cf id 1: 108483 /tmp/logs/commitlog/CommitLog-1308946745380.log.header Keyspace Unknown: Cf id 1000: 239223 Cf id 1: 172211 /tmp/logs/commitlog/CommitLog-1308947888397.log.header Keyspace Unknown: Cf id 1001: 57595560 Cf id 1: 816960 Cf id 1000: 0 {noformat} CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are: 1. Write to cf1 and flush. 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal() 3. Do not write to cf1 again. 4. Roll the log, my test does this manually. 5. Write to CF2 and flush. 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1. Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called which does not flush, and this was the last thing that happened. The expired memtable thread created by Table uses the same cfs.forceFlush() which is a no-op if the cf or it's secondary indexes are clean. I think the same problem would exist in 0.8. -- This message is automatically generated by JIRA. For more information on JIRA, see:
[jira] [Commented] (CASSANDRA-2863) NPE when writing SSTable generated via repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068951#comment-13068951 ] Sylvain Lebresne commented on CASSANDRA-2863: - I'm a little bit baffled by that one. Trusting the stack trace, apparently when SSTW.RowIndexer.close() is called, the iwriter field is null. But iwriter is set in prepareIndexing() that is called the line before index() in SSTW.Builder. Thus if an exception happens in prepareIndexing, we shouldn't arrive to the index() method (which is the one triggering the close()). And looking at the use of iwriter, no other line set it (so it can't be set back to null after prepareIndexing()). So I mean we can add a {{if (iwriter != null)}} before calling the close, but the truth is I have no clue how it could ever be null at that point. Héctor: are you positive that you are using stock 0.8.1 ? NPE when writing SSTable generated via repair - Key: CASSANDRA-2863 URL: https://issues.apache.org/jira/browse/CASSANDRA-2863 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Héctor Izquierdo Assignee: Sylvain Lebresne Fix For: 0.8.2 A NPE is generated during repair when closing an sstable generated via SSTable build. It doesn't happen always. The node had been scrubbed and compacted before calling repair. INFO [CompactionExecutor:2] 2011-07-06 11:11:32,640 SSTableReader.java (line 158) Opening /d2/cassandra/data/sbs/walf-g-730 ERROR [CompactionExecutor:2] 2011-07-06 11:11:34,327 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.close(SSTableWriter.java:382) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:370) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1103) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1094) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2816) Repair doesn't synchronize merkle tree creation properly
[ https://issues.apache.org/jira/browse/CASSANDRA-2816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068953#comment-13068953 ] Hudson commented on CASSANDRA-2816: --- Integrated in Cassandra-0.8 #231 (See [https://builds.apache.org/job/Cassandra-0.8/231/]) Properly synchronize merkle tree computation patch by slebresne; reviewed by jbellis for CASSANDRA-2816 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149121 Files : * /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/service/AntiEntropyServiceTestAbstract.java * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AntiEntropyService.java * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/conf/cassandra.yaml * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/concurrent/DebuggableThreadPoolExecutor.java * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/db/compaction/CompactionManager.java Repair doesn't synchronize merkle tree creation properly Key: CASSANDRA-2816 URL: https://issues.apache.org/jira/browse/CASSANDRA-2816 Project: Cassandra Issue Type: Bug Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Labels: repair Fix For: 0.8.2 Attachments: 0001-Schedule-merkle-tree-request-one-by-one.patch, 2816-v2.txt, 2816-v4.txt, 2816-v5.txt, 2816_0.8_v3.patch Being a little slow, I just realized after having opened CASSANDRA-2811 and CASSANDRA-2815 that there is a more general problem with repair. When a repair is started, it will send a number of merkle tree to its neighbor as well as himself and assume for correction that the building of those trees will be started on every node roughly at the same time (if not, we end up comparing data snapshot at different time and will thus mistakenly repair a lot of useless data). This is bogus for many reasons: * Because validation compaction runs on the same executor that other compaction, the start of the validation on the different node is subject to other compactions. 0.8 mitigates this in a way by being multi-threaded (and thus there is less change to be blocked a long time by a long running compaction), but the compaction executor being bounded, its still a problem) * if you run a nodetool repair without arguments, it will repair every CFs. As a consequence it will generate lots of merkle tree requests and all of those requests will be issued at the same time. Because even in 0.8 the compaction executor is bounded, some of those validations will end up being queued behind the first ones. Even assuming that the different validation are submitted in the same order on each node (which isn't guaranteed either), there is no guarantee that on all nodes, the first validation will take the same time, hence desynchronizing the queued ones. Overall, it is important for the precision of repair that for a given CF and range (which is the unit at which trees are computed), we make sure that all node will start the validation at the same time (or, since we can't do magic, as close as possible). One (reasonably simple) proposition to fix this would be to have repair schedule validation compactions across nodes one by one (i.e, one CF/range at a time), waiting for all nodes to return their tree before submitting the next request. Then on each node, we should make sure that the node will start the validation compaction as soon as requested. For that, we probably want to have a specific executor for validation compaction and: * either we fail the whole repair whenever one node is not able to execute the validation compaction right away (because no thread are available right away). * we simply tell the user that if he start too many repairs in parallel, he may start seeing some of those repairing more data than it should. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2863) NPE when writing SSTable generated via repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068956#comment-13068956 ] Héctor Izquierdo commented on CASSANDRA-2863: - I have a patch from 2818 (2818-v4) applied, if that's of any help. The patch only touches messaging classes though. NPE when writing SSTable generated via repair - Key: CASSANDRA-2863 URL: https://issues.apache.org/jira/browse/CASSANDRA-2863 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Héctor Izquierdo Assignee: Sylvain Lebresne Fix For: 0.8.2 A NPE is generated during repair when closing an sstable generated via SSTable build. It doesn't happen always. The node had been scrubbed and compacted before calling repair. INFO [CompactionExecutor:2] 2011-07-06 11:11:32,640 SSTableReader.java (line 158) Opening /d2/cassandra/data/sbs/walf-g-730 ERROR [CompactionExecutor:2] 2011-07-06 11:11:34,327 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.close(SSTableWriter.java:382) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:370) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1103) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1094) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2930) corrupt commitlog
corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more After some debugging I found that in some serialized supercolumns column counter is less than the number of serialized columns. Difference was always 1 in corrupt commitlogs. This error always appears with supercolumns with more than one column, but there are properly serialized supercolumns also in commitlog. I have no clue yet why this error happens. I suspect it maybe a race condition. -- This message is
[jira] [Issue Comment Edited] (CASSANDRA-2863) NPE when writing SSTable generated via repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068956#comment-13068956 ] Héctor Izquierdo edited comment on CASSANDRA-2863 at 7/21/11 12:36 PM: --- I have a patch from #2818 (2818-v4) applied, if that's of any help. The patch only touches messaging classes though. was (Author: hector.izquierdo): I have a patch from 2818 (2818-v4) applied, if that's of any help. The patch only touches messaging classes though. NPE when writing SSTable generated via repair - Key: CASSANDRA-2863 URL: https://issues.apache.org/jira/browse/CASSANDRA-2863 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Héctor Izquierdo Assignee: Sylvain Lebresne Fix For: 0.8.2 A NPE is generated during repair when closing an sstable generated via SSTable build. It doesn't happen always. The node had been scrubbed and compacted before calling repair. INFO [CompactionExecutor:2] 2011-07-06 11:11:32,640 SSTableReader.java (line 158) Opening /d2/cassandra/data/sbs/walf-g-730 ERROR [CompactionExecutor:2] 2011-07-06 11:11:34,327 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.close(SSTableWriter.java:382) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:370) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1103) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1094) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068957#comment-13068957 ] Brandon Williams commented on CASSANDRA-2843: - {quote} I used commit 4629648899e637e8e03938935f126689cce5ad48 and applied the 2843_c.patch, and also tried the head, but got the following error with the benchmark pycassa script. how did you succeed with it? [default@query] create column family NoCache ... with comparator = AsciiType ... and default_validation_class = AsciiType ... and key_validation_class = AsciiType ... and keys_cached = 0 ... and rows_cached = 0; Unable to set Compaction Strategy Class of AsciiType {quote} Add and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2930) corrupt commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ivan updated CASSANDRA-2930: Attachment: CommitLog-1310637513214.log A corrupt serialized row from a corrupt commitlog. corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan Attachments: CommitLog-1310637513214.log We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more After
[jira] [Updated] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-2921: --- Attachment: CASSANDRA-2921-make-Writer-a-stream.patch Patch changes a BRAF.Writer: instead of extending AbstractRandomAccessFile it expends AbstractDataOutput (new class) and introduces mark() and resetAndTruncate(...) methods to satisfy scrub and CommitLong requirements. Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty
[ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Morton updated CASSANDRA-2829: Attachment: 0002-2829-v08.patch 0001-2829-unit-test-v08.patch I got to take another look at this tonight on the 0.8 trunk and ported the unit test to 0.8. The 002-2829-v08 patch was my second attempt. It changes CFS.forceFlush() to always flush and trusts maybeSwitchMemtable() will only flush non clean CF's. There are no changes to CommitLog.discardCompletedSegmentsInternal(). The CF will be turned off in any segment that is not the context segment. It will always be turned on in the current / context segment. I think this gives the correct behaviour, i.e. the cf can never have dirty changes in a segment that is not current AND the cf may have changes in a segment that is current. It is a bit sloppy though as clean CF's will mark segments as dirty which may delay them been cleaned. I also think there is a theoretical risk of a race condition with access to the segments Deque. The iterator runs in the postFlushExecutor and the segments are added on the appropriate commit log executor service. memtable with no post-flush activity can leave commitlog permanently dirty --- Key: CASSANDRA-2829 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829 Project: Cassandra Issue Type: Bug Components: Core Reporter: Aaron Morton Assignee: Jonathan Ellis Fix For: 0.8.2 Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed. Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. {noformat} $ sudo ls -lah commitlog/ total 6.9G drwx-- 2 cassandra cassandra 12K 2011-06-24 20:38 . drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 .. -rw--- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log -rw--- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308876643288.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308877711517.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308879395824.log.header ... -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log -rw-r--r-- 1 cassandra cassandra 36 2011-06-24 20:47 CommitLog-1308946745380.log.header -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log -rw-r--r-- 1 cassandra cassandra 44 2011-06-24 20:47 CommitLog-1308947888397.log.header {noformat} The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty. {noformat} $ bin/logtool dirty /tmp/logs/commitlog/ Not connected to a server, Keyspace and Column Family names are not available. /tmp/logs/commitlog/CommitLog-1308876643288.log.header Keyspace Unknown: Cf id 0: 444 /tmp/logs/commitlog/CommitLog-1308877711517.log.header Keyspace Unknown: Cf id 1: 68848763 ... /tmp/logs/commitlog/CommitLog-1308944451460.log.header Keyspace Unknown: Cf id 1: 61074 /tmp/logs/commitlog/CommitLog-1308945597471.log.header Keyspace Unknown: Cf id 1000: 43175492 Cf id 1: 108483 /tmp/logs/commitlog/CommitLog-1308946745380.log.header Keyspace Unknown: Cf id 1000: 239223 Cf id 1: 172211 /tmp/logs/commitlog/CommitLog-1308947888397.log.header Keyspace Unknown: Cf id 1001: 57595560 Cf id 1: 816960 Cf id 1000: 0 {noformat} CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are: 1. Write to cf1 and flush. 2. Current log segment is marked as dirty at the CL
[jira] [Created] (CASSANDRA-2931) Nodetool ring prints the same token regardless of node queried
Nodetool ring prints the same token regardless of node queried -- Key: CASSANDRA-2931 URL: https://issues.apache.org/jira/browse/CASSANDRA-2931 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.6 Reporter: David Allsopp Priority: Trivial I have a 3-node test cluster. Using {{nodetool ring}} for any of the nodes returns the _same_ token at the top of the list (113427455640312821154458202477256070484) - but presumably this should reflect the token _of the node I am querying_ (as specified using -h) ? Or if not, what does it mean? {noformat} [dna@dev6 ~]$ nodetool -h dev6 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev8 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev9 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068972#comment-13068972 ] Sylvain Lebresne commented on CASSANDRA-2843: - bq. Doesn't it make sense then to change the AL fallback-to-bsearch into an assertion failure? Actually I just realized that there is one place where we do add a column after a read not at the end of the CF. That's for counters, after the read for replication and in the case where we can shrink the context because the node has renewed its NodeId multiple times and we can merge the old ones together. In that case, we end up updating some of the columns of the column family we've just read. Note that this code won't even be executed 99.999% of the time, and even then only a handful of columns are likely to be updated, so using the AL implementation really is the best choice. We could, if we really want to, add special code for that specific case (copying the CF read into a CLSM backed one typically before updating it). That would be less efficient but that probably doesn't matter in that specific case. But more importantly, that exemplify why I think using an assertion is more dangerous than it needs to be. Imho, the bug we would have had is the kind that could likely made it into a release. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This
[jira] [Resolved] (CASSANDRA-2931) Nodetool ring prints the same token regardless of node queried
[ https://issues.apache.org/jira/browse/CASSANDRA-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-2931. - Resolution: Not A Problem This is not what the first token means. The first is always the bigger assigned token. The fact that it is diplayed at the top is an artistic rendering supposed to explain that we have a ring. I.e, the first and last printed token are the same, suggesting some kind of continuity. Nodetool ring prints the same token regardless of node queried -- Key: CASSANDRA-2931 URL: https://issues.apache.org/jira/browse/CASSANDRA-2931 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.6 Reporter: David Allsopp Priority: Trivial I have a 3-node test cluster. Using {{nodetool ring}} for any of the nodes returns the _same_ token at the top of the list (113427455640312821154458202477256070484) - but presumably this should reflect the token _of the node I am querying_ (as specified using -h) ? Or if not, what does it mean? {noformat} [dna@dev6 ~]$ nodetool -h dev6 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev8 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev9 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2825) Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config.
[ https://issues.apache.org/jira/browse/CASSANDRA-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068974#comment-13068974 ] Brandon Williams commented on CASSANDRA-2825: - Impressive, I have no idea how this is breaking testTokenRoundtrip(): {noformat} public void testTokenRoundtrip() throws Exception { StorageService.instance.initServer(); // fetch a bootstrap token from the local node assert BootStrapper.getBootstrapTokenFrom(FBUtilities.getLocalAddress()) != null; } {noformat} The log just shows a bunch of attempts to connect to the seed (127.0.0.2) which hasn't started yet. Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config. -- Key: CASSANDRA-2825 URL: https://issues.apache.org/jira/browse/CASSANDRA-2825 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0, 0.8.1 Reporter: Michael Allen Assignee: Brandon Williams Fix For: 0.8.2 Attachments: 2825-v2.txt, 2825.txt This was done in sequence. A, B, C, and D. Node A with token 0 explicitly set in config. The rest with auto_bootstrap: true and no token explicitly assigned. B and C work as expected. D ends up stealing C's token. from system.log on C: INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 638) Node /10.171.47.226 is now part of the cluster INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 606) InetAddress /10.171.47.226 is now UP INFO [GossipStage:1] 2011-06-24 16:42:09,432 StorageService.java (line 769) Nodes /10.171.47.226 and /10.171.55.77 have the same token 61078635599166706937511052402724559481. /10.171.47.226 is the new owner WARN [GossipStage:1] 2011-06-24 16:42:09,432 TokenMetadata.java (line 120) Token 61078635599166706937511052402724559481 changing ownership from /10.171.55.77 to /10.171.47.226 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of NodeTool by DavidAllsopp
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The NodeTool page has been changed by DavidAllsopp: http://wiki.apache.org/cassandra/NodeTool?action=diffrev1=15rev2=16 10.176.1.162 Up 511.34 MB 63538518574533451921556363897953848387 |--| }}} + The format is a little different for later versions - this is from v0.7.6: + + {{{ + Address Status State LoadOwnsToken + 113427455640312821154458202477256070484 + 10.176.0.146Up Normal 459.27 MB 33.33% 0 + 10.176.1.161Up Normal 382.53 MB 33.33% 56713727820156410577229101238628035242 + 10.176.1.162Up Normal 511.34 MB 33.33% 113427455640312821154458202477256070484 + }}} + + The `Owns` column indicates the percentage of the ring (keyspace) handled by that node + + The largest token is repeated at the top of the list to indicate that we have a ring. i.e, the first and last printed token are the same, suggesting some kind of continuity + == Info == Outputs node information including the token, load info (on disk storage), generation number (times started), uptime in seconds, and heap memory usage.
[jira] [Commented] (CASSANDRA-2931) Nodetool ring prints the same token regardless of node queried
[ https://issues.apache.org/jira/browse/CASSANDRA-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068980#comment-13068980 ] David Allsopp commented on CASSANDRA-2931: -- Thanks - I have edited http://wiki.apache.org/cassandra/NodeTool to spell this out, as most of the documentation I've seen uses the older format (with the ASCII ring arrows on the right). Nodetool ring prints the same token regardless of node queried -- Key: CASSANDRA-2931 URL: https://issues.apache.org/jira/browse/CASSANDRA-2931 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.6 Reporter: David Allsopp Priority: Trivial I have a 3-node test cluster. Using {{nodetool ring}} for any of the nodes returns the _same_ token at the top of the list (113427455640312821154458202477256070484) - but presumably this should reflect the token _of the node I am querying_ (as specified using -h) ? Or if not, what does it mean? {noformat} [dna@dev6 ~]$ nodetool -h dev6 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev8 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 [dna@dev6 ~]$ nodetool -h dev9 -p 8082 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070484 10.0.11.8 Up Normal 2.41 GB 33.33% 0 10.0.11.6 Up Normal 3.13 GB 33.33% 56713727820156410577229101238628035242 10.0.11.9 Up Normal 1.65 GB 33.33% 113427455640312821154458202477256070484 {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149176 - in /cassandra/branches/cassandra-0.7: CHANGES.txt NEWS.txt build.xml debian/changelog
Author: slebresne Date: Thu Jul 21 13:53:13 2011 New Revision: 1149176 URL: http://svn.apache.org/viewvc?rev=1149176view=rev Log: Updates for 0.7.8 release (changelog, news, version number) Modified: cassandra/branches/cassandra-0.7/CHANGES.txt cassandra/branches/cassandra-0.7/NEWS.txt cassandra/branches/cassandra-0.7/build.xml cassandra/branches/cassandra-0.7/debian/changelog Modified: cassandra/branches/cassandra-0.7/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1149176r1=1149175r2=1149176view=diff == --- cassandra/branches/cassandra-0.7/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7/CHANGES.txt Thu Jul 21 13:53:13 2011 @@ -5,6 +5,11 @@ * avoid including inferred types in CF update (CASSANDRA-2809) * fix re-using index CF sstable names after drop/recreate (CASSANDRA-2872) * fix hint replay (CASSANDRA-2928) + * don't accept extra args for 0-arg nodetool commands (CASSANDRA-2740) + * allows using cli functions in cli del statement (CASSANDRA-2821) + * allows quoted classes in CLI (CASSANDRA-2899) + * log unavailableexception details at debug level (CASSANDRA-2856) + * expose data_dir though jmx (CASSANDRA-2770) 0.7.7 Modified: cassandra/branches/cassandra-0.7/NEWS.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/NEWS.txt?rev=1149176r1=1149175r2=1149176view=diff == --- cassandra/branches/cassandra-0.7/NEWS.txt (original) +++ cassandra/branches/cassandra-0.7/NEWS.txt Thu Jul 21 13:53:13 2011 @@ -1,3 +1,12 @@ +0.7.8 += + +Upgrading +- +- Nothing specific to 0.7.8, but see 0.7.3 Upgrading if upgrading + from earlier than 0.7.1. + + 0.7.7 = Modified: cassandra/branches/cassandra-0.7/build.xml URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/build.xml?rev=1149176r1=1149175r2=1149176view=diff == --- cassandra/branches/cassandra-0.7/build.xml (original) +++ cassandra/branches/cassandra-0.7/build.xml Thu Jul 21 13:53:13 2011 @@ -24,7 +24,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information (we need the default SCM info as people may checkout with git-svn) -- -property name=base.version value=0.7.7/ +property name=base.version value=0.7.8/ property name=scm.default.path value=cassandra/branches/cassandra-0.7/ property name=scm.default.connection value=scm:svn:http://svn.apache.org/repos/asf/${scm.default.path}/ property name=scm.default.developerConnection value=scm:svn:https://svn.apache.org/repos/asf/${scm.default.path}/ Modified: cassandra/branches/cassandra-0.7/debian/changelog URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/debian/changelog?rev=1149176r1=1149175r2=1149176view=diff == --- cassandra/branches/cassandra-0.7/debian/changelog (original) +++ cassandra/branches/cassandra-0.7/debian/changelog Thu Jul 21 13:53:13 2011 @@ -1,3 +1,9 @@ +cassandra (0.7.8) unstable; urgency=low + + * New stable point release + + -- Sylvain Lebresne slebre...@apache.org Thu, 21 Jul 2011 15:51:51 +0200 + cassandra (0.7.7) unstable; urgency=low * New stable point release
[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty
[ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069011#comment-13069011 ] Jonathan Ellis commented on CASSANDRA-2829: --- bq. I also think there is a theoretical risk of a race condition with access to the segments Deque. The iterator runs in the postFlushExecutor discardCompletedSegments actually does the real work in a task on the CL executor. Unless that's not what you're thinking of, I think we're ok here. bq. It changes CFS.forceFlush() to always flush and trusts maybeSwitchMemtable() will only flush non clean CF's Hmm. Interesting. Part of me thinks it can't be that simple but I don't see a problem with it. :) Sylvain? memtable with no post-flush activity can leave commitlog permanently dirty --- Key: CASSANDRA-2829 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829 Project: Cassandra Issue Type: Bug Components: Core Reporter: Aaron Morton Assignee: Jonathan Ellis Fix For: 0.8.2 Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed. Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. {noformat} $ sudo ls -lah commitlog/ total 6.9G drwx-- 2 cassandra cassandra 12K 2011-06-24 20:38 . drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 .. -rw--- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log -rw--- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308876643288.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308877711517.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308879395824.log.header ... -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log -rw-r--r-- 1 cassandra cassandra 36 2011-06-24 20:47 CommitLog-1308946745380.log.header -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log -rw-r--r-- 1 cassandra cassandra 44 2011-06-24 20:47 CommitLog-1308947888397.log.header {noformat} The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty. {noformat} $ bin/logtool dirty /tmp/logs/commitlog/ Not connected to a server, Keyspace and Column Family names are not available. /tmp/logs/commitlog/CommitLog-1308876643288.log.header Keyspace Unknown: Cf id 0: 444 /tmp/logs/commitlog/CommitLog-1308877711517.log.header Keyspace Unknown: Cf id 1: 68848763 ... /tmp/logs/commitlog/CommitLog-1308944451460.log.header Keyspace Unknown: Cf id 1: 61074 /tmp/logs/commitlog/CommitLog-1308945597471.log.header Keyspace Unknown: Cf id 1000: 43175492 Cf id 1: 108483 /tmp/logs/commitlog/CommitLog-1308946745380.log.header Keyspace Unknown: Cf id 1000: 239223 Cf id 1: 172211 /tmp/logs/commitlog/CommitLog-1308947888397.log.header Keyspace Unknown: Cf id 1001: 57595560 Cf id 1: 816960 Cf id 1000: 0 {noformat} CF 0 is the Status / LocationInfo CF and 1 is the HintedHandof CF. I dont have it now, but IIRC CFStats showed the LocationInfo CF with dirty ops. I was able to repo a case where flushing the CF's did not mark the log segments as obsolete (attached unit-test patch). Steps are: 1. Write to cf1 and flush. 2. Current log segment is marked as dirty at the CL position when the flush started, CommitLog.discardCompletedSegmentsInternal() 3. Do not write to cf1 again. 4. Roll the log, my test does this manually. 5. Write to CF2 and flush. 6. Only CF2 is flushed because it is the only dirty CF. cfs.maybeSwitchMemtable() is not called for cf1 and so log segment 1 is still marked as dirty from cf1. Step 5 is not essential, just matched what I thought was happening. I thought SystemTable.updateToken() was called
[Cassandra Wiki] Update of NodeTool by DavidAllsopp
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The NodeTool page has been changed by DavidAllsopp: http://wiki.apache.org/cassandra/NodeTool?action=diffrev1=16rev2=17 == Flush == Flushes memtables (in memory) to SSTables (on disk), which also enables CommitLog segments to be deleted. + == Removetoken== + Removes a dead node from the ring - this command is issued to any other live node (since clearly the dead node cannot respond!). + == Scrub == Cassandra v0.7.1 and v0.7.2 shipped with a bug that caused incorrect row-level bloom filters to be generated when compacting sstables generated with earlier versions. This would manifest in IOExceptions during column name-based queries. v0.7.3 provides nodetool scrub to rebuild sstables with correct bloom filters, with no data lost. (If your cluster was never on 0.7.0 or earlier, you don't have to worry about this.) Note that nodetool scrub will snapshot your data files before rebuilding, just in case.
[Cassandra Wiki] Trivial Update of NodeTool by DavidAllsopp
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The NodeTool page has been changed by DavidAllsopp: http://wiki.apache.org/cassandra/NodeTool?action=diffrev1=17rev2=18 Comment: Fixed broken heading == Flush == Flushes memtables (in memory) to SSTables (on disk), which also enables CommitLog segments to be deleted. - == Removetoken== + == Removetoken == Removes a dead node from the ring - this command is issued to any other live node (since clearly the dead node cannot respond!). == Scrub ==
[Cassandra Wiki] Trivial Update of MultinodeCluster by DavidAllsopp
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The MultinodeCluster page has been changed by DavidAllsopp: http://wiki.apache.org/cassandra/MultinodeCluster?action=diffrev1=8rev2=9 Comment: Added extra detail about using netstat to verify listen address }}} - Once these changes are made, simply restart cassandra on this node. Use netstat to verify cassandra is listening on the right address. Look for a line like this: + Once these changes are made, simply restart cassandra on this node. Use netstat (e.g. `netstat -ant | grep 7000`) to verify cassandra is listening on the right address. Look for a line like this: {{{tcp4 0 0 192.168.1.1.7000 *.* LISTEN}}}
svn commit: r1149217 - /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java
Author: slebresne Date: Thu Jul 21 15:14:36 2011 New Revision: 1149217 URL: http://svn.apache.org/viewvc?rev=1149217view=rev Log: Reverting #2825 until BootStrapper unit test is fixed Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java?rev=1149217r1=1149216r2=1149217view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageService.java Thu Jul 21 15:14:36 2011 @@ -1726,8 +1726,6 @@ public class StorageService implements I ListDecoratedKey keys = new ArrayListDecoratedKey(); for (ColumnFamilyStore cfs : ColumnFamilyStore.all()) { -if (cfs.table.name.equals(Table.SYSTEM_TABLE)) -continue; for (DecoratedKey key : cfs.allKeySamples()) { if (range.contains(key.token)) @@ -1736,19 +1734,9 @@ public class StorageService implements I } FBUtilities.sortSampledKeys(keys, range); -Token token; -if (keys.size() 3) -{ -token = partitioner.midpoint(range.left, range.right); -logger_.debug(Used midpoint to assign token + token); -} -else -{ -token = keys.get(keys.size() / 2).token; -logger_.debug(Used key sample of size + keys.size() + to assign token + token); -} -if (tokenMetadata_.isMember(tokenMetadata_.getEndpoint(token))) -throw new RuntimeException(Chose token + token + which is already in use by + tokenMetadata_.getEndpoint(token) + -- specify one manually with initial_token); +Token token = keys.size() 3 +? partitioner.midpoint(range.left, range.right) +: keys.get(keys.size() / 2).token; // Hack to prevent giving nodes tokens with DELIMITER_STR in them (which is fine in a row key/token) if (token instanceof StringToken) {
[jira] [Commented] (CASSANDRA-2825) Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config.
[ https://issues.apache.org/jira/browse/CASSANDRA-2825?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069017#comment-13069017 ] Sylvain Lebresne commented on CASSANDRA-2825: - I've reverted the patch too so we can do a release of 0.8.2 without having to wait on the unit test fix. Auto bootstrapping the 4th node in a 4 node cluster doesn't work, when no token explicitly assigned in config. -- Key: CASSANDRA-2825 URL: https://issues.apache.org/jira/browse/CASSANDRA-2825 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.0, 0.8.1 Reporter: Michael Allen Assignee: Brandon Williams Fix For: 0.8.2 Attachments: 2825-v2.txt, 2825.txt This was done in sequence. A, B, C, and D. Node A with token 0 explicitly set in config. The rest with auto_bootstrap: true and no token explicitly assigned. B and C work as expected. D ends up stealing C's token. from system.log on C: INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 638) Node /10.171.47.226 is now part of the cluster INFO [GossipStage:1] 2011-06-24 16:40:41,947 Gossiper.java (line 606) InetAddress /10.171.47.226 is now UP INFO [GossipStage:1] 2011-06-24 16:42:09,432 StorageService.java (line 769) Nodes /10.171.47.226 and /10.171.55.77 have the same token 61078635599166706937511052402724559481. /10.171.47.226 is the new owner WARN [GossipStage:1] 2011-06-24 16:42:09,432 TokenMetadata.java (line 120) Token 61078635599166706937511052402724559481 changing ownership from /10.171.55.77 to /10.171.47.226 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2932) Implement assume in cqlsh
Implement assume in cqlsh --- Key: CASSANDRA-2932 URL: https://issues.apache.org/jira/browse/CASSANDRA-2932 Project: Cassandra Issue Type: Improvement Reporter: Jeremy Hanna Priority: Minor In the CLI there is a handy way to assume validators. It would be very nice to have the assume command in cqlsh as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2932) Implement assume in cqlsh
[ https://issues.apache.org/jira/browse/CASSANDRA-2932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-2932: --- Description: In the CLI there is a handy way to assume CF comparators/validators (CASSANDRA-1693). It would be very nice to have the assume command in cqlsh as well. (was: In the CLI there is a handy way to assume validators. It would be very nice to have the assume command in cqlsh as well.) Implement assume in cqlsh --- Key: CASSANDRA-2932 URL: https://issues.apache.org/jira/browse/CASSANDRA-2932 Project: Cassandra Issue Type: Improvement Reporter: Jeremy Hanna Priority: Minor Labels: lhf In the CLI there is a handy way to assume CF comparators/validators (CASSANDRA-1693). It would be very nice to have the assume command in cqlsh as well. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069029#comment-13069029 ] Vijay commented on CASSANDRA-957: - Seems like CASSANDRA-2928 fixes the hints issue... so we can ignore 0003 in this ticket. convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-Token-Replace.patch, 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-Support-token-replace.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-Rework-Hints-to-be-on-token.patch, 0002-Rework-Hints-to-be-on-token.patch, 0003-Make-HintedHandoff-More-reliable.patch, 0003-Make-hints-More-reliable.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149235 - in /cassandra/branches/cassandra-0.8: CHANGES.txt NEWS.txt build.xml debian/changelog
Author: slebresne Date: Thu Jul 21 15:47:58 2011 New Revision: 1149235 URL: http://svn.apache.org/viewvc?rev=1149235view=rev Log: Updates for 0.8.2 release (changelog, news, version number) Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/NEWS.txt cassandra/branches/cassandra-0.8/build.xml cassandra/branches/cassandra-0.8/debian/changelog Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1149235r1=1149234r2=1149235view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Thu Jul 21 15:47:58 2011 @@ -39,6 +39,13 @@ * prepend CF to default index names (CASSANDRA-2903) * fix hint replay (CASSANDRA-2928) * Properly synchronize merkle tree computation (CASSANDRA-2816) + * escape quotes in sstable2json (CASSANDRA-2780) + * allows using cli functions in cli del statement (CASSANDRA-2821) + * allows quoted classes in CLI (CASSANDRA-2899) + * expose data_dir though jmx (CASSANDRA-2770) + * proper support for validation and functions for cli count statement + (CASSANDRA-1902) + * debian package now depend on libjna-java (CASSANDRA-2803) 0.8.1 Modified: cassandra/branches/cassandra-0.8/NEWS.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/NEWS.txt?rev=1149235r1=1149234r2=1149235view=diff == --- cassandra/branches/cassandra-0.8/NEWS.txt (original) +++ cassandra/branches/cassandra-0.8/NEWS.txt Thu Jul 21 15:47:58 2011 @@ -11,6 +11,16 @@ Upgrading if replicate_on_write was uncorrectly set to false (before or after upgrade). +Tools +- +- Add new simplified classes to write sstables (to complement the bulk + loading utility). + +Other +- +- This release fix a regression of 0.8.1 that made hinted handoff being + never delivered. Upgrade from 0.8.1 is thus highly encourage. + 0.8.1 = Modified: cassandra/branches/cassandra-0.8/build.xml URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/build.xml?rev=1149235r1=1149234r2=1149235view=diff == --- cassandra/branches/cassandra-0.8/build.xml (original) +++ cassandra/branches/cassandra-0.8/build.xml Thu Jul 21 15:47:58 2011 @@ -25,7 +25,7 @@ property name=debuglevel value=source,lines,vars/ !-- default version and SCM information (we need the default SCM info as people may checkout with git-svn) -- -property name=base.version value=0.8.2-dev/ +property name=base.version value=0.8.2/ property name=scm.default.path value=cassandra/branches/cassandra-0.8/ property name=scm.default.connection value=scm:svn:http://svn.apache.org/repos/asf/${scm.default.path}/ property name=scm.default.developerConnection value=scm:svn:https://svn.apache.org/repos/asf/${scm.default.path}/ Modified: cassandra/branches/cassandra-0.8/debian/changelog URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/debian/changelog?rev=1149235r1=1149234r2=1149235view=diff == --- cassandra/branches/cassandra-0.8/debian/changelog (original) +++ cassandra/branches/cassandra-0.8/debian/changelog Thu Jul 21 15:47:58 2011 @@ -1,3 +1,9 @@ +cassandra (0.8.2) unstable; urgency=low + + * New release + + -- Sylvain Lebresne slebre...@apache.org Thu, 21 Jul 2011 17:45:19 +0200 + cassandra (0.8.1) unstable; urgency=low * New release
[jira] [Created] (CASSANDRA-2933) nodetool hangs (doesn't return prompt) if you specify a table that doesn't exist or a KS that has no CF's
nodetool hangs (doesn't return prompt) if you specify a table that doesn't exist or a KS that has no CF's - Key: CASSANDRA-2933 URL: https://issues.apache.org/jira/browse/CASSANDRA-2933 Project: Cassandra Issue Type: Bug Reporter: Cathy Daw Priority: Minor Invalid CF {code} ERROR 02:18:18,904 Fatal exception in thread Thread[AntiEntropyStage:3,5,main] java.lang.IllegalArgumentException: Unknown table/cf pair (StressKeyspace.StressStandard) at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:147) at org.apache.cassandra.service.AntiEntropyService$TreeRequestVerbHandler.doVerb(AntiEntropyService.java:601) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} Empty KS {code} INFO 02:19:21,483 Waiting for repair requests: [] INFO 02:19:21,484 Waiting for repair requests: [] INFO 02:19:21,484 Waiting for repair requests: [] {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2933) nodetool hangs (doesn't return prompt) if you specify a table that doesn't exist or a KS that has no CF's
[ https://issues.apache.org/jira/browse/CASSANDRA-2933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2933: -- Component/s: Tools Labels: lhf (was: ) nodetool hangs (doesn't return prompt) if you specify a table that doesn't exist or a KS that has no CF's - Key: CASSANDRA-2933 URL: https://issues.apache.org/jira/browse/CASSANDRA-2933 Project: Cassandra Issue Type: Bug Components: Tools Reporter: Cathy Daw Priority: Minor Labels: lhf Invalid CF {code} ERROR 02:18:18,904 Fatal exception in thread Thread[AntiEntropyStage:3,5,main] java.lang.IllegalArgumentException: Unknown table/cf pair (StressKeyspace.StressStandard) at org.apache.cassandra.db.Table.getColumnFamilyStore(Table.java:147) at org.apache.cassandra.service.AntiEntropyService$TreeRequestVerbHandler.doVerb(AntiEntropyService.java:601) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) {code} Empty KS {code} INFO 02:19:21,483 Waiting for repair requests: [] INFO 02:19:21,484 Waiting for repair requests: [] INFO 02:19:21,484 Waiting for repair requests: [] {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2829) memtable with no post-flush activity can leave commitlog permanently dirty
[ https://issues.apache.org/jira/browse/CASSANDRA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069070#comment-13069070 ] Sylvain Lebresne commented on CASSANDRA-2829: - I think this kind of work, in that we won't keep commit log forever, but it still keep commit logs for much longer than necessary because: # it relies on forceFlush being called, which unless client triggered will only be after the memtable expires and quite a bunch of commit log could pile up during that time. Quite potentially enough to be a problem (if the commit logs fills up you hard drive, it doesn't matter much that it would have been deleted in 5 hours). I think we can do much better with not too much effort. # when we do flush the expired memtable, we'll call maybeSwitchMemtable() will potentially clean memtables. This doesn't sound like a good use of resource: we'll grab the write lock, create a latch, create a new memtable, increment the memtable switch number, push an almost no-op job on the flush executor. I think we should fix the real problem. The problem is that we discard segment, we always keep the current segment dirty because we don't know if there was some write since we grabbed the context. Let's add that information and fix that. This would make commit log being deleted much quicker, even if we don't consider the corner case of column family that have suddenly no write anymore, because CFs like the system ones, that have very low update volume can retain the logs longer than it's really need. As for the fix, because the CL executor is mono-threaded, this is fairly easy, let's have an in-memory map of cfId-lastPositionWritten, and compare that to the context position in discardCompletedSegmentInternal (we could probably even just use a set of cfid who would meant: dirty since last getContext). memtable with no post-flush activity can leave commitlog permanently dirty --- Key: CASSANDRA-2829 URL: https://issues.apache.org/jira/browse/CASSANDRA-2829 Project: Cassandra Issue Type: Bug Components: Core Reporter: Aaron Morton Assignee: Jonathan Ellis Fix For: 0.8.3 Attachments: 0001-2829-unit-test-v08.patch, 0001-2829-unit-test.patch, 0002-2829-v08.patch, 0002-2829.patch Only dirty Memtables are flushed, and so only dirty memtables are used to discard obsolete commit log segments. This can result it log segments not been deleted even though the data has been flushed. Was using a 3 node 0.7.6-2 AWS cluster (DataStax AMI's) with pre 0.7 data loaded and a running application working against the cluster. Did a rolling restart and then kicked off a repair, one node filled up the commit log volume with 7GB+ of log data, there was about 20 hours of log files. {noformat} $ sudo ls -lah commitlog/ total 6.9G drwx-- 2 cassandra cassandra 12K 2011-06-24 20:38 . drwxr-xr-x 3 cassandra cassandra 4.0K 2011-06-25 01:47 .. -rw--- 1 cassandra cassandra 129M 2011-06-24 01:08 CommitLog-1308876643288.log -rw--- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308876643288.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 01:36 CommitLog-1308877711517.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308877711517.log.header -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 02:20 CommitLog-1308879395824.log -rw-r--r-- 1 cassandra cassandra 28 2011-06-24 20:47 CommitLog-1308879395824.log.header ... -rw-r--r-- 1 cassandra cassandra 129M 2011-06-24 20:38 CommitLog-1308946745380.log -rw-r--r-- 1 cassandra cassandra 36 2011-06-24 20:47 CommitLog-1308946745380.log.header -rw-r--r-- 1 cassandra cassandra 112M 2011-06-24 20:54 CommitLog-1308947888397.log -rw-r--r-- 1 cassandra cassandra 44 2011-06-24 20:47 CommitLog-1308947888397.log.header {noformat} The user KS has 2 CF's with 60 minute flush times. System KS had the default settings which is 24 hours. Will create another ticket see if these can be reduced or if it's something users should do, in this case it would not have mattered. I grabbed the log headers and used the tool in CASSANDRA-2828 and most of the segments had the system CF's marked as dirty. {noformat} $ bin/logtool dirty /tmp/logs/commitlog/ Not connected to a server, Keyspace and Column Family names are not available. /tmp/logs/commitlog/CommitLog-1308876643288.log.header Keyspace Unknown: Cf id 0: 444 /tmp/logs/commitlog/CommitLog-1308877711517.log.header Keyspace Unknown: Cf id 1: 68848763 ... /tmp/logs/commitlog/CommitLog-1308944451460.log.header Keyspace Unknown: Cf id 1: 61074 /tmp/logs/commitlog/CommitLog-1308945597471.log.header Keyspace Unknown: Cf id 1000:
[jira] [Commented] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069125#comment-13069125 ] Yang Yang commented on CASSANDRA-2843: -- bq. I did some performance testing using Aaron's script here: http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ and overall in the 95th percentile there was an approximate 10% gain across the board. I looked at Aaron's script, it actually returns 100 columns on each get. since the column count filtering happens in the memtable iterator *before* the collating iterator and ColumnMap.add(), the advantage of this patch is not fully shown (only 10% ). I added a simple test case to the script to return all columns , in the 10,000 columns case, the time reduction is about 50%. I'm still running the full test, will upload the data later better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1405) Switch to THsHaServer, redux
[ https://issues.apache.org/jira/browse/CASSANDRA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069137#comment-13069137 ] Brandon Williams commented on CASSANDRA-1405: - I see, that's kind of annoying. I'll change the log4j level to ERROR on commit. One last thing is that the rpc_type should be validated, instead an invalid type produces: {noformat} INFO 14:02:46,359 Listening for thrift clients... ERROR 14:02:46,360 Fatal exception in thread Thread[Thread-3,5,main] java.lang.NullPointerException at org.apache.cassandra.thrift.CassandraDaemon$ThriftServer.run(CassandraDaemon.java:192) {noformat} Switch to THsHaServer, redux Key: CASSANDRA-1405 URL: https://issues.apache.org/jira/browse/CASSANDRA-1405 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 0.8.3 Attachments: 0001-log4j-config-change.patch, 1405-Thrift-Patch-SVN.patch, libthrift-r1026391.jar, trunk-1405.patch Brian's patch to CASSANDRA-876 suggested using a custom TProcessorFactory subclass, overriding getProcessor to reset to a default state when a new client connects. It looks like this would allow dropping CustomTThreadPoolServer as well as allowing non-thread based servers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1405) Switch to THsHaServer, redux
[ https://issues.apache.org/jira/browse/CASSANDRA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069145#comment-13069145 ] Jonathan Ellis commented on CASSANDRA-1405: --- bq. It is logged by thrift internally so we dont have much control over that Let's submit a Thrift patch to fix that then. It's easy to get a Java-only patch reviewed. In the meantime I'm ok w/ turning org.apache.thrift log4j levels down to debug. Switch to THsHaServer, redux Key: CASSANDRA-1405 URL: https://issues.apache.org/jira/browse/CASSANDRA-1405 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 0.8.3 Attachments: 0001-log4j-config-change.patch, 1405-Thrift-Patch-SVN.patch, libthrift-r1026391.jar, trunk-1405.patch Brian's patch to CASSANDRA-876 suggested using a custom TProcessorFactory subclass, overriding getProcessor to reset to a default state when a new client connects. It looks like this would allow dropping CustomTThreadPoolServer as well as allowing non-thread based servers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149332 - in /cassandra/branches/cassandra-0.8: src/java/org/apache/cassandra/thrift/CassandraServer.java test/system/test_thrift_server.py
Author: jbellis Date: Thu Jul 21 19:33:24 2011 New Revision: 1149332 URL: http://svn.apache.org/viewvc?rev=1149332view=rev Log: fix test failures w/ index names Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/CassandraServer.java cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/CassandraServer.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/CassandraServer.java?rev=1149332r1=1149331r2=1149332view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/CassandraServer.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/CassandraServer.java Thu Jul 21 19:33:24 2011 @@ -960,6 +960,7 @@ public class CassandraServer implements CFMetaData oldCfm = DatabaseDescriptor.getCFMetaData(CFMetaData.getId(cf_def.keyspace, cf_def.name)); if (oldCfm == null) throw new InvalidRequestException(Could not find column family definition to modify.); +CFMetaData.addDefaultIndexNames(cf_def); ThriftValidation.validateCfDef(cf_def, oldCfm); validateSchemaAgreement(); Modified: cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py?rev=1149332r1=1149331r2=1149332view=diff == --- cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py (original) +++ cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py Thu Jul 21 19:33:24 2011 @@ -1415,13 +1415,13 @@ class TestMutations(ThriftTester): ks1 = client.describe_keyspace('Keyspace1') cfid = [x.id for x in ks1.cf_defs if x.name=='BlankCF'][0] -modified_cd = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, 'birthdate_index') +modified_cd = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, None) modified_cf = CfDef('Keyspace1', 'BlankCF', column_metadata=[modified_cd]) modified_cf.id = cfid client.system_update_column_family(modified_cf) # Add a second indexed CF ... -birthdate_coldef = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, 'birthdate2_index') +birthdate_coldef = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, None) age_coldef = ColumnDef('age', 'BytesType', IndexType.KEYS, 'age_index') cfdef = CfDef('Keyspace1', 'BlankCF2', column_metadata=[birthdate_coldef, age_coldef]) client.system_add_column_family(cfdef) @@ -1472,7 +1472,7 @@ class TestMutations(ThriftTester): # add an index on 'birthdate' ks1 = client.describe_keyspace('Keyspace1') cfid = [x.id for x in ks1.cf_defs if x.name=='ToBeIndexed'][0] -modified_cd = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, None) +modified_cd = ColumnDef('birthdate', 'BytesType', IndexType.KEYS, 'bd_index') modified_cf = CfDef('Keyspace1', 'ToBeIndexed', column_metadata=[modified_cd]) modified_cf.id = cfid client.system_update_column_family(modified_cf)
[jira] [Created] (CASSANDRA-2934) log broken incoming connections at DEBUG
log broken incoming connections at DEBUG Key: CASSANDRA-2934 URL: https://issues.apache.org/jira/browse/CASSANDRA-2934 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Trivial Fix For: 0.8.2 Attachments: 2934.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2934) log broken incoming connections at DEBUG
[ https://issues.apache.org/jira/browse/CASSANDRA-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2934: -- Attachment: 2934.txt log broken incoming connections at DEBUG Key: CASSANDRA-2934 URL: https://issues.apache.org/jira/browse/CASSANDRA-2934 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Trivial Fix For: 0.8.2 Attachments: 2934.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1405) Switch to THsHaServer, redux
[ https://issues.apache.org/jira/browse/CASSANDRA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069168#comment-13069168 ] Brandon Williams commented on CASSANDRA-1405: - More serious benchmarks reveal that sync and hsha are even, and async is 50% slower. Switch to THsHaServer, redux Key: CASSANDRA-1405 URL: https://issues.apache.org/jira/browse/CASSANDRA-1405 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 0.8.3 Attachments: 0001-log4j-config-change.patch, 1405-Thrift-Patch-SVN.patch, libthrift-r1026391.jar, trunk-1405.patch Brian's patch to CASSANDRA-876 suggested using a custom TProcessorFactory subclass, overriding getProcessor to reset to a default state when a new client connects. It looks like this would allow dropping CustomTThreadPoolServer as well as allowing non-thread based servers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149341 - /cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
Author: brandonwilliams Date: Thu Jul 21 20:10:54 2011 New Revision: 1149341 URL: http://svn.apache.org/viewvc?rev=1149341view=rev Log: Use a UDF-specific context signature. Patch by Jeremy Hanna, reviewed by brandonwilliams for CASSANDRA-2869 Modified: cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java Modified: cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java?rev=1149341r1=1149340r2=1149341view=diff == --- cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java (original) +++ cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java Thu Jul 21 20:10:54 2011 @@ -68,8 +68,6 @@ public class CassandraStorage extends Lo public final static String PIG_INITIAL_ADDRESS = PIG_INITIAL_ADDRESS; public final static String PIG_PARTITIONER = PIG_PARTITIONER; -private static String UDFCONTEXT_SCHEMA_KEY_PREFIX = cassandra.schema; - private final static ByteBuffer BOUND = ByteBufferUtil.EMPTY_BYTE_BUFFER; private static final Log logger = LogFactory.getLog(CassandraStorage.class); @@ -78,6 +76,8 @@ public class CassandraStorage extends Lo private boolean slice_reverse = false; private String keyspace; private String column_family; +private String loadSignature; +private String storeSignature; private Configuration conf; private RecordReader reader; @@ -112,7 +112,7 @@ public class CassandraStorage extends Lo if (!reader.nextKeyValue()) return null; -CfDef cfDef = getCfDef(); +CfDef cfDef = getCfDef(loadSignature); ByteBuffer key = (ByteBuffer)reader.getCurrentKey(); SortedMapByteBuffer,IColumn cf = (SortedMapByteBuffer,IColumn)reader.getCurrentValue(); assert key != null cf != null; @@ -165,11 +165,11 @@ public class CassandraStorage extends Lo return pair; } -private CfDef getCfDef() +private CfDef getCfDef(String signature) { UDFContext context = UDFContext.getUDFContext(); Properties property = context.getUDFProperties(CassandraStorage.class); -return cfdefFromString(property.getProperty(getSchemaContextKey())); +return cfdefFromString(property.getProperty(signature)); } private ListAbstractType getDefaultMarshallers(CfDef cfDef) throws IOException @@ -289,7 +289,7 @@ public class CassandraStorage extends Lo } ConfigHelper.setInputColumnFamily(conf, keyspace, column_family); setConnectionInformation(); -initSchema(); +initSchema(loadSignature); } @Override @@ -298,9 +298,16 @@ public class CassandraStorage extends Lo return location; } +@Override +public void setUDFContextSignature(String signature) +{ +this.loadSignature = signature; +} + /* StoreFunc methods */ public void setStoreFuncUDFContextSignature(String signature) { +this.storeSignature = signature; } public String relToAbsPathForStoreLocation(String location, Path curDir) throws IOException @@ -314,7 +321,7 @@ public class CassandraStorage extends Lo setLocationFromUri(location); ConfigHelper.setOutputColumnFamily(conf, keyspace, column_family); setConnectionInformation(); -initSchema(); +initSchema(storeSignature); } public OutputFormat getOutputFormat() @@ -346,7 +353,7 @@ public class CassandraStorage extends Lo ByteBuffer key = objToBB(t.get(0)); DefaultDataBag pairs = (DefaultDataBag) t.get(1); ArrayListMutation mutationList = new ArrayListMutation(); -CfDef cfDef = getCfDef(); +CfDef cfDef = getCfDef(storeSignature); ListAbstractType marshallers = getDefaultMarshallers(cfDef); MapByteBuffer,AbstractType validators = getValidatorMap(cfDef); try @@ -404,7 +411,6 @@ public class CassandraStorage extends Lo column.timestamp = System.currentTimeMillis() * 1000; mutation.column_or_supercolumn = new ColumnOrSuperColumn(); mutation.column_or_supercolumn.column = column; - mutationList.add(mutation); } } mutationList.add(mutation); @@ -412,7 +418,7 @@ public class CassandraStorage extends Lo } catch (ClassCastException e) { -throw new IOException(e + Output must be (key, {(column,value)...}) for ColumnFamily or (key,
[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] paul cannon updated CASSANDRA-2496: --- Attachment: 0006-acknowledge-unexpected-repl-fins.patch.txt 0006-acknowledge-unexpected-repl-fins.patch.txt (updated): also log at info when acknowledging the unexpected messages Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2863) NPE when writing SSTable generated via repair
[ https://issues.apache.org/jira/browse/CASSANDRA-2863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2863. --- Resolution: Cannot Reproduce Fix Version/s: (was: 0.8.3) Assignee: (was: Sylvain Lebresne) Doesn't make any sense to me, either. The only place close() is called is from index() [as seen in the stacktrace here] and the only place index() is called is after prepareIndexing, which sets iwriter to non-null: {code} long estimatedRows = indexer.prepareIndexing(); // build the index and filter long rows = indexer.index(); {code} NPE when writing SSTable generated via repair - Key: CASSANDRA-2863 URL: https://issues.apache.org/jira/browse/CASSANDRA-2863 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Reporter: Héctor Izquierdo A NPE is generated during repair when closing an sstable generated via SSTable build. It doesn't happen always. The node had been scrubbed and compacted before calling repair. INFO [CompactionExecutor:2] 2011-07-06 11:11:32,640 SSTableReader.java (line 158) Opening /d2/cassandra/data/sbs/walf-g-730 ERROR [CompactionExecutor:2] 2011-07-06 11:11:34,327 AbstractCassandraDaemon.java (line 113) Fatal exception in thread Thread[CompactionExecutor:2,1,main] java.lang.NullPointerException at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.close(SSTableWriter.java:382) at org.apache.cassandra.io.sstable.SSTableWriter$RowIndexer.index(SSTableWriter.java:370) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:315) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1103) at org.apache.cassandra.db.compaction.CompactionManager$9.call(CompactionManager.java:1094) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Trivial Update of ArticlesAndPresentations by MatthewDennis
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The ArticlesAndPresentations page has been changed by MatthewDennis: http://wiki.apache.org/cassandra/ArticlesAndPresentations?action=diffrev1=123rev2=124 * [[http://www.slideshare.net/hjort/persistncia-nas-nuvens-com-no-sql-hjort|Persistência nas Nuvens com NoSQL]], Brazilian Portuguese, June 2011 = Presentations = + * [[http://www.slideshare.net/mattdennis/cassandra-data-modeling|Cassandra Data Modeling Workshop]] - Cassandra SF, Matthew F. Dennis, July 2011 * [[http://www.slideshare.net/jeromatron/cassandrahadoop-integration|Cassandra/Hadoop Integration]] - Jeremy Hanna, January 2011 * [[http://www.slideshare.net/supertom/using-cassandra-with-your-web-application|Using Cassandra with your Web Application]] - Tom Melendez, Oct 2010 * [[http://www.slideshare.net/yutuki/cassandrah-baseno-sql|CassandraとHBaseの比較をして入門するNoSQL]] by Shusuke Shiina (Sep 2010 Japanese)
[jira] [Updated] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] paul cannon updated CASSANDRA-2496: --- Attachment: (was: 0006-acknowledge-unexpected-repl-fins.patch.txt) Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2869) CassandraStorage does not function properly when used multiple times in a single pig script due to UDFContext sharing issues
[ https://issues.apache.org/jira/browse/CASSANDRA-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069189#comment-13069189 ] Hudson commented on CASSANDRA-2869: --- Integrated in Cassandra-0.7 #534 (See [https://builds.apache.org/job/Cassandra-0.7/534/]) Use a UDF-specific context signature. Patch by Jeremy Hanna, reviewed by brandonwilliams for CASSANDRA-2869 brandonwilliams : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149341 Files : * /cassandra/branches/cassandra-0.7/contrib/pig/src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java CassandraStorage does not function properly when used multiple times in a single pig script due to UDFContext sharing issues Key: CASSANDRA-2869 URL: https://issues.apache.org/jira/browse/CASSANDRA-2869 Project: Cassandra Issue Type: Bug Components: Contrib Affects Versions: 0.7.2 Reporter: Grant Ingersoll Assignee: Jeremy Hanna Fix For: 0.7.9, 0.8.2 Attachments: 2869-2.txt, 2869.txt CassandraStorage appears to have threading issues along the lines of those described at http://pig.markmail.org/message/oz7oz2x2dwp66eoz due to the sharing of the UDFContext. I believe the fix lies in implementing {code} public void setStoreFuncUDFContextSignature(String signature) { } {code} and then using that signature when getting the UDFContext. From the Pig manual: {quote} setStoreFunc!UDFContextSignature(): This method will be called by Pig both in the front end and back end to pass a unique signature to the Storer. The signature can be used to store into the UDFContext any information which the Storer needs to store between various method invocations in the front end and back end. The default implementation in StoreFunc has an empty body. This method will be called before other methods. {quote} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1405) Switch to THsHaServer, redux
[ https://issues.apache.org/jira/browse/CASSANDRA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vijay updated CASSANDRA-1405: - Attachment: 0001-including-validation.patch Awesome... Actually if we have a test with more client connections and unlimited threads in the sync we should actually have a better performance :) BTW: attached is a validation, this goes on top of the earlier patch. Jonathan, will submit a ticket and work on thrift patch to make it trace instead of error. Switch to THsHaServer, redux Key: CASSANDRA-1405 URL: https://issues.apache.org/jira/browse/CASSANDRA-1405 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 0.8.3 Attachments: 0001-including-validation.patch, 0001-log4j-config-change.patch, 1405-Thrift-Patch-SVN.patch, libthrift-r1026391.jar, trunk-1405.patch Brian's patch to CASSANDRA-876 suggested using a custom TProcessorFactory subclass, overriding getProcessor to reset to a default state when a new client connects. It looks like this would allow dropping CustomTThreadPoolServer as well as allowing non-thread based servers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2930) corrupt commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069197#comment-13069197 ] Jonathan Ellis commented on CASSANDRA-2930: --- Sounds like https://issues.apache.org/jira/browse/CASSANDRA-2675. Are you sure you're actually running 0.8.1? We've had a lot of 0.8.0 installs that people thought were 0.8.1 due to incorrect packages being published. grep Cassandra version /var/log/cassandra/system.log should verify. corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan Attachments: CommitLog-1310637513214.log We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at
[jira] [Commented] (CASSANDRA-2924) Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1
[ https://issues.apache.org/jira/browse/CASSANDRA-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069199#comment-13069199 ] Jonathan Ellis commented on CASSANDRA-2924: --- Why is ThriftConnection introduced? Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1 --- Key: CASSANDRA-2924 URL: https://issues.apache.org/jira/browse/CASSANDRA-2924 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Reporter: Rick Shaw Assignee: Rick Shaw Priority: Minor Labels: JDBC Fix For: 0.8.3 Attachments: 2924-v2.txt, consolidate-connection-v1.txt For the JDBC Driver suite, additional cleanup and consolidation of classes {{Connection}} and {{CassandraConnection}} were in order. Those changes drove a few casual additional changes in related classes {{CResultSet}}, {{CassandraStatement}} and {{CassandraPreparedStatement}} in order to continue to communicate properly. The class {{Utils}} was also enhanced to move more static utility methods into this holder class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2496) Gossip should handle 'dead' states
[ https://issues.apache.org/jira/browse/CASSANDRA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069202#comment-13069202 ] paul cannon commented on CASSANDRA-2496: ok, +1 with these patches. Gossip should handle 'dead' states -- Key: CASSANDRA-2496 URL: https://issues.apache.org/jira/browse/CASSANDRA-2496 Project: Cassandra Issue Type: Bug Components: Core Reporter: Brandon Williams Assignee: Brandon Williams Attachments: 0001-Rework-token-removal-process.txt, 0002-add-2115-back.txt, 0003-update-gossip-related-comments.patch.txt, 0004-do-REMOVING_TOKEN-REMOVED_TOKEN.patch.txt, 0005-drain-self-if-removetoken-d-elsewhere.patch.txt, 0006-acknowledge-unexpected-repl-fins.patch.txt For background, see CASSANDRA-2371 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
[ https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069201#comment-13069201 ] Patricio Echague commented on CASSANDRA-2045: - Tested with CASSANDRA-2928 patch and it works perfectly. Test environment: - 2 nodes on localhost (127.0.02 and .3) Test case: - start both nodes - create the schema for testing - stop node 1 - insert 5 keys into node 2. - verified HintsColumnFamily that has 5 entries in node 2. - start node 1. - Verify that node 1 has the new data - Verify that node 2 deleted the delivered hints. Simplify HH to decrease read load when nodes come back -- Key: CASSANDRA-2045 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Assignee: Nicholas Telford Fix For: 1.0 Attachments: 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 0003-Fixed-some-coding-style-issues.patch, 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 2045-v5.txt, 2045-v6.txt, CASSANDRA-2045-simplify-hinted-handoff-001.diff, CASSANDRA-2045-simplify-hinted-handoff-002.diff, CASSANDRA-2045-v4.diff Currently when HH is enabled, hints are stored, and when a node comes back, we begin sending that node data. We do a lookup on the local node for the row to send. To help reduce read load (if a node is offline for long period of time) we should store the data we want forward the node locally instead. We wouldn't have to do any lookups, just take byte[] and send to the destination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1405) Switch to THsHaServer, redux
[ https://issues.apache.org/jira/browse/CASSANDRA-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069208#comment-13069208 ] Brandon Williams commented on CASSANDRA-1405: - bq. Actually if we have a test with more client connections and unlimited threads in the sync we should actually have a better performance With 2k conns, hsha starts to show a small edge over sync. Switch to THsHaServer, redux Key: CASSANDRA-1405 URL: https://issues.apache.org/jira/browse/CASSANDRA-1405 Project: Cassandra Issue Type: Improvement Components: API Reporter: Jonathan Ellis Assignee: Vijay Priority: Minor Fix For: 0.8.3 Attachments: 0001-including-validation.patch, 0001-log4j-config-change.patch, 1405-Thrift-Patch-SVN.patch, libthrift-r1026391.jar, trunk-1405.patch Brian's patch to CASSANDRA-876 suggested using a custom TProcessorFactory subclass, overriding getProcessor to reset to a default state when a new client connects. It looks like this would allow dropping CustomTThreadPoolServer as well as allowing non-thread based servers. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2935) CLI ignores quoted default_validation_class in create column family command
CLI ignores quoted default_validation_class in create column family command - Key: CASSANDRA-2935 URL: https://issues.apache.org/jira/browse/CASSANDRA-2935 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.1 Environment: Ubuntu 10.10 Reporter: Andy Bauch Priority: Trivial The default_validation_class parameter of CREATE COLUMN FAMILY is ignored when quotes. The key_validation_class and comparator parameters do not exhibit this behavior. Sample output: [default@vp2] create column family UserPlaybackHistory with comparator='AsciiType' and key_validation_class='AsciiType' and default_validation_class='AsciiType'; 18a9f020-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] describe keyspace; Keyspace: vp2: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] Column Families: ColumnFamily: UserPlaybackHistory Key Validation Class: org.apache.cassandra.db.marshal.AsciiType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.AsciiType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.0875/232/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] [default@vp2] drop column family UserPlaybackHistory; 2513e2d0-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] create column family UserPlaybackHistory with comparator=AsciiType and key_validation_class=AsciiType and default_validation_class=AsciiType; 5d1b4ce0-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] describe keyspace; Keyspace: vp2: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] Column Families: ColumnFamily: UserPlaybackHistory Key Validation Class: org.apache.cassandra.db.marshal.AsciiType Default column value validator: org.apache.cassandra.db.marshal.AsciiType Columns sorted by: org.apache.cassandra.db.marshal.AsciiType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.0875/232/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2930) corrupt commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069243#comment-13069243 ] ivan commented on CASSANDRA-2930: - Hi Jonathan, we built our Cassandra package from git://git.apache.org/cassandra.git cassandra-0.8.1 branch. As I see it's the same as official 0.8.1 Cassandra code. grep Cassandra version /var/log/cassandra/system.log: INFO [main] 2011-07-21 14:42:48,553 StorageService.java (line 378) Cassandra version: 0.8.1 I checked CASSANDRA-2675 report. Patch 0002-Avoid-modifying-super-column-in-memtable-being-flush-v2.patch is in our code. PAtch 0001-Don-t-remove-columns-from-super-columns-in-memtable.patch is not in our code, but as I see it's not in official package and trunk also. This error happens rarely. I found it 2 to 3 times in a 128MB commitlog. I suspect some race condition, but a RowMutation shouldn't change. (SuperColumn.java:371) So I have no clue yet. It's welcomed any further test suggestion. Regards, ivan corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan Attachments: CommitLog-1310637513214.log We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at
[jira] [Commented] (CASSANDRA-2761) JDBC driver does not build
[ https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069248#comment-13069248 ] Eric Evans commented on CASSANDRA-2761: --- To summarize, it is now possible to build and test w/ ant. This is currently done by pointing to a local (built) working copy of Cassandra (a site config). What's left, and seems reasonable to scope with this issue: * Create an alternate mechanism for specifying the version of Cassandra to build/test against (in order to run the tests against prior releases). I'm thinking Ivy could be used here to automatically download artifacts when a property is passed (-Dcassandra.release=0.8.0 for example). * (Re)build Cassandra as needed from the drivers Ant build, or at the very least, handle the case when a build is needed. * Fix the {{generate-eclipse-files}} target if possible, or remove it otherwise. Work should also continue to reduce the cross-section of Cassandra that this driver depends on, but I'll open another issue for that. JDBC driver does not build -- Key: CASSANDRA-2761 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Rick Shaw Fix For: 1.0 Attachments: jdbc-driver-build-v1.txt, v1-0001-CASSANDRA-2761-cleanup-nits.txt Need a way to build (and run tests for) the Java driver. Also: still some vestigal references to drivers/ in trunk build.xml. Should we remove drivers/ from the 0.8 branch as well? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069250#comment-13069250 ] Jonathan Ellis commented on CASSANDRA-2921: --- There's no reason to have Reader and Writer live in the same outer class anymore. Let's split them up. Similarly, no reason to split ARAF out from Reader. Writer should probably just extend OutputStream, and let caller wrap in DOS if they want instead of pulling in that Harmony code (if we can get rid of it, great). Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069258#comment-13069258 ] Pavel Yaskevich commented on CASSANDRA-2921: Can we go only with v2 patch by now? I'm a bit concerned about design of the Writer: it should be able to seek back if we want to support resetAndTruncate() so there is no way to go away from using RAF inside of Writer that I see. I have tried to use FileOutputStream and it's channel but it didn't go so well. Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069262#comment-13069262 ] Jonathan Ellis commented on CASSANDRA-2921: --- By just extend OutputStream i just meant in the class definition, i agree that you probably need RAF internally. Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069263#comment-13069263 ] Pavel Yaskevich commented on CASSANDRA-2921: AbstractDataOutput extends OutputStream Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2921) Split BufferedRandomAccessFile (BRAF) into Input and Output classes
[ https://issues.apache.org/jira/browse/CASSANDRA-2921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069271#comment-13069271 ] Pavel Yaskevich commented on CASSANDRA-2921: I think we should stay with Reader/Writer introduced by v2 here to support full (expected) BRAF functionality. In the separate issue we can create an in-house implementation of FileOutputStream with support of mark(), resetAndTruncate(...) and truncate(...) methods and replace BRAF.Writer with it where needed, that should be a better design. Split BufferedRandomAccessFile (BRAF) into Input and Output classes Key: CASSANDRA-2921 URL: https://issues.apache.org/jira/browse/CASSANDRA-2921 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Pavel Yaskevich Assignee: Pavel Yaskevich Fix For: 1.0 Attachments: CASSANDRA-2921-make-Writer-a-stream.patch, CASSANDRA-2921-v2.patch, CASSANDRA-2921.patch Split BRAF into Input and Output classes to void complexity related to random I/O in write mode that we don't need any more, see CASSANDRA-2879. And make implementation more clean and reusable. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2034) Make Read Repair unnecessary when Hinted Handoff is enabled
[ https://issues.apache.org/jira/browse/CASSANDRA-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069275#comment-13069275 ] Patricio Echague commented on CASSANDRA-2034: - bq. after RpcTimeout we check the responseHandler write acks and write local hints for any missing targets. CASSANDRA-2914 handles the local storage of hints. Make Read Repair unnecessary when Hinted Handoff is enabled --- Key: CASSANDRA-2034 URL: https://issues.apache.org/jira/browse/CASSANDRA-2034 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Patricio Echague Fix For: 1.0 Original Estimate: 8h Remaining Estimate: 8h Currently, HH is purely an optimization -- if a machine goes down, enabling HH means RR/AES will have less work to do, but you can't disable RR entirely in most situations since HH doesn't kick in until the FailureDetector does. Let's add a scheduled task to the mutate path, such that we return to the client normally after ConsistencyLevel is achieved, but after RpcTimeout we check the responseHandler write acks and write local hints for any missing targets. This would making disabling RR when HH is enabled a much more reasonable option, which has a huge impact on read throughput. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
[ https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069280#comment-13069280 ] Hudson commented on CASSANDRA-2045: --- Integrated in Cassandra #968 (See [https://builds.apache.org/job/Cassandra/968/]) store hints as serialized mutations instead of pointers to data rows patch by Nick Telford, jbellis, and Patricio Echague for CASSANDRA-2045 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149396 Files : * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutationVerbHandler.java * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java Simplify HH to decrease read load when nodes come back -- Key: CASSANDRA-2045 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Assignee: Nicholas Telford Fix For: 1.0 Attachments: 0001-Changed-storage-of-Hints-to-store-a-serialized-RowMu.patch, 0002-Refactored-HintedHandoffManager.sendRow-to-reduce-co.patch, 0003-Fixed-some-coding-style-issues.patch, 0004-Fixed-direct-usage-of-Gossiper.getEndpointStateForEn.patch, 0005-Removed-duplicate-failure-detection-conditionals.-It.patch, 0006-Removed-handling-of-old-style-hints.patch, 2045-v3.txt, 2045-v5.txt, 2045-v6.txt, CASSANDRA-2045-simplify-hinted-handoff-001.diff, CASSANDRA-2045-simplify-hinted-handoff-002.diff, CASSANDRA-2045-v4.diff Currently when HH is enabled, hints are stored, and when a node comes back, we begin sending that node data. We do a lookup on the local node for the row to send. To help reduce read load (if a node is offline for long period of time) we should store the data we want forward the node locally instead. We wouldn't have to do any lookups, just take byte[] and send to the destination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2937) certain generic type causes compile error in eclipse
certain generic type causes compile error in eclipse Key: CASSANDRA-2937 URL: https://issues.apache.org/jira/browse/CASSANDRA-2937 Project: Cassandra Issue Type: Bug Reporter: Yang Yang Priority: Trivial the code ColumnFamily and AbstractColumnContainer uses code similar to the following (substitute Blah with AbstractColumnContainer.DeletionInfo): import java.util.concurrent.atomic.AtomicReference; public class TestPrivateAtomicRef { protected final AtomicReferenceBlah b = new AtomicReferenceBlah(new Blah()); // the following form would work for eclipse //protected final AtomicReference b = new AtomicReference(new Blah()); private static class Blah { } } class Child extends TestPrivateAtomicRef { public void aaa() { Child c = new Child(); c.b.set( b.get() // eclipse shows error here ); } } in eclipse, the above code generates compile error, but works fine under java command line. since many people use eclipse, it's better to make a temporary compromise and make DeletionInfo protected -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2924) Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1
[ https://issues.apache.org/jira/browse/CASSANDRA-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069284#comment-13069284 ] Rick Shaw commented on CASSANDRA-2924: -- It defines the methods that are to be implemented over and above those required by the {{java.sql.Connection}} interface; these moved from the {{o.a.c.cql.jdbc.Connection}}. That class can now be removed. (I did not know how to do that in the patch?) I thought that seemed like the right thing to do (defining an interface) but it is not necessary. Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1 --- Key: CASSANDRA-2924 URL: https://issues.apache.org/jira/browse/CASSANDRA-2924 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Reporter: Rick Shaw Assignee: Rick Shaw Priority: Minor Labels: JDBC Fix For: 0.8.3 Attachments: 2924-v2.txt, consolidate-connection-v1.txt For the JDBC Driver suite, additional cleanup and consolidation of classes {{Connection}} and {{CassandraConnection}} were in order. Those changes drove a few casual additional changes in related classes {{CResultSet}}, {{CassandraStatement}} and {{CassandraPreparedStatement}} in order to continue to communicate properly. The class {{Utils}} was also enhanced to move more static utility methods into this holder class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2937) certain generic type causes compile error in eclipse
[ https://issues.apache.org/jira/browse/CASSANDRA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2937: - Attachment: 0002-avoid-eclipse-compile-error-for-generic-type-on-Atom.patch minor fix to avoid Eclipse compile error certain generic type causes compile error in eclipse Key: CASSANDRA-2937 URL: https://issues.apache.org/jira/browse/CASSANDRA-2937 Project: Cassandra Issue Type: Bug Reporter: Yang Yang Priority: Trivial Attachments: 0002-avoid-eclipse-compile-error-for-generic-type-on-Atom.patch the code ColumnFamily and AbstractColumnContainer uses code similar to the following (substitute Blah with AbstractColumnContainer.DeletionInfo): import java.util.concurrent.atomic.AtomicReference; public class TestPrivateAtomicRef { protected final AtomicReferenceBlah b = new AtomicReferenceBlah(new Blah()); // the following form would work for eclipse //protected final AtomicReference b = new AtomicReference(new Blah()); private static class Blah { } } class Child extends TestPrivateAtomicRef { public void aaa() { Child c = new Child(); c.b.set( b.get() // eclipse shows error here ); } } in eclipse, the above code generates compile error, but works fine under java command line. since many people use eclipse, it's better to make a temporary compromise and make DeletionInfo protected -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: incremental.diff) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-2935) CLI ignores quoted default_validation_class in create column family command
[ https://issues.apache.org/jira/browse/CASSANDRA-2935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams resolved CASSANDRA-2935. - Resolution: Duplicate Dupe of CASSANDRA-2899 CLI ignores quoted default_validation_class in create column family command - Key: CASSANDRA-2935 URL: https://issues.apache.org/jira/browse/CASSANDRA-2935 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.1 Environment: Ubuntu 10.10 Reporter: Andy Bauch Priority: Trivial The default_validation_class parameter of CREATE COLUMN FAMILY is ignored when quotes. The key_validation_class and comparator parameters do not exhibit this behavior. Sample output: [default@vp2] create column family UserPlaybackHistory with comparator='AsciiType' and key_validation_class='AsciiType' and default_validation_class='AsciiType'; 18a9f020-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] describe keyspace; Keyspace: vp2: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] Column Families: ColumnFamily: UserPlaybackHistory Key Validation Class: org.apache.cassandra.db.marshal.AsciiType Default column value validator: org.apache.cassandra.db.marshal.BytesType Columns sorted by: org.apache.cassandra.db.marshal.AsciiType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.0875/232/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] [default@vp2] drop column family UserPlaybackHistory; 2513e2d0-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] create column family UserPlaybackHistory with comparator=AsciiType and key_validation_class=AsciiType and default_validation_class=AsciiType; 5d1b4ce0-b3ce-11e0--9904252df9ff Waiting for schema agreement... ... schemas agree across the cluster [default@vp2] describe keyspace; Keyspace: vp2: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] Column Families: ColumnFamily: UserPlaybackHistory Key Validation Class: org.apache.cassandra.db.marshal.AsciiType Default column value validator: org.apache.cassandra.db.marshal.AsciiType Columns sorted by: org.apache.cassandra.db.marshal.AsciiType Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 1.0875/232/1440 (millions of ops/MB/minutes) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: false Built indexes: [] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: fast_cf_081_trunk.diff) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2761) JDBC driver does not build
[ https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069317#comment-13069317 ] Jonathan Ellis commented on CASSANDRA-2761: --- +1 cleanup patch JDBC driver does not build -- Key: CASSANDRA-2761 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Rick Shaw Fix For: 1.0 Attachments: jdbc-driver-build-v1.txt, v1-0001-CASSANDRA-2761-cleanup-nits.txt Need a way to build (and run tests for) the Java driver. Also: still some vestigal references to drivers/ in trunk build.xml. Should we remove drivers/ from the 0.8 branch as well? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (CASSANDRA-2930) corrupt commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis reassigned CASSANDRA-2930: - Assignee: Sylvain Lebresne corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan Assignee: Sylvain Lebresne Fix For: 0.8.3 Attachments: CommitLog-1310637513214.log We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261)
[jira] [Updated] (CASSANDRA-2930) corrupt commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-2930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2930: -- Fix Version/s: 0.8.3 corrupt commitlog - Key: CASSANDRA-2930 URL: https://issues.apache.org/jira/browse/CASSANDRA-2930 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8.1 Environment: Linux, amd64. Java(TM) SE Runtime Environment (build 1.6.0_26-b03) Reporter: ivan Assignee: Sylvain Lebresne Fix For: 0.8.3 Attachments: CommitLog-1310637513214.log We get Exception encountered during startup error while Cassandra starts. Error messages: INFO 13:56:28,736 Finished reading /var/lib/cassandra/commitlog/CommitLog-1310637513214.log ERROR 13:56:28,736 Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ... 13 more Exception encountered during startup. java.io.IOError: java.io.EOFException at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:265) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:281) at org.apache.cassandra.io.util.ColumnIterator.next(ColumnSortedMap.java:236) at java.util.concurrent.ConcurrentSkipListMap.buildFromSorted(ConcurrentSkipListMap.java:1493) at java.util.concurrent.ConcurrentSkipListMap.init(ConcurrentSkipListMap.java:1443) at org.apache.cassandra.db.SuperColumnSerializer.deserialize(SuperColumn.java:419) at org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:139) at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:127) at org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:278) at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:158) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:175) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:368) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:80) Caused by: java.io.EOFException at java.io.DataInputStream.readFully(DataInputStream.java:180) at java.io.DataInputStream.readFully(DataInputStream.java:152) at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:394) at org.apache.cassandra.utils.ByteBufferUtil.readWithShortLength(ByteBufferUtil.java:368) at org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:87) at org.apache.cassandra.io.util.ColumnIterator.deserializeNext(ColumnSortedMap.java:261) ...
[jira] [Updated] (CASSANDRA-2914) Simplify HH to always store hints on the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patricio Echague updated CASSANDRA-2914: Attachment: CASSANDRA-2914-trunk-v2.diff v2 replaces v1 Rebase diff file Add entry in CHANGES.txt update javadoc and comments. Simplify HH to always store hints on the coordinator Key: CASSANDRA-2914 URL: https://issues.apache.org/jira/browse/CASSANDRA-2914 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Patricio Echague Fix For: 1.0 Attachments: CASSANDRA-2914-trunk-v1.diff, CASSANDRA-2914-trunk-v2.diff Moved from CASSANDRA-2045: Since we're storing the full mutation post-2045, there's no benefit to be gained from storing the hint on the replica node, only an increase in complexity. Let's switch it to always store hints on the coordinator instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2934) log broken incoming connections at DEBUG
[ https://issues.apache.org/jira/browse/CASSANDRA-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069329#comment-13069329 ] Brandon Williams commented on CASSANDRA-2934: - +1 log broken incoming connections at DEBUG Key: CASSANDRA-2934 URL: https://issues.apache.org/jira/browse/CASSANDRA-2934 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Trivial Fix For: 0.8.2 Attachments: 2934.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1149426 - in /cassandra/trunk: CHANGES.txt src/java/org/apache/cassandra/db/RowMutation.java src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java
Author: jbellis Date: Fri Jul 22 01:13:06 2011 New Revision: 1149426 URL: http://svn.apache.org/viewvc?rev=1149426view=rev Log: store hints in the coordinator node instead of in the closest replica patch by Patricio Echague; reviewed by jbellis for CASSANDRA-2914 Modified: cassandra/trunk/CHANGES.txt cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1149426r1=1149425r2=1149426view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Fri Jul 22 01:13:06 2011 @@ -16,6 +16,8 @@ * use reference counting for deleting sstables instead of relying on the GC (CASSANDRA-2521) * store hints as serialized mutations instead of pointers to data rows + * store hints in the coordinator node instead of in the closest + replica (CASSANDRA-2914). 0.8.2 Modified: cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java?rev=1149426r1=1149425r2=1149426view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java Fri Jul 22 01:13:06 2011 @@ -97,6 +97,23 @@ public class RowMutation implements IMut return modifications_.values(); } +/** + * Returns mutation representing a Hints to be sent to codeaddress/code + * as soon as it becomes available. + * The format is the following: + * + * HintsColumnFamily: {// cf + * dest ip: { // key + * uuid: { // super-column + * table: table// columns + * key: key + * mutation: mutation + * version: version + * } + * } + * } + * + */ public static RowMutation hintFor(RowMutation mutation, ByteBuffer address) throws IOException { RowMutation rm = new RowMutation(Table.SYSTEM_TABLE, address); Modified: cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java?rev=1149426r1=1149425r2=1149426view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java Fri Jul 22 01:13:06 2011 @@ -157,10 +157,7 @@ public abstract class AbstractReplicatio if (map.size() == targets.size() || !StorageProxy.isHintedHandoffEnabled()) return map; -// assign dead endpoints to be hinted to the closest live one, or to the local node -// (since it is trivially the closest) if none are alive. This way, the cost of doing -// a hint is only adding the hint header, rather than doing a full extra write, if any -// destination nodes are alive. +// Assign dead endpoints to be hinted to the local node. // // we do a 2nd pass on targets instead of using temporary storage, // to optimize for the common case (everything was alive). @@ -176,10 +173,8 @@ public abstract class AbstractReplicatio continue; } -InetAddress destination = map.isEmpty() -? localAddress -: snitch.getSortedListByProximity(localAddress, map.keySet()).get(0); -map.put(destination, ep); +// We always store the hint on the coordinator node. +map.put(localAddress, ep); } return map;
svn commit: r1149430 - /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java
Author: jbellis Date: Fri Jul 22 01:48:17 2011 New Revision: 1149430 URL: http://svn.apache.org/viewvc?rev=1149430view=rev Log: log broken incoming connections at DEBUG patch by jbellis; reviewed by brandonwilliams for CASSANDRA-2934 Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java?rev=1149430r1=1149429r2=1149430view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java Fri Jul 22 01:48:17 2011 @@ -75,8 +75,9 @@ public class IncomingTcpConnection exten } catch (IOException e) { +logger.debug(Incoming IOException, e); close(); -throw new IOError(e); +return; } if (version MessagingService.version_)
svn commit: r1149431 - /cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java
Author: jbellis Date: Fri Jul 22 01:50:07 2011 New Revision: 1149431 URL: http://svn.apache.org/viewvc?rev=1149431view=rev Log: humor Eclipse Modified: cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java Modified: cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java?rev=1149431r1=1149430r2=1149431view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/AbstractColumnContainer.java Fri Jul 22 01:50:07 2011 @@ -193,7 +193,7 @@ public abstract class AbstractColumnCont return columns.values().iterator(); } -private static class DeletionInfo +protected static class DeletionInfo { public final long markedForDeleteAt; public final int localDeletionTime;
[jira] [Resolved] (CASSANDRA-2937) certain generic type causes compile error in eclipse
[ https://issues.apache.org/jira/browse/CASSANDRA-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2937. --- Resolution: Fixed committed, although in general I'm against humoring broken tools certain generic type causes compile error in eclipse Key: CASSANDRA-2937 URL: https://issues.apache.org/jira/browse/CASSANDRA-2937 Project: Cassandra Issue Type: Bug Reporter: Yang Yang Priority: Trivial Attachments: 0002-avoid-eclipse-compile-error-for-generic-type-on-Atom.patch the code ColumnFamily and AbstractColumnContainer uses code similar to the following (substitute Blah with AbstractColumnContainer.DeletionInfo): import java.util.concurrent.atomic.AtomicReference; public class TestPrivateAtomicRef { protected final AtomicReferenceBlah b = new AtomicReferenceBlah(new Blah()); // the following form would work for eclipse //protected final AtomicReference b = new AtomicReference(new Blah()); private static class Blah { } } class Child extends TestPrivateAtomicRef { public void aaa() { Child c = new Child(); c.b.set( b.get() // eclipse shows error here ); } } in eclipse, the above code generates compile error, but works fine under java command line. since many people use eclipse, it's better to make a temporary compromise and make DeletionInfo protected -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2924) Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1
[ https://issues.apache.org/jira/browse/CASSANDRA-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069350#comment-13069350 ] Jonathan Ellis commented on CASSANDRA-2924: --- [committed w/ above changes] Consolidate JDBC driver classes: Connection and CassandraConnection in advance of feature additions for 1.1 --- Key: CASSANDRA-2924 URL: https://issues.apache.org/jira/browse/CASSANDRA-2924 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.1 Reporter: Rick Shaw Assignee: Rick Shaw Priority: Minor Labels: JDBC Fix For: 0.8.3 Attachments: 2924-v2.txt, consolidate-connection-v1.txt For the JDBC Driver suite, additional cleanup and consolidation of classes {{Connection}} and {{CassandraConnection}} were in order. Those changes drove a few casual additional changes in related classes {{CResultSet}}, {{CassandraStatement}} and {{CassandraPreparedStatement}} in order to continue to communicate properly. The class {{Utils}} was also enhanced to move more static utility methods into this holder class. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2761) JDBC driver does not build
[ https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069353#comment-13069353 ] Rick Shaw commented on CASSANDRA-2761: -- +1 for the cleanup path The {{generate-eclipse-files}} seems to be working for me? How does it fail? JDBC driver does not build -- Key: CASSANDRA-2761 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Rick Shaw Fix For: 1.0 Attachments: jdbc-driver-build-v1.txt, v1-0001-CASSANDRA-2761-cleanup-nits.txt Need a way to build (and run tests for) the Java driver. Also: still some vestigal references to drivers/ in trunk build.xml. Should we remove drivers/ from the 0.8 branch as well? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2761) JDBC driver does not build
[ https://issues.apache.org/jira/browse/CASSANDRA-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069353#comment-13069353 ] Rick Shaw edited comment on CASSANDRA-2761 at 7/22/11 2:08 AM: --- +1 for the cleanup patch The {{generate-eclipse-files}} seems to be working for me? How does it fail? was (Author: ardot): +1 for the cleanup path The {{generate-eclipse-files}} seems to be working for me? How does it fail? JDBC driver does not build -- Key: CASSANDRA-2761 URL: https://issues.apache.org/jira/browse/CASSANDRA-2761 Project: Cassandra Issue Type: Bug Components: API Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Rick Shaw Fix For: 1.0 Attachments: jdbc-driver-build-v1.txt, v1-0001-CASSANDRA-2761-cleanup-nits.txt Need a way to build (and run tests for) the Java driver. Also: still some vestigal references to drivers/ in trunk build.xml. Should we remove drivers/ from the 0.8 branch as well? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2934) log broken incoming connections at DEBUG
[ https://issues.apache.org/jira/browse/CASSANDRA-2934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069356#comment-13069356 ] Hudson commented on CASSANDRA-2934: --- Integrated in Cassandra-0.8 #234 (See [https://builds.apache.org/job/Cassandra-0.8/234/]) log broken incoming connections at DEBUG patch by jbellis; reviewed by brandonwilliams for CASSANDRA-2934 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149430 Files : * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/IncomingTcpConnection.java log broken incoming connections at DEBUG Key: CASSANDRA-2934 URL: https://issues.apache.org/jira/browse/CASSANDRA-2934 Project: Cassandra Issue Type: Task Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Trivial Fix For: 0.8.2 Attachments: 2934.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2914) Simplify HH to always store hints on the coordinator
[ https://issues.apache.org/jira/browse/CASSANDRA-2914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13069358#comment-13069358 ] Hudson commented on CASSANDRA-2914: --- Integrated in Cassandra #969 (See [https://builds.apache.org/job/Cassandra/969/]) store hints in the coordinator node instead of in the closest replica patch by Patricio Echague; reviewed by jbellis for CASSANDRA-2914 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1149426 Files : * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/src/java/org/apache/cassandra/locator/AbstractReplicationStrategy.java Simplify HH to always store hints on the coordinator Key: CASSANDRA-2914 URL: https://issues.apache.org/jira/browse/CASSANDRA-2914 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.0 Reporter: Jonathan Ellis Assignee: Patricio Echague Fix For: 1.0 Attachments: CASSANDRA-2914-trunk-v1.diff, CASSANDRA-2914-trunk-v2.diff Moved from CASSANDRA-2045: Since we're storing the full mutation post-2045, there's no benefit to be gained from storing the hint on the replica node, only an increase in complexity. Let's switch it to always store hints on the coordinator instead. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira