[jira] [Created] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2358) CLI doesn't handle inserting negative integers
[ https://issues.apache.org/jira/browse/CASSANDRA-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-2358: --- Attachment: CASSANDRA-2358-trunk.patch branch: trunk (latest commit e6c5a28da940a086d0e786f1ad0288c0b0efa27d) CLI doesn't handle inserting negative integers -- Key: CASSANDRA-2358 URL: https://issues.apache.org/jira/browse/CASSANDRA-2358 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.0 Reporter: Tyler Hobbs Assignee: Pavel Yaskevich Priority: Trivial Fix For: 0.7.5, 0.8 Attachments: CASSANDRA-2358-trunk.patch, CASSANDRA-2358.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The CLI raises a syntax error when trying to insert negative integers: {noformat} [default@Keyspace1] set StandardInteger['key'][-12] = 'val'; Syntax error at position 28: mismatched character '1' expecting '-' {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2358) CLI doesn't handle inserting negative integers
[ https://issues.apache.org/jira/browse/CASSANDRA-2358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-2358: --- Fix Version/s: 0.8 CLI doesn't handle inserting negative integers -- Key: CASSANDRA-2358 URL: https://issues.apache.org/jira/browse/CASSANDRA-2358 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.7.0 Reporter: Tyler Hobbs Assignee: Pavel Yaskevich Priority: Trivial Fix For: 0.7.5, 0.8 Attachments: CASSANDRA-2358-trunk.patch, CASSANDRA-2358.patch Original Estimate: 0.5h Remaining Estimate: 0.5h The CLI raises a syntax error when trying to insert negative integers: {noformat} [default@Keyspace1] set StandardInteger['key'][-12] = 'val'; Syntax error at position 28: mismatched character '1' expecting '-' {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2397) Improve or remove replicate-on-write setting
[ https://issues.apache.org/jira/browse/CASSANDRA-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012459#comment-13012459 ] Chris Burroughs commented on CASSANDRA-2397: For what it's worth I *think* lazily replicated counters is what I want. Improve or remove replicate-on-write setting Key: CASSANDRA-2397 URL: https://issues.apache.org/jira/browse/CASSANDRA-2397 Project: Cassandra Issue Type: Bug Reporter: Stu Hood The replicate on write setting breaks assumptions in various places in the codebase dealing with whether data will be replicated in a timely fashion. It's worthwhile to discuss whether we should go all-the-way on replicate-on-write, such that it is a fully supported feature, or whether we should remove it entirely. On one hand, ROW could be considered to be just another replication tunable like HH, RR and AES. On the other hand, a lazily replicating store is very rarely what you actually wanted. Open issues related to ROW are linked, but additionally, we'd need to: * Make the setting have an effect for standard column families * Change the default for ROW to enabled and properly warn of the effects -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012475#comment-13012475 ] Jonathan Ellis commented on CASSANDRA-2401: --- Are you querying for zero columns? getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012496#comment-13012496 ] Jonathan Ellis commented on CASSANDRA-1902: --- why do we need normalmappedsegment as well as native? could we get rid of normal? Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012506#comment-13012506 ] Pavel Yaskevich commented on CASSANDRA-1902: NativeMappedSegment if used to give us a better control over memory-mapped region of the file, especially for skipping page cache using (madvice(DONT_NEED) and utilizing a mapping by native way - munmap and doing page cache migration. NormalMappedSegment is used in cases when we don't have a page size (JNA is not installed or not on Linux). I have added those methods because it's not possible to clean page cache from the region of the MBB, also using NativeMappedSegment we don't need to duplicate buffers on slice calls. Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012511#comment-13012511 ] Jonathan Ellis commented on CASSANDRA-1902: --- bq. NormalMappedSegment is used in cases when we don't have a page size (JNA is not installed or not on Linux) Makes sense. bq. using NativeMappedSegment we don't need to duplicate buffers on slice calls Don't we still need to duplicate, in case we unmap the sstable we read from before we return the data to the requester? Similarly, if we manually munmap, isn't there a race condition where we say give me the list of sstables and then while reading one gets compacted and unmapped? Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012513#comment-13012513 ] Pavel Yaskevich commented on CASSANDRA-1902: Pointer.getByteArray gives us the same performance as zero-reads at the current version, I have tried Pointer.getByteBuffer - it's slower than getByteArray + it needs to be re-ordered to Big Endian. Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012517#comment-13012517 ] Jonathan Ellis commented on CASSANDRA-1902: --- Okay, so we *are* adding a copy by using getByteArray. I guess that takes care of the race conditions but it baffles me that copy + wrap can be as fast as duplicate + set position. (duplicate != copy.) All the order(nativeOrder) call does is set byte order for things like the get* methods, which all look like this: {code} private long getLong(long a) { if (unaligned) { long x = unsafe.getLong(a); return (nativeByteOrder ? x : Bits.swap(x)); } return Bits.getLong(a, bigEndian); } {code} And here is where unaligned gets set: {code} static boolean unaligned() { if (unalignedKnown) return unaligned; PrivilegedAction pa = new sun.security.action.GetPropertyAction(os.arch); String arch = (String)AccessController.doPrivileged(pa); unaligned = arch.equals(i386) || arch.equals(x86) || arch.equals(x86_64) || arch.equals(amd64) || arch.equals(ppc); // Mac OS X / PPC: see Radar #3253257 unalignedKnown = true; return unaligned; } {code} ... so byte ordering is basically a no-op on any architecture we are likely to run on. Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012520#comment-13012520 ] Pavel Yaskevich commented on CASSANDRA-1902: the story behind ordering I mentioned is that Pointer.getByteBuffer always ensures native byte order and MappedByteBuffer always has Big Endian order so we need to ensure byte order of the Pointer.getByteBuffer by setting it to Big Endian, but this is not a case here - on my tests on the server with 2gb RAM hosted on rackspace and on the server with high-memory and medium server hosted on ec2 both gave me a better performance using Pointer.getByteArray and wrap comparing to Pointer.getByteBuffer (and the getByteArray performance is almost identical to our current version). Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2231) Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal
[ https://issues.apache.org/jira/browse/CASSANDRA-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012531#comment-13012531 ] Ed Anuff commented on CASSANDRA-2231: - Sylvain, in the JPA implementation, we're seeing that we'd like to have a little more flexibility with the trailing end-of-component, specifically, that it be able to have values of -1,0,1 rather than just 0,1. The comparison logic would look like this: {noformat} byte b1 = bb1.get(); byte b2 = bb2.get(); if (b1 0) { if (b2 = 0) { return -1; } } if (b1 0) { if (b2 = 0) { return 1; } } if ((b1 == 0) (b2 != 0)) { return - b2; } {noformat} Add CompositeType comparer to the comparers provided in org.apache.cassandra.db.marshal --- Key: CASSANDRA-2231 URL: https://issues.apache.org/jira/browse/CASSANDRA-2231 Project: Cassandra Issue Type: Improvement Components: Contrib Affects Versions: 0.7.3 Reporter: Ed Anuff Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.5 Attachments: CompositeType-and-DynamicCompositeType.patch, edanuff-CassandraCompositeType-1e253c4.zip CompositeType is a custom comparer that makes it possible to create comparable composite values out of the basic types that Cassandra currently supports, such as Long, UUID, etc. This is very useful in both the creation of custom inverted indexes using columns in a skinny row, where each column name is a composite value, and also when using Cassandra's built-in secondary index support, where it can be used to encode the values in the columns that Cassandra indexes. One scenario for the usage of these is documented here: http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html. Source for contribution is attached and has been previously maintained on github here: https://github.com/edanuff/CassandraCompositeType -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2402) Python dbapi driver for CQL
Python dbapi driver for CQL --- Key: CASSANDRA-2402 URL: https://issues.apache.org/jira/browse/CASSANDRA-2402 Project: Cassandra Issue Type: Task Reporter: Jon Hermes Assignee: Jon Hermes Fix For: 0.8 Create a driver that emulates python's dbapi. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2400) Resolve Maven Ant Tasks from local Maven repository if present
[ https://issues.apache.org/jira/browse/CASSANDRA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012536#comment-13012536 ] Jeremy Hanna commented on CASSANDRA-2400: - Tried to clarify a couple of things with Stephen in IRC. This would probably speed up the build a little since it checks the local repo first. It also doesn't change the build process at all - it just checks the cache first. So I would be +1 on this as it is generally useful as it speeds things up for others (and helps for testing). Resolve Maven Ant Tasks from local Maven repository if present -- Key: CASSANDRA-2400 URL: https://issues.apache.org/jira/browse/CASSANDRA-2400 Project: Cassandra Issue Type: Improvement Components: Packaging Affects Versions: 0.7.4 Reporter: Stephen Connolly Priority: Minor Fix For: 0.7.5 Attachments: MCASSANDRA-tweaks.patch To aid with testing using newer versions of Maven ANT Tasks it can be helpful to copy the version from the local repository if present rather than always downloading -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2400) Resolve Maven Ant Tasks from local Maven repository if present
[ https://issues.apache.org/jira/browse/CASSANDRA-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012536#comment-13012536 ] Jeremy Hanna edited comment on CASSANDRA-2400 at 3/29/11 4:12 PM: -- Tried to clarify a couple of things with Stephen in IRC. This would probably speed up the build a little since it checks the local repo first. It also doesn't change the build process at all - it just checks the cache first. So I would be +1 on this as it is generally useful in speeding up the build a bit. It also helps for testing. was (Author: jeromatron): Tried to clarify a couple of things with Stephen in IRC. This would probably speed up the build a little since it checks the local repo first. It also doesn't change the build process at all - it just checks the cache first. So I would be +1 on this as it is generally useful as it speeds things up for others (and helps for testing). Resolve Maven Ant Tasks from local Maven repository if present -- Key: CASSANDRA-2400 URL: https://issues.apache.org/jira/browse/CASSANDRA-2400 Project: Cassandra Issue Type: Improvement Components: Packaging Affects Versions: 0.7.4 Reporter: Stephen Connolly Priority: Minor Fix For: 0.7.5 Attachments: MCASSANDRA-tweaks.patch To aid with testing using newer versions of Maven ANT Tasks it can be helpful to copy the version from the local repository if present rather than always downloading -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012542#comment-13012542 ] Tey Kar Shiang commented on CASSANDRA-2401: --- Hi, nope. It is a query for 4 columns. I cheked that only 1 row has this problem (no column found), out of the 948 records returned; I skipped the row with zero columns. In my stress-test, all rows have 4 columns; i.e. row is the file, the 4 columns (index) are like its version, modified time, type, etc. I added all the columns when added each file. The addition should be working since there is no such exception on day 1, and I start and stop the stress tests until each users have around 1500 files. Row with 0 column only found on the 4th day after I continue to run it. I will keep picking up cassandra logics, as I have little understanding about how data loaded, stored and deleted. Any suggestion / guide on how I should go on with my study is greatly appreciated. Thank you! Btw, for this test, I have not yet going to 2 nodes / 3 nodes. It is only a single-node cassandra runnning on my localhost. getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012547#comment-13012547 ] Jonathan Ellis commented on CASSANDRA-2401: --- Is there any data from earlier than 0.7.4? getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1086640 - /cassandra/trunk/CHANGES.txt
Author: jbellis Date: Tue Mar 29 16:47:19 2011 New Revision: 1086640 URL: http://svn.apache.org/viewvc?rev=1086640view=rev Log: add 1669 and 924 to CHANGES Modified: cassandra/trunk/CHANGES.txt Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1086640r1=1086639r2=1086640view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Tue Mar 29 16:47:19 2011 @@ -1,4 +1,5 @@ 0.8-dev + * remove Avro RPC support (CASSANDRA-926) * avoid double RowMutation serialization on write path (CASSANDRA-1800) * adds support for columns that act as incr/decr counters (CASSANDRA-1072, 1937, 1944, 1936, 2101, 2093, 2288, 2105) @@ -11,6 +12,7 @@ * Fix for Cli to support updating replicate_on_write (CASSANDRA-2236) * JDBC driver for CQL (CASSANDRA-2124, 2302) * atomic switch of memtables and sstables (CASSANDRA-2284) + * add pluggable SeedProvider (CASSANDRA-1669) 0.7.5
svn commit: r1086696 - /cassandra/branches/cassandra-0.7/CHANGES.txt
Author: jbellis Date: Tue Mar 29 19:30:55 2011 New Revision: 1086696 URL: http://svn.apache.org/viewvc?rev=1086696view=rev Log: update CHANGES Modified: cassandra/branches/cassandra-0.7/CHANGES.txt Modified: cassandra/branches/cassandra-0.7/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1086696r1=1086695r2=1086696view=diff == --- cassandra/branches/cassandra-0.7/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7/CHANGES.txt Tue Mar 29 19:30:55 2011 @@ -63,8 +63,7 @@ * fixes for cache save/load (CASSANDRA-2172, -2174) * Handle whole-row deletions in CFOutputFormat (CASSANDRA-2014) * Make memtable_flush_writers flush in parallel (CASSANDRA-2178) - * make key cache preheating default to false; enable with - -Dcompaction_preheat_key_cache=true (CASSANDRA-2175) + * Add compaction_preheat_key_cache option (CASSANDRA-2175) * refactor stress.py to have only one copy of the format string used for creating row keys (CASSANDRA-2108) * validate index names for \w+ (CASSANDRA-2196)
[jira] [Updated] (CASSANDRA-2281) keep a count of errors
[ https://issues.apache.org/jira/browse/CASSANDRA-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ryan King updated CASSANDRA-2281: - Attachment: patch update to trunk keep a count of errors -- Key: CASSANDRA-2281 URL: https://issues.apache.org/jira/browse/CASSANDRA-2281 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ryan King Assignee: Ryan King Priority: Minor Fix For: 0.7.5 Attachments: patch, textmate stdin Vrj9Xa.txt I have patch that keeps a counter (exposed via jmx) of errors. This is quite useful for operators to keep track of the quality of cassandra without having to tail and parse logs across a cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2281) keep a count of errors
[ https://issues.apache.org/jira/browse/CASSANDRA-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012661#comment-13012661 ] Jonathan Ellis commented on CASSANDRA-2281: --- bq. Also you could add a log4j socket appender for error level that would collect the actual errors for operators to be notified of. Another option would be to write a log4j FileAppender that also has a JMX bean. Well, you can specify multiple appenders, so all you really need is to stack the JMX one in w/ the existing RollingFileAppender. Looks like JBoss had the same idea: http://community.jboss.org/wiki/JMXNotificationAppender keep a count of errors -- Key: CASSANDRA-2281 URL: https://issues.apache.org/jira/browse/CASSANDRA-2281 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Ryan King Assignee: Ryan King Priority: Minor Fix For: 0.7.5 Attachments: patch, textmate stdin Vrj9Xa.txt I have patch that keeps a counter (exposed via jmx) of errors. This is quite useful for operators to keep track of the quality of cassandra without having to tail and parse logs across a cluster. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2311) type validated row keys
[ https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012668#comment-13012668 ] Jonathan Ellis commented on CASSANDRA-2311: --- I'm getting this running nosetests -x test/system/test_thrift_server.py w/ Eric's patch applied: {noformat} java.lang.NullPointerException at org.apache.cassandra.thrift.ThriftValidation.validateKeyType(ThriftValidation.java:68) at org.apache.cassandra.thrift.CassandraServer.internal_batch_mutate(CassandraServer.java:391) at org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:418) {noformat} Also: shouldn't we move validateKeyType into validateKey? type validated row keys --- Key: CASSANDRA-2311 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Eric Evans Assignee: Jon Hermes Labels: cql Fix For: 0.8 Attachments: 2311.txt, v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt The idea here is to allow the assignment of a column-family-wide key type used to perform validation, (similar to how default_validation_class does for column values). This should be as straightforward as extending the column family schema to include the new attribute, and updating {{ThriftValidation.validateKey}} to validate the key ({{AbstractType.validate}}). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2321) disallow to querying a counter CF with non-counter operation
[ https://issues.apache.org/jira/browse/CASSANDRA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2321: -- Summary: disallow to querying a counter CF with non-counter operation (was: Counter column values shows in hex values. Need to show it in string value.) disallow to querying a counter CF with non-counter operation Key: CASSANDRA-2321 URL: https://issues.apache.org/jira/browse/CASSANDRA-2321 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 Environment: Linux Reporter: Mubarak Seyed Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8 Attachments: 0001-Don-t-allow-normal-query-on-counter-CF.patch CounterColumnType.getString() returns hexString. {code} public String getString(ByteBuffer bytes) { return ByteBufferUtil.bytesToHex(bytes); } {code} and python stress.py reader returns [ColumnOrSuperColumn(column=None, super_column=SuperColumn(name='19', columns=[Column(timestamp=1299984960277, name='56', value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00,', ttl=None), Column(timestamp=1299985019923, name='57', value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00;\x00\x00\x00\x00\x00\x00\x08\xfd', ttl=None))] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1086729 - in /cassandra/trunk: src/java/org/apache/cassandra/config/ src/java/org/apache/cassandra/cql/ src/java/org/apache/cassandra/thrift/ test/system/ test/unit/org/apache/cassandra/t
Author: jbellis Date: Tue Mar 29 20:39:37 2011 New Revision: 1086729 URL: http://svn.apache.org/viewvc?rev=1086729view=rev Log: disallow querying a counter CF with non-counter operation patch by slebresne; reviewed by jbellis for CASSANDRA-2321 Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java cassandra/trunk/test/system/test_thrift_server.py cassandra/trunk/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java?rev=1086729r1=1086728r2=1086729view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java Tue Mar 29 20:39:37 2011 @@ -468,6 +468,11 @@ public final class CFMetaData { return Collections.unmodifiableMap(column_metadata); } + +public AbstractType getComparatorFor(ByteBuffer superColumnName) +{ +return superColumnName == null ? comparator : subcolumnComparator; +} public boolean equals(Object obj) { Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086729r1=1086728r2=1086729view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue Mar 29 20:39:37 2011 @@ -236,7 +236,7 @@ public class QueryProcessor // FIXME: keys as ascii is not a Real Solution ByteBuffer key = update.getKey().getByteBuffer(AsciiType.instance); validateKey(key); -validateColumnFamily(keyspace, cfname); +validateColumnFamily(keyspace, update.getColumnFamily(), false); validateKeyType(key, keyspace, cfname); AbstractType? comparator = update.getComparator(keyspace); @@ -460,7 +460,7 @@ public class QueryProcessor case SELECT: SelectStatement select = (SelectStatement)statement.statement; clientState.hasColumnFamilyAccess(select.getColumnFamily(), Permission.READ); -validateColumnFamily(keyspace, select.getColumnFamily()); +validateColumnFamily(keyspace, select.getColumnFamily(), false); validateSelect(keyspace, select); Listorg.apache.cassandra.db.Row rows = null; Modified: cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java?rev=1086729r1=1086728r2=1086729view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java Tue Mar 29 20:39:37 2011 @@ -241,7 +241,7 @@ public class CassandraServer implements logger.debug(get_slice); state().hasColumnFamilyAccess(column_parent.column_family, Permission.READ); -return multigetSliceInternal(state().getKeyspace(), Collections.singletonList(key), column_parent, predicate, consistency_level).get(key); +return multigetSliceInternal(state().getKeyspace(), Collections.singletonList(key), column_parent, predicate, consistency_level, false).get(key); } public MapByteBuffer, ListColumnOrSuperColumn multiget_slice(ListByteBuffer keys, ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level) @@ -250,14 +250,15 @@ public class CassandraServer implements logger.debug(multiget_slice); state().hasColumnFamilyAccess(column_parent.column_family, Permission.READ); -return multigetSliceInternal(state().getKeyspace(), keys, column_parent, predicate, consistency_level); +return multigetSliceInternal(state().getKeyspace(), keys, column_parent, predicate, consistency_level, false); } -private MapByteBuffer, ListColumnOrSuperColumn multigetSliceInternal(String keyspace, ListByteBuffer keys, ColumnParent column_parent, SlicePredicate predicate, ConsistencyLevel consistency_level) +private MapByteBuffer, ListColumnOrSuperColumn multigetSliceInternal(String keyspace, ListByteBuffer
[jira] [Commented] (CASSANDRA-2311) type validated row keys
[ https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012684#comment-13012684 ] Jon Hermes commented on CASSANDRA-2311: --- I'm getting different test failures right now (the system tests are in a state of flux), but the NPE definitely means the CFMetaData didn't get loaded correctly or we're asking for it incorrectly. As for moving vKT into vK, they have different args. Assume key K validates correctly. K may validate correctly in CF Foo (because the keyValidator is a no-op BytesType), but fail to validate in CF Bar (because the keyValidator is something restrictive such as UTF8, and K is random bytes). Ideally we'd like to just ask for the CFMD.keyValidator in every vK, but we don't know that the CF they're asking for is valid either. We happen to have a nice validateCF already, and it's a waste to duplicate work. Hence why my original function was names validateKeyInCF... first you validate the key on its own (it's nonzero bytes, etc.). Then you validate the cf (it's a valid cf in the system). Lastly you validate that the key IN that cf is altogether valid. type validated row keys --- Key: CASSANDRA-2311 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Eric Evans Assignee: Jon Hermes Labels: cql Fix For: 0.8 Attachments: 2311.txt, v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt The idea here is to allow the assignment of a column-family-wide key type used to perform validation, (similar to how default_validation_class does for column values). This should be as straightforward as extending the column family schema to include the new attribute, and updating {{ThriftValidation.validateKey}} to validate the key ({{AbstractType.validate}}). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-1263) Push replication factor down to the replication strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Hermes updated CASSANDRA-1263: -- Attachment: 1263-3.txt rebased. NOTE: system tests are currently failing. If the system tests in trunk runs fine and these continue to fail, please re-open. Push replication factor down to the replication strategy Key: CASSANDRA-1263 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263 Project: Cassandra Issue Type: Task Components: Core Reporter: Jeremy Hanna Assignee: Jon Hermes Priority: Minor Fix For: 0.8 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt Original Estimate: 8h Remaining Estimate: 8h Currently the replication factor is in the keyspace metadata. As we've added the datacenter shard strategy, the replication factor becomes more computed by the replication strategy. It seems reasonable to therefore push the replication factor for the keyspace down to the replication strategy so that it can be handled in one place. This adds on the work being done in CASSANDRA-1066 since that ticket will make the replication strategy a member variable of keyspace metadata instead of just a quasi singleton giving the replication strategy state for each keyspace. That makes it able to have the replication factor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2311) type validated row keys
[ https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012709#comment-13012709 ] Eric Evans commented on CASSANDRA-2311: --- bq. Also: shouldn't we move validateKeyType into validateKey? FWIW, this is not used in CQL (because the ByteBuffer is created by {{AT.fromString}}, there is no need to invoke {{validate()}}). So {{validateKeyType}}'s use is limited to those two invocations in {{CassandraServer}}. type validated row keys --- Key: CASSANDRA-2311 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Eric Evans Assignee: Jon Hermes Labels: cql Fix For: 0.8 Attachments: 2311.txt, v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt The idea here is to allow the assignment of a column-family-wide key type used to perform validation, (similar to how default_validation_class does for column values). This should be as straightforward as extending the column family schema to include the new attribute, and updating {{ThriftValidation.validateKey}} to validate the key ({{AbstractType.validate}}). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1086755 - in /cassandra/trunk/src/java/org/apache/cassandra: config/CFMetaData.java cql/QueryProcessor.java thrift/CassandraServer.java thrift/ThriftValidation.java
Author: jbellis Date: Tue Mar 29 21:26:48 2011 New Revision: 1086755 URL: http://svn.apache.org/viewvc?rev=1086755view=rev Log: merge validateKey/validateKeyType, add CF validation to cql, add comparator to cql name validation. fixes test NPE. patch by jbellis Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java cassandra/trunk/src/java/org/apache/cassandra/thrift/CassandraServer.java cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java Modified: cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java?rev=1086755r1=1086754r2=1086755view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/config/CFMetaData.java Tue Mar 29 21:26:48 2011 @@ -730,6 +730,7 @@ public final class CFMetaData def.memtable_throughput_in_mb = cfm.memtableThroughputInMb; def.memtable_operations_in_millions = cfm.memtableOperationsInMillions; def.merge_shards_chance = cfm.mergeShardsChance; +def.key_validation_class = cfm.keyValidator.getClass().getName(); Listorg.apache.cassandra.db.migration.avro.ColumnDef column_meta = new ArrayListorg.apache.cassandra.db.migration.avro.ColumnDef(cfm.column_metadata.size()); for (ColumnDefinition cd : cfm.column_metadata.values()) { Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086755r1=1086754r2=1086755view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue Mar 29 21:26:48 2011 @@ -69,7 +69,7 @@ import com.google.common.collect.Maps; import static org.apache.cassandra.thrift.ThriftValidation.validateKey; import static org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily; -import static org.apache.cassandra.thrift.ThriftValidation.validateKeyType; +import static org.apache.cassandra.thrift.ThriftValidation.validateColumnNames; public class QueryProcessor { @@ -86,8 +86,9 @@ public class QueryProcessor assert select.getKeys().size() == 1; ByteBuffer key = select.getKeys().get(0).getByteBuffer(AsciiType.instance); -validateKey(key); - +CFMetaData metadata = validateColumnFamily(keyspace, select.getColumnFamily(), false); +validateKey(metadata, key); + // ...of a list of column names if (!select.isColumnRange()) { @@ -95,7 +96,7 @@ public class QueryProcessor for (Term column : select.getColumnNames()) columnNames.add(column.getByteBuffer(comparator)); -validateColumnNames(keyspace, select.getColumnFamily(), columnNames); +validateColumnNames(metadata, null, columnNames); commands.add(new SliceByNamesReadCommand(keyspace, key, queryPath, columnNames)); } // ...a range (slice) of column names @@ -104,7 +105,7 @@ public class QueryProcessor ByteBuffer start = select.getColumnStart().getByteBuffer(comparator); ByteBuffer finish = select.getColumnFinish().getByteBuffer(comparator); -validateSliceRange(keyspace, select.getColumnFamily(), start, finish, select.isColumnsReversed()); +validateSliceRange(metadata, start, finish, select.isColumnsReversed()); commands.add(new SliceFromReadCommand(keyspace, key, queryPath, @@ -140,10 +141,11 @@ public class QueryProcessor IPartitioner? p = StorageService.getPartitioner(); AbstractBounds bounds = new Bounds(p.getToken(startKey), p.getToken(finishKey)); -AbstractType? comparator = select.getComparator(keyspace); +CFMetaData metadata = validateColumnFamily(keyspace, select.getColumnFamily(), false); +AbstractType? comparator = metadata.getComparatorFor(null); // XXX: Our use of Thrift structs internally makes me Sad. :( SlicePredicate thriftSlicePredicate = slicePredicateFromSelect(select, comparator); -validateSlicePredicate(keyspace, select.getColumnFamily(), thriftSlicePredicate); +validateSlicePredicate(metadata, thriftSlicePredicate); try { @@ -174,10 +176,11 @@ public class QueryProcessor private static Listorg.apache.cassandra.db.Row
buildbot failure in ASF Buildbot on cassandra-trunk
The Buildbot has detected a new failure on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1193 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1086755 Blamelist: jbellis BUILD FAILED: failed compile sincerely, -The Buildbot
svn commit: r1086759 - /cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java
Author: jbellis Date: Tue Mar 29 21:37:14 2011 New Revision: 1086759 URL: http://svn.apache.org/viewvc?rev=1086759view=rev Log: avoid unnecessary type validation in cql patch by jbellis Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java?rev=1086759r1=1086758r2=1086759view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java Tue Mar 29 21:37:14 2011 @@ -63,13 +63,12 @@ import org.apache.cassandra.service.Stor import org.apache.cassandra.thrift.Column; import org.apache.cassandra.thrift.*; import org.apache.cassandra.utils.ByteBufferUtil; +import org.apache.cassandra.utils.FBUtilities; import com.google.common.base.Predicates; import com.google.common.collect.Maps; -import static org.apache.cassandra.thrift.ThriftValidation.validateKey; import static org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily; -import static org.apache.cassandra.thrift.ThriftValidation.validateColumnNames; public class QueryProcessor { @@ -87,7 +86,7 @@ public class QueryProcessor ByteBuffer key = select.getKeys().get(0).getByteBuffer(AsciiType.instance); CFMetaData metadata = validateColumnFamily(keyspace, select.getColumnFamily(), false); -validateKey(metadata, key); +validateKey(key); // ...of a list of column names if (!select.isColumnRange()) @@ -96,7 +95,7 @@ public class QueryProcessor for (Term column : select.getColumnNames()) columnNames.add(column.getByteBuffer(comparator)); -validateColumnNames(metadata, null, columnNames); +validateColumnNames(columnNames); commands.add(new SliceByNamesReadCommand(keyspace, key, queryPath, columnNames)); } // ...a range (slice) of column names @@ -238,7 +237,7 @@ public class QueryProcessor // FIXME: keys as ascii is not a Real Solution ByteBuffer key = update.getKey().getByteBuffer(AsciiType.instance); -validateKey(metadata, key); +validateKey(key); AbstractType? comparator = update.getComparator(keyspace); RowMutation rm = new RowMutation(keyspace, key); @@ -367,16 +366,45 @@ public class QueryProcessor } } -private static void validateColumnName(CFMetaData metadata, ByteBuffer column) +private static void validateKey(ByteBuffer key) throws InvalidRequestException +{ +if (key == null || key.remaining() == 0) +{ +throw new InvalidRequestException(Key may not be empty); +} + +// check that key can be handled by FBUtilities.writeShortByteArray +if (key.remaining() FBUtilities.MAX_UNSIGNED_SHORT) +{ +throw new InvalidRequestException(Key length of + key.remaining() + + is longer than maximum of + FBUtilities.MAX_UNSIGNED_SHORT); +} +} + +private static void validateColumnNames(IterableByteBuffer columns) +throws InvalidRequestException +{ +for (ByteBuffer name : columns) +{ +if (name.remaining() IColumn.MAX_NAME_LENGTH) +throw new InvalidRequestException(String.format(column name is too long (%s %s), + name.remaining(), + IColumn.MAX_NAME_LENGTH)); +if (name.remaining() == 0) +throw new InvalidRequestException(zero-length column name); +} +} + +private static void validateColumnName(ByteBuffer column) throws InvalidRequestException { -validateColumnNames(metadata, null, Arrays.asList(column)); +validateColumnNames(Arrays.asList(column)); } private static void validateColumn(CFMetaData metadata, ByteBuffer name, ByteBuffer value) throws InvalidRequestException { -validateColumnName(metadata, name); +validateColumnName(name); AbstractType? validator = metadata.getValueValidator(name); try @@ -398,7 +426,7 @@ public class QueryProcessor if (predicate.slice_range != null) validateSliceRange(metadata, predicate.slice_range); else -validateColumnNames(metadata, null, predicate.column_names); +validateColumnNames(predicate.column_names); } private static void validateSliceRange(CFMetaData metadata, SliceRange range) @@ -578,7 +606,7 @@ public
[jira] [Resolved] (CASSANDRA-2311) type validated row keys
[ https://issues.apache.org/jira/browse/CASSANDRA-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-2311. --- Resolution: Fixed Reviewer: urandom Committed Eric's fix and some other Thrift fixes. Merged TV type validation into validateKey; added QP.validateKey that only checks length. type validated row keys --- Key: CASSANDRA-2311 URL: https://issues.apache.org/jira/browse/CASSANDRA-2311 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Eric Evans Assignee: Jon Hermes Labels: cql Fix For: 0.8 Attachments: 2311.txt, v1-0001-CASSANDRA-2311-missed-CFM-conversion.txt The idea here is to allow the assignment of a column-family-wide key type used to perform validation, (similar to how default_validation_class does for column values). This should be as straightforward as extending the column family schema to include the new attribute, and updating {{ThriftValidation.validateKey}} to validate the key ({{AbstractType.validate}}). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
buildbot success in ASF Buildbot on cassandra-trunk
The Buildbot has detected a restored build on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1194 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1086759 Blamelist: jbellis Build succeeded! sincerely, -The Buildbot
[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012753#comment-13012753 ] Jeremy Hanna commented on CASSANDRA-1263: - Overall looks like a clean way of pushing down the RF to strategy. A few minor points: In KSMetaData it has the following: {code} StringBuilder sb = new StringBuilder(); sb.append(name) .append(rep factor:) .append(rep strategy:) .append(strategyClass.getSimpleName()) .append({) .append(StringUtils.join(cfMetaData.values(), , )) .append(}); return sb.toString(); {code} Shouldn't the rep factor String be gone along with the variable output? The end of SimpleStrategy - the curly brace should be on its own line. I see an instance of replication_factor in Cli.g - not sure if that matters. Seems that's just for typing generally. Looks like CQL still has some references to the way things were with RF - that could be a separate issue I would think. Push replication factor down to the replication strategy Key: CASSANDRA-1263 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263 Project: Cassandra Issue Type: Task Components: Core Reporter: Jeremy Hanna Assignee: Jon Hermes Priority: Minor Fix For: 0.8 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt Original Estimate: 8h Remaining Estimate: 8h Currently the replication factor is in the keyspace metadata. As we've added the datacenter shard strategy, the replication factor becomes more computed by the replication strategy. It seems reasonable to therefore push the replication factor for the keyspace down to the replication strategy so that it can be handled in one place. This adds on the work being done in CASSANDRA-1066 since that ticket will make the replication strategy a member variable of keyspace metadata instead of just a quasi singleton giving the replication strategy state for each keyspace. That makes it able to have the replication factor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012757#comment-13012757 ] Jeremy Hanna commented on CASSANDRA-1263: - Btw - Jon - how extensively did you test this? I wonder if something like RF + Strategy could be something that could be handled in a distributed test. Push replication factor down to the replication strategy Key: CASSANDRA-1263 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263 Project: Cassandra Issue Type: Task Components: Core Reporter: Jeremy Hanna Assignee: Jon Hermes Priority: Minor Fix For: 0.8 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt Original Estimate: 8h Remaining Estimate: 8h Currently the replication factor is in the keyspace metadata. As we've added the datacenter shard strategy, the replication factor becomes more computed by the replication strategy. It seems reasonable to therefore push the replication factor for the keyspace down to the replication strategy so that it can be handled in one place. This adds on the work being done in CASSANDRA-1066 since that ticket will make the replication strategy a member variable of keyspace metadata instead of just a quasi singleton giving the replication strategy state for each keyspace. That makes it able to have the replication factor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1263) Push replication factor down to the replication strategy
[ https://issues.apache.org/jira/browse/CASSANDRA-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012766#comment-13012766 ] Jeremy Hanna commented on CASSANDRA-1263: - One other small thing that would be easy to fix while you're in there - in CliUserHelp: {code} state.out.println(update keyspace foo with); state.out.println(placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy';); state.out.println(and strategy_options=[{replication_factor:4}];); {code} That second line shouldn't have a ';' at the end of it - it's the middle of the update keyspace. Push replication factor down to the replication strategy Key: CASSANDRA-1263 URL: https://issues.apache.org/jira/browse/CASSANDRA-1263 Project: Cassandra Issue Type: Task Components: Core Reporter: Jeremy Hanna Assignee: Jon Hermes Priority: Minor Fix For: 0.8 Attachments: 1263-2.txt, 1263-3.txt, 1263-incomplete.txt, 1263.txt Original Estimate: 8h Remaining Estimate: 8h Currently the replication factor is in the keyspace metadata. As we've added the datacenter shard strategy, the replication factor becomes more computed by the replication strategy. It seems reasonable to therefore push the replication factor for the keyspace down to the replication strategy so that it can be handled in one place. This adds on the work being done in CASSANDRA-1066 since that ticket will make the replication strategy a member variable of keyspace metadata instead of just a quasi singleton giving the replication strategy state for each keyspace. That makes it able to have the replication factor. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1086806 - in /cassandra/trunk: src/java/org/apache/cassandra/cql/ test/system/
Author: eevans Date: Tue Mar 29 23:41:43 2011 New Revision: 1086806 URL: http://svn.apache.org/viewvc?rev=1086806view=rev Log: CQL support for typed keys Patch by eevans Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java cassandra/trunk/src/java/org/apache/cassandra/cql/QueryProcessor.java cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java cassandra/trunk/test/system/test_cql.py Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1086806r1=1086805r2=1086806view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Tue Mar 29 23:41:43 2011 @@ -270,16 +270,18 @@ createKeyspaceStatement returns [CreateK */ createColumnFamilyStatement returns [CreateColumnFamilyStatement expr] : K_CREATE K_COLUMNFAMILY name=( IDENT | STRING_LITERAL | INTEGER ) { $expr = new CreateColumnFamilyStatement($name.text); } - ( '(' - col1=term v1=createCfamColumnValidator { $expr.addColumn(col1, $v1.validator); } ( ',' - colN=term vN=createCfamColumnValidator { $expr.addColumn(colN, $vN.validator); } )* - ')' )? + ( '(' createCfamColumns[expr] ( ',' createCfamColumns[expr] )* ')' )? ( K_WITH prop1=IDENT '=' arg1=createCfamKeywordArgument { $expr.addProperty($prop1.text, $arg1.arg); } ( K_AND propN=IDENT '=' argN=createCfamKeywordArgument { $expr.addProperty($propN.text, $argN.arg); } )* )? endStmnt ; +createCfamColumns[CreateColumnFamilyStatement expr] +: n=term v=createCfamColumnValidator { $expr.addColumn(n, $v.validator); } +| K_KEY v=createCfamColumnValidator K_PRIMARY K_KEY { $expr.setKeyType($v.validator); } +; + createCfamColumnValidator returns [String validator] : comparatorType { $validator = $comparatorType.text; } | STRING_LITERAL { $validator = $STRING_LITERAL.text; } @@ -378,6 +380,7 @@ K_COLUMNFAMILY: C O L U M N F A M I L Y; K_INDEX: I N D E X; K_ON: O N; K_DROP:D R O P; +K_PRIMARY: P R I M A R Y; // Case-insensitive alpha characters fragment A: ('a'|'A'); Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java?rev=1086806r1=1086805r2=1086806view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java Tue Mar 29 23:41:43 2011 @@ -87,6 +87,7 @@ public class CreateColumnFamilyStatement private final String name; private final MapTerm, String columns = new HashMapTerm, String(); private final MapString, String properties = new HashMapString, String(); +private String keyValidator; public CreateColumnFamilyStatement(String name) { @@ -157,6 +158,11 @@ public class CreateColumnFamilyStatement columns.put(term, comparator); } +public void setKeyType(String validator) +{ +this.keyValidator = validator; +} + /** Map a keyword to the corresponding value */ public void addProperty(String name, String value) { @@ -180,7 +186,7 @@ public class CreateColumnFamilyStatement { ByteBuffer columnName = col.getKey().getByteBuffer(comparator); String validatorClassName = comparators.containsKey(col.getValue()) ? comparators.get(col.getValue()) : col.getValue(); -AbstractType validator = DatabaseDescriptor.getComparator(validatorClassName); +AbstractType? validator = DatabaseDescriptor.getComparator(validatorClassName); columnDefs.put(columnName, new ColumnDefinition(columnName, validator, null, null)); } catch (ConfigurationException e) @@ -212,6 +218,7 @@ public class CreateColumnFamilyStatement // RPC uses BytesType as the default validator/comparator but BytesType expects hex for string terms, (not convenient). AbstractType? comparator = DatabaseDescriptor.getComparator(comparators.get(getPropertyString(KW_COMPARATOR, utf8))); String validator = getPropertyString(KW_DEFAULTVALIDATION, utf8); +AbstractType? keyType = DatabaseDescriptor.getComparator(comparators.get((keyValidator != null) ? keyValidator : utf8)); newCFMD = new CFMetaData(keyspace, name, @@ -234,7 +241,8 @@ public class
svn commit: r1086807 - in /cassandra/trunk: src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java test/system/test_cql.py
Author: eevans Date: Tue Mar 29 23:41:49 2011 New Revision: 1086807 URL: http://svn.apache.org/viewvc?rev=1086807view=rev Log: allow exactly one PRIMARY KEY definition Patch by eevans Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java cassandra/trunk/test/system/test_cql.py Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java?rev=1086807r1=1086806r2=1086807view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/CreateColumnFamilyStatement.java Tue Mar 29 23:41:49 2011 @@ -21,8 +21,10 @@ package org.apache.cassandra.cql; import java.nio.ByteBuffer; +import java.util.ArrayList; import java.util.HashMap; import java.util.HashSet; +import java.util.List; import java.util.Map; import java.util.Set; @@ -87,7 +89,7 @@ public class CreateColumnFamilyStatement private final String name; private final MapTerm, String columns = new HashMapTerm, String(); private final MapString, String properties = new HashMapString, String(); -private String keyValidator; +private ListString keyValidator = new ArrayListString(); public CreateColumnFamilyStatement(String name) { @@ -150,6 +152,12 @@ public class CreateColumnFamilyStatement if ((memOps != null) (memOps =0)) throw new InvalidRequestException(String.format(%s must be non-negative and greater than zero, KW_MEMTABLEOPSINMILLIONS)); + +// Ensure that exactly one key has been specified. +if (keyValidator.size() 1) +throw new InvalidRequestException(You must specify a PRIMARY KEY); +else if (keyValidator.size() 1) +throw new InvalidRequestException(You may only specify one PRIMARY KEY); } /** Map a column name to a validator for its value */ @@ -160,7 +168,12 @@ public class CreateColumnFamilyStatement public void setKeyType(String validator) { -this.keyValidator = validator; +keyValidator.add(validator); +} + +public String getKeyType() +{ +return keyValidator.get(0); } /** Map a keyword to the corresponding value */ @@ -218,7 +231,6 @@ public class CreateColumnFamilyStatement // RPC uses BytesType as the default validator/comparator but BytesType expects hex for string terms, (not convenient). AbstractType? comparator = DatabaseDescriptor.getComparator(comparators.get(getPropertyString(KW_COMPARATOR, utf8))); String validator = getPropertyString(KW_DEFAULTVALIDATION, utf8); -AbstractType? keyType = DatabaseDescriptor.getComparator(comparators.get((keyValidator != null) ? keyValidator : utf8)); newCFMD = new CFMetaData(keyspace, name, @@ -242,7 +254,7 @@ public class CreateColumnFamilyStatement .memOps(getPropertyDouble(KW_MEMTABLEOPSINMILLIONS, CFMetaData.DEFAULT_MEMTABLE_OPERATIONS_IN_MILLIONS)) .mergeShardsChance(0.0) .columnMetadata(getColumns(comparator)) - .keyValidator(keyType); + .keyValidator(DatabaseDescriptor.getComparator(comparators.get(getKeyType(; } catch (ConfigurationException e) { Modified: cassandra/trunk/test/system/test_cql.py URL: http://svn.apache.org/viewvc/cassandra/trunk/test/system/test_cql.py?rev=1086807r1=1086806r2=1086807view=diff == --- cassandra/trunk/test/system/test_cql.py (original) +++ cassandra/trunk/test/system/test_cql.py Tue Mar 29 23:41:49 2011 @@ -365,6 +365,7 @@ class TestCql(ThriftTester): conn.execute( CREATE COLUMNFAMILY NewCf1 ( +KEY int PRIMARY KEY, 'username' utf8, 'age' int, 'birthdate' long, @@ -383,25 +384,31 @@ class TestCql(ThriftTester): assert cfam.comment == shiny, new, cf assert cfam.default_validation_class == org.apache.cassandra.db.marshal.AsciiType assert cfam.comparator_type == org.apache.cassandra.db.marshal.UTF8Type +assert cfam.key_validation_class == org.apache.cassandra.db.marshal.IntegerType -# No column defs, defaults all-around -conn.execute(CREATE COLUMNFAMILY NewCf2) -ksdef = thrift_client.describe_keyspace(CreateCFKeyspace) -assert len(ksdef.cf_defs) == 2, \ -expected 2 column families total, found %d % len(ksdef.cf_defs) +
buildbot failure in ASF Buildbot on cassandra-trunk
The Buildbot has detected a new failure on builder cassandra-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/cassandra-trunk/builds/1195 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: isis_ubuntu Build Reason: scheduler Build Source Stamp: [branch cassandra/trunk] 1086806 Blamelist: eevans BUILD FAILED: failed compile sincerely, -The Buildbot
[jira] [Commented] (CASSANDRA-2321) disallow to querying a counter CF with non-counter operation
[ https://issues.apache.org/jira/browse/CASSANDRA-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012781#comment-13012781 ] Hudson commented on CASSANDRA-2321: --- Integrated in Cassandra #817 (See [https://hudson.apache.org/hudson/job/Cassandra/817/]) disallow to querying a counter CF with non-counter operation Key: CASSANDRA-2321 URL: https://issues.apache.org/jira/browse/CASSANDRA-2321 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.8 Environment: Linux Reporter: Mubarak Seyed Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8 Attachments: 0001-Don-t-allow-normal-query-on-counter-CF.patch CounterColumnType.getString() returns hexString. {code} public String getString(ByteBuffer bytes) { return ByteBufferUtil.bytesToHex(bytes); } {code} and python stress.py reader returns [ColumnOrSuperColumn(column=None, super_column=SuperColumn(name='19', columns=[Column(timestamp=1299984960277, name='56', value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x08\x00\x00\x00\x00\x00\x00\x00,', ttl=None), Column(timestamp=1299985019923, name='57', value='\x7f\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00;\x00\x00\x00\x00\x00\x00\x08\xfd', ttl=None))] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-2403) Backport AbstractType.compose from trunk
Backport AbstractType.compose from trunk Key: CASSANDRA-2403 URL: https://issues.apache.org/jira/browse/CASSANDRA-2403 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Brandon Williams Priority: Minor Fix For: 0.7.5 It was added in CASSANDRA-2262, but is also useful for 0.7.x. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2387) Make it possible for pig to understand packed data
[ https://issues.apache.org/jira/browse/CASSANDRA-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-2387: Attachment: 2387-v2.txt v2 builds upon this by utilizing AbstractType to handle any type, also handles column names the same way, and only deserializes the cfdef once per row instead of for every column. Still has some string roundtrip casting lameness until CASSANDRA-2403 is resolved. Make it possible for pig to understand packed data -- Key: CASSANDRA-2387 URL: https://issues.apache.org/jira/browse/CASSANDRA-2387 Project: Cassandra Issue Type: Improvement Reporter: Jeremy Hanna Assignee: Jeremy Hanna Labels: contrib, hadoop, pig Attachments: 2387-1.txt, 2387-v2.txt, loadstorecaster-patch.txt Packed values are throwing off pig. This ticket is to make it so pig can interpret packed values. Originally we thought we could just use a loadcaster. However, the only way we know how we can do it now is to get the schema through thrift and essentially perform the function of the loadcaster in the getNext method. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012783#comment-13012783 ] Peter Schuller commented on CASSANDRA-1902: --- Catching up with ticket history and the latest version of the patch, a few things based on the history+patch themselves (I have not tested or benchmarked anything): With respect to avoiding waiting on GC: the munmap() is still in finalize() we we're still waiting on GC, right? Just not on every possible ByteBuffer (instead only on the MappedFileSegment itself). BufferedSegmentedFile.tryPreserveFilePageCache() is doing a tryPreserveCacheRegion() for every page considered hot. The first thing to be aware of then is that this will translate into a posix_fadvise() syscall for every page, even when all or almost all pages are in fact in memory. This may be acceptable, but keep in mind that use-cases where all or almost all pages are in cache, are likely to be the ones CPU-bound rather than disk bound. The bigger issue with the same thing, is that in the cache of large column families that we're trying to optimize for, unless I am missing something the preservation process is expected to be entirely seek bound for sparsely hot sstables. In the best case for mostly-hot sstables it might not be seek bound provided that pre-fetching and/or read-ahead and/or linear access detection is working well, but that seems very dependent on system details and the type of load the system is under (probably less likely to work well under high live read i/o loads). In the non-best case (sparsely hot), it should most definitely be entirely seek bound. fadvising entire regions at once instead of once per page might improve that, but I still think the better solution is to just not DONTNEED hot data to begin with (subject to potential limitations to avoid too frequent DONTNEEDs). Note: The original motivation for avoiding frequent DONTNEED was performance in relation to the syscall. But in this case we're taking a one syscall per page hit anyway with the WILLNEED:s. In fact in the case of a very hot sstable (where CPU efficiency is more important than a cold sstable where disk I/O is more important) the WILLNEED:s should be more numerous than the DONTNEED:s would have been had they been fragmented according to a hotness map. Disregarding the CPU efficiency concerns though, the primary concern I'd have is the WILLNEED calls. Again I haven't tested to make sure I'm not mis-reading it, but this should mean that all compactions of actively used sstables will end, after the streaming I/O, with lots of seek bound reads to fullfil the WILLNEED:s. This can take a lot of time and be expensive in terms of the amount of disk time being spent (relative to a rate limited compaction process), and also violates the otherwise preserved rule that the only seek-bound I/O is live reads; all other I/O is sequential. Also: If WILLNEED blocks until it's been read, the impact on live traffic should be limited but on the other hand latency should be high under read load. If WILLNEED doesn't block throughput should have a chance of being reasonable by maintaining some queue depth, but on the other hand would potentially severely affect live reads. (I don't know which is true, I should check, but I haven't yet.) Minor nit: Seemingly truncated doc string for SegmentedFile.complete(). Minor suggestion: Should isRangeInCache() be renamed to wasRangeInCache() to reflect the fact that it does not represent current status? It is not an implementation detail because if it did reflect current reality, the caller would be incorrect (the test on a per-column basis would constantly give false positives as being in cache due to (1) the column just having been serialized, which would be easily fixable, but also because (2) previous columns on the same page, which is more difficult to fix than moving a line of code). Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively
[jira] [Issue Comment Edited] (CASSANDRA-2387) Make it possible for pig to understand packed data
[ https://issues.apache.org/jira/browse/CASSANDRA-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012782#comment-13012782 ] Brandon Williams edited comment on CASSANDRA-2387 at 3/30/11 12:02 AM: --- v2 builds upon this by utilizing AbstractType to handle any type, also handles column names the same way, and only deserializes the cfdef once per row instead of for every column. Still has some string roundtrip casting lameness until CASSANDRA-2403 is resolved. Also handles serialization when storing. was (Author: brandon.williams): v2 builds upon this by utilizing AbstractType to handle any type, also handles column names the same way, and only deserializes the cfdef once per row instead of for every column. Still has some string roundtrip casting lameness until CASSANDRA-2403 is resolved. Make it possible for pig to understand packed data -- Key: CASSANDRA-2387 URL: https://issues.apache.org/jira/browse/CASSANDRA-2387 Project: Cassandra Issue Type: Improvement Reporter: Jeremy Hanna Assignee: Jeremy Hanna Labels: contrib, hadoop, pig Attachments: 2387-1.txt, 2387-v2.txt, loadstorecaster-patch.txt Packed values are throwing off pig. This ticket is to make it so pig can interpret packed values. Originally we thought we could just use a loadcaster. However, the only way we know how we can do it now is to get the schema through thrift and essentially perform the function of the loadcaster in the getNext method. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1086812 - /cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g
Author: eevans Date: Wed Mar 30 00:26:32 2011 New Revision: 1086812 URL: http://svn.apache.org/viewvc?rev=1086812view=rev Log: allow but do not require semicolon in batch updates Patch by eevans Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1086812r1=1086811r2=1086812view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Wed Mar 30 00:26:32 2011 @@ -100,7 +100,7 @@ options { query returns [CQLStatement stmnt] : selectStatement { $stmnt = new CQLStatement(StatementType.SELECT, $selectStatement.expr); } -| updateStatement { $stmnt = new CQLStatement(StatementType.UPDATE, $updateStatement.expr); } +| updateStatement endStmnt { $stmnt = new CQLStatement(StatementType.UPDATE, $updateStatement.expr); } | batchUpdateStatement { $stmnt = new CQLStatement(StatementType.BATCH_UPDATE, $batchUpdateStatement.expr); } | useStatement { $stmnt = new CQLStatement(StatementType.USE, $useStatement.keyspace); } | truncateStatement { $stmnt = new CQLStatement(StatementType.TRUNCATE, $truncateStatement.cfam); } @@ -188,7 +188,7 @@ batchUpdateStatement returns [BatchUpdat ListUpdateStatement updates = new ArrayListUpdateStatement(); } K_BEGIN K_BATCH ( K_USING K_CONSISTENCY K_LEVEL { cLevel = ConsistencyLevel.valueOf($K_LEVEL.text); } )? - u1=updateStatement { updates.add(u1); } ( uN=updateStatement { updates.add(uN); } )* + u1=updateStatement ';'? { updates.add(u1); } ( uN=updateStatement ';'? { updates.add(uN); } )* K_APPLY K_BATCH EOF { return new BatchUpdateStatement(updates, cLevel); @@ -214,7 +214,7 @@ updateStatement returns [UpdateStatement K_UPDATE columnFamily=( IDENT | STRING_LITERAL | INTEGER ) (K_USING K_CONSISTENCY K_LEVEL { cLevel = ConsistencyLevel.valueOf($K_LEVEL.text); })? K_SET termPair[columns] (',' termPair[columns])* - K_WHERE K_KEY '=' key=term endStmnt + K_WHERE K_KEY '=' key=term { return new UpdateStatement($columnFamily.text, cLevel, columns, key); } @@ -241,7 +241,7 @@ deleteStatement returns [DeleteStatement K_FROM columnFamily=( IDENT | STRING_LITERAL | INTEGER ) ( K_USING K_CONSISTENCY K_LEVEL )? K_WHERE ( K_KEY '=' key=term { keyList = Collections.singletonList(key); } | K_KEY K_IN '(' keys=termList { keyList = $keys.items; } ')' - )? + )? endStmnt { return new DeleteStatement(columnsList, $columnFamily.text, cLevel, keyList); } @@ -339,7 +339,7 @@ truncateStatement returns [String cfam] ; endStmnt -: (EOF | ';') +: ';'? EOF ;
[jira] [Commented] (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012793#comment-13012793 ] Pavel Yaskevich commented on CASSANDRA-1902: bq. With respect to avoiding waiting on GC: the munmap() is still in finalize() we we're still waiting on GC, right? Just not on every possible ByteBuffer (instead only on the MappedFileSegment itself). This is correct but on other hand we can unmap segment only when we don't need a reader, I can do an explicite method for it but my tests show that unmap in finalize if pretty useful. bq. BufferedSegmentedFile.tryPreserveFilePageCache() is doing a tryPreserveCacheRegion() for every page considered hot. The first thing to be aware of then is that this will translate into a posix_fadvise() syscall for every page, even when all or almost all pages are in fact in memory. This may be acceptable, but keep in mind that use-cases where all or almost all pages are in cache, are likely to be the ones CPU-bound rather than disk bound. Documentation for posix_fadvice/madvice calls suggests to do more frequent little requests instead of big requests - kernel in the high possibility going to ignore an advice on the big region. bq. fadvising entire regions at once instead of once per page might improve that, but I still think the better solution is to just not DONTNEED hot data to begin with (subject to potential limitations to avoid too frequent DONTNEEDs). We can't stop using DONTNEED while writing compacted file because it will suck pages from sstables which are currently in use. And we do WILLNEED's only when we have SSTableReader for a compacted file ready - right before old sstables going to be replaced with new ones so this is not going to make a big performance impact on the reads. Note that WILLNEED is non-blocking call. bq. Minor nit: Seemingly truncated doc string for SegmentedFile.complete(). Yes, I will fix that, thanks! bq. Minor suggestion: Should isRangeInCache() be renamed to wasRangeInCache() to reflect the fact that it does not represent current status? It is not an implementation detail because if it did reflect current reality, the caller would be incorrect (the test on a per-column basis would constantly give false positives as being in cache due to (1) the column just having been serialized, which would be easily fixable, but also because (2) previous columns on the same page, which is more difficult to fix than moving a line of code). Sounds reasonable to me, I will rename a method. Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.5, 0.8 Attachments: 0001-CASSANDRA-1902-cache-migration-impl-with-config-option.txt, 1902-formatted.txt, 1902-per-column-migration-rebase2.txt, 1902-per-column-migration.txt, CASSANDRA-1902-v3.patch, CASSANDRA-1902-v4.patch, CASSANDRA-1902-v5.patch Original Estimate: 32h Time Spent: 56h Remaining Estimate: 0h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. This is now important since CASSANDRA-1470 caches effectively nothing. For example an active CF being compacted hurts reads since nothing is cached in the new SSTable. The purpose of this ticket then is to make sure SOME data is cached from active CFs. This can be done my monitoring which Old SSTables are in the page cache and caching active rows in the New SStable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
[ https://issues.apache.org/jira/browse/CASSANDRA-2045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012813#comment-13012813 ] Jonathan Ellis commented on CASSANDRA-2045: --- You want to use the pointer approach when your ratio of overwrites : row size is sufficiently high -- the biggest win there is when you can turn dozens or hundreds of mutations, into replay of just the latest version. Not sure what the best way to estimate that is -- Brandon suggested checking SSTable bloom filters on writes. Which is probably low-overhead enough, especially if we just do it only every 10% of writes for instance. I kind of like that idea, I think it will be useful in multiple places down the road. (Sufficiently high depends on SSD vs magnetic -- time to introduce a postgresql-like random vs sequential penalty setting?) Simplify HH to decrease read load when nodes come back -- Key: CASSANDRA-2045 URL: https://issues.apache.org/jira/browse/CASSANDRA-2045 Project: Cassandra Issue Type: Improvement Reporter: Chris Goffinet Fix For: 0.8 Currently when HH is enabled, hints are stored, and when a node comes back, we begin sending that node data. We do a lookup on the local node for the row to send. To help reduce read load (if a node is offline for long period of time) we should store the data we want forward the node locally instead. We wouldn't have to do any lookups, just take byte[] and send to the destination. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012821#comment-13012821 ] Tey Kar Shiang commented on CASSANDRA-2401: --- Hi, New finding here: For the 0-column data, it is because it is never read from the file. As I step through the line, here it returns -1 position from org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is returning false) - key is missing. There seem to be a missing record which is indexed or indexed column itself not updated when the record is removed (?). As for the data return with 0-column, simply because a container is always created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and returned from getTopLevelColumns even if there is no read taken. getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Cassandra Wiki] Update of FAQ_JP by MakiWatanabe
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The FAQ_JP page has been changed by MakiWatanabe. The comment on this change is: Translate batch_mutate, mmap. http://wiki.apache.org/cassandra/FAQ_JP?action=diffrev1=76rev2=77 -- Anchor(batch_mutate_atomic) == batch_mutateはアトミックな操作ですか? == - As a special case, mutations against a single key are atomic but not isolated. Reads which occur during such a mutation may see part of the write before they see the whole thing. More generally, batch_mutate operations are not atomic. [[API#batch_mutate|batch_mutate]] allows grouping operations on many keys into a single call in order to save on the cost of network round-trips. If `batch_mutate` fails in the middle of its list of mutations, no rollback occurs and the mutations that have already been applied stay applied. The client should typically retry the `batch_mutate` operation. + 特殊な例として、単一のキーに対するbatch_mutationを考えると、それぞれのmutationはアトミックですが、アイソレーションはされていません。このようなmutationの最中にreadを行うと、部分的に更新された状態が見える可能性があります。 + 一般的に言うと、batch_mutateはアトミックではありません。ネットワークのラウンドトリップコストを削減するため、 + [[API#batch_mutate|batch_mutate]] + は複数のキーに対する操作を単一の呼び出しにまとめることを許しています。`batch_mutate`がmutationの途中で失敗した場合、既に適用されたmutationはそのまま残ります。ロールバックはされません。 + このような場合、一般的にはクライアントアプリケーションは`batch_mutate`をリトライする必要があります。 + Anchor(hadoop_support) @@ -446, +451 @@ == Compactionを実行してもディスク使用量が減らないのはなぜでしょうか? == SSTables that are obsoleted by a compaction are deleted asynchronously when the JVM performs a GC. You can force a GC from jconsole if necessary, but Cassandra will force one itself if it detects that it is low on space. A compaction marker is also added to obsolete sstables so they can be deleted on startup if the server does not perform a GC before being restarted. Read more on this subject [[http://wiki.apache.org/cassandra/MemtableSSTable|here]]. + Anchor(mmap) == topコマンドの出力で,CassandraがJava heapの最大値よりも大きなメモリを使用しているのはなぜでしょうか? == - Cassandra uses mmap to do zero-copy reads. That is, we use the operating system's virtual memory system to map the sstable data files into the Cassandra process' address space. This will use virtual memory; i.e. address space, and will be reported by tools like top accordingly, but on 64 bit systems virtual address space is effectively unlimited so you should not worry about that. - - What matters from the perspective of memory use in the sense as it is normally meant, is the amount of data allocated on brk() or mmap'd /dev/zero, which represent real memory used. The key issue is that for a mmap'd file, there is never a need to retain the data resident in physical memory. Thus, whatever you do keep resident in physical memory is essentially just there as a cache, in the same way as normal I/O will cause the kernel page cache to retain data that you read/write. - - The difference between normal I/O and mmap() is that in the mmap() case the memory is actually mapped to the process, thus affecting the virtual size as reported by top. The main argument for using mmap() instead of standard I/O is the fact that reading entails just touching memory - in the case of the memory being resident, you just read it - you don't even take a page fault (so no overhead in entering the kernel and doing a semi-context switch). This is covered in more detail [[http://www.varnish-cache.org/trac/wiki/ArchitectNotes|here]]. + Cassandraはzero-copy readのためにmmapを使用しています。即ち、sstableデータファイルをCassandraプロセスのアドレス空間にマップするためにOSの仮想メモリシステムを使用しているのです。これが仮想メモリが多量に使用されているように見える理由です。実際に使用されているのは仮想アドレス空間ですが、topなどのツールでは仮想メモリの消費としてレポートされます。64ビット環境ではアドレス空間はほぼ無限大ですので、あまり気にする必要はありません。通常の意味でのメモリ使用量の観点から問題となるのはbrk()でアロケートされたデータの量やmmapされた/dev/zeroです。これは使用された実メモリ量を示しています。 + mmapされたファイルについて注意すべき点は、それらを物理メモリに保持する必要がないということです。つまり物理メモリに保持されたmmapファイルは基本的にはキャッシュとみなせます。これはちょうど通常のI/Oによってread/writeしたデータがカーネルのページキャッシュに保持されるのと同じです。 + 通常のI/Oとmmap()の違いは、mmap()がメモリをプロセスにマップするため、topでレポートされる仮想メモリサイズに影響することにあります。 + 通常のI/Oに替えてmmap()を使う主な利点は、メモリにアクセスするだけでreadが完了する点にあります。そのメモリ領域が実際にロードされている(residentである)場合は、正にそれを読むだけで、ページフォルトも不要です。(カーネル空間に入ったり、コンテキストスイッチするオーバーヘッドも必要ありません) + 詳細については次のリンクを参照してください。[[http://www.varnish-cache.org/trac/wiki/ArchitectNotes]] Anchor(jna)
[jira] [Issue Comment Edited] (CASSANDRA-2401) getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query
[ https://issues.apache.org/jira/browse/CASSANDRA-2401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13012821#comment-13012821 ] Tey Kar Shiang edited comment on CASSANDRA-2401 at 3/30/11 2:19 AM: Hi, New finding here: For the 0-column data, it is because it is never read from the file. As I step through the line, here it returns -1 position from org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is returning false) - key is missing. There seem to be a missing record which is indexed or indexed column itself not updated when the record is removed (?). As for the data returned with 0-column, simply because a container is always created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and returned from getTopLevelColumns even if there is no read taken. As for this case, it causes Timeout exception to Hector when null exception thrown without captured. was (Author: karshiang): Hi, New finding here: For the 0-column data, it is because it is never read from the file. As I step through the line, here it returns -1 position from org.apache.cassandra.io.sstable.SSTableReader.java::getPosition(DecoratedKey decoratedKey, Operator op), line 448 (bf.isPresent(decoratedKey.key) is returning false) - key is missing. There seem to be a missing record which is indexed or indexed column itself not updated when the record is removed (?). As for the data return with 0-column, simply because a container is always created (final ColumnFamily returnCF = ColumnFamily.create(metadata)) and returned from getTopLevelColumns even if there is no read taken. getColumnFamily() return null, which is not checked in ColumnFamilyStore.java scan() method, causing Timeout Exception in query --- Key: CASSANDRA-2401 URL: https://issues.apache.org/jira/browse/CASSANDRA-2401 Project: Cassandra Issue Type: Bug Affects Versions: 0.7.4 Environment: Hector 0.7.0-28, Cassandra 0.7.4, Windows 7, Eclipse Reporter: Tey Kar Shiang ColumnFamilyStore.java, line near 1680, ColumnFamily data = getColumnFamily(new QueryFilter(dk, path, firstFilter)), the data is returned null, causing NULL exception in satisfies(data, clause, primary) which is not captured. The callback got timeout and return a Timeout exception to Hector. The data is empty, as I traced, I have the the columns Count as 0 in removeDeletedCF(), which return the null there. (I am new and trying to understand the logics around still). Instead of crash to NULL, could we bypass the data? About my test: A stress-test program to add, modify and delete data to keyspace. I have 30 threads simulate concurrent users to perform the actions above, and do a query to all rows periodically. I have Column Family with rows (as File) and columns as index (e.g. userID, fileType). No issue on the first day of test, and stopped for 3 days. I restart the test on 4th day, 1 of the users failed to query the files (timeout exception received). Most of the users are still okay with the query. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira