[Cassandra Wiki] Update of FileFormatDesignDoc by Stu Hood
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The FileFormatDesignDoc page has been changed by StuHood. The comment on this change is: Clarified the metadata discussion, added horizontal rules. http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=15rev2=16 -- * Space efficient when un-compressed: remove redundancy * Random access to the middle of wide rows * Arbitrary nesting + * Range tombstones (for range/slice deletes) == Influences == * Google Dremel [1] - Arbitrarily nested, field-oriented serialization * Hive RCFile [2] - Column-group-oriented storage + + == Current Implementation == @@ -48, +51 @@ Finally, there is a second type of redundancy that the current design does not tackle: the column names at level name2 are frequently repeated, but since rows are stored independently, we don't normalize those values. For narrow rows (like those shown), removing this redundancy will be our largest win. - == High level == + Metadata + + Metadata is currently implemented such that column parents have metadata that covers their entire range: this means that you cannot delete arbitrary slices, only exact keys or names. + + + + == Proposed Implementation == Because we will be storing multiple columns per SSTable, our design will bear the most similarity to RCFile (rather than the column-per-file approach taken in Dremel). But because we allow for nesting via super columns (and hopefully with a more flexible representation in the future), we need to take hints from Dremel's serialization to allow for efficient storage of parent and null information. === Vertical chunks === - Rather than slicing the span into chunks horizontally, we will use vertical chunks (and horizontal chunks only when necessary for particularly wide rows): + Rather than slicing the span into chunks horizontally, we will use vertical chunks (and break particularly wide rows into multiple spans): || ''row key'' || || cheese || @@ -122, +131 @@ The parent change flag can be represented compactly using a bitmap, and field lengths can be packed tightly into group-varint encoded arrays [3], as alluded to in the Dremel paper, and mentioned in Jeff Dean's talks. - === Metadata === - - Cassandra also needs to encode metadata about tuples and ranges of tuples, in order to represent creation and deletion timestamps: range tuples can be encoded in a similar fashion to the value tuples represented here, and the metadata timestamps can be group-varint encoded. - === Field reordering === One weakness of the implementation so far is that it doesn't allow tuples to be reordered within a level. This approach performs well for wide rows with high field cardinality, since adding compression is unlikely to remove data. @@ -154, +159 @@ === Summary === - The final (simplified) representation of the span is: + A (simplified) representation of the span so far (without metadata) is: ''(parent-ordered)'' || ''row key'' || ''parent_change'' || @@ -184, +189 @@ || || 0 || || china || 1 || + == Metadata == + + Cassandra also needs to encode metadata about tuples and ranges of tuples in order to represent creation and deletion timestamps. For both value tuples and range tuples, a varying number (depending on value and range type) of timestamps will also need to be encoded. + + === Range Metadata === + + Range tuples can be encoded in a very similar fashion to the value tuples represented above, except that they always come in pairs. It will likely make sense to store them in a separate blob from the value tuples, since they will bear very little similarity to one another (TODO: need to confirm with an anecdote or two). + + || ''name1'' - ''left'' || ''name1'' - ''right'' || ''parent_change'' || + || havarti || muenster || 0 || + || || || 1 || + + This example shows a range tombstone for values at level name1 between 'havarti' and 'muenster': the chunk for the name1 level stores a pair of range tuples for the 'cheese' parent and a nulls are stored for parents without any range metadata. The end result is that the span stores a tombstone from ('cheese', 'havarti', empty) to ('cheese', 'muenster', null), where empty is the smallest value, and null is the largest value. + + Note that it is not possible for ranges for a parent to overlap: in this case, the ranges would be resolved such that the intersection was given the winning timestamp, and the two remainders would use their original timestamps. + + Effect of ordering + + When a chunk is marked as ''self'' ordered, range metadata should be affected as well: therefore, the number of ranges that need to be represented in a chunk should also factor into the cardinality threshold that toggles a chunk between
[jira] Commented: (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977697#action_12977697 ] Stu Hood commented on CASSANDRA-674: How will ranges be stored? The parent ordering would mean the sorting of data at that level is lost no? Added some explanation of how I think ranges should work to the wiki. http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=15rev2=16 Are chunks broken up by size only? Technically spans are the largest unit, so they define the boundaries: tried to clarify this part as well. There are a few possible thresholds, including a max number of rows, columns, range tombstones or total bytes in the span. One semi-undefined portion is what happens when a row is larger than can be stuffed in a span. Most likely we'll want to use the range metadata to indicate the portion of the row covered by the span (the approach I took in the original implementation attached here). Will the metadata be ripe for caching? I don't think so: the metadata is useless on it's own. It only becomes useful when it is attached to data (a column or to a range), so there is no reason to cache the meta- independently of the data. Thanks! New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Fix For: 0.8 Attachments: 674-v1.diff, perf-674-v1.txt, perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. The implementation has a bunch of issues/fixmes, which I'll describe in the comments. The file format is described in the javadoc for the o.a.c.io.SSTableWriter class, but briefly: * Blocks are opaque (except for their header) so that they can be compressed. The index file contains an entry for the first key in every Block. Blocks contain Slices. * Slices are series of columns with the same parents and (deletion) metadata. They can be used to represent ColumnFamilies or SuperColumns (or a slice of columns at any other depth). A single CF can be split across multiple Slices, which can be split across multiple blocks. * Neither Slices nor Blocks have a fixed size or maximum length, but they each have target lengths which can be stretched and broken by very large columns. The most interesting concepts from this patch are: * Block compression is possible (currently using GZIP, which has one bug mentioned in the comments), * Compaction involves merging intersecting Slices from input SSTables. Since large rows will be broken down into multiple slices, only the portions of rows that intersect between tables need to be deserialized/merged/held-in-memory, * Indexes for individual rows are gone, since the global index allows random access to the middle of column families that span Blocks, and Slices allow batches of columns to be skipped within a Block. * Bloom filters for individual rows are gone, and the global filter contains ColumnKeys instead, meaning that a query for a column that doesn't exist in a row that does will often not need to seek to the row. * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) for columns are defined recursively, so deeply nested slices are possible, * Slices representing a single parent (CF, SC, etc) can have different Metadata, meaning that a tombstone Slice from d-f could sit between Slices containing columns a-c and g-h. This allows for eventually consistent range deletes of columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[Cassandra Wiki] Trivial Update of FileFormatDesignDoc by StuHood
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The FileFormatDesignDoc page has been changed by StuHood. http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=16rev2=17 -- == Metadata == - Cassandra also needs to encode metadata about tuples and ranges of tuples in order to represent creation and deletion timestamps. For both value tuples and range tuples, a varying number (depending on value and range type) of timestamps will also need to be encoded. + Cassandra also needs to encode metadata about tuples and ranges of tuples in order to represent creation and deletion timestamps. For both value tuples and range tuples, a varying number (depending on value and range type) of timestamps will need to be encoded. === Range Metadata ===
[Cassandra Wiki] Trivial Update of FileFormatDesignDoc by StuHood
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The FileFormatDesignDoc page has been changed by StuHood. http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=17rev2=18 -- Range tuples can be encoded in a very similar fashion to the value tuples represented above, except that they always come in pairs. It will likely make sense to store them in a separate blob from the value tuples, since they will bear very little similarity to one another (TODO: need to confirm with an anecdote or two). - || ''name1'' - ''left'' || ''name1'' - ''right'' || ''parent_change'' || + || ''name1'' - ''left'' || ''name1'' - ''right'' || ''parent_change'' || ''null?'' || - || havarti || muenster || 0 || + || havarti || muenster || 0 || 0 || - || || || 1 || + || || || 1 || 1 || This example shows a range tombstone for values at level name1 between 'havarti' and 'muenster': the chunk for the name1 level stores a pair of range tuples for the 'cheese' parent and a nulls are stored for parents without any range metadata. The end result is that the span stores a tombstone from ('cheese', 'havarti', empty) to ('cheese', 'muenster', null), where empty is the smallest value, and null is the largest value.
[jira] Commented: (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1296#action_1296 ] T Jake Luciani commented on CASSANDRA-674: -- bq. the metadata is useless on it's own. It only becomes useful when it is attached to data (a column or to a range), so there is no reason to cache the meta- independently of the data. But above you mention: {code} Indexes for individual rows are gone, since the global index allows random access to the middle of column families that span Blocks, and Slices allow batches of columns to be skipped within a Block. {code} ^ This wouldn't be useful to cache? in the situation you only want a small range of columns? - More questions Roughly how large would the actual chunk be? This is the unit of deserialization right? or can avro deserialize only part of a structure? So if you are doing a range query on a very wide row how do you know when to stop processing chunks? do you keep going till you hit the sentinel value empty ? New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Fix For: 0.8 Attachments: 674-v1.diff, perf-674-v1.txt, perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. The implementation has a bunch of issues/fixmes, which I'll describe in the comments. The file format is described in the javadoc for the o.a.c.io.SSTableWriter class, but briefly: * Blocks are opaque (except for their header) so that they can be compressed. The index file contains an entry for the first key in every Block. Blocks contain Slices. * Slices are series of columns with the same parents and (deletion) metadata. They can be used to represent ColumnFamilies or SuperColumns (or a slice of columns at any other depth). A single CF can be split across multiple Slices, which can be split across multiple blocks. * Neither Slices nor Blocks have a fixed size or maximum length, but they each have target lengths which can be stretched and broken by very large columns. The most interesting concepts from this patch are: * Block compression is possible (currently using GZIP, which has one bug mentioned in the comments), * Compaction involves merging intersecting Slices from input SSTables. Since large rows will be broken down into multiple slices, only the portions of rows that intersect between tables need to be deserialized/merged/held-in-memory, * Indexes for individual rows are gone, since the global index allows random access to the middle of column families that span Blocks, and Slices allow batches of columns to be skipped within a Block. * Bloom filters for individual rows are gone, and the global filter contains ColumnKeys instead, meaning that a query for a column that doesn't exist in a row that does will often not need to seek to the row. * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) for columns are defined recursively, so deeply nested slices are possible, * Slices representing a single parent (CF, SC, etc) can have different Metadata, meaning that a tombstone Slice from d-f could sit between Slices containing columns a-c and g-h. This allows for eventually consistent range deletes of columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977783#action_12977783 ] Gary Dusbabek commented on CASSANDRA-1710: -- * returnConnection() possibly closes a connection and then returns the [maybe] closed connection back to the pool. Does this mean it is possible to borrow a closed connection? * it looks like the size of the pool can be artificially inflated by creating new Connections outside of the pool and then returning them to the pool. * EvictionTask closes Connections that may already be closed. IIRC this will generate a Thrift exception when the transport is double-closed. Since the pool doesn't know the state of the connection does it makes sense to add isClosed() to the connection API? Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977785#action_12977785 ] T Jake Luciani commented on CASSANDRA-674: -- Let me know if this is wrong, but this design opens the cassandra data model to contain arbitrarily nested data. Given the complexity we already have surrounding the supercolumn concept do you think this is the right way forward? As much as my inner geek wants to build a tree or graph model I don't think the C* community or committers want to take it this way. If we assume we keep the datamodel as is how can we simplify the open ended-ness of your design to make the approach fit our current data model. New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Fix For: 0.8 Attachments: 674-v1.diff, perf-674-v1.txt, perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. The implementation has a bunch of issues/fixmes, which I'll describe in the comments. The file format is described in the javadoc for the o.a.c.io.SSTableWriter class, but briefly: * Blocks are opaque (except for their header) so that they can be compressed. The index file contains an entry for the first key in every Block. Blocks contain Slices. * Slices are series of columns with the same parents and (deletion) metadata. They can be used to represent ColumnFamilies or SuperColumns (or a slice of columns at any other depth). A single CF can be split across multiple Slices, which can be split across multiple blocks. * Neither Slices nor Blocks have a fixed size or maximum length, but they each have target lengths which can be stretched and broken by very large columns. The most interesting concepts from this patch are: * Block compression is possible (currently using GZIP, which has one bug mentioned in the comments), * Compaction involves merging intersecting Slices from input SSTables. Since large rows will be broken down into multiple slices, only the portions of rows that intersect between tables need to be deserialized/merged/held-in-memory, * Indexes for individual rows are gone, since the global index allows random access to the middle of column families that span Blocks, and Slices allow batches of columns to be skipped within a Block. * Bloom filters for individual rows are gone, and the global filter contains ColumnKeys instead, meaning that a query for a column that doesn't exist in a row that does will often not need to seek to the row. * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) for columns are defined recursively, so deeply nested slices are possible, * Slices representing a single parent (CF, SC, etc) can have different Metadata, meaning that a tombstone Slice from d-f could sit between Slices containing columns a-c and g-h. This allows for eventually consistent range deletes of columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: v2-0002-compile-driver-source.txt v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977789#action_12977789 ] Eric Evans commented on CASSANDRA-1710: --- bq. returnConnection() possibly closes a connection and then returns the maybe closed connection back to the pool. Does this mean it is possible to borrow a closed connection? Auh, right you are. That block is missing a {{return}}. {quote} • it looks like the size of the pool can be artificially inflated by creating new Connections outside of the pool and then returning them to the pool. • EvictionTask closes Connections that may already be closed. IIRC this will generate a Thrift exception when the transport is double-closed. {quote} Good catches. v2 patches attached. Thanks! Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977793#action_12977793 ] Gary Dusbabek commented on CASSANDRA-1710: -- Changes look good. One more thing I realized: careless uses could close() a Connection before returning it to the pool, which if not full, would add a closed connection. Depending on how much you want to guard the API you could check connection status before returning it to the queue or emit derived Connections that have their close() methods overwritten to be no-ops (or log a WARN), so that only the pool could call the real close(). Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1937) Keep partitioned counters (contexts) sorted
Keep partitioned counters (contexts) sorted - Key: CASSANDRA-1937 URL: https://issues.apache.org/jira/browse/CASSANDRA-1937 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 In the value of CounterColumns, the code keep the subpart unsorted, but sort them 'on the fly' when needed (in diff() and merge()). It will be more efficient to keep the parts always sorted (it will also be easier in that it will remove the need of the ad-hoc in-place quicksort in CounterContext). NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses
Use UUID as node identifiers in counters instead of IP addresses - Key: CASSANDRA-1938 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 The use of IP addresses as node identifiers in the partition of a given counter is fragile. Changes of the node's IP addresses can result in data loss. This patch proposes to use UUIDs instead. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1936: Attachment: 0001-Put-partitioned-counter-directly-in-column-value.patch Patch attached, a few comments: * since this patch stuffs the context in the column value, who are ByteBuffers, a good part of this patch deals with adapting the functions of CounterContext to take and return ByteBuffers instead of plain byte[]. * the patch also corrects a few not completely related misuse of absolute ByteBuffer's gets. Namely, there was a few occurrences where arrayOffset was wrongly added to the provided index, like: bb.getLong(bb.position() + bb.arrayOffset()). * this patch breaks the on-disk file format. Since #1937 and #1938 will do too, I'm fine with waiting that both are ready to commit all the 3 patches together (I'm hoping to tackle these patches quickly). Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977800#action_12977800 ] Jonathan Ellis commented on CASSANDRA-1938: --- Will this also break the on-disk ring persistence we added for CASSANDRA-1518? Use UUID as node identifiers in counters instead of IP addresses - Key: CASSANDRA-1938 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Original Estimate: 56h Remaining Estimate: 56h The use of IP addresses as node identifiers in the partition of a given counter is fragile. Changes of the node's IP addresses can result in data loss. This patch proposes to use UUIDs instead. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction
[ https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977801#action_12977801 ] T Jake Luciani commented on CASSANDRA-1902: --- Well the good news is the mincore() stuff works via JNA! I'm re-considering what todo with this information. It's most efficient to keep around contiguous chunks of pages so the plan might be to find the most densely populated ranges of pages in the sstable then get the range of rows this covered. Then pass to the SSTableWriter this list which will subsequently mark the new data written for these pages as POSIX_FADV_WILLNEED. The ordering of the keys should be close. I think in the average case this will get the most active data cached. It may keep too much data in the page cache though. Any thoughts on this? Migrate cached pages during compaction --- Key: CASSANDRA-1902 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.7.1 Reporter: T Jake Luciani Assignee: T Jake Luciani Fix For: 0.7.1 Original Estimate: 32h Remaining Estimate: 32h Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a pre-compacted CF during the compaction process. First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that uses the posix mincore() function to detect the offsets of pages for this file currently in page cache. Then add getActiveKeys() which uses underlying pagesInPageCache() to get the keys actually in the page cache. use getActiveKeys() to detect which SSTables being compacted are in the os cache and make sure the subsequent pages in the new compacted SSTable are kept in the page cache for these keys. This will minimize the impact of compacting a hot SSTable. A simpler yet similar approach is described here: http://insights.oetiker.ch/linux/fadvise/ -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977802#action_12977802 ] Jonathan Ellis commented on CASSANDRA-1936: --- can you break the bytebuffer fixes out into a separate patch so we can apply to 0.7? Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses
[ https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977806#action_12977806 ] Sylvain Lebresne commented on CASSANDRA-1938: - I'd have to look at CASSANDRA-1518 with more details, but for this ticket, I intend to keep those node identifiers strictly local, they will not get gossiped. So I expect that no, it won't beak on-disk ring persistence. I think we may have to gossip them at some point however to deal with ever increasing contexts, but I'm not yet completely clear on that and I don't think it's an urgent matter. Use UUID as node identifiers in counters instead of IP addresses - Key: CASSANDRA-1938 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Original Estimate: 56h Remaining Estimate: 56h The use of IP addresses as node identifiers in the partition of a given counter is fragile. Changes of the node's IP addresses can result in data loss. This patch proposes to use UUIDs instead. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977807#action_12977807 ] Sylvain Lebresne commented on CASSANDRA-1936: - Oh right, forgot this was in 0.7 too. I'll do that. Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977820#action_12977820 ] T Jake Luciani commented on CASSANDRA-1472: --- I can't seem to figure out how to apply this patchset? git apply just throws errors. this is against cassandra-0.7 branch correct? I untarred the dir and ran git apply 1472/* Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)
Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index) --- Key: CASSANDRA-1939 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 rc 3, 0.7.0 rc 2, 0.7.0 rc 1, 0.7.0, 0.7.1, 0.8 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.0 Attachments: 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch ByteBuffer.arrayOffset() should not be added to the argument of an absolute get. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)
[ https://issues.apache.org/jira/browse/CASSANDRA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-1939: Attachment: 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index) --- Key: CASSANDRA-1939 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 rc 1, 0.7.0 rc 2, 0.7.0 rc 3, 0.7.0, 0.7.1, 0.8 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.0 Attachments: 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch ByteBuffer.arrayOffset() should not be added to the argument of an absolute get. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977825#action_12977825 ] Jonathan Ellis commented on CASSANDRA-1472: --- you want git am, apply is just for a single patch Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: v3-0002-compile-driver-source.txt v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977829#action_12977829 ] Eric Evans commented on CASSANDRA-1710: --- Yeah, that falls squarely in Don't Do That territory, but since you cannot re-open a closed connection, it makes sense to refuse to re-add them to the pool (and to log a warning). Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977832#action_12977832 ] T Jake Luciani commented on CASSANDRA-1472: --- Now i get this: error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in index error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in index error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist in index Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977832#action_12977832 ] T Jake Luciani edited comment on CASSANDRA-1472 at 1/5/11 11:51 AM: Now i get this: error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in index error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in index error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist in index The 0001 patch header looks like this: .../org/apache/cassandra/db/CompactionManager.java | 23 +++- .../apache/cassandra/io/AbstractCompactedRow.java |6 +- .../org/apache/cassandra/io/ColumnObserver.java| 124 .../apache/cassandra/io/CompactionIterator.java| 15 ++- .../apache/cassandra/io/LazilyCompactedRow.java|9 +- .../org/apache/cassandra/io/PrecompactedRow.java | 23 +++- .../io/sstable/SSTableIdentityIterator.java| 13 ++- .../apache/cassandra/io/sstable/SSTableWriter.java | 22 - .../cassandra/io/LazilyCompactedRowTest.java | 19 +++- was (Author: tjake): Now i get this: error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in index error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in index error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does not exist in index error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist in index Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977841#action_12977841 ] Jonathan Ellis commented on CASSANDRA-1472: --- looks like the patch was generated from some weird non-root directory, you probably need some combination of -p or --directory (as in git-apply) Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977846#action_12977846 ] Gary Dusbabek commented on CASSANDRA-1710: -- +1 Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1711) Python driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1711: -- Attachment: v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Python driver for CQL - Key: CASSANDRA-1711 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt, v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055538 - in /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver: Connection.java ConnectionPool.java IConnectionPool.java Utils.java
Author: eevans Date: Wed Jan 5 17:24:07 2011 New Revision: 1055538 URL: http://svn.apache.org/viewvc?rev=1055538view=rev Log: CASSANDRA-1710 basic connection pooling for java driver Patch by eevans for CASSANDRA-1710 Added: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Utils.java Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java?rev=1055538r1=1055537r2=1055538view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java Wed Jan 5 17:24:07 2011 @@ -1,33 +1,8 @@ -/* - * - * Licensed to the Apache Software Foundation (ASF) under one - * or more contributor license agreements. See the NOTICE file - * distributed with this work for additional information - * regarding copyright ownership. The ASF licenses this file - * to you under the Apache License, Version 2.0 (the - * License); you may not use this file except in compliance - * with the License. You may obtain a copy of the License at - * - * http://www.apache.org/licenses/LICENSE-2.0 - * - * Unless required by applicable law or agreed to in writing, - * software distributed under the License is distributed on an - * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY - * KIND, either express or implied. See the License for the - * specific language governing permissions and limitations - * under the License. - * - */ package org.apache.cassandra.cql.driver; -import java.io.ByteArrayOutputStream; -import java.nio.ByteBuffer; -import java.util.zip.Deflater; - import org.apache.cassandra.thrift.Cassandra; import org.apache.cassandra.thrift.Compression; import org.apache.cassandra.thrift.CqlResult; -import org.apache.cassandra.thrift.CqlRow; import org.apache.cassandra.thrift.InvalidRequestException; import org.apache.cassandra.thrift.TimedOutException; import org.apache.cassandra.thrift.UnavailableException; @@ -37,105 +12,95 @@ import org.apache.thrift.protocol.TProto import org.apache.thrift.transport.TFramedTransport; import org.apache.thrift.transport.TSocket; import org.apache.thrift.transport.TTransport; +import org.apache.thrift.transport.TTransportException; import org.slf4j.Logger; import org.slf4j.LoggerFactory; +/** CQL connection object. */ public class Connection { -private static final Logger logger = LoggerFactory.getLogger(Connection.class); +public static Compression defaultCompression = Compression.GZIP; +public final String hostName; +public final int portNo; -public String hostName; -public int port; +private static final Logger logger = LoggerFactory.getLogger(Connection.class); +protected long timeOfLastFailure = 0; +protected int numFailures = 0; private Cassandra.Client client; private TTransport transport; -private Compression defaultCompression = Compression.GZIP; -public Connection(String keyspaceName, String...hosts) throws InvalidRequestException, TException +/** + * Create a new codeConnection/code instance. + * + * @param hostName hostname or IP address of the remote host + * @param portNo TCP port number + * @throws TTransportException if unable to connect + */ +public Connection(String hostName, int portNo) throws TTransportException { -assert hosts.length 0; +this.hostName = hostName; +this.portNo = portNo; -for (String hostSpec : hosts) -{ -String[] parts = hostSpec.split(:, 2); -this.hostName = parts[0]; -this.port = Integer.parseInt(parts[1]); - -// TODO: This will need to do connection pooling. -break; -} - -TSocket socket = new TSocket(hostName, port); +TSocket socket = new TSocket(hostName, portNo); transport = new TFramedTransport(socket); TProtocol protocol = new TBinaryProtocol(transport); client = new Cassandra.Client(protocol); socket.open(); -client.set_keyspace(keyspaceName); -} - -private ByteBuffer compressQuery(String queryStr, Compression compression) -{ -byte[] data = queryStr.getBytes(); -Deflater compressor = new Deflater(); -compressor.setInput(data); -compressor.finish(); - -ByteArrayOutputStream byteArray = new ByteArrayOutputStream(); -
svn commit: r1055539 - /cassandra/trunk/build.xml
Author: eevans Date: Wed Jan 5 17:24:11 2011 New Revision: 1055539 URL: http://svn.apache.org/viewvc?rev=1055539view=rev Log: compile driver source Patch by eevans for CASSANDRA-1710 Modified: cassandra/trunk/build.xml Modified: cassandra/trunk/build.xml URL: http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1055539r1=1055538r2=1055539view=diff == --- cassandra/trunk/build.xml (original) +++ cassandra/trunk/build.xml Wed Jan 5 17:24:11 2011 @@ -26,6 +26,7 @@ property name=basedir value=./ property name=build.src value=${basedir}/src/ property name=build.src.java value=${basedir}/src/java/ +property name=build.src.driver value=${basedir}/drivers/java/src / property name=avro.src value=${basedir}/src/avro/ property name=build.src.gen-java value=${basedir}/src/gen-java/ property name=build.lib value=${basedir}/lib/ @@ -300,7 +301,8 @@ src path=${build.src.java}/ src path=${build.src.gen-java}/ src path=${interface.thrift.dir}/gen-java/ -classpath refid=cassandra.classpath/ +src path=${build.src.driver} / +classpath refid=cassandra.classpath/ /javac taskdef name=paranamer classname=com.thoughtworks.paranamer.ant.ParanamerGeneratorTask
[jira] Commented: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977863#action_12977863 ] Eric Evans commented on CASSANDRA-1710: --- basic pooling committed. Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Assignee: (was: Eric Evans) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: (was: v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0002-compile-driver-source.txt, v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v2-0002-compile-driver-source.txt, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: (was: v1-0002-compile-driver-source.txt) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: (was: v2-0002-compile-driver-source.txt) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1710: -- Attachment: (was: v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1940) Twisted driver for CQL
Twisted driver for CQL -- Key: CASSANDRA-1940 URL: https://issues.apache.org/jira/browse/CASSANDRA-1940 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: • Query compression • Keyspace assignment on connection • Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (CASSANDRA-1940) Twisted driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-1940: --- Assignee: Brandon Williams Twisted driver for CQL -- Key: CASSANDRA-1940 URL: https://issues.apache.org/jira/browse/CASSANDRA-1940 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Brandon Williams Priority: Minor Fix For: 0.8 Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: • Query compression • Keyspace assignment on connection • Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1711) Python driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977910#action_12977910 ] Gary Dusbabek commented on CASSANDRA-1711: -- +1, except needs apache license headers. Python driver for CQL - Key: CASSANDRA-1711 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt, v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055591 - in /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver: Connection.java ConnectionPool.java IConnectionPool.java Utils.java
Author: eevans Date: Wed Jan 5 19:23:51 2011 New Revision: 1055591 URL: http://svn.apache.org/viewvc?rev=1055591view=rev Log: license headers (java driver source) Patch by eevans for CASSANDRA-1710 Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Utils.java Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java?rev=1055591r1=1055590r2=1055591view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java Wed Jan 5 19:23:51 2011 @@ -1,3 +1,24 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ + package org.apache.cassandra.cql.driver; import org.apache.cassandra.thrift.Cassandra; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java?rev=1055591r1=1055590r2=1055591view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java Wed Jan 5 19:23:51 2011 @@ -1,3 +1,23 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ package org.apache.cassandra.cql.driver; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java?rev=1055591r1=1055590r2=1055591view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java Wed Jan 5 19:23:51 2011 @@ -1,3 +1,24 @@ +/* + * + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + * + */ + package org.apache.cassandra.cql.driver; public interface IConnectionPool Modified:
svn commit: r1055594 - in /cassandra/trunk: drivers/py/cql/__init__.py drivers/py/cql/connection.py drivers/py/cql/connection_pool.py drivers/py/cql/errors.py test/system/test_cql.py
Author: eevans Date: Wed Jan 5 19:27:50 2011 New Revision: 1055594 URL: http://svn.apache.org/viewvc?rev=1055594view=rev Log: basic connection pooling for python driver Patch by eevans; reviewed by gdusbabek for CASSANDRA-1711 Added: cassandra/trunk/drivers/py/cql/connection.py cassandra/trunk/drivers/py/cql/connection_pool.py cassandra/trunk/drivers/py/cql/errors.py Modified: cassandra/trunk/drivers/py/cql/__init__.py cassandra/trunk/test/system/test_cql.py Modified: cassandra/trunk/drivers/py/cql/__init__.py URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/py/cql/__init__.py?rev=1055594r1=1055593r2=1055594view=diff == --- cassandra/trunk/drivers/py/cql/__init__.py (original) +++ cassandra/trunk/drivers/py/cql/__init__.py Wed Jan 5 19:27:50 2011 @@ -1,79 +1,23 @@ -from os.path import exists, abspath, dirname, join -from thrift.transport import TTransport, TSocket -from thrift.protocol import TBinaryProtocol -from thrift.Thrift import TApplicationException -import zlib +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# License); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an AS IS BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +Cassandra Query Language driver + -try: -from cassandra import Cassandra -from cassandra.ttypes import Compression, InvalidRequestException, \ - CqlResultType -except ImportError: -# Hack to run from a source tree -import sys -sys.path.append(join(abspath(dirname(__file__)), - '..', - '..', - '..', - 'interface', - 'thrift', - 'gen-py')) -from cassandra import Cassandra -from cassandra.ttypes import Compression, InvalidRequestException, \ - CqlResultType - -COMPRESSION_SCHEMES = ['GZIP'] -DEFAULT_COMPRESSION = 'GZIP' - -class Connection(object): -def __init__(self, keyspace, host, port=9160): -socket = TSocket.TSocket(host, port) -self.transport = TTransport.TFramedTransport(socket) -protocol = TBinaryProtocol.TBinaryProtocolAccelerated(self.transport) -self.client = Cassandra.Client(protocol) -socket.open() - -if keyspace: -self.execute('USE %s' % keyspace) - -def execute(self, query, compression=None): -compress = compression is None and DEFAULT_COMPRESSION \ -or compression.upper() - -compressed_query = Connection.compress_query(query, compress) -request_compression = getattr(Compression, compress) - -try: -response = self.client.execute_cql_query(compressed_query, - request_compression) -except InvalidRequestException, ire: -raise CQLException(Bad Request: %s % ire.why) -except TApplicationException, tapp: -raise CQLException(Internal application error) -except Exception, exc: -raise CQLException(exc) - -if response.type == CqlResultType.ROWS: -return response.rows -if response.type == CqlResultType.INT: -return response.num - -return None - -def close(self): -self.transport.close() - -@classmethod -def compress_query(cls, query, compression): -if not compression in COMPRESSION_SCHEMES: -raise InvalidCompressionScheme(compression) - -if compression == 'GZIP': -return zlib.compress(query) - - -class InvalidCompressionScheme(Exception): pass -class CQLException(Exception): pass - -# vi: ai ts=4 tw=0 sw=4 et +from connection import Connection +from connection_pool import ConnectionPool Added: cassandra/trunk/drivers/py/cql/connection.py URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/py/cql/connection.py?rev=1055594view=auto == --- cassandra/trunk/drivers/py/cql/connection.py (added) +++ cassandra/trunk/drivers/py/cql/connection.py Wed Jan 5 19:27:50 2011 @@ -0,0 +1,121 @@ + +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements.
[jira] Updated: (CASSANDRA-1859) distributed test harness
[ https://issues.apache.org/jira/browse/CASSANDRA-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1859: -- Reviewer: brandon.williams (was: urandom) Fix Version/s: (was: 0.8) 0.7.1 distributed test harness Key: CASSANDRA-1859 URL: https://issues.apache.org/jira/browse/CASSANDRA-1859 Project: Cassandra Issue Type: Test Components: Tools Reporter: Kelvin Kakugawa Assignee: Kelvin Kakugawa Fix For: 0.7.1 Attachments: 0001-Add-distributed-ultra-long-running-tests-using-Whirr-j.txt, 0002-Pull-whirr-0.3.0-incubating-SNAPSHOT-155-from-Twitter-.txt, 0003-add-a-test-for-one-writes-and-all-reads.txt Distributed Test Harness - deploys a cluster on a cloud provider - runs tests targeted at the cluster - tears down the cluster -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1711) Python driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1711: -- Assignee: (was: Eric Evans) Python driver for CQL - Key: CASSANDRA-1711 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1711) Python driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977921#action_12977921 ] Eric Evans commented on CASSANDRA-1711: --- basic pooling committed (w/ license headers) Python driver for CQL - Key: CASSANDRA-1711 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1711) Python driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1711: -- Attachment: (was: v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt) Python driver for CQL - Key: CASSANDRA-1711 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1941) Add distributed test doing reads during bootstrap of additional node
Add distributed test doing reads during bootstrap of additional node Key: CASSANDRA-1941 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Priority: Minor Fix For: 0.8 Following introduction of the distributed test framework in CASSANDRA-1859, we should extend that to test reads while bootstrap happens (this is a scenario that has had regressions in the past). See test/distributed/README.txt for intro. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1857) nodetool has invalidaterowcache but no invalidatekeycache
[ https://issues.apache.org/jira/browse/CASSANDRA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Hermes updated CASSANDRA-1857: -- Attachment: 1857.txt Add invalidateKeyCache, add both to NodeCmd (with all the spiffy optional KS and CFs ala repair/compact/cleanup/flush). nodetool has invalidaterowcache but no invalidatekeycache - Key: CASSANDRA-1857 URL: https://issues.apache.org/jira/browse/CASSANDRA-1857 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Robert Coli Assignee: Jon Hermes Priority: Trivial Attachments: 1857.txt In many cases where you would want to use invalidaterowcache, you would probably also want to invalidatekeycache. Currently, you can invalidaterowcache, but not invalidatekeycache. It seems that users should, generally, be able to do both or neither, but not one or the other. A brief look at the NodeCmd/ColumnFamilyStore code suggests that the stubs/hooks for this feature do not currently exist. =Rob -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1857) nodetool has invalidaterowcache but no invalidatekeycache
[ https://issues.apache.org/jira/browse/CASSANDRA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jon Hermes updated CASSANDRA-1857: -- Fix Version/s: 0.7.1 nodetool has invalidaterowcache but no invalidatekeycache - Key: CASSANDRA-1857 URL: https://issues.apache.org/jira/browse/CASSANDRA-1857 Project: Cassandra Issue Type: Improvement Components: Tools Reporter: Robert Coli Assignee: Jon Hermes Priority: Trivial Fix For: 0.7.1 Attachments: 1857.txt In many cases where you would want to use invalidaterowcache, you would probably also want to invalidatekeycache. Currently, you can invalidaterowcache, but not invalidatekeycache. It seems that users should, generally, be able to do both or neither, but not one or the other. A brief look at the NodeCmd/ColumnFamilyStore code suggests that the stubs/hooks for this feature do not currently exist. =Rob -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (CASSANDRA-1941) Add distributed test doing reads during bootstrap of additional node
[ https://issues.apache.org/jira/browse/CASSANDRA-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams reassigned CASSANDRA-1941: --- Assignee: Brandon Williams Add distributed test doing reads during bootstrap of additional node Key: CASSANDRA-1941 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941 Project: Cassandra Issue Type: New Feature Components: Core Reporter: Jonathan Ellis Assignee: Brandon Williams Priority: Minor Fix For: 0.8 Following introduction of the distributed test framework in CASSANDRA-1859, we should extend that to test reads while bootstrap happens (this is a scenario that has had regressions in the past). See test/distributed/README.txt for intro. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1859) distributed test harness
[ https://issues.apache.org/jira/browse/CASSANDRA-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-1859: Attachment: 0.7-1859.tgz Attaching a rebase for the 0.7 branch. distributed test harness Key: CASSANDRA-1859 URL: https://issues.apache.org/jira/browse/CASSANDRA-1859 Project: Cassandra Issue Type: Test Components: Tools Reporter: Kelvin Kakugawa Assignee: Kelvin Kakugawa Fix For: 0.7.1 Attachments: 0.7-1859.tgz, 0001-Add-distributed-ultra-long-running-tests-using-Whirr-j.txt, 0002-Pull-whirr-0.3.0-incubating-SNAPSHOT-155-from-Twitter-.txt, 0003-add-a-test-for-one-writes-and-all-reads.txt Distributed Test Harness - deploys a cluster on a cloud provider - runs tests targeted at the cluster - tears down the cluster -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Dusbabek updated CASSANDRA-1710: - Attachment: jdbc-ish.diff This may not be useful or productive, but I had to put the code somewhere. This patch (applies on top) and makes the API JDBC-ish, which may be undesirable). However, it does push the pool abstraction down so that client code would think about pools and could treat all Connection objects the same way: get connection, execute query, close. I haven't given too much thought as to how this would work out in other languages, but it is idiomatic for java. :) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: jdbc-ish.diff, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055618 - in /cassandra/branches/cassandra-0.7: ./ test/distributed/ test/distributed/org/ test/distributed/org/apache/ test/distributed/org/apache/cassandra/ test/distributed/org/apache/
Author: brandonwilliams Date: Wed Jan 5 20:16:14 2011 New Revision: 1055618 URL: http://svn.apache.org/viewvc?rev=1055618view=rev Log: Distributed test harness. Patch by Kelvin Kakugawa, Stu Hood, and Ryan King, reviewed by brandonwilliams for CASSANDRA-1859. Added: cassandra/branches/cassandra-0.7/test/distributed/ cassandra/branches/cassandra-0.7/test/distributed/README.txt cassandra/branches/cassandra-0.7/test/distributed/ivy.xml - copied, changed from r1055594, cassandra/branches/cassandra-0.7/ivysettings.xml cassandra/branches/cassandra-0.7/test/distributed/org/ cassandra/branches/cassandra-0.7/test/distributed/org/apache/ cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/ cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/CassandraServiceController.java cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MovementTest.java cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MutationTest.java cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/TestBase.java cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/ cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/BlobUtils.java cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/KeyPair.java cassandra/branches/cassandra-0.7/test/resources/whirr-default.properties Modified: cassandra/branches/cassandra-0.7/CHANGES.txt cassandra/branches/cassandra-0.7/build.xml cassandra/branches/cassandra-0.7/ivysettings.xml Modified: cassandra/branches/cassandra-0.7/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1055618r1=1055617r2=1055618view=diff == --- cassandra/branches/cassandra-0.7/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jan 5 20:16:14 2011 @@ -12,6 +12,7 @@ dev * implement describeOwnership for BOP, COPP (CASSANDRA-1928) * make read repair behave as expected for ConsistencyLevel ONE (CASSANDRA-982) + * distributed test harness (CASSANDRA-1859) 0.7.0-rc4 Modified: cassandra/branches/cassandra-0.7/build.xml URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/build.xml?rev=1055618r1=1055617r2=1055618view=diff == --- cassandra/branches/cassandra-0.7/build.xml (original) +++ cassandra/branches/cassandra-0.7/build.xml Wed Jan 5 20:16:14 2011 @@ -40,12 +40,14 @@ property name=interface.avro.dir value=${interface.dir}/avro/ property name=test.dir value=${basedir}/test/ property name=test.resources value=${test.dir}/resources/ +property name=test.lib value=${build.dir}/test/lib/ property name=test.classes value=${build.dir}/test/classes/ property name=test.conf value=${test.dir}/conf/ property name=test.data value=${test.dir}/data/ property name=test.name value=*Test/ property name=test.unit.src value=${test.dir}/unit/ property name=test.long.src value=${test.dir}/long/ +property name=test.distributed.src value=${test.dir}/distributed/ property name=dist.dir value=${build.dir}/dist/ property name=base.version value=0.7.0-rc4/ condition property=version value=${base.version} @@ -105,6 +107,7 @@ fail unless=is.source.artifact message=Not a source artifact, stopping here. / mkdir dir=${build.classes}/ +mkdir dir=${test.lib}/ mkdir dir=${test.classes}/ mkdir dir=${build.src.gen-java}/ /target @@ -165,10 +168,17 @@ /target target name=ivy-retrieve-build depends=ivy-init + ivy:resolve file=${basedir}/ivy.xml/ ivy:retrieve type=jar,source sync=true pattern=${build.dir.lib}/[type]s/[artifact]-[revision].[ext] / /target +target name=ivy-retrieve-test depends=ivy-init + ivy:resolve file=${basedir}/test/distributed/ivy.xml/ + ivy:retrieve type=jar,source sync=true + pattern=${test.lib}/[type]s/[artifact]-[revision].[ext] / +/target + !-- Generate avro code -- @@ -453,28 +463,49 @@ /copy /target + target name=build-distributed-test depends=build-test,ivy-retrieve-test description=Compile distributed test classes (which have additional deps) +javac + debug=true + debuglevel=${debuglevel} + destdir=${test.classes} + classpath + path refid=cassandra.classpath/ + pathelement location=${test.classes}/ + fileset dir=${test.lib} +include name=**/*.jar / + /fileset + /classpath + src path=${test.distributed.src}/ +/javac + /target + macrodef name=testmacro attribute name=suitename / attribute name=inputdir / attribute name=timeout / +
[jira] Issue Comment Edited: (CASSANDRA-1710) Java driver for CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977948#action_12977948 ] Gary Dusbabek edited comment on CASSANDRA-1710 at 1/5/11 3:19 PM: -- This may not be useful or productive, but I had to put the code somewhere. This patch (applies on top) and makes the API JDBC-ish, which may be undesirable). However, it does push the pool abstraction down so that client code wouldn't need to think about pools and could treat all Connection objects the same way: get connection, execute query, close. I haven't given too much thought as to how this would work out in other languages, but it is idiomatic for java. :) was (Author: gdusbabek): This may not be useful or productive, but I had to put the code somewhere. This patch (applies on top) and makes the API JDBC-ish, which may be undesirable). However, it does push the pool abstraction down so that client code would think about pools and could treat all Connection objects the same way: get connection, execute query, close. I haven't given too much thought as to how this would work out in other languages, but it is idiomatic for java. :) Java driver for CQL --- Key: CASSANDRA-1710 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: jdbc-ish.diff, v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, v3-0002-compile-driver-source.txt Original Estimate: 0h Remaining Estimate: 0h In-tree CQL drivers should be reasonably consistent with one another (wherever possible/practical), and implement a minimum of: * Query compression * Keyspace assignment on connection * Connection pooling / load-balancing The goal is not to supplant the idiomatic libraries, but to provide a consistent, stable base for them to build upon. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1937) Keep partitioned counters (contexts) sorted
[ https://issues.apache.org/jira/browse/CASSANDRA-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977953#action_12977953 ] Kelvin Kakugawa commented on CASSANDRA-1937: Definitely agree. I made the trade-off to keep them in update order. i.e. the order in which the node was last updated. However, keeping them in node id sorted order did cross my mind. Keep partitioned counters (contexts) sorted - Key: CASSANDRA-1937 URL: https://issues.apache.org/jira/browse/CASSANDRA-1937 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Original Estimate: 4h Remaining Estimate: 4h In the value of CounterColumns, the code keep the subpart unsorted, but sort them 'on the fly' when needed (in diff() and merge()). It will be more efficient to keep the parts always sorted (it will also be easier in that it will remove the need of the ad-hoc in-place quicksort in CounterContext). NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055623 - /cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar
Author: eevans Date: Wed Jan 5 20:27:58 2011 New Revision: 1055623 URL: http://svn.apache.org/viewvc?rev=1055623view=rev Log: replace high-scale-lib.jar from maven central Modified: cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar Modified: cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar?rev=1055623r1=1055622r2=1055623view=diff == Binary files - no diff available.
svn commit: r1055626 - in /cassandra/trunk: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ lib/ test/distributed/ test/distributed/org/ test/distributed/org/apache/ test/distributed/org/apa
Author: eevans Date: Wed Jan 5 20:36:35 2011 New Revision: 1055626 URL: http://svn.apache.org/viewvc?rev=1055626view=rev Log: merge w/ 0.7 branch Added: cassandra/trunk/test/distributed/ - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/ cassandra/trunk/test/distributed/README.txt - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/README.txt cassandra/trunk/test/distributed/ivy.xml - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/ivy.xml cassandra/trunk/test/distributed/org/ - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/ cassandra/trunk/test/distributed/org/apache/ - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/ cassandra/trunk/test/distributed/org/apache/cassandra/ - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/ cassandra/trunk/test/distributed/org/apache/cassandra/CassandraServiceController.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/CassandraServiceController.java cassandra/trunk/test/distributed/org/apache/cassandra/MovementTest.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MovementTest.java cassandra/trunk/test/distributed/org/apache/cassandra/MutationTest.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MutationTest.java cassandra/trunk/test/distributed/org/apache/cassandra/TestBase.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/TestBase.java cassandra/trunk/test/distributed/org/apache/cassandra/utils/ - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/ cassandra/trunk/test/distributed/org/apache/cassandra/utils/BlobUtils.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/BlobUtils.java cassandra/trunk/test/distributed/org/apache/cassandra/utils/KeyPair.java - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/KeyPair.java cassandra/trunk/test/resources/whirr-default.properties - copied unchanged from r1055624, cassandra/branches/cassandra-0.7/test/resources/whirr-default.properties Modified: cassandra/trunk/ (props changed) cassandra/trunk/CHANGES.txt cassandra/trunk/build.xml cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/ivysettings.xml cassandra/trunk/lib/high-scale-lib.jar Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 20:36:35 2011 @@ -1,5 +1,5 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1055311 -/cassandra/branches/cassandra-0.7:1026516-1055325 +/cassandra/branches/cassandra-0.7:1026516-1055624 /cassandra/branches/cassandra-0.7.0:1053690-1054631 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3:774578-796573 Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1055626r1=1055625r2=1055626view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Wed Jan 5 20:36:35 2011 @@ -17,6 +17,7 @@ * implement describeOwnership for BOP, COPP (CASSANDRA-1928) * make read repair behave as expected for ConsistencyLevel ONE (CASSANDRA-982) + * distributed test harness (CASSANDRA-1859) 0.7.0-rc4 Modified: cassandra/trunk/build.xml URL: http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1055626r1=1055625r2=1055626view=diff == --- cassandra/trunk/build.xml (original) +++ cassandra/trunk/build.xml Wed Jan 5 20:36:35 2011 @@ -41,12 +41,14 @@ property name=interface.avro.dir value=${interface.dir}/avro/ property name=test.dir value=${basedir}/test/ property name=test.resources value=${test.dir}/resources/ +
[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977959#action_12977959 ] Kelvin Kakugawa commented on CASSANDRA-1936: The refactor to store the partitioned counter in the value, instead, is a good direction. I noticed some material changes, though: - client deltas are in the correct partitioned counter format, but targeted at the local node - the RowMutation : updateCommutativeTypes was removed The code works, now, as long as the coordinator node (the local node) is part of the replica set. However, if it's not, then all updates from those non-replica coordinators will be fixed at the highest delta. The reconciliation strategy (on a replica) is sum my node's updates, but take the highest update from all other nodes. (Just ran a distributed test to validate my hypothesis--the same test that's included on 1072.) In the current code, value and partitioned counter are broken apart, because when the RowMutation is created we don't know which node we're going to write to, yet. So, we can't create the final partitioned counter. We could create a sentinel node that replicas need to look for, but that's dirty. The way I solved it was using value for the client delta and converting it to the partitioned counter (w/ the target node) via RM : updateCommutativeTypes. I have an alternate proposal for this ticket, the second patch on 1072. I'll post it and we can take the best parts of each patch. Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository
[ https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977960#action_12977960 ] Eric Evans commented on CASSANDRA-1888: --- I committed 1.1.1 from (http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar). Replace lib/high-scale-lib.jar with equivalent from maven central repository Key: CASSANDRA-1888 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888 Project: Cassandra Issue Type: Improvement Affects Versions: 0.7.0 rc 3 Reporter: Stephen Connolly Fix For: 0.7.1 As part of my effort to get Cassandra published to Maven Central, there are a number of libraries which Cassandra depends on but which are not available in Maven Central. Perhaps the most interesting of these is the Public Domain high-scale-lib.jar The author is an XML build tool hater (and that includes ANT), and the artifact itself contains a lot of unusual cruft... .CVS folders, etc. The build process uses a build.java, that effectively is a rewrite of Make in java with the Makefile embedded in the build.java. I have rebuilt the artifacts and published them to the Maven Central repository. As part of the requirements for publishing to Maven Central are to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was easier to take the source code and transform it into a Maven project. The project is hosted at github: http://stephenc.github.com/high-scale-lib I have published the following versions, all signed with by steph...@apache.org PGP key 1.0.0 1.0.1 1.1.0 1.1.1 1.1.2 These should all be equivalent to the releases by Cliff Click, with the only exception being 1.1.1. For 1.1.1 Cliff's original build script did not run the Unit tests correctly, one of the unit tests consistently fails even on his build process due to an invalid assumption that element ordering is preserved across serialization for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported the test change. The code however remains as is. In any case, can we change the version of high-scale-lib.jar in the lib directory to the version from maven central http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar [The current version used by Cassandra is 1.1.1] Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that this could be considered riskier] My justification for the change is so that I can be sure that consumers of a Maven Central distribution of Cassandra will have exactly the same dependencies, which have been tested as part of the Cassandra release process, and not just the Stephen's very damn sure they are the same dependencies ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository
[ https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans resolved CASSANDRA-1888. --- Resolution: Fixed Fix Version/s: 0.7.1 Assignee: Stephen Connolly Replace lib/high-scale-lib.jar with equivalent from maven central repository Key: CASSANDRA-1888 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888 Project: Cassandra Issue Type: Improvement Affects Versions: 0.7.0 rc 3 Reporter: Stephen Connolly Assignee: Stephen Connolly Fix For: 0.7.1 As part of my effort to get Cassandra published to Maven Central, there are a number of libraries which Cassandra depends on but which are not available in Maven Central. Perhaps the most interesting of these is the Public Domain high-scale-lib.jar The author is an XML build tool hater (and that includes ANT), and the artifact itself contains a lot of unusual cruft... .CVS folders, etc. The build process uses a build.java, that effectively is a rewrite of Make in java with the Makefile embedded in the build.java. I have rebuilt the artifacts and published them to the Maven Central repository. As part of the requirements for publishing to Maven Central are to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was easier to take the source code and transform it into a Maven project. The project is hosted at github: http://stephenc.github.com/high-scale-lib I have published the following versions, all signed with by steph...@apache.org PGP key 1.0.0 1.0.1 1.1.0 1.1.1 1.1.2 These should all be equivalent to the releases by Cliff Click, with the only exception being 1.1.1. For 1.1.1 Cliff's original build script did not run the Unit tests correctly, one of the unit tests consistently fails even on his build process due to an invalid assumption that element ordering is preserved across serialization for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported the test change. The code however remains as is. In any case, can we change the version of high-scale-lib.jar in the lib directory to the version from maven central http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar [The current version used by Cassandra is 1.1.1] Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that this could be considered riskier] My justification for the change is so that I can be sure that consumers of a Maven Central distribution of Cassandra will have exactly the same dependencies, which have been tested as part of the Cassandra release process, and not just the Stephen's very damn sure they are the same dependencies ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977965#action_12977965 ] Sylvain Lebresne commented on CASSANDRA-1936: - You're right, don't know what I have smoked. Actually I was still thinking in the context of 1546, where it's always a replica that 'apply' the update (since the coordinator simply forward the update to a replica if its not one). That being said, I do plan to introduce this part 1546, because it allowed to rehabilitate the consistency levels. So maybe I should do that before, in which case I think the method here would work. Still, curious to see your patch. But I'll admit that I was actually happy to get rid of the updateCommutativeType logic. Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kelvin Kakugawa updated CASSANDRA-1936: --- Attachment: 1936-ALT-0001-lazily-materialize-value.patch Alternate strategy to lazily materialize value. Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch, 1936-ALT-0001-lazily-materialize-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (CASSANDRA-1909) normal replication shouldn't happen on counter CFs.
[ https://issues.apache.org/jira/browse/CASSANDRA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kelvin Kakugawa resolved CASSANDRA-1909. Resolution: Not A Problem normal replication shouldn't happen on counter CFs. --- Key: CASSANDRA-1909 URL: https://issues.apache.org/jira/browse/CASSANDRA-1909 Project: Cassandra Issue Type: Bug Components: Core Reporter: Gary Dusbabek Assignee: Kelvin Kakugawa Fix For: 0.8 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055642 - in /cassandra/branches/cassandra-0.7.0: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/java/org/apache/cassandra/utils/ test/unit/org/apache/
Author: jbellis Date: Wed Jan 5 21:18:59 2011 New Revision: 1055642 URL: http://svn.apache.org/viewvc?rev=1055642view=rev Log: fix offsets to ByteBuffer.get patch by slebresne; reviewed by jbellis for CASSANDRA-1939 Modified: cassandra/branches/cassandra-0.7.0/CHANGES.txt cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java Modified: cassandra/branches/cassandra-0.7.0/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/CHANGES.txt?rev=1055642r1=1055641r2=1055642view=diff == --- cassandra/branches/cassandra-0.7.0/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7.0/CHANGES.txt Wed Jan 5 21:18:59 2011 @@ -1,3 +1,7 @@ +0.7.0-final + * fix offsets to ByteBuffer.get (CASSANDRA-1939) + + 0.7.0-rc4 * fix cli crash after backgrounding (CASSANDRA-1875) * count timeouts in storageproxy latencies, and include latency Modified: cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java?rev=1055642r1=1055641r2=1055642view=diff == --- cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java (original) +++ cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java Wed Jan 5 21:18:59 2011 @@ -55,7 +55,7 @@ public class DeletedColumn extends Colum @Override public int getLocalDeletionTime() { - return value.getInt(value.position()+value.arrayOffset()); + return value.getInt(value.position()); } @Override Modified: cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java?rev=1055642r1=1055641r2=1055642view=diff == --- cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java (original) +++ cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java Wed Jan 5 21:18:59 2011 @@ -63,7 +63,7 @@ public class LongType extends AbstractTy } -return String.valueOf(bytes.getLong(bytes.position()+bytes.arrayOffset())); +return String.valueOf(bytes.getLong(bytes.position())); } public ByteBuffer fromString(String source) Modified: cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java?rev=1055642r1=1055641r2=1055642view=diff == --- cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java (original) +++ cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java Wed Jan 5 21:18:59 2011 @@ -56,7 +56,7 @@ public class UUIDGen /** creates a type 1 uuid from raw bytes. */ public static UUID getUUID(ByteBuffer raw) { -return new UUID(raw.getLong(raw.position() + raw.arrayOffset()), raw.getLong(raw.position() + raw.arrayOffset() + 8)); +return new UUID(raw.getLong(raw.position()), raw.getLong(raw.position() + 8)); } /** decomposes a uuid into raw bytes. */ Modified: cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java?rev=1055642r1=1055641r2=1055642view=diff == --- cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java (original) +++ cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java Wed Jan 5 21:18:59 2011 @@ -124,7 +124,7 @@ public class NameSortTest extends Cleanu assert subColumns.size() == 4; for (IColumn subColumn : subColumns) { -long k = subColumn.name().getLong(subColumn.name().position() + subColumn.name().arrayOffset()); +long k = subColumn.name().getLong(subColumn.name().position());
[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value
[ https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977974#action_12977974 ] Kelvin Kakugawa commented on CASSANDRA-1936: The partitionedCounter of my strategy could be refactored into value, like your strategy. Yeah, unfortunately, the two-step RM creation (via updateCommutativeTypes) is still present. Fit partitioned counter directly into CounterColumn.value -- Key: CASSANDRA-1936 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Fix For: 0.8 Attachments: 0001-Put-partitioned-counter-directly-in-column-value.patch, 1936-ALT-0001-lazily-materialize-value.patch The current implementation of CounterColumn keeps both the partitioned counter and the total value of the counter (that is, the sum of the parts of the partitioned counter). This waste space and this requires the code to keep both representation in sync. This ticket propose to remove the total value from the representation and to only calculate it when returning the value to the client. NOTE: this breaks the on-disk file format (for counters) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055658 - in /cassandra/branches/cassandra-0.7: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/j
Author: jbellis Date: Wed Jan 5 21:57:03 2011 New Revision: 1055658 URL: http://svn.apache.org/viewvc?rev=1055658view=rev Log: merge from 0.7.0 Modified: cassandra/branches/cassandra-0.7/ (props changed) cassandra/branches/cassandra-0.7/CHANGES.txt cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/DeletedColumn.java cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/marshal/LongType.java cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/UUIDGen.java cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/NameSortTest.java cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java Propchange: cassandra/branches/cassandra-0.7/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 21:57:03 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6:922689-1055311 /cassandra/branches/cassandra-0.7:1026516,1035666,1050269 -/cassandra/branches/cassandra-0.7.0:1053690-1054631 +/cassandra/branches/cassandra-0.7.0:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /cassandra/trunk:1026516-1026734,1028929 /incubator/cassandra/branches/cassandra-0.3:774578-796573 Modified: cassandra/branches/cassandra-0.7/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1055658r1=1055657r2=1055658view=diff == --- cassandra/branches/cassandra-0.7/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jan 5 21:57:03 2011 @@ -1,4 +1,4 @@ -dev +0.7.1-dev * buffer network stack to avoid inefficient small TCP messages while avoiding the nagle/delayed ack problem (CASSANDRA-1896) * check log4j configuration for changes every 10s (CASSANDRA-1525, 1907) @@ -15,6 +15,10 @@ dev * distributed test harness (CASSANDRA-1859) +0.7.0-dev + * fix offsets to ByteBuffer.get (CASSANDRA-1939) + + 0.7.0-rc4 * fix cli crash after backgrounding (CASSANDRA-1875) * count timeouts in storageproxy latencies, and include latency Propchange: cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 21:57:03 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1055311 /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516,1035666,1050269 -/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1054631 +/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689 /cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1026734,1028929 /incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573 Propchange: cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 21:57:03 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1055311 /cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516,1035666,1050269 -/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1054631 +/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689 /cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1026734,1028929
svn commit: r1055668 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java
Author: jbellis Date: Wed Jan 5 22:30:05 2011 New Revision: 1055668 URL: http://svn.apache.org/viewvc?rev=1055668view=rev Log: revert r1053443 Modified: cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java Modified: cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java?rev=1055668r1=1055667r2=1055668view=diff == --- cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java (original) +++ cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java Wed Jan 5 22:30:05 2011 @@ -90,7 +90,7 @@ public class ReadResponseResolver implem ListColumnFamily versions = new ArrayListColumnFamily(); ListInetAddress endpoints = new ArrayListInetAddress(); -// validate digests against each other; throw immediately on mismatch. +// case 1: validate digests against each other; throw immediately on mismatch. // also, collects data results into versions/endpoints lists. // // results are cleared as we process them, to avoid unnecessary duplication of work @@ -100,13 +100,20 @@ public class ReadResponseResolver implem { ReadResponse result = entry.getValue(); Message message = entry.getKey(); -ByteBuffer resultDigest = result.isDigestQuery() ? result.digest() : ColumnFamily.digest(result.row().cf); -if (digest == null) -digest = resultDigest; -else if (!digest.equals(resultDigest)) -throw new DigestMismatchException(key, digest, resultDigest); - -if (!result.isDigestQuery()) +if (result.isDigestQuery()) +{ +if (digest == null) +{ +digest = result.digest(); +} +else +{ +ByteBuffer digest2 = result.digest(); +if (!digest.equals(digest2)) +throw new DigestMismatchException(key, digest, digest2); +} +} +else { versions.add(result.row().cf); endpoints.add(message.getFrom()); @@ -115,8 +122,23 @@ public class ReadResponseResolver implem results.remove(message); } -if (logger_.isDebugEnabled()) -logger_.debug(digests verified); + // If there was a digest query compare it with all the data digests + // If there is a mismatch then throw an exception so that read repair can happen. +// +// It's important to note that we do not compare the digests of multiple data responses -- +// if we are in that situation we know there was a previous mismatch and now we're doing a repair, +// so our job is now case 2: figure out what the most recent version is and update everyone to that version. +if (digest != null) +{ +for (ColumnFamily cf : versions) +{ +ByteBuffer digest2 = ColumnFamily.digest(cf); +if (!digest.equals(digest2)) +throw new DigestMismatchException(key, digest, digest2); +} +if (logger_.isDebugEnabled()) +logger_.debug(digests verified); +} ColumnFamily resolved; if (versions.size() 1)
svn commit: r1055669 - in /cassandra/trunk: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/java/org/apache/ca
Author: jbellis Date: Wed Jan 5 22:35:09 2011 New Revision: 1055669 URL: http://svn.apache.org/viewvc?rev=1055669view=rev Log: merge from 0.7 Modified: cassandra/trunk/ (props changed) cassandra/trunk/CHANGES.txt cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java (props changed) cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java (props changed) cassandra/trunk/src/java/org/apache/cassandra/db/DeletedColumn.java cassandra/trunk/src/java/org/apache/cassandra/db/marshal/LongType.java cassandra/trunk/src/java/org/apache/cassandra/service/ReadResponseResolver.java cassandra/trunk/src/java/org/apache/cassandra/utils/UUIDGen.java cassandra/trunk/test/unit/org/apache/cassandra/db/NameSortTest.java cassandra/trunk/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java Propchange: cassandra/trunk/ -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 22:35:09 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1055311 -/cassandra/branches/cassandra-0.7:1026516-1055624 -/cassandra/branches/cassandra-0.7.0:1053690-1054631 +/cassandra/branches/cassandra-0.7:1026516-1055668 +/cassandra/branches/cassandra-0.7.0:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3:774578-796573 /incubator/cassandra/branches/cassandra-0.4:810145-834239,834349-834350 Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1055669r1=1055668r2=1055669view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Wed Jan 5 22:35:09 2011 @@ -3,7 +3,7 @@ * adds support for columns that act as incr/decr counters (CASSANDRA-1072) -0.7-dev +0.7.1-dev * buffer network stack to avoid inefficient small TCP messages while avoiding the nagle/delayed ack problem (CASSANDRA-1896) * check log4j configuration for changes every 10s (CASSANDRA-1525, 1907) @@ -20,6 +20,10 @@ * distributed test harness (CASSANDRA-1859) +0.7.0-dev + * fix offsets to ByteBuffer.get (CASSANDRA-1939) + + 0.7.0-rc4 * fix cli crash after backgrounding (CASSANDRA-1875) * count timeouts in storageproxy latencies, and include latency Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 22:35:09 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1055311 -/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1055624 -/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1054631 +/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1055668 +/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654 /cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689 /incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573 /incubator/cassandra/branches/cassandra-0.4/interface/gen-java/org/apache/cassandra/service/Cassandra.java:810145-834239,834349-834350 Propchange: cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java -- --- svn:mergeinfo (original) +++ svn:mergeinfo Wed Jan 5 22:35:09 2011 @@ -1,6 +1,6 @@ /cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1055311 -/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1055624 -/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1054631 +/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1055668
[jira] Commented: (CASSANDRA-1705) CQL writes (aka UPDATE)
[ https://issues.apache.org/jira/browse/CASSANDRA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978007#action_12978007 ] Eric Evans commented on CASSANDRA-1705: --- I'm still not sure I understand how this is an improvement, perhaps it's just a matter of taste. I can't see how it hurts anything either though so it's committed. Thanks Pavel! CQL writes (aka UPDATE) --- Key: CASSANDRA-1705 URL: https://issues.apache.org/jira/browse/CASSANDRA-1705 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: CASSANDRA-1705.patch Original Estimate: 0h Remaining Estimate: 0h CQL specification and implementation for data manipulation. This corresponds to the following RPC methods: * insert() * batch_mutate() (writes, not deletes) The initial check-in to trunk/ uses a syntax that looks like: {code:SQL} UPDATE CF [USING CONSISTENCY.LVL] WITH ROW(key, COLUMN(name, value)[, COLUMN(...)])[ AND ROW(...)]; {code} Where: * CF is the column family name. * Rows are a parenthesized expressions with comma separated arguments for a key and one or more columns. * Columns are a parenthesized expressions with comma separated arguments for the name and value (timestamp is inaccessible). What is still undone: * Complete test coverage And of course, all of this is still very much open to further discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1705) CQL writes (aka UPDATE)
[ https://issues.apache.org/jira/browse/CASSANDRA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans updated CASSANDRA-1705: -- Assignee: (was: Pavel Yaskevich) CQL writes (aka UPDATE) --- Key: CASSANDRA-1705 URL: https://issues.apache.org/jira/browse/CASSANDRA-1705 Project: Cassandra Issue Type: Sub-task Components: API Affects Versions: 0.8 Reporter: Eric Evans Priority: Minor Fix For: 0.8 Attachments: CASSANDRA-1705.patch Original Estimate: 0h Remaining Estimate: 0h CQL specification and implementation for data manipulation. This corresponds to the following RPC methods: * insert() * batch_mutate() (writes, not deletes) The initial check-in to trunk/ uses a syntax that looks like: {code:SQL} UPDATE CF [USING CONSISTENCY.LVL] WITH ROW(key, COLUMN(name, value)[, COLUMN(...)])[ AND ROW(...)]; {code} Where: * CF is the column family name. * Rows are a parenthesized expressions with comma separated arguments for a key and one or more columns. * Columns are a parenthesized expressions with comma separated arguments for the name and value (timestamp is inaccessible). What is still undone: * Complete test coverage And of course, all of this is still very much open to further discussion. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055677 - /cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g
Author: eevans Date: Wed Jan 5 22:55:24 2011 New Revision: 1055677 URL: http://svn.apache.org/viewvc?rev=1055677view=rev Log: move term-pair parse and map update to separate rule Patch by Pavel Yaskevich (w/ minor changes); reviewed by eevans for CASSANDRA-1705 Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1055677r1=1055676r2=1055677view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original) +++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Wed Jan 5 22:55:24 2011 @@ -127,7 +127,7 @@ updateStatement returns [UpdateStatement } K_UPDATE columnFamily=IDENT (K_USING K_CONSISTENCY '.' K_LEVEL { cLevel = ConsistencyLevel.valueOf($K_LEVEL.text); })? - K_SET c1=term '=' v1=term { columns.put(c1, v1); } (',' cN=term '=' vN=term { columns.put(cN, vN); })* + K_SET termPair[columns] (',' termPair[columns])* K_WHERE K_KEY '=' key=term endStmnt { return new UpdateStatement($columnFamily.text, cLevel, columns, key); @@ -172,6 +172,11 @@ termList returns [ListTerm items] t1=term { $items.add(t1); } (',' tN=term { $items.add(tN); })* ; +// term = term +termPair[MapTerm, Term columns] +: key=term '=' value=term { columns.put(key, value); } +; + // Note: ranges are inclusive so = and , and and = all have the same semantics. relation returns [Relation rel] : { Term entity = new Term(KEY, STRING_LITERAL); }
[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future
[ https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978029#action_12978029 ] Ryan King commented on CASSANDRA-1935: -- It seems like we should probably abort in this case, but that might be a bit draconian. Refuse to open SSTables from the future --- Key: CASSANDRA-1935 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Priority: Minor Fix For: 0.8 If somebody has rolled back to a previous version of Cassandra that is unable to read an SSTable written by a future version correctly (indicated by a version change), failing fast is safer than accidentally performing a compaction that rewrites incorrect data and leaves you in an odd state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[Cassandra Wiki] Update of Operations by BrandonWilli ams
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The Operations page has been changed by BrandonWilliams. The comment on this change is: Update python to start with token 0. http://wiki.apache.org/cassandra/Operations?action=diffrev1=72rev2=73 -- Here's a python program which can be used to calculate new tokens for the nodes. There's more info on the subject at Ben Black's presentation at Cassandra Summit 2010. http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010 def tokens(nodes): - for i in range(1, nodes + 1): + for x in xrange(nodes): - print (i * (2 ** 127 - 1) / nodes) + print 2 ** 127 / nodes * x There's also `nodetool loadbalance`: essentially a convenience over decommission + bootstrap, only instead of telling the target node where to move on the ring it will choose its location based on the same heuristic as Token selection on bootstrap. You should not use this as it doesn't rebalance the entire ring.
[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future
[ https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978063#action_12978063 ] Jonathan Ellis commented on CASSANDRA-1935: --- Agreed that we should abort startup. (Isn't that what fail fast means?) Refuse to open SSTables from the future --- Key: CASSANDRA-1935 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Priority: Minor Fix For: 0.8 If somebody has rolled back to a previous version of Cassandra that is unable to read an SSTable written by a future version correctly (indicated by a version change), failing fast is safer than accidentally performing a compaction that rewrites incorrect data and leaves you in an odd state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1942) upgrade to high-scale-lib (home of NBHM and NBHS) 1.1.2
upgrade to high-scale-lib (home of NBHM and NBHS) 1.1.2 --- Key: CASSANDRA-1942 URL: https://issues.apache.org/jira/browse/CASSANDRA-1942 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Jonathan Ellis Priority: Minor Fix For: 0.8 Stephen Connolly gives a summary of changes in CASSANDRA-1888. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository
[ https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978066#action_12978066 ] Jonathan Ellis commented on CASSANDRA-1888: --- created CASSANDRA-1942 to upgrade to 1.1.2 in 0.8 Replace lib/high-scale-lib.jar with equivalent from maven central repository Key: CASSANDRA-1888 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888 Project: Cassandra Issue Type: Improvement Affects Versions: 0.7.0 rc 3 Reporter: Stephen Connolly Assignee: Stephen Connolly Fix For: 0.7.1 As part of my effort to get Cassandra published to Maven Central, there are a number of libraries which Cassandra depends on but which are not available in Maven Central. Perhaps the most interesting of these is the Public Domain high-scale-lib.jar The author is an XML build tool hater (and that includes ANT), and the artifact itself contains a lot of unusual cruft... .CVS folders, etc. The build process uses a build.java, that effectively is a rewrite of Make in java with the Makefile embedded in the build.java. I have rebuilt the artifacts and published them to the Maven Central repository. As part of the requirements for publishing to Maven Central are to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was easier to take the source code and transform it into a Maven project. The project is hosted at github: http://stephenc.github.com/high-scale-lib I have published the following versions, all signed with by steph...@apache.org PGP key 1.0.0 1.0.1 1.1.0 1.1.1 1.1.2 These should all be equivalent to the releases by Cliff Click, with the only exception being 1.1.1. For 1.1.1 Cliff's original build script did not run the Unit tests correctly, one of the unit tests consistently fails even on his build process due to an invalid assumption that element ordering is preserved across serialization for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported the test change. The code however remains as is. In any case, can we change the version of high-scale-lib.jar in the lib directory to the version from maven central http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar [The current version used by Cassandra is 1.1.1] Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that this could be considered riskier] My justification for the change is so that I can be sure that consumers of a Maven Central distribution of Cassandra will have exactly the same dependencies, which have been tested as part of the Cassandra release process, and not just the Stephen's very damn sure they are the same dependencies ;-) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)
[ https://issues.apache.org/jira/browse/CASSANDRA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-1939: -- Reviewer: jbellis Affects Version/s: (was: 0.7.0) (was: 0.7.0 rc 3) (was: 0.7.0 rc 2) (was: 0.7.0 rc 1) (was: 0.7.1) (was: 0.8) 0.7 beta 3 committed Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index) --- Key: CASSANDRA-1939 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7 beta 3 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.7.0 Attachments: 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch ByteBuffer.arrayOffset() should not be added to the argument of an absolute get. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
svn commit: r1055703 - /cassandra/trunk/doc/cql/CQL.textile
Author: eevans Date: Thu Jan 6 01:33:24 2011 New Revision: 1055703 URL: http://svn.apache.org/viewvc?rev=1055703view=rev Log: code markup inside a link upsets mylyn Patch by eevans Modified: cassandra/trunk/doc/cql/CQL.textile Modified: cassandra/trunk/doc/cql/CQL.textile URL: http://svn.apache.org/viewvc/cassandra/trunk/doc/cql/CQL.textile?rev=1055703r1=1055702r2=1055703view=diff == --- cassandra/trunk/doc/cql/CQL.textile (original) +++ cassandra/trunk/doc/cql/CQL.textile Thu Jan 6 01:33:24 2011 @@ -127,7 +127,7 @@ h3. Specifying Columns bc. DELETE [COLUMNS] ... -Following the @DELETE@ keyword is an optional comma-delimited list of column name terms. When no column names are specified, the remove applies to the entire row(s) matched by the @WHERE@ clause:#deleterows +Following the @DELETE@ keyword is an optional comma-delimited list of column name terms. When no column names are specified, the remove applies to the entire row(s) matched by the WHERE clause:#deleterows h3. Column Family
[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future
[ https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978076#action_12978076 ] Ryan King commented on CASSANDRA-1935: -- What about scenarios outside startup, like streaming? Refuse to open SSTables from the future --- Key: CASSANDRA-1935 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Priority: Minor Fix For: 0.8 If somebody has rolled back to a previous version of Cassandra that is unable to read an SSTable written by a future version correctly (indicated by a version change), failing fast is safer than accidentally performing a compaction that rewrites incorrect data and leaves you in an odd state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future
[ https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978095#action_12978095 ] Jonathan Ellis commented on CASSANDRA-1935: --- Streaming mostly doesn't work across different versions anyway, so I would be in favor of gossiping the Cassandra version and requiring matching versions to stream. Refuse to open SSTables from the future --- Key: CASSANDRA-1935 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Priority: Minor Fix For: 0.8 If somebody has rolled back to a previous version of Cassandra that is unable to read an SSTable written by a future version correctly (indicated by a version change), failing fast is safer than accidentally performing a compaction that rewrites incorrect data and leaves you in an odd state. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Williams updated CASSANDRA-1943: Fix Version/s: (was: 0.7.0) 0.7.1 Addition of internode buffering broke Streaming --- Key: CASSANDRA-1943 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Priority: Critical Fix For: 0.7.1 Adding internode buffering broke StreamingTransferTest in the 0.7.0 branch. Bisected to r1055313 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (CASSANDRA-1943) Addition of internode buffering broke Streaming
Addition of internode buffering broke Streaming --- Key: CASSANDRA-1943 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Priority: Critical Fix For: 0.7.0 Adding internode buffering broke StreamingTransferTest in the 0.7.0 branch. Bisected to r1055313 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-1943: Description: Adding internode buffering broke StreamingTransferTest in the 0.7 branch. Bisected to r1055313 (was: Adding internode buffering broke StreamingTransferTest in the 0.7.0 branch. Bisected to r1055313) Addition of internode buffering broke Streaming --- Key: CASSANDRA-1943 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Priority: Critical Fix For: 0.7.1 Adding internode buffering broke StreamingTransferTest in the 0.7 branch. Bisected to r1055313 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-1943: Attachment: 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt Streaming connections don't use the InputStream implementation to read their data: they bypass all buffering and use the SocketChannel directly. By buffering immediately after opening the connection, we were buffering in the beginning of the streamed file. Addition of internode buffering broke Streaming --- Key: CASSANDRA-1943 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Priority: Critical Fix For: 0.7.1 Attachments: 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt Adding internode buffering broke StreamingTransferTest in the 0.7 branch. Bisected to r1055313 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming
[ https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-1943: Fix Version/s: 0.8 Also affects trunk. Addition of internode buffering broke Streaming --- Key: CASSANDRA-1943 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943 Project: Cassandra Issue Type: Bug Reporter: Stu Hood Priority: Critical Fix For: 0.7.1, 0.8 Attachments: 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt Adding internode buffering broke StreamingTransferTest in the 0.7 branch. Bisected to r1055313 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (CASSANDRA-1848) Separate thrift and avro classes from cassandra's jar
[ https://issues.apache.org/jira/browse/CASSANDRA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Evans reassigned CASSANDRA-1848: - Assignee: Eric Evans Separate thrift and avro classes from cassandra's jar - Key: CASSANDRA-1848 URL: https://issues.apache.org/jira/browse/CASSANDRA-1848 Project: Cassandra Issue Type: Improvement Components: Packaging Affects Versions: 0.7.0 rc 2 Reporter: Tristan Tarrant Assignee: Eric Evans Priority: Trivial Fix For: 0.8 Attachments: CASSANDRA-1848.patch, CASSANDRA-1848_with_hadoop.patch Original Estimate: 0h Remaining Estimate: 0h Most client applications written in Java include the full apache-cassandra-x.y.z.jar in their classpath. I propose to separate the avro and thrift classes into separate jars. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (CASSANDRA-1472) Add bitmap secondary indexes
[ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stu Hood updated CASSANDRA-1472: Attachment: 0.7-1472-v5.tgz Attaching a version of 1472-v5 rebased for the 0.7 branch. Add bitmap secondary indexes Key: CASSANDRA-1472 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Assignee: Stu Hood Fix For: 0.7.1 Attachments: 0.7-1472-v5.tgz, 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt Bitmap indexes are a very efficient structure for dealing with immutable data. We can take advantage of the fact that SSTables are immutable by attaching them directly to SSTables as a new component (supported by CASSANDRA-1471). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978155#action_12978155 ] Stu Hood commented on CASSANDRA-674: Indexes for individual rows are gone, since the global index allows random access... ^ This wouldn't be useful to cache? in the situation you only want a small range of columns? That information is outdated: it's from the original implementation. But yes... we will want to keep the index in app memory or page cache. Roughly how large would the actual chunk be? This is the unit of deserialization right? The span is the unit of deserialization (made up of at most 1 chunk per level), and its size would be 100% configurable. The main question is how frequently to index the spans in the sstable index: does each span get an index entry? or only the first span of a row (this is our approach in the current implementation). So if you are doing a range query on a very wide row how do you know when to stop processing chunks? By looking at the global index: if all spans get entries in the index, you know the last interesting span. Let me know if this is wrong, but this design opens the cassandra data model to contain arbitrarily nested data. Given the complexity we already have surrounding the supercolumn concept do you think this is the right way forward? The super column concept is only confusing _because_ we call them supercolumns rather than just calling them compound column names. People use them, and the consensus I've heard is that they are useful. If we assume we keep the datamodel as is how can we simplify the open ended-ness of your design to make the approach fit our current data model. The only difference is what you call the structures, and whether you put arbitrary limits on the nesting: I'm open to suggestions. New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Fix For: 0.8 Attachments: 674-v1.diff, perf-674-v1.txt, perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. The implementation has a bunch of issues/fixmes, which I'll describe in the comments. The file format is described in the javadoc for the o.a.c.io.SSTableWriter class, but briefly: * Blocks are opaque (except for their header) so that they can be compressed. The index file contains an entry for the first key in every Block. Blocks contain Slices. * Slices are series of columns with the same parents and (deletion) metadata. They can be used to represent ColumnFamilies or SuperColumns (or a slice of columns at any other depth). A single CF can be split across multiple Slices, which can be split across multiple blocks. * Neither Slices nor Blocks have a fixed size or maximum length, but they each have target lengths which can be stretched and broken by very large columns. The most interesting concepts from this patch are: * Block compression is possible (currently using GZIP, which has one bug mentioned in the comments), * Compaction involves merging intersecting Slices from input SSTables. Since large rows will be broken down into multiple slices, only the portions of rows that intersect between tables need to be deserialized/merged/held-in-memory, * Indexes for individual rows are gone, since the global index allows random access to the middle of column families that span Blocks, and Slices allow batches of columns to be skipped within a Block. * Bloom filters for individual rows are gone, and the global filter contains ColumnKeys instead, meaning that a query for a column that doesn't exist in a row that does will often not need to seek to the row. * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) for columns are defined recursively, so deeply nested slices are possible, * Slices representing a single parent (CF, SC, etc) can have different Metadata, meaning that a tombstone Slice from d-f could sit between Slices containing columns a-c and g-h. This allows for eventually consistent range deletes of columns. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (CASSANDRA-674) New SSTable Format
[ https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978155#action_12978155 ] Stu Hood edited comment on CASSANDRA-674 at 1/6/11 1:17 AM: Indexes for individual rows are gone, since the global index allows random access... ^ This wouldn't be useful to cache? in the situation you only want a small range of columns? That information is outdated: it's from the original implementation. But yes... we will want to keep the index in app memory or page cache. Roughly how large would the actual chunk be? This is the unit of deserialization right? The span is the unit of deserialization (made up of at most 1 chunk per level), and its size would be 100% configurable. The main question is how frequently to index the spans in the sstable index: does each span get an index entry? or only the first span of a row (this is our approach in the current implementation). EDIT: Sorry... the span is symbolic: you would deserialize the first chunk of the span (containing the keys) to decide whether to skip the rest of the chunks in the span. So if you are doing a range query on a very wide row how do you know when to stop processing chunks? By looking at the global index: if all spans get entries in the index, you know the last interesting span. Let me know if this is wrong, but this design opens the cassandra data model to contain arbitrarily nested data. Given the complexity we already have surrounding the supercolumn concept do you think this is the right way forward? The super column concept is only confusing _because_ we call them supercolumns rather than just calling them compound column names. People use them, and the consensus I've heard is that they are useful. If we assume we keep the datamodel as is how can we simplify the open ended-ness of your design to make the approach fit our current data model. The only difference is what you call the structures, and whether you put arbitrary limits on the nesting: I'm open to suggestions. was (Author: stuhood): Indexes for individual rows are gone, since the global index allows random access... ^ This wouldn't be useful to cache? in the situation you only want a small range of columns? That information is outdated: it's from the original implementation. But yes... we will want to keep the index in app memory or page cache. Roughly how large would the actual chunk be? This is the unit of deserialization right? The span is the unit of deserialization (made up of at most 1 chunk per level), and its size would be 100% configurable. The main question is how frequently to index the spans in the sstable index: does each span get an index entry? or only the first span of a row (this is our approach in the current implementation). So if you are doing a range query on a very wide row how do you know when to stop processing chunks? By looking at the global index: if all spans get entries in the index, you know the last interesting span. Let me know if this is wrong, but this design opens the cassandra data model to contain arbitrarily nested data. Given the complexity we already have surrounding the supercolumn concept do you think this is the right way forward? The super column concept is only confusing _because_ we call them supercolumns rather than just calling them compound column names. People use them, and the consensus I've heard is that they are useful. If we assume we keep the datamodel as is how can we simplify the open ended-ness of your design to make the approach fit our current data model. The only difference is what you call the structures, and whether you put arbitrary limits on the nesting: I'm open to suggestions. New SSTable Format -- Key: CASSANDRA-674 URL: https://issues.apache.org/jira/browse/CASSANDRA-674 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Stu Hood Fix For: 0.8 Attachments: 674-v1.diff, perf-674-v1.txt, perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt Various tickets exist due to limitations in the SSTable file format, including #16, #47 and #328. Attached is a proposed design/implementation of a new file format for SSTables that addresses a few of these limitations. The implementation has a bunch of issues/fixmes, which I'll describe in the comments. The file format is described in the javadoc for the o.a.c.io.SSTableWriter class, but briefly: * Blocks are opaque (except for their header) so that they can be compressed. The index file contains an entry for the first key in every Block. Blocks contain Slices. * Slices are series of columns with the same parents and (deletion) metadata. They can be used to represent ColumnFamilies or