[Cassandra Wiki] Update of FileFormatDesignDoc by Stu Hood

2011-01-05 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FileFormatDesignDoc page has been changed by StuHood.
The comment on this change is: Clarified the metadata discussion, added 
horizontal rules.
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=15rev2=16

--

   * Space efficient when un-compressed: remove redundancy
   * Random access to the middle of wide rows
   * Arbitrary nesting
+  * Range tombstones (for range/slice deletes)
  
  == Influences ==
  
   * Google Dremel [1] - Arbitrarily nested, field-oriented serialization
   * Hive RCFile [2] - Column-group-oriented storage
+ 
+ 
  
  == Current Implementation ==
  
@@ -48, +51 @@

  
  Finally, there is a second type of redundancy that the current design does 
not tackle: the column names at level name2 are frequently repeated, but 
since rows are stored independently, we don't normalize those values. For 
narrow rows (like those shown), removing this redundancy will be our largest 
win.
  
- == High level ==
+  Metadata 
+ 
+ Metadata is currently implemented such that column parents have metadata that 
covers their entire range: this means that you cannot delete arbitrary slices, 
only exact keys or names.
+ 
+ 
+ 
+ == Proposed Implementation ==
  
  Because we will be storing multiple columns per SSTable, our design will bear 
the most similarity to RCFile (rather than the column-per-file approach taken 
in Dremel). But because we allow for nesting via super columns (and hopefully 
with a more flexible representation in the future), we need to take hints from 
Dremel's serialization to allow for efficient storage of parent and null 
information.
  
  === Vertical chunks ===
  
- Rather than slicing the span into chunks horizontally, we will use vertical 
chunks (and horizontal chunks only when necessary for particularly wide rows):
+ Rather than slicing the span into chunks horizontally, we will use vertical 
chunks (and break particularly wide rows into multiple spans):
  
  || ''row key'' ||
  || cheese  ||
@@ -122, +131 @@

  
  The parent change flag can be represented compactly using a bitmap, and field 
lengths can be packed tightly into group-varint encoded arrays [3], as alluded 
to in the Dremel paper, and mentioned in Jeff Dean's talks.
  
- === Metadata ===
- 
- Cassandra also needs to encode metadata about tuples and ranges of tuples, in 
order to represent creation and deletion timestamps: range tuples can be 
encoded in a similar fashion to the value tuples represented here, and the 
metadata timestamps can be group-varint encoded.
- 
  === Field reordering ===
  
  One weakness of the implementation so far is that it doesn't allow tuples to 
be reordered within a level. This approach performs well for wide rows with 
high field cardinality, since adding compression is unlikely to remove data.
@@ -154, +159 @@

  
  === Summary ===
  
- The final (simplified) representation of the span is:
+ A (simplified) representation of the span so far (without metadata) is:
  
  ''(parent-ordered)''
  || ''row key'' || ''parent_change'' ||
@@ -184, +189 @@

  || || 0 ||
  || china || 1 ||
  
+ == Metadata ==
+ 
+ Cassandra also needs to encode metadata about tuples and ranges of tuples in 
order to represent creation and deletion timestamps. For both value tuples and 
range tuples, a varying number (depending on value and range type) of 
timestamps will also need to be encoded.
+ 
+ === Range Metadata ===
+ 
+ Range tuples can be encoded in a very similar fashion to the value tuples 
represented above, except that they always come in pairs. It will likely make 
sense to store them in a separate blob from the value tuples, since they will 
bear very little similarity to one another (TODO: need to confirm with an 
anecdote or two).
+ 
+ || ''name1'' - ''left'' || ''name1'' - ''right''  || ''parent_change'' ||
+ || havarti || muenster || 0 ||
+ || || || 1 ||
+ 
+ This example shows a range tombstone for values at level name1 between 
'havarti' and 'muenster': the chunk for the name1 level stores a pair of 
range tuples for the 'cheese' parent and a nulls are stored for parents without 
any range metadata. The end result is that the span stores a tombstone from 
('cheese', 'havarti', empty) to ('cheese', 'muenster', null), where empty 
is the smallest value, and null is the largest value.
+ 
+ Note that it is not possible for ranges for a parent to overlap: in this 
case, the ranges would be resolved such that the intersection was given the 
winning timestamp, and the two remainders would use their original timestamps.
+ 
+  Effect of ordering 
+ 
+ When a chunk is marked as ''self'' ordered, range metadata should be affected 
as well: therefore, the number of ranges that need to be represented in a chunk 
should also factor into the cardinality threshold that toggles a chunk between 

[jira] Commented: (CASSANDRA-674) New SSTable Format

2011-01-05 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977697#action_12977697
 ] 

Stu Hood commented on CASSANDRA-674:


 How will ranges be stored? The parent ordering would mean the sorting of data 
 at that level is lost no?
Added some explanation of how I think ranges should work to the wiki. 
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=15rev2=16

 Are chunks broken up by size only?
Technically spans are the largest unit, so they define the boundaries: tried 
to clarify this part as well. There are a few possible thresholds, including a 
max number of rows, columns, range tombstones or total bytes in the span.

One semi-undefined portion is what happens when a row is larger than can be 
stuffed in a span. Most likely we'll want to use the range metadata to indicate 
the portion of the row covered by the span (the approach I took in the original 
implementation attached here).

 Will the metadata be ripe for caching?
I don't think so: the metadata is useless on it's own. It only becomes useful 
when it is attached to data (a column or to a range), so there is no reason to 
cache the meta- independently of the data.

Thanks!

 New SSTable Format
 --

 Key: CASSANDRA-674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 674-v1.diff, perf-674-v1.txt, 
 perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt


 Various tickets exist due to limitations in the SSTable file format, 
 including #16, #47 and #328. Attached is a proposed design/implementation of 
 a new file format for SSTables that addresses a few of these limitations. The 
 implementation has a bunch of issues/fixmes, which I'll describe in the 
 comments.
 The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
 class, but briefly:
  * Blocks are opaque (except for their header) so that they can be 
 compressed. The index file contains an entry for the first key in every 
 Block. Blocks contain Slices.
  * Slices are series of columns with the same parents and (deletion) 
 metadata. They can be used to represent ColumnFamilies or SuperColumns (or a 
 slice of columns at any other depth). A single CF can be split across 
 multiple Slices, which can be split across multiple blocks.
  * Neither Slices nor Blocks have a fixed size or maximum length, but they 
 each have target lengths which can be stretched and broken by very large 
 columns.
 The most interesting concepts from this patch are:
  * Block compression is possible (currently using GZIP, which has one bug 
 mentioned in the comments),
  * Compaction involves merging intersecting Slices from input SSTables. Since 
 large rows will be broken down into multiple slices, only the portions of 
 rows that intersect between tables need to be 
 deserialized/merged/held-in-memory,
  * Indexes for individual rows are gone, since the global index allows random 
 access to the middle of column families that span Blocks, and Slices allow 
 batches of columns to be skipped within a Block.
  * Bloom filters for individual rows are gone, and the global filter contains 
 ColumnKeys instead, meaning that a query for a column that doesn't exist in a 
 row that does will often not need to seek to the row.
  * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) 
 for columns are defined recursively, so deeply nested slices are possible,
  * Slices representing a single parent (CF, SC, etc) can have different 
 Metadata, meaning that a tombstone Slice from d-f could sit between Slices 
 containing columns a-c and g-h. This allows for eventually consistent range 
 deletes of columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Cassandra Wiki] Trivial Update of FileFormatDesignDoc by StuHood

2011-01-05 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FileFormatDesignDoc page has been changed by StuHood.
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=16rev2=17

--

  
  == Metadata ==
  
- Cassandra also needs to encode metadata about tuples and ranges of tuples in 
order to represent creation and deletion timestamps. For both value tuples and 
range tuples, a varying number (depending on value and range type) of 
timestamps will also need to be encoded.
+ Cassandra also needs to encode metadata about tuples and ranges of tuples in 
order to represent creation and deletion timestamps. For both value tuples and 
range tuples, a varying number (depending on value and range type) of 
timestamps will need to be encoded.
  
  === Range Metadata ===
  


[Cassandra Wiki] Trivial Update of FileFormatDesignDoc by StuHood

2011-01-05 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The FileFormatDesignDoc page has been changed by StuHood.
http://wiki.apache.org/cassandra/FileFormatDesignDoc?action=diffrev1=17rev2=18

--

  
  Range tuples can be encoded in a very similar fashion to the value tuples 
represented above, except that they always come in pairs. It will likely make 
sense to store them in a separate blob from the value tuples, since they will 
bear very little similarity to one another (TODO: need to confirm with an 
anecdote or two).
  
- || ''name1'' - ''left'' || ''name1'' - ''right''  || ''parent_change'' ||
+ || ''name1'' - ''left'' || ''name1'' - ''right''  || ''parent_change'' || 
''null?'' ||
- || havarti || muenster || 0 ||
+ || havarti || muenster || 0 || 0 ||
- || || || 1 ||
+ || || || 1 || 1 ||
  
  This example shows a range tombstone for values at level name1 between 
'havarti' and 'muenster': the chunk for the name1 level stores a pair of 
range tuples for the 'cheese' parent and a nulls are stored for parents without 
any range metadata. The end result is that the span stores a tombstone from 
('cheese', 'havarti', empty) to ('cheese', 'muenster', null), where empty 
is the smallest value, and null is the largest value.
  


[jira] Commented: (CASSANDRA-674) New SSTable Format

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1296#action_1296
 ] 

T Jake Luciani commented on CASSANDRA-674:
--

bq. the metadata is useless on it's own. It only becomes useful when it is 
attached to data (a column or to a range), so there is no reason to cache the 
meta- independently of the data.

But above you mention:
{code}
Indexes for individual rows are gone, since the global index allows random 
access to the middle of column families that span Blocks, and Slices allow 
batches of columns to be skipped within a Block.
{code}

^ This wouldn't be useful to cache? in the situation you only want a small 
range of columns? 

- More questions 
Roughly how large would the actual chunk be? This is the unit of 
deserialization right? or can avro deserialize only part of a structure?

So if you are doing a range query on a very wide row how do you know when to 
stop processing chunks? do you keep going till you hit the sentinel value 
empty ?





 New SSTable Format
 --

 Key: CASSANDRA-674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 674-v1.diff, perf-674-v1.txt, 
 perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt


 Various tickets exist due to limitations in the SSTable file format, 
 including #16, #47 and #328. Attached is a proposed design/implementation of 
 a new file format for SSTables that addresses a few of these limitations. The 
 implementation has a bunch of issues/fixmes, which I'll describe in the 
 comments.
 The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
 class, but briefly:
  * Blocks are opaque (except for their header) so that they can be 
 compressed. The index file contains an entry for the first key in every 
 Block. Blocks contain Slices.
  * Slices are series of columns with the same parents and (deletion) 
 metadata. They can be used to represent ColumnFamilies or SuperColumns (or a 
 slice of columns at any other depth). A single CF can be split across 
 multiple Slices, which can be split across multiple blocks.
  * Neither Slices nor Blocks have a fixed size or maximum length, but they 
 each have target lengths which can be stretched and broken by very large 
 columns.
 The most interesting concepts from this patch are:
  * Block compression is possible (currently using GZIP, which has one bug 
 mentioned in the comments),
  * Compaction involves merging intersecting Slices from input SSTables. Since 
 large rows will be broken down into multiple slices, only the portions of 
 rows that intersect between tables need to be 
 deserialized/merged/held-in-memory,
  * Indexes for individual rows are gone, since the global index allows random 
 access to the middle of column families that span Blocks, and Slices allow 
 batches of columns to be skipped within a Block.
  * Bloom filters for individual rows are gone, and the global filter contains 
 ColumnKeys instead, meaning that a query for a column that doesn't exist in a 
 row that does will often not need to seek to the row.
  * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) 
 for columns are defined recursively, so deeply nested slices are possible,
  * Slices representing a single parent (CF, SC, etc) can have different 
 Metadata, meaning that a tombstone Slice from d-f could sit between Slices 
 containing columns a-c and g-h. This allows for eventually consistent range 
 deletes of columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977783#action_12977783
 ] 

Gary Dusbabek commented on CASSANDRA-1710:
--

* returnConnection() possibly closes a connection and then returns the [maybe] 
closed connection back to the pool.  Does this mean it is possible to borrow a 
closed connection?
* it looks like the size of the pool can be artificially inflated by creating 
new Connections outside of the pool and then returning them to the pool.
* EvictionTask closes Connections that may already be closed.  IIRC this will 
generate a Thrift exception when the transport is double-closed.

Since the pool doesn't know the state of the connection does it makes sense to 
add isClosed() to the connection API?

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-674) New SSTable Format

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977785#action_12977785
 ] 

T Jake Luciani commented on CASSANDRA-674:
--



Let me know if this is wrong, but this design opens the cassandra data model to 
contain arbitrarily nested data.

Given the complexity we already have surrounding the supercolumn concept do you 
think this is the right way forward?  
As much as my inner geek wants to build a tree or graph model I don't think the 
C* community or committers want to take it this way.

If we assume we keep the datamodel as is how can we simplify the open 
ended-ness of your design to make the approach fit our current data model.

 New SSTable Format
 --

 Key: CASSANDRA-674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 674-v1.diff, perf-674-v1.txt, 
 perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt


 Various tickets exist due to limitations in the SSTable file format, 
 including #16, #47 and #328. Attached is a proposed design/implementation of 
 a new file format for SSTables that addresses a few of these limitations. The 
 implementation has a bunch of issues/fixmes, which I'll describe in the 
 comments.
 The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
 class, but briefly:
  * Blocks are opaque (except for their header) so that they can be 
 compressed. The index file contains an entry for the first key in every 
 Block. Blocks contain Slices.
  * Slices are series of columns with the same parents and (deletion) 
 metadata. They can be used to represent ColumnFamilies or SuperColumns (or a 
 slice of columns at any other depth). A single CF can be split across 
 multiple Slices, which can be split across multiple blocks.
  * Neither Slices nor Blocks have a fixed size or maximum length, but they 
 each have target lengths which can be stretched and broken by very large 
 columns.
 The most interesting concepts from this patch are:
  * Block compression is possible (currently using GZIP, which has one bug 
 mentioned in the comments),
  * Compaction involves merging intersecting Slices from input SSTables. Since 
 large rows will be broken down into multiple slices, only the portions of 
 rows that intersect between tables need to be 
 deserialized/merged/held-in-memory,
  * Indexes for individual rows are gone, since the global index allows random 
 access to the middle of column families that span Blocks, and Slices allow 
 batches of columns to be skipped within a Block.
  * Bloom filters for individual rows are gone, and the global filter contains 
 ColumnKeys instead, meaning that a query for a column that doesn't exist in a 
 row that does will often not need to seek to the row.
  * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) 
 for columns are defined recursively, so deeply nested slices are possible,
  * Slices representing a single parent (CF, SC, etc) can have different 
 Metadata, meaning that a tombstone Slice from d-f could sit between Slices 
 containing columns a-c and g-h. This allows for eventually consistent range 
 deletes of columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: v2-0002-compile-driver-source.txt

v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977789#action_12977789
 ] 

Eric Evans commented on CASSANDRA-1710:
---

bq. returnConnection() possibly closes a connection and then returns the maybe 
closed connection back to the pool. Does this mean it is possible to borrow a 
closed connection?

Auh, right you are.  That block is missing a {{return}}.

{quote}
•  it looks like the size of the pool can be artificially inflated by 
creating new Connections outside of the pool and then returning them to the 
pool.
•  EvictionTask closes Connections that may already be closed. IIRC 
this will generate a Thrift exception when the transport is double-closed.
{quote}

Good catches.  v2 patches attached.

Thanks!

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977793#action_12977793
 ] 

Gary Dusbabek commented on CASSANDRA-1710:
--

Changes look good.  

One more thing I realized: careless uses could close() a Connection before 
returning it to the pool, which if not full, would add a closed connection.

Depending on how much you want to guard the API you could check connection 
status before returning it to the queue or emit derived Connections that have 
their close() methods overwritten to be no-ops (or log a WARN), so that only 
the pool could call the real close().

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Sylvain Lebresne (JIRA)
Fit partitioned counter directly into CounterColumn.value 
--

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8


The current implementation of CounterColumn keeps both the partitioned
counter and the total value of the counter (that is, the sum of the parts of
the partitioned counter).
This waste space and this requires the code to keep both representation in
sync. This ticket propose to remove the total value from the representation
and to only calculate it when returning the value to the client.

NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1937) Keep partitioned counters (contexts) sorted

2011-01-05 Thread Sylvain Lebresne (JIRA)
Keep partitioned counters (contexts) sorted
-

 Key: CASSANDRA-1937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1937
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8


In the value of CounterColumns, the code keep the subpart unsorted, but sort
them 'on the fly' when needed (in diff() and merge()). It will be more
efficient to keep the parts always sorted (it will also be easier in that it
will remove the need of the ad-hoc in-place quicksort in CounterContext).

NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-01-05 Thread Sylvain Lebresne (JIRA)
Use UUID as node identifiers in counters instead of IP addresses 
-

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8


The use of IP addresses as node identifiers in the partition of a given
counter is fragile. Changes of the node's IP addresses can result in data
loss. This patch proposes to use UUIDs instead.

NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1936:


Attachment: 0001-Put-partitioned-counter-directly-in-column-value.patch

Patch attached, a few comments:
* since this patch stuffs the context in the column value, who are
  ByteBuffers, a good part of this patch deals with adapting the functions
  of CounterContext to take and return ByteBuffers instead of plain byte[].
* the patch also corrects a few not completely related misuse of absolute
  ByteBuffer's gets. Namely, there was a few occurrences where arrayOffset
  was wrongly added to the provided index, like: bb.getLong(bb.position() +
  bb.arrayOffset()).
* this patch breaks the on-disk file format. Since #1937 and #1938 will do
  too, I'm fine with waiting that both are ready to commit all the 3 patches
  together (I'm hoping to tackle these patches quickly).


 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977800#action_12977800
 ] 

Jonathan Ellis commented on CASSANDRA-1938:
---

Will this also break the on-disk ring persistence we added for CASSANDRA-1518?

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1902) Migrate cached pages during compaction

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977801#action_12977801
 ] 

T Jake Luciani commented on CASSANDRA-1902:
---

Well the good news is the mincore() stuff works via JNA!

I'm re-considering what todo with this information.  It's most efficient to 
keep around contiguous chunks of pages so the plan might be to find the most 
densely populated ranges of pages in the sstable then get the range of rows 
this covered.  Then pass to the SSTableWriter this list which will subsequently 
mark the new data written for these pages as POSIX_FADV_WILLNEED.  The ordering 
of the keys should be close.  

I think in the average case this will get the most active data cached.  It may 
keep too much data in the page cache though.

Any thoughts on this?



 Migrate cached pages during compaction 
 ---

 Key: CASSANDRA-1902
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1902
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.1
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 0.7.1

   Original Estimate: 32h
  Remaining Estimate: 32h

 Post CASSANDRA-1470 there is an opportunity to migrate cached pages from a 
 pre-compacted CF during the compaction process.  
 First, add a method to MmappedSegmentFile: long[] pagesInPageCache() that 
 uses the posix mincore() function to detect the offsets of pages for this 
 file currently in page cache.
 Then add getActiveKeys() which uses underlying pagesInPageCache() to get the 
 keys actually in the page cache.
 use getActiveKeys() to detect which SSTables being compacted are in the os 
 cache and make sure the subsequent pages in the new compacted SSTable are 
 kept in the page cache for these keys. This will minimize the impact of 
 compacting a hot SSTable.
 A simpler yet similar approach is described here: 
 http://insights.oetiker.ch/linux/fadvise/

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977802#action_12977802
 ] 

Jonathan Ellis commented on CASSANDRA-1936:
---

can you break the bytebuffer fixes out into a separate patch so we can apply to 
0.7?

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1938) Use UUID as node identifiers in counters instead of IP addresses

2011-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977806#action_12977806
 ] 

Sylvain Lebresne commented on CASSANDRA-1938:
-

I'd have to look at CASSANDRA-1518 with more details, but for this ticket, I 
intend to keep those node identifiers strictly local, they will not get 
gossiped. So I expect that no, it won't beak on-disk ring persistence. I think 
we may have to gossip them at some point however to deal with ever increasing 
contexts, but I'm not yet completely clear on that and I don't think it's an 
urgent matter.

 Use UUID as node identifiers in counters instead of IP addresses 
 -

 Key: CASSANDRA-1938
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1938
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

   Original Estimate: 56h
  Remaining Estimate: 56h

 The use of IP addresses as node identifiers in the partition of a given
 counter is fragile. Changes of the node's IP addresses can result in data
 loss. This patch proposes to use UUIDs instead.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977807#action_12977807
 ] 

Sylvain Lebresne commented on CASSANDRA-1936:
-

Oh right, forgot this was in 0.7 too. I'll do that.

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977820#action_12977820
 ] 

T Jake Luciani commented on CASSANDRA-1472:
---

I can't seem to figure out how to apply this patchset? git apply just throws 
errors. this is against cassandra-0.7 branch correct?

I untarred the dir and ran git apply 1472/*

 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, 
 v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)

2011-01-05 Thread Sylvain Lebresne (JIRA)
Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)
---

 Key: CASSANDRA-1939
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 3, 0.7.0 rc 2, 0.7.0 rc 1, 0.7.0, 0.7.1, 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.0
 Attachments: 
0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch

ByteBuffer.arrayOffset() should not be added to the argument of an absolute 
get. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)

2011-01-05 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-1939:


Attachment: 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch

 Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the 
 index)
 ---

 Key: CASSANDRA-1939
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.0 rc 1, 0.7.0 rc 2, 0.7.0 rc 3, 0.7.0, 0.7.1, 0.8
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.0

 Attachments: 
 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch


 ByteBuffer.arrayOffset() should not be added to the argument of an absolute 
 get. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977825#action_12977825
 ] 

Jonathan Ellis commented on CASSANDRA-1472:
---

you want git am, apply is just for a single patch

 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, 
 v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: v3-0002-compile-driver-source.txt

v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977829#action_12977829
 ] 

Eric Evans commented on CASSANDRA-1710:
---

Yeah, that falls squarely in Don't Do That territory, but since you cannot 
re-open a closed connection, it makes sense to refuse to re-add them to the 
pool (and to log a warning).

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977832#action_12977832
 ] 

T Jake Luciani commented on CASSANDRA-1472:
---

Now i get this:

error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in 
index
error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist 
in index
error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in 
index
error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does 
not exist in index
error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist 
in index

 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, 
 v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977832#action_12977832
 ] 

T Jake Luciani edited comment on CASSANDRA-1472 at 1/5/11 11:51 AM:


Now i get this:

error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in 
index
error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist 
in index
error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in 
index
error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does 
not exist in index
error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist 
in index


The 0001 patch header looks like this:

 .../org/apache/cassandra/db/CompactionManager.java |   23 +++-
 .../apache/cassandra/io/AbstractCompactedRow.java  |6 +-
 .../org/apache/cassandra/io/ColumnObserver.java|  124 
 .../apache/cassandra/io/CompactionIterator.java|   15 ++-
 .../apache/cassandra/io/LazilyCompactedRow.java|9 +-
 .../org/apache/cassandra/io/PrecompactedRow.java   |   23 +++-
 .../io/sstable/SSTableIdentityIterator.java|   13 ++-
 .../apache/cassandra/io/sstable/SSTableWriter.java |   22 -
 .../cassandra/io/LazilyCompactedRowTest.java   |   19 +++-

  was (Author: tjake):
Now i get this:

error: java/org/apache/cassandra/db/CompactionManager.java: does not exist in 
index
error: java/org/apache/cassandra/io/AbstractCompactedRow.java: does not exist 
in index
error: java/org/apache/cassandra/io/CompactionIterator.java: does not exist in 
index
error: java/org/apache/cassandra/io/LazilyCompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/PrecompactedRow.java: does not exist in 
index
error: java/org/apache/cassandra/io/sstable/SSTableIdentityIterator.java: does 
not exist in index
error: java/org/apache/cassandra/io/sstable/SSTableWriter.java: does not exist 
in index
  
 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, 
 v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977841#action_12977841
 ] 

Jonathan Ellis commented on CASSANDRA-1472:
---

looks like the patch was generated from some weird non-root directory, you 
probably need some combination of -p or --directory (as in git-apply)

 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, 
 v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977846#action_12977846
 ] 

Gary Dusbabek commented on CASSANDRA-1710:
--

+1

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1711) Python driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1711:
--

Attachment: 
v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

 Python driver for CQL
 -

 Key: CASSANDRA-1711
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt, 
 v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055538 - in /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver: Connection.java ConnectionPool.java IConnectionPool.java Utils.java

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 17:24:07 2011
New Revision: 1055538

URL: http://svn.apache.org/viewvc?rev=1055538view=rev
Log:
CASSANDRA-1710 basic connection pooling for java driver

Patch by eevans for CASSANDRA-1710

Added:

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Utils.java
Modified:

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java

Modified: 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java?rev=1055538r1=1055537r2=1055538view=diff
==
--- 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
 (original)
+++ 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
 Wed Jan  5 17:24:07 2011
@@ -1,33 +1,8 @@
-/*
- * 
- * Licensed to the Apache Software Foundation (ASF) under one
- * or more contributor license agreements.  See the NOTICE file
- * distributed with this work for additional information
- * regarding copyright ownership.  The ASF licenses this file
- * to you under the Apache License, Version 2.0 (the
- * License); you may not use this file except in compliance
- * with the License.  You may obtain a copy of the License at
- * 
- *   http://www.apache.org/licenses/LICENSE-2.0
- * 
- * Unless required by applicable law or agreed to in writing,
- * software distributed under the License is distributed on an
- * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- * KIND, either express or implied.  See the License for the
- * specific language governing permissions and limitations
- * under the License.
- * 
- */
 package org.apache.cassandra.cql.driver;
 
-import java.io.ByteArrayOutputStream;
-import java.nio.ByteBuffer;
-import java.util.zip.Deflater;
-
 import org.apache.cassandra.thrift.Cassandra;
 import org.apache.cassandra.thrift.Compression;
 import org.apache.cassandra.thrift.CqlResult;
-import org.apache.cassandra.thrift.CqlRow;
 import org.apache.cassandra.thrift.InvalidRequestException;
 import org.apache.cassandra.thrift.TimedOutException;
 import org.apache.cassandra.thrift.UnavailableException;
@@ -37,105 +12,95 @@ import org.apache.thrift.protocol.TProto
 import org.apache.thrift.transport.TFramedTransport;
 import org.apache.thrift.transport.TSocket;
 import org.apache.thrift.transport.TTransport;
+import org.apache.thrift.transport.TTransportException;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
 
+/** CQL connection object. */
 public class Connection
 {
-private static final Logger logger = 
LoggerFactory.getLogger(Connection.class);
+public static Compression defaultCompression = Compression.GZIP;
+public final String hostName;
+public final int portNo;
 
-public String hostName;
-public int port;
+private static final Logger logger = 
LoggerFactory.getLogger(Connection.class);
+protected long timeOfLastFailure = 0;
+protected int numFailures = 0;
 private Cassandra.Client client;
 private TTransport transport;
-private Compression defaultCompression = Compression.GZIP;
 
-public Connection(String keyspaceName, String...hosts) throws 
InvalidRequestException, TException
+/**
+ * Create a new codeConnection/code instance.
+ * 
+ * @param hostName hostname or IP address of the remote host
+ * @param portNo TCP port number
+ * @throws TTransportException if unable to connect
+ */
+public Connection(String hostName, int portNo) throws TTransportException
 {
-assert hosts.length  0;
+this.hostName = hostName;
+this.portNo = portNo;
 
-for (String hostSpec : hosts)
-{
-String[] parts = hostSpec.split(:, 2);
-this.hostName = parts[0];
-this.port = Integer.parseInt(parts[1]);
-
-// TODO: This will need to do connection pooling.
-break;
-}
-
-TSocket socket = new TSocket(hostName, port);
+TSocket socket = new TSocket(hostName, portNo);
 transport = new TFramedTransport(socket);
 TProtocol protocol = new TBinaryProtocol(transport);
 client = new Cassandra.Client(protocol);
 socket.open();
 
-client.set_keyspace(keyspaceName);
-}
-
-private ByteBuffer compressQuery(String queryStr, Compression compression)
-{
-byte[] data = queryStr.getBytes();
-Deflater compressor = new Deflater();
-compressor.setInput(data);
-compressor.finish();
-
-ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
-  

svn commit: r1055539 - /cassandra/trunk/build.xml

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 17:24:11 2011
New Revision: 1055539

URL: http://svn.apache.org/viewvc?rev=1055539view=rev
Log:
compile driver source

Patch by eevans for CASSANDRA-1710

Modified:
cassandra/trunk/build.xml

Modified: cassandra/trunk/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1055539r1=1055538r2=1055539view=diff
==
--- cassandra/trunk/build.xml (original)
+++ cassandra/trunk/build.xml Wed Jan  5 17:24:11 2011
@@ -26,6 +26,7 @@
 property name=basedir value=./
 property name=build.src value=${basedir}/src/
 property name=build.src.java value=${basedir}/src/java/
+property name=build.src.driver value=${basedir}/drivers/java/src /
 property name=avro.src value=${basedir}/src/avro/
 property name=build.src.gen-java value=${basedir}/src/gen-java/
 property name=build.lib value=${basedir}/lib/
@@ -300,7 +301,8 @@
 src path=${build.src.java}/
 src path=${build.src.gen-java}/
 src path=${interface.thrift.dir}/gen-java/
-classpath refid=cassandra.classpath/
+src path=${build.src.driver} /
+classpath refid=cassandra.classpath/
 /javac
 
 taskdef name=paranamer 
classname=com.thoughtworks.paranamer.ant.ParanamerGeneratorTask




[jira] Commented: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977863#action_12977863
 ] 

Eric Evans commented on CASSANDRA-1710:
---

basic pooling committed.

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Assignee: (was: Eric Evans)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: (was: 
v1-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: v1-0002-compile-driver-source.txt, 
 v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v2-0002-compile-driver-source.txt, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: (was: v1-0002-compile-driver-source.txt)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: (was: v2-0002-compile-driver-source.txt)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1710:
--

Attachment: (was: 
v2-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1940) Twisted driver for CQL

2011-01-05 Thread Eric Evans (JIRA)
Twisted driver for CQL
--

 Key: CASSANDRA-1940
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1940
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8


In-tree CQL drivers should be reasonably consistent with one another (wherever 
possible/practical), and implement a minimum of:

•  Query compression
•  Keyspace assignment on connection
•  Connection pooling / load-balancing

The goal is not to supplant the idiomatic libraries, but to provide a 
consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1940) Twisted driver for CQL

2011-01-05 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-1940:
---

Assignee: Brandon Williams

 Twisted driver for CQL
 --

 Key: CASSANDRA-1940
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1940
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
   •  Query compression
   •  Keyspace assignment on connection
   •  Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1711) Python driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977910#action_12977910
 ] 

Gary Dusbabek commented on CASSANDRA-1711:
--

+1, except needs apache license headers.

 Python driver for CQL
 -

 Key: CASSANDRA-1711
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Assignee: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt, 
 v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055591 - in /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver: Connection.java ConnectionPool.java IConnectionPool.java Utils.java

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 19:23:51 2011
New Revision: 1055591

URL: http://svn.apache.org/viewvc?rev=1055591view=rev
Log:
license headers (java driver source)

Patch by eevans for CASSANDRA-1710

Modified:

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java

cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Utils.java

Modified: 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java?rev=1055591r1=1055590r2=1055591view=diff
==
--- 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
 (original)
+++ 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/Connection.java
 Wed Jan  5 19:23:51 2011
@@ -1,3 +1,24 @@
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 package org.apache.cassandra.cql.driver;
 
 import org.apache.cassandra.thrift.Cassandra;

Modified: 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java?rev=1055591r1=1055590r2=1055591view=diff
==
--- 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java
 (original)
+++ 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/ConnectionPool.java
 Wed Jan  5 19:23:51 2011
@@ -1,3 +1,23 @@
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
 
 package org.apache.cassandra.cql.driver;
 

Modified: 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java?rev=1055591r1=1055590r2=1055591view=diff
==
--- 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java
 (original)
+++ 
cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/driver/IConnectionPool.java
 Wed Jan  5 19:23:51 2011
@@ -1,3 +1,24 @@
+/*
+ * 
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * License); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ * 
+ */
+
 package org.apache.cassandra.cql.driver;
 
 public interface IConnectionPool

Modified: 

svn commit: r1055594 - in /cassandra/trunk: drivers/py/cql/__init__.py drivers/py/cql/connection.py drivers/py/cql/connection_pool.py drivers/py/cql/errors.py test/system/test_cql.py

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 19:27:50 2011
New Revision: 1055594

URL: http://svn.apache.org/viewvc?rev=1055594view=rev
Log:
basic connection pooling for python driver

Patch by eevans; reviewed by gdusbabek for CASSANDRA-1711

Added:
cassandra/trunk/drivers/py/cql/connection.py
cassandra/trunk/drivers/py/cql/connection_pool.py
cassandra/trunk/drivers/py/cql/errors.py
Modified:
cassandra/trunk/drivers/py/cql/__init__.py
cassandra/trunk/test/system/test_cql.py

Modified: cassandra/trunk/drivers/py/cql/__init__.py
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/py/cql/__init__.py?rev=1055594r1=1055593r2=1055594view=diff
==
--- cassandra/trunk/drivers/py/cql/__init__.py (original)
+++ cassandra/trunk/drivers/py/cql/__init__.py Wed Jan  5 19:27:50 2011
@@ -1,79 +1,23 @@
 
-from os.path import exists, abspath, dirname, join
-from thrift.transport import TTransport, TSocket
-from thrift.protocol import TBinaryProtocol
-from thrift.Thrift import TApplicationException
-import zlib
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# License); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an AS IS BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+Cassandra Query Language driver
+
 
-try:
-from cassandra import Cassandra
-from cassandra.ttypes import Compression, InvalidRequestException, \
- CqlResultType
-except ImportError:
-# Hack to run from a source tree
-import sys
-sys.path.append(join(abspath(dirname(__file__)),
- '..',
- '..',
- '..',
- 'interface',
- 'thrift',
- 'gen-py'))
-from cassandra import Cassandra
-from cassandra.ttypes import Compression, InvalidRequestException, \
-  CqlResultType
-
-COMPRESSION_SCHEMES = ['GZIP']
-DEFAULT_COMPRESSION = 'GZIP'
-
-class Connection(object):
-def __init__(self, keyspace, host, port=9160):
-socket = TSocket.TSocket(host, port)
-self.transport = TTransport.TFramedTransport(socket)
-protocol = TBinaryProtocol.TBinaryProtocolAccelerated(self.transport)
-self.client = Cassandra.Client(protocol)
-socket.open()
-
-if keyspace:
-self.execute('USE %s' % keyspace)
-
-def execute(self, query, compression=None):
-compress = compression is None and DEFAULT_COMPRESSION \
-or compression.upper()
-
-compressed_query = Connection.compress_query(query, compress)
-request_compression = getattr(Compression, compress)
-
-try:
-response = self.client.execute_cql_query(compressed_query,
- request_compression)
-except InvalidRequestException, ire:
-raise CQLException(Bad Request: %s % ire.why)
-except TApplicationException, tapp:
-raise CQLException(Internal application error)
-except Exception, exc:
-raise CQLException(exc)
-
-if response.type == CqlResultType.ROWS:
-return response.rows
-if response.type == CqlResultType.INT:
-return response.num
-
-return None
-
-def close(self):
-self.transport.close()
-
-@classmethod
-def compress_query(cls, query, compression):
-if not compression in COMPRESSION_SCHEMES:
-raise InvalidCompressionScheme(compression)
-
-if compression == 'GZIP':
-return zlib.compress(query)
-
-
-class InvalidCompressionScheme(Exception): pass
-class CQLException(Exception): pass
-
-# vi: ai ts=4 tw=0 sw=4 et
+from connection import Connection
+from connection_pool import ConnectionPool

Added: cassandra/trunk/drivers/py/cql/connection.py
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/drivers/py/cql/connection.py?rev=1055594view=auto
==
--- cassandra/trunk/drivers/py/cql/connection.py (added)
+++ cassandra/trunk/drivers/py/cql/connection.py Wed Jan  5 19:27:50 2011
@@ -0,0 +1,121 @@
+
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  

[jira] Updated: (CASSANDRA-1859) distributed test harness

2011-01-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1859:
--

 Reviewer: brandon.williams  (was: urandom)
Fix Version/s: (was: 0.8)
   0.7.1

 distributed test harness
 

 Key: CASSANDRA-1859
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1859
 Project: Cassandra
  Issue Type: Test
  Components: Tools
Reporter: Kelvin Kakugawa
Assignee: Kelvin Kakugawa
 Fix For: 0.7.1

 Attachments: 
 0001-Add-distributed-ultra-long-running-tests-using-Whirr-j.txt, 
 0002-Pull-whirr-0.3.0-incubating-SNAPSHOT-155-from-Twitter-.txt, 
 0003-add-a-test-for-one-writes-and-all-reads.txt


 Distributed Test Harness
 - deploys a cluster on a cloud provider
 - runs tests targeted at the cluster
 - tears down the cluster

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1711) Python driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1711:
--

Assignee: (was: Eric Evans)

 Python driver for CQL
 -

 Key: CASSANDRA-1711
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1711) Python driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977921#action_12977921
 ] 

Eric Evans commented on CASSANDRA-1711:
---

basic pooling committed (w/ license headers)

 Python driver for CQL
 -

 Key: CASSANDRA-1711
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1711) Python driver for CQL

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1711:
--

Attachment: (was: 
v1-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt)

 Python driver for CQL
 -

 Key: CASSANDRA-1711
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1711
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: 
 v2-0001-CASSANDRA-1711-basic-connection-pooling-for-python-dri.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1941) Add distributed test doing reads during bootstrap of additional node

2011-01-05 Thread Jonathan Ellis (JIRA)
Add distributed test doing reads during bootstrap of additional node


 Key: CASSANDRA-1941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 0.8


Following introduction of the distributed test framework in CASSANDRA-1859, we 
should extend that to test reads while bootstrap happens (this is a scenario 
that has had regressions in the past).

See test/distributed/README.txt for intro.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1857) nodetool has invalidaterowcache but no invalidatekeycache

2011-01-05 Thread Jon Hermes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Hermes updated CASSANDRA-1857:
--

Attachment: 1857.txt

Add invalidateKeyCache, add both to NodeCmd (with all the spiffy optional KS 
and CFs ala repair/compact/cleanup/flush).

 nodetool has invalidaterowcache but no invalidatekeycache
 -

 Key: CASSANDRA-1857
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1857
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robert Coli
Assignee: Jon Hermes
Priority: Trivial
 Attachments: 1857.txt


 In many cases where you would want to use invalidaterowcache, you would 
 probably also want to invalidatekeycache. Currently, you can 
 invalidaterowcache, but not invalidatekeycache. It seems that users 
 should, generally, be able to do both or neither, but not one or the other. A 
 brief look at the NodeCmd/ColumnFamilyStore code suggests that the 
 stubs/hooks for this feature do not currently exist.
 =Rob

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1857) nodetool has invalidaterowcache but no invalidatekeycache

2011-01-05 Thread Jon Hermes (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jon Hermes updated CASSANDRA-1857:
--

Fix Version/s: 0.7.1

 nodetool has invalidaterowcache but no invalidatekeycache
 -

 Key: CASSANDRA-1857
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1857
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Robert Coli
Assignee: Jon Hermes
Priority: Trivial
 Fix For: 0.7.1

 Attachments: 1857.txt


 In many cases where you would want to use invalidaterowcache, you would 
 probably also want to invalidatekeycache. Currently, you can 
 invalidaterowcache, but not invalidatekeycache. It seems that users 
 should, generally, be able to do both or neither, but not one or the other. A 
 brief look at the NodeCmd/ColumnFamilyStore code suggests that the 
 stubs/hooks for this feature do not currently exist.
 =Rob

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1941) Add distributed test doing reads during bootstrap of additional node

2011-01-05 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reassigned CASSANDRA-1941:
---

Assignee: Brandon Williams

 Add distributed test doing reads during bootstrap of additional node
 

 Key: CASSANDRA-1941
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1941
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8


 Following introduction of the distributed test framework in CASSANDRA-1859, 
 we should extend that to test reads while bootstrap happens (this is a 
 scenario that has had regressions in the past).
 See test/distributed/README.txt for intro.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1859) distributed test harness

2011-01-05 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1859:


Attachment: 0.7-1859.tgz

Attaching a rebase for the 0.7 branch.

 distributed test harness
 

 Key: CASSANDRA-1859
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1859
 Project: Cassandra
  Issue Type: Test
  Components: Tools
Reporter: Kelvin Kakugawa
Assignee: Kelvin Kakugawa
 Fix For: 0.7.1

 Attachments: 0.7-1859.tgz, 
 0001-Add-distributed-ultra-long-running-tests-using-Whirr-j.txt, 
 0002-Pull-whirr-0.3.0-incubating-SNAPSHOT-155-from-Twitter-.txt, 
 0003-add-a-test-for-one-writes-and-all-reads.txt


 Distributed Test Harness
 - deploys a cluster on a cloud provider
 - runs tests targeted at the cluster
 - tears down the cluster

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Dusbabek updated CASSANDRA-1710:
-

Attachment: jdbc-ish.diff

This may not be useful or productive, but I had to put the code somewhere.  
This patch (applies on top) and makes the API JDBC-ish, which may be 
undesirable).  However, it does push the pool abstraction down so that client 
code would think about pools and could treat all Connection objects the same 
way: get connection, execute query, close.

I haven't given too much thought as to how this would work out in other 
languages, but it is idiomatic for java. :)

 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: jdbc-ish.diff, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055618 - in /cassandra/branches/cassandra-0.7: ./ test/distributed/ test/distributed/org/ test/distributed/org/apache/ test/distributed/org/apache/cassandra/ test/distributed/org/apache/

2011-01-05 Thread brandonwilliams
Author: brandonwilliams
Date: Wed Jan  5 20:16:14 2011
New Revision: 1055618

URL: http://svn.apache.org/viewvc?rev=1055618view=rev
Log:
Distributed test harness.  Patch by Kelvin Kakugawa, Stu Hood, and Ryan
King, reviewed by brandonwilliams for CASSANDRA-1859.

Added:
cassandra/branches/cassandra-0.7/test/distributed/
cassandra/branches/cassandra-0.7/test/distributed/README.txt
cassandra/branches/cassandra-0.7/test/distributed/ivy.xml
  - copied, changed from r1055594, 
cassandra/branches/cassandra-0.7/ivysettings.xml
cassandra/branches/cassandra-0.7/test/distributed/org/
cassandra/branches/cassandra-0.7/test/distributed/org/apache/
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/CassandraServiceController.java

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MovementTest.java

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MutationTest.java

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/TestBase.java

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/BlobUtils.java

cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/KeyPair.java
cassandra/branches/cassandra-0.7/test/resources/whirr-default.properties
Modified:
cassandra/branches/cassandra-0.7/CHANGES.txt
cassandra/branches/cassandra-0.7/build.xml
cassandra/branches/cassandra-0.7/ivysettings.xml

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1055618r1=1055617r2=1055618view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jan  5 20:16:14 2011
@@ -12,6 +12,7 @@ dev
  * implement describeOwnership for BOP, COPP (CASSANDRA-1928)
  * make read repair behave as expected for ConsistencyLevel  ONE
(CASSANDRA-982)
+ * distributed test harness (CASSANDRA-1859)
 
 
 0.7.0-rc4

Modified: cassandra/branches/cassandra-0.7/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/build.xml?rev=1055618r1=1055617r2=1055618view=diff
==
--- cassandra/branches/cassandra-0.7/build.xml (original)
+++ cassandra/branches/cassandra-0.7/build.xml Wed Jan  5 20:16:14 2011
@@ -40,12 +40,14 @@
 property name=interface.avro.dir value=${interface.dir}/avro/
 property name=test.dir value=${basedir}/test/
 property name=test.resources value=${test.dir}/resources/
+property name=test.lib value=${build.dir}/test/lib/
 property name=test.classes value=${build.dir}/test/classes/
 property name=test.conf value=${test.dir}/conf/
 property name=test.data value=${test.dir}/data/
 property name=test.name value=*Test/
 property name=test.unit.src value=${test.dir}/unit/
 property name=test.long.src value=${test.dir}/long/
+property name=test.distributed.src value=${test.dir}/distributed/
 property name=dist.dir value=${build.dir}/dist/
 property name=base.version value=0.7.0-rc4/
 condition property=version value=${base.version}
@@ -105,6 +107,7 @@
 fail unless=is.source.artifact
 message=Not a source artifact, stopping here. /
 mkdir dir=${build.classes}/
+mkdir dir=${test.lib}/
 mkdir dir=${test.classes}/
 mkdir dir=${build.src.gen-java}/
 /target
@@ -165,10 +168,17 @@
 /target
 
 target name=ivy-retrieve-build depends=ivy-init
+  ivy:resolve file=${basedir}/ivy.xml/
   ivy:retrieve type=jar,source sync=true
  pattern=${build.dir.lib}/[type]s/[artifact]-[revision].[ext] /
 /target
 
+target name=ivy-retrieve-test depends=ivy-init
+  ivy:resolve file=${basedir}/test/distributed/ivy.xml/
+  ivy:retrieve type=jar,source sync=true
+ pattern=${test.lib}/[type]s/[artifact]-[revision].[ext] /
+/target
+
 !--
Generate avro code
 --
@@ -453,28 +463,49 @@
 /copy
   /target
 
+  target name=build-distributed-test depends=build-test,ivy-retrieve-test 
description=Compile distributed test classes (which have additional deps)
+javac
+ debug=true
+ debuglevel=${debuglevel}
+ destdir=${test.classes}
+  classpath
+  path refid=cassandra.classpath/
+  pathelement location=${test.classes}/
+  fileset dir=${test.lib}
+include name=**/*.jar /
+  /fileset
+  /classpath
+  src path=${test.distributed.src}/
+/javac
+  /target
+
   macrodef name=testmacro
 attribute name=suitename /
 attribute name=inputdir /
 attribute name=timeout /
+

[jira] Issue Comment Edited: (CASSANDRA-1710) Java driver for CQL

2011-01-05 Thread Gary Dusbabek (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977948#action_12977948
 ] 

Gary Dusbabek edited comment on CASSANDRA-1710 at 1/5/11 3:19 PM:
--

This may not be useful or productive, but I had to put the code somewhere.  
This patch (applies on top) and makes the API JDBC-ish, which may be 
undesirable).  However, it does push the pool abstraction down so that client 
code wouldn't need to think about pools and could treat all Connection objects 
the same way: get connection, execute query, close.

I haven't given too much thought as to how this would work out in other 
languages, but it is idiomatic for java. :)

  was (Author: gdusbabek):
This may not be useful or productive, but I had to put the code somewhere.  
This patch (applies on top) and makes the API JDBC-ish, which may be 
undesirable).  However, it does push the pool abstraction down so that client 
code would think about pools and could treat all Connection objects the same 
way: get connection, execute query, close.

I haven't given too much thought as to how this would work out in other 
languages, but it is idiomatic for java. :)
  
 Java driver for CQL
 ---

 Key: CASSANDRA-1710
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1710
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: jdbc-ish.diff, 
 v3-0001-CASSANDRA-1710-basic-connection-pooling-for-java-drive.txt, 
 v3-0002-compile-driver-source.txt

   Original Estimate: 0h
  Remaining Estimate: 0h

 In-tree CQL drivers should be reasonably consistent with one another 
 (wherever possible/practical), and implement a minimum of:
 * Query compression
 * Keyspace assignment on connection
 * Connection pooling / load-balancing
 The goal is not to supplant the idiomatic libraries, but to provide a 
 consistent, stable base for them to build upon.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1937) Keep partitioned counters (contexts) sorted

2011-01-05 Thread Kelvin Kakugawa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977953#action_12977953
 ] 

Kelvin Kakugawa commented on CASSANDRA-1937:


Definitely agree.  I made the trade-off to keep them in update order.  i.e. the 
order in which the node was last updated.  However, keeping them in node id 
sorted order did cross my mind.

 Keep partitioned counters (contexts) sorted
 -

 Key: CASSANDRA-1937
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1937
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

   Original Estimate: 4h
  Remaining Estimate: 4h

 In the value of CounterColumns, the code keep the subpart unsorted, but sort
 them 'on the fly' when needed (in diff() and merge()). It will be more
 efficient to keep the parts always sorted (it will also be easier in that it
 will remove the need of the ad-hoc in-place quicksort in CounterContext).
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055623 - /cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 20:27:58 2011
New Revision: 1055623

URL: http://svn.apache.org/viewvc?rev=1055623view=rev
Log:
replace high-scale-lib.jar from maven central

Modified:
cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar

Modified: cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/lib/high-scale-lib.jar?rev=1055623r1=1055622r2=1055623view=diff
==
Binary files - no diff available.




svn commit: r1055626 - in /cassandra/trunk: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ lib/ test/distributed/ test/distributed/org/ test/distributed/org/apache/ test/distributed/org/apa

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 20:36:35 2011
New Revision: 1055626

URL: http://svn.apache.org/viewvc?rev=1055626view=rev
Log:
merge w/ 0.7 branch

Added:
cassandra/trunk/test/distributed/
  - copied from r1055624, cassandra/branches/cassandra-0.7/test/distributed/
cassandra/trunk/test/distributed/README.txt
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/README.txt
cassandra/trunk/test/distributed/ivy.xml
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/ivy.xml
cassandra/trunk/test/distributed/org/
  - copied from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/
cassandra/trunk/test/distributed/org/apache/
  - copied from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/
cassandra/trunk/test/distributed/org/apache/cassandra/
  - copied from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/

cassandra/trunk/test/distributed/org/apache/cassandra/CassandraServiceController.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/CassandraServiceController.java
cassandra/trunk/test/distributed/org/apache/cassandra/MovementTest.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MovementTest.java
cassandra/trunk/test/distributed/org/apache/cassandra/MutationTest.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/MutationTest.java
cassandra/trunk/test/distributed/org/apache/cassandra/TestBase.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/TestBase.java
cassandra/trunk/test/distributed/org/apache/cassandra/utils/
  - copied from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/
cassandra/trunk/test/distributed/org/apache/cassandra/utils/BlobUtils.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/BlobUtils.java
cassandra/trunk/test/distributed/org/apache/cassandra/utils/KeyPair.java
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/distributed/org/apache/cassandra/utils/KeyPair.java
cassandra/trunk/test/resources/whirr-default.properties
  - copied unchanged from r1055624, 
cassandra/branches/cassandra-0.7/test/resources/whirr-default.properties
Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/build.xml

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/ivysettings.xml
cassandra/trunk/lib/high-scale-lib.jar

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 20:36:35 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1055311
-/cassandra/branches/cassandra-0.7:1026516-1055325
+/cassandra/branches/cassandra-0.7:1026516-1055624
 /cassandra/branches/cassandra-0.7.0:1053690-1054631
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1055626r1=1055625r2=1055626view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Wed Jan  5 20:36:35 2011
@@ -17,6 +17,7 @@
  * implement describeOwnership for BOP, COPP (CASSANDRA-1928)
  * make read repair behave as expected for ConsistencyLevel  ONE
(CASSANDRA-982)
+ * distributed test harness (CASSANDRA-1859)
 
 
 0.7.0-rc4

Modified: cassandra/trunk/build.xml
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/build.xml?rev=1055626r1=1055625r2=1055626view=diff
==
--- cassandra/trunk/build.xml (original)
+++ cassandra/trunk/build.xml Wed Jan  5 20:36:35 2011
@@ -41,12 +41,14 @@
 property name=interface.avro.dir value=${interface.dir}/avro/
 property name=test.dir value=${basedir}/test/
 property name=test.resources value=${test.dir}/resources/
+

[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Kelvin Kakugawa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977959#action_12977959
 ] 

Kelvin Kakugawa commented on CASSANDRA-1936:


The refactor to store the partitioned counter in the value, instead, is a good 
direction.

I noticed some material changes, though:
- client deltas are in the correct partitioned counter format, but targeted at 
the local node
- the RowMutation : updateCommutativeTypes was removed

The code works, now, as long as the coordinator node (the local node) is part 
of the replica set.  However, if it's not, then all updates from those 
non-replica coordinators will be fixed at the highest delta.  The 
reconciliation strategy (on a replica) is sum my node's updates, but take the 
highest update from all other nodes.  (Just ran a distributed test to validate 
my hypothesis--the same test that's included on 1072.)

In the current code, value and partitioned counter are broken apart, because 
when the RowMutation is created we don't know which node we're going to write 
to, yet.  So, we can't create the final partitioned counter.  We could create a 
sentinel node that replicas need to look for, but that's dirty.  The way I 
solved it was using value for the client delta and converting it to the 
partitioned counter (w/ the target node) via RM : updateCommutativeTypes.

I have an alternate proposal for this ticket, the second patch on 1072.  I'll 
post it and we can take the best parts of each patch.

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977960#action_12977960
 ] 

Eric Evans commented on CASSANDRA-1888:
---

I committed 1.1.1 from 
(http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar).

 Replace lib/high-scale-lib.jar with equivalent from maven central repository
 

 Key: CASSANDRA-1888
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.7.0 rc 3
Reporter: Stephen Connolly
 Fix For: 0.7.1


 As part of my effort to get Cassandra published to Maven Central, there are a 
 number of libraries which Cassandra depends on but which are not available in 
 Maven Central.
 Perhaps the most interesting of these is the Public Domain high-scale-lib.jar
 The author is an XML build tool hater (and that includes ANT), and the 
 artifact itself contains a lot of unusual cruft... .CVS folders, etc. The 
 build process uses a build.java, that effectively is a rewrite of Make in 
 java with the Makefile embedded in the build.java.
 I have rebuilt the artifacts and published them to the Maven Central 
 repository. As part of the requirements for publishing to Maven Central are 
 to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was 
 easier to take the source code and transform it into a Maven project.  The 
 project is hosted at github: http://stephenc.github.com/high-scale-lib
 I have published the following versions, all signed with by 
 steph...@apache.org PGP key
 1.0.0
 1.0.1
 1.1.0
 1.1.1
 1.1.2
 These should all be equivalent to the releases by Cliff Click, with the only 
 exception being 1.1.1.
 For 1.1.1 Cliff's original build script did not run the Unit tests correctly, 
 one of the unit tests consistently fails even on his build process due to an 
 invalid assumption that element ordering is preserved across serialization 
 for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported 
 the test change. The code however remains as is.
 In any case, can we change the version of high-scale-lib.jar in the lib 
 directory to the version from maven central
   
 http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar
 [The current version used by Cassandra is 1.1.1]
 Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that 
 this could be considered riskier]
 My justification for the change is so that I can be sure that consumers of a 
 Maven Central distribution of Cassandra will have exactly the same 
 dependencies, which have been tested as part of the Cassandra release 
 process, and not just the Stephen's very damn sure they are the same 
 dependencies ;-) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans resolved CASSANDRA-1888.
---

   Resolution: Fixed
Fix Version/s: 0.7.1
 Assignee: Stephen Connolly

 Replace lib/high-scale-lib.jar with equivalent from maven central repository
 

 Key: CASSANDRA-1888
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.7.0 rc 3
Reporter: Stephen Connolly
Assignee: Stephen Connolly
 Fix For: 0.7.1


 As part of my effort to get Cassandra published to Maven Central, there are a 
 number of libraries which Cassandra depends on but which are not available in 
 Maven Central.
 Perhaps the most interesting of these is the Public Domain high-scale-lib.jar
 The author is an XML build tool hater (and that includes ANT), and the 
 artifact itself contains a lot of unusual cruft... .CVS folders, etc. The 
 build process uses a build.java, that effectively is a rewrite of Make in 
 java with the Makefile embedded in the build.java.
 I have rebuilt the artifacts and published them to the Maven Central 
 repository. As part of the requirements for publishing to Maven Central are 
 to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was 
 easier to take the source code and transform it into a Maven project.  The 
 project is hosted at github: http://stephenc.github.com/high-scale-lib
 I have published the following versions, all signed with by 
 steph...@apache.org PGP key
 1.0.0
 1.0.1
 1.1.0
 1.1.1
 1.1.2
 These should all be equivalent to the releases by Cliff Click, with the only 
 exception being 1.1.1.
 For 1.1.1 Cliff's original build script did not run the Unit tests correctly, 
 one of the unit tests consistently fails even on his build process due to an 
 invalid assumption that element ordering is preserved across serialization 
 for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported 
 the test change. The code however remains as is.
 In any case, can we change the version of high-scale-lib.jar in the lib 
 directory to the version from maven central
   
 http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar
 [The current version used by Cassandra is 1.1.1]
 Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that 
 this could be considered riskier]
 My justification for the change is so that I can be sure that consumers of a 
 Maven Central distribution of Cassandra will have exactly the same 
 dependencies, which have been tested as part of the Cassandra release 
 process, and not just the Stephen's very damn sure they are the same 
 dependencies ;-) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977965#action_12977965
 ] 

Sylvain Lebresne commented on CASSANDRA-1936:
-

You're right, don't know what I have smoked. Actually I was still thinking in 
the context of 1546, where it's always a replica that 'apply' the update (since 
the coordinator simply forward the update to a replica if its not one).

That being said, I do plan to introduce this part 1546, because it allowed to 
rehabilitate the consistency levels. So maybe I should do that before, in which 
case I think the method here would work.

Still, curious to see your patch. But I'll admit that I was actually happy to 
get rid of the updateCommutativeType logic.

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Kelvin Kakugawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelvin Kakugawa updated CASSANDRA-1936:
---

Attachment: 1936-ALT-0001-lazily-materialize-value.patch

Alternate strategy to lazily materialize value.

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch, 
 1936-ALT-0001-lazily-materialize-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (CASSANDRA-1909) normal replication shouldn't happen on counter CFs.

2011-01-05 Thread Kelvin Kakugawa (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelvin Kakugawa resolved CASSANDRA-1909.


Resolution: Not A Problem

 normal replication shouldn't happen on counter CFs.
 ---

 Key: CASSANDRA-1909
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1909
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Gary Dusbabek
Assignee: Kelvin Kakugawa
 Fix For: 0.8




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055642 - in /cassandra/branches/cassandra-0.7.0: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/java/org/apache/cassandra/utils/ test/unit/org/apache/

2011-01-05 Thread jbellis
Author: jbellis
Date: Wed Jan  5 21:18:59 2011
New Revision: 1055642

URL: http://svn.apache.org/viewvc?rev=1055642view=rev
Log:
fix offsets to ByteBuffer.get
patch by slebresne; reviewed by jbellis for CASSANDRA-1939

Modified:
cassandra/branches/cassandra-0.7.0/CHANGES.txt

cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java

cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java

cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java

cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java

cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java

Modified: cassandra/branches/cassandra-0.7.0/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/CHANGES.txt?rev=1055642r1=1055641r2=1055642view=diff
==
--- cassandra/branches/cassandra-0.7.0/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7.0/CHANGES.txt Wed Jan  5 21:18:59 2011
@@ -1,3 +1,7 @@
+0.7.0-final
+ * fix offsets to ByteBuffer.get (CASSANDRA-1939)
+
+
 0.7.0-rc4
  * fix cli crash after backgrounding (CASSANDRA-1875)
  * count timeouts in storageproxy latencies, and include latency 

Modified: 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java?rev=1055642r1=1055641r2=1055642view=diff
==
--- 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java
 (original)
+++ 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/DeletedColumn.java
 Wed Jan  5 21:18:59 2011
@@ -55,7 +55,7 @@ public class DeletedColumn extends Colum
 @Override
 public int getLocalDeletionTime()
 {
-   return value.getInt(value.position()+value.arrayOffset());
+   return value.getInt(value.position());
 }
 
 @Override

Modified: 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java?rev=1055642r1=1055641r2=1055642view=diff
==
--- 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java
 (original)
+++ 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/db/marshal/LongType.java
 Wed Jan  5 21:18:59 2011
@@ -63,7 +63,7 @@ public class LongType extends AbstractTy
 }
 
 
-return 
String.valueOf(bytes.getLong(bytes.position()+bytes.arrayOffset()));
+return String.valueOf(bytes.getLong(bytes.position()));
 }
 
 public ByteBuffer fromString(String source)

Modified: 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java?rev=1055642r1=1055641r2=1055642view=diff
==
--- 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java
 (original)
+++ 
cassandra/branches/cassandra-0.7.0/src/java/org/apache/cassandra/utils/UUIDGen.java
 Wed Jan  5 21:18:59 2011
@@ -56,7 +56,7 @@ public class UUIDGen
 /** creates a type 1 uuid from raw bytes. */
 public static UUID getUUID(ByteBuffer raw)
 {
-return new UUID(raw.getLong(raw.position() + raw.arrayOffset()), 
raw.getLong(raw.position() + raw.arrayOffset() + 8));
+return new UUID(raw.getLong(raw.position()), 
raw.getLong(raw.position() + 8));
 }
 
 /** decomposes a uuid into raw bytes. */

Modified: 
cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java?rev=1055642r1=1055641r2=1055642view=diff
==
--- 
cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java
 (original)
+++ 
cassandra/branches/cassandra-0.7.0/test/unit/org/apache/cassandra/db/NameSortTest.java
 Wed Jan  5 21:18:59 2011
@@ -124,7 +124,7 @@ public class NameSortTest extends Cleanu
 assert subColumns.size() == 4;
 for (IColumn subColumn : subColumns)
 {
-long k = 
subColumn.name().getLong(subColumn.name().position() + 
subColumn.name().arrayOffset());
+long k = 
subColumn.name().getLong(subColumn.name().position());
   

[jira] Commented: (CASSANDRA-1936) Fit partitioned counter directly into CounterColumn.value

2011-01-05 Thread Kelvin Kakugawa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12977974#action_12977974
 ] 

Kelvin Kakugawa commented on CASSANDRA-1936:


The partitionedCounter of my strategy could be refactored into value, like your 
strategy.

Yeah, unfortunately, the two-step RM creation (via updateCommutativeTypes) is 
still present.

 Fit partitioned counter directly into CounterColumn.value 
 --

 Key: CASSANDRA-1936
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1936
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8

 Attachments: 
 0001-Put-partitioned-counter-directly-in-column-value.patch, 
 1936-ALT-0001-lazily-materialize-value.patch


 The current implementation of CounterColumn keeps both the partitioned
 counter and the total value of the counter (that is, the sum of the parts of
 the partitioned counter).
 This waste space and this requires the code to keep both representation in
 sync. This ticket propose to remove the total value from the representation
 and to only calculate it when returning the value to the client.
 NOTE: this breaks the on-disk file format (for counters)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055658 - in /cassandra/branches/cassandra-0.7: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/j

2011-01-05 Thread jbellis
Author: jbellis
Date: Wed Jan  5 21:57:03 2011
New Revision: 1055658

URL: http://svn.apache.org/viewvc?rev=1055658view=rev
Log:
merge from 0.7.0

Modified:
cassandra/branches/cassandra-0.7/   (props changed)
cassandra/branches/cassandra-0.7/CHANGES.txt

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/DeletedColumn.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/db/marshal/LongType.java

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/utils/UUIDGen.java

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/NameSortTest.java

cassandra/branches/cassandra-0.7/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java

Propchange: cassandra/branches/cassandra-0.7/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 21:57:03 2011
@@ -1,6 +1,6 @@
 /cassandra/branches/cassandra-0.6:922689-1055311
 /cassandra/branches/cassandra-0.7:1026516,1035666,1050269
-/cassandra/branches/cassandra-0.7.0:1053690-1054631
+/cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /cassandra/trunk:1026516-1026734,1028929
 /incubator/cassandra/branches/cassandra-0.3:774578-796573

Modified: cassandra/branches/cassandra-0.7/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/CHANGES.txt?rev=1055658r1=1055657r2=1055658view=diff
==
--- cassandra/branches/cassandra-0.7/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.7/CHANGES.txt Wed Jan  5 21:57:03 2011
@@ -1,4 +1,4 @@
-dev
+0.7.1-dev
  * buffer network stack to avoid inefficient small TCP messages while avoiding
the nagle/delayed ack problem (CASSANDRA-1896)
  * check log4j configuration for changes every 10s (CASSANDRA-1525, 1907)
@@ -15,6 +15,10 @@ dev
  * distributed test harness (CASSANDRA-1859)
 
 
+0.7.0-dev
+ * fix offsets to ByteBuffer.get (CASSANDRA-1939)
+
+
 0.7.0-rc4
  * fix cli crash after backgrounding (CASSANDRA-1875)
  * count timeouts in storageproxy latencies, and include latency 

Propchange: 
cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 21:57:03 2011
@@ -1,6 +1,6 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1055311
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516,1035666,1050269
-/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1054631
+/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1026734,1028929
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573

Propchange: 
cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 21:57:03 2011
@@ -1,6 +1,6 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1055311
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516,1035666,1050269
-/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1054631
+/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1051699-1053689
 
/cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1026734,1028929
 

svn commit: r1055668 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java

2011-01-05 Thread jbellis
Author: jbellis
Date: Wed Jan  5 22:30:05 2011
New Revision: 1055668

URL: http://svn.apache.org/viewvc?rev=1055668view=rev
Log:
revert r1053443

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java?rev=1055668r1=1055667r2=1055668view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/service/ReadResponseResolver.java
 Wed Jan  5 22:30:05 2011
@@ -90,7 +90,7 @@ public class ReadResponseResolver implem
ListColumnFamily versions = new ArrayListColumnFamily();
ListInetAddress endpoints = new ArrayListInetAddress();
 
-// validate digests against each other; throw immediately on mismatch.
+// case 1: validate digests against each other; throw immediately on 
mismatch.
 // also, collects data results into versions/endpoints lists.
 //
 // results are cleared as we process them, to avoid unnecessary 
duplication of work
@@ -100,13 +100,20 @@ public class ReadResponseResolver implem
 {
 ReadResponse result = entry.getValue();
 Message message = entry.getKey();
-ByteBuffer resultDigest = result.isDigestQuery() ? result.digest() 
: ColumnFamily.digest(result.row().cf);
-if (digest == null)
-digest = resultDigest;
-else if (!digest.equals(resultDigest))
-throw new DigestMismatchException(key, digest, resultDigest);
-
-if (!result.isDigestQuery())
+if (result.isDigestQuery())
+{
+if (digest == null)
+{
+digest = result.digest();
+}
+else
+{
+ByteBuffer digest2 = result.digest();
+if (!digest.equals(digest2))
+throw new DigestMismatchException(key, digest, 
digest2);
+}
+}
+else
 {
 versions.add(result.row().cf);
 endpoints.add(message.getFrom());
@@ -115,8 +122,23 @@ public class ReadResponseResolver implem
 results.remove(message);
 }
 
-if (logger_.isDebugEnabled())
-logger_.debug(digests verified);
+   // If there was a digest query compare it with all the data 
digests
+   // If there is a mismatch then throw an exception so that read 
repair can happen.
+//
+// It's important to note that we do not compare the digests of 
multiple data responses --
+// if we are in that situation we know there was a previous mismatch 
and now we're doing a repair,
+// so our job is now case 2: figure out what the most recent version 
is and update everyone to that version.
+if (digest != null)
+{
+for (ColumnFamily cf : versions)
+{
+ByteBuffer digest2 = ColumnFamily.digest(cf);
+if (!digest.equals(digest2))
+throw new DigestMismatchException(key, digest, digest2);
+}
+if (logger_.isDebugEnabled())
+logger_.debug(digests verified);
+}
 
 ColumnFamily resolved;
 if (versions.size()  1)




svn commit: r1055669 - in /cassandra/trunk: ./ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/marshal/ src/java/org/apache/ca

2011-01-05 Thread jbellis
Author: jbellis
Date: Wed Jan  5 22:35:09 2011
New Revision: 1055669

URL: http://svn.apache.org/viewvc?rev=1055669view=rev
Log:
merge from 0.7

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/db/DeletedColumn.java
cassandra/trunk/src/java/org/apache/cassandra/db/marshal/LongType.java

cassandra/trunk/src/java/org/apache/cassandra/service/ReadResponseResolver.java
cassandra/trunk/src/java/org/apache/cassandra/utils/UUIDGen.java
cassandra/trunk/test/unit/org/apache/cassandra/db/NameSortTest.java

cassandra/trunk/test/unit/org/apache/cassandra/db/marshal/TypeCompareTest.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 22:35:09 2011
@@ -1,6 +1,6 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1055311
-/cassandra/branches/cassandra-0.7:1026516-1055624
-/cassandra/branches/cassandra-0.7.0:1053690-1054631
+/cassandra/branches/cassandra-0.7:1026516-1055668
+/cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689
 /incubator/cassandra/branches/cassandra-0.3:774578-796573
 /incubator/cassandra/branches/cassandra-0.4:810145-834239,834349-834350

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1055669r1=1055668r2=1055669view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Wed Jan  5 22:35:09 2011
@@ -3,7 +3,7 @@
  * adds support for columns that act as incr/decr counters (CASSANDRA-1072)
 
 
-0.7-dev
+0.7.1-dev
  * buffer network stack to avoid inefficient small TCP messages while avoiding
the nagle/delayed ack problem (CASSANDRA-1896)
  * check log4j configuration for changes every 10s (CASSANDRA-1525, 1907)
@@ -20,6 +20,10 @@
  * distributed test harness (CASSANDRA-1859)
 
 
+0.7.0-dev
+ * fix offsets to ByteBuffer.get (CASSANDRA-1939)
+
+
 0.7.0-rc4
  * fix cli crash after backgrounding (CASSANDRA-1875)
  * count timeouts in storageproxy latencies, and include latency 

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 22:35:09 2011
@@ -1,6 +1,6 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1055311
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1055624
-/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1054631
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1055668
+/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689
 
/incubator/cassandra/branches/cassandra-0.3/interface/gen-java/org/apache/cassandra/service/Cassandra.java:774578-796573
 
/incubator/cassandra/branches/cassandra-0.4/interface/gen-java/org/apache/cassandra/service/Cassandra.java:810145-834239,834349-834350

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Wed Jan  5 22:35:09 2011
@@ -1,6 +1,6 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1055311
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1055624
-/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1053690-1054631
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1055668

[jira] Commented: (CASSANDRA-1705) CQL writes (aka UPDATE)

2011-01-05 Thread Eric Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978007#action_12978007
 ] 

Eric Evans commented on CASSANDRA-1705:
---

I'm still not sure I understand how this is an improvement, perhaps it's just a 
matter of taste.  I can't see how it hurts anything either though so it's 
committed.  Thanks Pavel!

 CQL writes (aka UPDATE)
 ---

 Key: CASSANDRA-1705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1705
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: CASSANDRA-1705.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for data manipulation.
 This corresponds to the following RPC methods:
 * insert()
 * batch_mutate() (writes, not deletes)
 The initial check-in to trunk/ uses a syntax that looks like:
 {code:SQL}
 UPDATE CF [USING CONSISTENCY.LVL] WITH ROW(key, COLUMN(name, 
 value)[, COLUMN(...)])[ AND ROW(...)];
 {code}
 Where:
 * CF is the column family name.
 * Rows are a parenthesized expressions with comma separated arguments for a 
 key and one or more columns.
 * Columns are a parenthesized expressions with comma separated arguments for 
 the name and value (timestamp is inaccessible).
 What is still undone:
 * Complete test coverage
 And of course, all of this is still very much open to further discussion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1705) CQL writes (aka UPDATE)

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans updated CASSANDRA-1705:
--

Assignee: (was: Pavel Yaskevich)

 CQL writes (aka UPDATE)
 ---

 Key: CASSANDRA-1705
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1705
 Project: Cassandra
  Issue Type: Sub-task
  Components: API
Affects Versions: 0.8
Reporter: Eric Evans
Priority: Minor
 Fix For: 0.8

 Attachments: CASSANDRA-1705.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 CQL specification and implementation for data manipulation.
 This corresponds to the following RPC methods:
 * insert()
 * batch_mutate() (writes, not deletes)
 The initial check-in to trunk/ uses a syntax that looks like:
 {code:SQL}
 UPDATE CF [USING CONSISTENCY.LVL] WITH ROW(key, COLUMN(name, 
 value)[, COLUMN(...)])[ AND ROW(...)];
 {code}
 Where:
 * CF is the column family name.
 * Rows are a parenthesized expressions with comma separated arguments for a 
 key and one or more columns.
 * Columns are a parenthesized expressions with comma separated arguments for 
 the name and value (timestamp is inaccessible).
 What is still undone:
 * Complete test coverage
 And of course, all of this is still very much open to further discussion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055677 - /cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

2011-01-05 Thread eevans
Author: eevans
Date: Wed Jan  5 22:55:24 2011
New Revision: 1055677

URL: http://svn.apache.org/viewvc?rev=1055677view=rev
Log:
move term-pair parse and map update to separate rule

Patch by Pavel Yaskevich (w/ minor changes); reviewed by eevans for 
CASSANDRA-1705

Modified:
cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g

Modified: cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g?rev=1055677r1=1055676r2=1055677view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cql/Cql.g Wed Jan  5 22:55:24 
2011
@@ -127,7 +127,7 @@ updateStatement returns [UpdateStatement
   }
   K_UPDATE columnFamily=IDENT
   (K_USING K_CONSISTENCY '.' K_LEVEL { cLevel = 
ConsistencyLevel.valueOf($K_LEVEL.text); })?
-  K_SET c1=term '=' v1=term { columns.put(c1, v1); } (',' cN=term '=' 
vN=term { columns.put(cN, vN); })*
+  K_SET termPair[columns] (',' termPair[columns])*
   K_WHERE K_KEY '=' key=term endStmnt
   {
   return new UpdateStatement($columnFamily.text, cLevel, columns, key);
@@ -172,6 +172,11 @@ termList returns [ListTerm items]
   t1=term { $items.add(t1); } (',' tN=term { $items.add(tN); })*
 ;
 
+// term = term
+termPair[MapTerm, Term columns]
+:   key=term '=' value=term { columns.put(key, value); }
+;
+
 // Note: ranges are inclusive so = and , and  and = all have the same 
semantics.  
 relation returns [Relation rel]
 : { Term entity = new Term(KEY, STRING_LITERAL); }




[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future

2011-01-05 Thread Ryan King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978029#action_12978029
 ] 

Ryan King commented on CASSANDRA-1935:
--

It seems like we should probably abort in this case, but that might be a bit 
draconian.

 Refuse to open SSTables from the future
 ---

 Key: CASSANDRA-1935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Priority: Minor
 Fix For: 0.8


 If somebody has rolled back to a previous version of Cassandra that is unable 
 to read an SSTable written by a future version correctly (indicated by a 
 version change), failing fast is safer than accidentally performing a 
 compaction that rewrites incorrect data and leaves you in an odd state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[Cassandra Wiki] Update of Operations by BrandonWilli ams

2011-01-05 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The Operations page has been changed by BrandonWilliams.
The comment on this change is: Update python to start with token 0.
http://wiki.apache.org/cassandra/Operations?action=diffrev1=72rev2=73

--

  Here's a python program which can be used to calculate new tokens for the 
nodes. There's more info on the subject at Ben Black's presentation at 
Cassandra Summit 2010. 
http://www.riptano.com/blog/slides-and-videos-cassandra-summit-2010
  
def tokens(nodes):
-   for i in range(1, nodes + 1): 
+   for x in xrange(nodes): 
-   print (i * (2 ** 127 - 1) / nodes)
+   print 2 ** 127 / nodes * x
  
  There's also `nodetool loadbalance`: essentially a convenience over 
decommission + bootstrap, only instead of telling the target node where to move 
on the ring it will choose its location based on the same heuristic as Token 
selection on bootstrap. You should not use this as it doesn't rebalance the 
entire ring.
  


[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978063#action_12978063
 ] 

Jonathan Ellis commented on CASSANDRA-1935:
---

Agreed that we should abort startup.  (Isn't that what fail fast means?)

 Refuse to open SSTables from the future
 ---

 Key: CASSANDRA-1935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Priority: Minor
 Fix For: 0.8


 If somebody has rolled back to a previous version of Cassandra that is unable 
 to read an SSTable written by a future version correctly (indicated by a 
 version change), failing fast is safer than accidentally performing a 
 compaction that rewrites incorrect data and leaves you in an odd state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1942) upgrade to high-scale-lib (home of NBHM and NBHS) 1.1.2

2011-01-05 Thread Jonathan Ellis (JIRA)
upgrade to high-scale-lib (home of NBHM and NBHS) 1.1.2
---

 Key: CASSANDRA-1942
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1942
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8


Stephen Connolly gives a summary of changes in CASSANDRA-1888.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1888) Replace lib/high-scale-lib.jar with equivalent from maven central repository

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978066#action_12978066
 ] 

Jonathan Ellis commented on CASSANDRA-1888:
---

created CASSANDRA-1942 to upgrade to 1.1.2 in 0.8

 Replace lib/high-scale-lib.jar with equivalent from maven central repository
 

 Key: CASSANDRA-1888
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1888
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.7.0 rc 3
Reporter: Stephen Connolly
Assignee: Stephen Connolly
 Fix For: 0.7.1


 As part of my effort to get Cassandra published to Maven Central, there are a 
 number of libraries which Cassandra depends on but which are not available in 
 Maven Central.
 Perhaps the most interesting of these is the Public Domain high-scale-lib.jar
 The author is an XML build tool hater (and that includes ANT), and the 
 artifact itself contains a lot of unusual cruft... .CVS folders, etc. The 
 build process uses a build.java, that effectively is a rewrite of Make in 
 java with the Makefile embedded in the build.java.
 I have rebuilt the artifacts and published them to the Maven Central 
 repository. As part of the requirements for publishing to Maven Central are 
 to publish a javadoc.jar and a sources.jar with gpg signatures, etc. It was 
 easier to take the source code and transform it into a Maven project.  The 
 project is hosted at github: http://stephenc.github.com/high-scale-lib
 I have published the following versions, all signed with by 
 steph...@apache.org PGP key
 1.0.0
 1.0.1
 1.1.0
 1.1.1
 1.1.2
 These should all be equivalent to the releases by Cliff Click, with the only 
 exception being 1.1.1.
 For 1.1.1 Cliff's original build script did not run the Unit tests correctly, 
 one of the unit tests consistently fails even on his build process due to an 
 invalid assumption that element ordering is preserved across serialization 
 for NonBlockingIdentityHashMap. He fixed the test in 1.1.2, so I back-ported 
 the test change. The code however remains as is.
 In any case, can we change the version of high-scale-lib.jar in the lib 
 directory to the version from maven central
   
 http://repo1.maven.org/maven2/com/github/stephenc/high-scale-lib/high-scale-lib/1.1.1/high-scale-lib-1.1.1.jar
 [The current version used by Cassandra is 1.1.1]
 Or if perhaps even consider upgrading to 1.1.2 [though I can appreciate that 
 this could be considered riskier]
 My justification for the change is so that I can be sure that consumers of a 
 Maven Central distribution of Cassandra will have exactly the same 
 dependencies, which have been tested as part of the Cassandra release 
 process, and not just the Stephen's very damn sure they are the same 
 dependencies ;-) 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1939) Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the index)

2011-01-05 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-1939:
--

 Reviewer: jbellis
Affects Version/s: (was: 0.7.0)
   (was: 0.7.0 rc 3)
   (was: 0.7.0 rc 2)
   (was: 0.7.0 rc 1)
   (was: 0.7.1)
   (was: 0.8)
   0.7 beta 3

committed

 Misuses of ByteBuffer absolute get (wrongfully adding arrayOffset to the 
 index)
 ---

 Key: CASSANDRA-1939
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1939
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7 beta 3
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.0

 Attachments: 
 0001-Remove-addition-of-arrayOffset-in-ByteBuffer-absolut.patch


 ByteBuffer.arrayOffset() should not be added to the argument of an absolute 
 get. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



svn commit: r1055703 - /cassandra/trunk/doc/cql/CQL.textile

2011-01-05 Thread eevans
Author: eevans
Date: Thu Jan  6 01:33:24 2011
New Revision: 1055703

URL: http://svn.apache.org/viewvc?rev=1055703view=rev
Log:
code markup inside a link upsets mylyn

Patch by eevans

Modified:
cassandra/trunk/doc/cql/CQL.textile

Modified: cassandra/trunk/doc/cql/CQL.textile
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/doc/cql/CQL.textile?rev=1055703r1=1055702r2=1055703view=diff
==
--- cassandra/trunk/doc/cql/CQL.textile (original)
+++ cassandra/trunk/doc/cql/CQL.textile Thu Jan  6 01:33:24 2011
@@ -127,7 +127,7 @@ h3. Specifying Columns
 bc. 
 DELETE [COLUMNS] ...
 
-Following the @DELETE@ keyword is an optional comma-delimited list of column 
name terms. When no column names are specified, the remove applies to the 
entire row(s) matched by the @WHERE@ clause:#deleterows
+Following the @DELETE@ keyword is an optional comma-delimited list of column 
name terms. When no column names are specified, the remove applies to the 
entire row(s) matched by the WHERE clause:#deleterows
 
 h3. Column Family
 




[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future

2011-01-05 Thread Ryan King (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978076#action_12978076
 ] 

Ryan King commented on CASSANDRA-1935:
--

What about scenarios outside startup, like streaming?

 Refuse to open SSTables from the future
 ---

 Key: CASSANDRA-1935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Priority: Minor
 Fix For: 0.8


 If somebody has rolled back to a previous version of Cassandra that is unable 
 to read an SSTable written by a future version correctly (indicated by a 
 version change), failing fast is safer than accidentally performing a 
 compaction that rewrites incorrect data and leaves you in an odd state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-1935) Refuse to open SSTables from the future

2011-01-05 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978095#action_12978095
 ] 

Jonathan Ellis commented on CASSANDRA-1935:
---

Streaming mostly doesn't work across different versions anyway, so I would be 
in favor of gossiping the Cassandra version and requiring matching versions to 
stream.

 Refuse to open SSTables from the future
 ---

 Key: CASSANDRA-1935
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1935
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Priority: Minor
 Fix For: 0.8


 If somebody has rolled back to a previous version of Cassandra that is unable 
 to read an SSTable written by a future version correctly (indicated by a 
 version change), failing fast is safer than accidentally performing a 
 compaction that rewrites incorrect data and leaves you in an odd state.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming

2011-01-05 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-1943:


Fix Version/s: (was: 0.7.0)
   0.7.1

 Addition of internode buffering broke Streaming
 ---

 Key: CASSANDRA-1943
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
 Fix For: 0.7.1


 Adding internode buffering broke StreamingTransferTest in the 0.7.0 branch. 
 Bisected to r1055313

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (CASSANDRA-1943) Addition of internode buffering broke Streaming

2011-01-05 Thread Stu Hood (JIRA)
Addition of internode buffering broke Streaming
---

 Key: CASSANDRA-1943
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
 Fix For: 0.7.0


Adding internode buffering broke StreamingTransferTest in the 0.7.0 branch. 
Bisected to r1055313

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming

2011-01-05 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1943:


Description: Adding internode buffering broke StreamingTransferTest in the 
0.7 branch. Bisected to r1055313  (was: Adding internode buffering broke 
StreamingTransferTest in the 0.7.0 branch. Bisected to r1055313)

 Addition of internode buffering broke Streaming
 ---

 Key: CASSANDRA-1943
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
 Fix For: 0.7.1


 Adding internode buffering broke StreamingTransferTest in the 0.7 branch. 
 Bisected to r1055313

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming

2011-01-05 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1943:


Attachment: 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt

Streaming connections don't use the InputStream implementation to read their 
data: they bypass all buffering and use the SocketChannel directly. By 
buffering immediately after opening the connection, we were buffering in the 
beginning of the streamed file.

 Addition of internode buffering broke Streaming
 ---

 Key: CASSANDRA-1943
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
 Fix For: 0.7.1

 Attachments: 
 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt


 Adding internode buffering broke StreamingTransferTest in the 0.7 branch. 
 Bisected to r1055313

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1943) Addition of internode buffering broke Streaming

2011-01-05 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1943:


Fix Version/s: 0.8

Also affects trunk.

 Addition of internode buffering broke Streaming
 ---

 Key: CASSANDRA-1943
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1943
 Project: Cassandra
  Issue Type: Bug
Reporter: Stu Hood
Priority: Critical
 Fix For: 0.7.1, 0.8

 Attachments: 
 0001-Don-t-begin-buffering-a-connection-until-we-ve-determi.txt


 Adding internode buffering broke StreamingTransferTest in the 0.7 branch. 
 Bisected to r1055313

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Assigned: (CASSANDRA-1848) Separate thrift and avro classes from cassandra's jar

2011-01-05 Thread Eric Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Evans reassigned CASSANDRA-1848:
-

Assignee: Eric Evans

 Separate thrift and avro classes from cassandra's jar
 -

 Key: CASSANDRA-1848
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1848
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Affects Versions: 0.7.0 rc 2
Reporter: Tristan Tarrant
Assignee: Eric Evans
Priority: Trivial
 Fix For: 0.8

 Attachments: CASSANDRA-1848.patch, CASSANDRA-1848_with_hadoop.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 Most client applications written in Java include the full 
 apache-cassandra-x.y.z.jar in their classpath. I propose to separate the avro 
 and thrift classes into separate jars.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (CASSANDRA-1472) Add bitmap secondary indexes

2011-01-05 Thread Stu Hood (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stu Hood updated CASSANDRA-1472:


Attachment: 0.7-1472-v5.tgz

Attaching a version of 1472-v5 rebased for the 0.7 branch.

 Add bitmap secondary indexes
 

 Key: CASSANDRA-1472
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
Assignee: Stu Hood
 Fix For: 0.7.1

 Attachments: 0.7-1472-v5.tgz, 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, 
 anatomy.png, v4-bench-c32.txt


 Bitmap indexes are a very efficient structure for dealing with immutable 
 data. We can take advantage of the fact that SSTables are immutable by 
 attaching them directly to SSTables as a new component (supported by 
 CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (CASSANDRA-674) New SSTable Format

2011-01-05 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978155#action_12978155
 ] 

Stu Hood commented on CASSANDRA-674:


 Indexes for individual rows are gone, since the global index allows random 
 access...
 ^ This wouldn't be useful to cache? in the situation you only want a small 
 range of columns?
That information is outdated: it's from the original implementation. But yes... 
we will want to keep the index in app memory or page cache.

 Roughly how large would the actual chunk be? This is the unit of 
 deserialization right?
The span is the unit of deserialization (made up of at most 1 chunk per level), 
and its size would be 100% configurable. The main question is how frequently to 
index the spans in the sstable index: does each span get an index entry? or 
only the first span of a row (this is our approach in the current 
implementation).

 So if you are doing a range query on a very wide row how do you know when to 
 stop processing chunks?
By looking at the global index: if all spans get entries in the index, you know 
the last interesting span.

 Let me know if this is wrong, but this design opens the cassandra data model 
 to contain arbitrarily nested data.
 Given the complexity we already have surrounding the supercolumn concept do 
 you think this is the right way forward? 
The super column concept is only confusing _because_ we call them 
supercolumns rather than just calling them compound column names. People 
use them, and the consensus I've heard is that they are useful.

 If we assume we keep the datamodel as is how can we simplify the open 
 ended-ness of your design to make the approach fit our current data model.
The only difference is what you call the structures, and whether you put 
arbitrary limits on the nesting: I'm open to suggestions.

 New SSTable Format
 --

 Key: CASSANDRA-674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 674-v1.diff, perf-674-v1.txt, 
 perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt


 Various tickets exist due to limitations in the SSTable file format, 
 including #16, #47 and #328. Attached is a proposed design/implementation of 
 a new file format for SSTables that addresses a few of these limitations. The 
 implementation has a bunch of issues/fixmes, which I'll describe in the 
 comments.
 The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
 class, but briefly:
  * Blocks are opaque (except for their header) so that they can be 
 compressed. The index file contains an entry for the first key in every 
 Block. Blocks contain Slices.
  * Slices are series of columns with the same parents and (deletion) 
 metadata. They can be used to represent ColumnFamilies or SuperColumns (or a 
 slice of columns at any other depth). A single CF can be split across 
 multiple Slices, which can be split across multiple blocks.
  * Neither Slices nor Blocks have a fixed size or maximum length, but they 
 each have target lengths which can be stretched and broken by very large 
 columns.
 The most interesting concepts from this patch are:
  * Block compression is possible (currently using GZIP, which has one bug 
 mentioned in the comments),
  * Compaction involves merging intersecting Slices from input SSTables. Since 
 large rows will be broken down into multiple slices, only the portions of 
 rows that intersect between tables need to be 
 deserialized/merged/held-in-memory,
  * Indexes for individual rows are gone, since the global index allows random 
 access to the middle of column families that span Blocks, and Slices allow 
 batches of columns to be skipped within a Block.
  * Bloom filters for individual rows are gone, and the global filter contains 
 ColumnKeys instead, meaning that a query for a column that doesn't exist in a 
 row that does will often not need to seek to the row.
  * Metadata (deletion/gc time) and ColumnKeys (key, colname1, colname2...) 
 for columns are defined recursively, so deeply nested slices are possible,
  * Slices representing a single parent (CF, SC, etc) can have different 
 Metadata, meaning that a tombstone Slice from d-f could sit between Slices 
 containing columns a-c and g-h. This allows for eventually consistent range 
 deletes of columns.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Issue Comment Edited: (CASSANDRA-674) New SSTable Format

2011-01-05 Thread Stu Hood (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12978155#action_12978155
 ] 

Stu Hood edited comment on CASSANDRA-674 at 1/6/11 1:17 AM:


 Indexes for individual rows are gone, since the global index allows random 
 access...
 ^ This wouldn't be useful to cache? in the situation you only want a small 
 range of columns?
That information is outdated: it's from the original implementation. But yes... 
we will want to keep the index in app memory or page cache.

 Roughly how large would the actual chunk be? This is the unit of 
 deserialization right?
The span is the unit of deserialization (made up of at most 1 chunk per level), 
and its size would be 100% configurable. The main question is how frequently to 
index the spans in the sstable index: does each span get an index entry? or 
only the first span of a row (this is our approach in the current 
implementation).

EDIT: Sorry... the span is symbolic: you would deserialize the first chunk of 
the span (containing the keys) to decide whether to skip the rest of the chunks 
in the span.

 So if you are doing a range query on a very wide row how do you know when to 
 stop processing chunks?
By looking at the global index: if all spans get entries in the index, you know 
the last interesting span.

 Let me know if this is wrong, but this design opens the cassandra data model 
 to contain arbitrarily nested data.
 Given the complexity we already have surrounding the supercolumn concept do 
 you think this is the right way forward? 
The super column concept is only confusing _because_ we call them 
supercolumns rather than just calling them compound column names. People 
use them, and the consensus I've heard is that they are useful.

 If we assume we keep the datamodel as is how can we simplify the open 
 ended-ness of your design to make the approach fit our current data model.
The only difference is what you call the structures, and whether you put 
arbitrary limits on the nesting: I'm open to suggestions.

  was (Author: stuhood):
 Indexes for individual rows are gone, since the global index allows 
random access...
 ^ This wouldn't be useful to cache? in the situation you only want a small 
 range of columns?
That information is outdated: it's from the original implementation. But yes... 
we will want to keep the index in app memory or page cache.

 Roughly how large would the actual chunk be? This is the unit of 
 deserialization right?
The span is the unit of deserialization (made up of at most 1 chunk per level), 
and its size would be 100% configurable. The main question is how frequently to 
index the spans in the sstable index: does each span get an index entry? or 
only the first span of a row (this is our approach in the current 
implementation).

 So if you are doing a range query on a very wide row how do you know when to 
 stop processing chunks?
By looking at the global index: if all spans get entries in the index, you know 
the last interesting span.

 Let me know if this is wrong, but this design opens the cassandra data model 
 to contain arbitrarily nested data.
 Given the complexity we already have surrounding the supercolumn concept do 
 you think this is the right way forward? 
The super column concept is only confusing _because_ we call them 
supercolumns rather than just calling them compound column names. People 
use them, and the consensus I've heard is that they are useful.

 If we assume we keep the datamodel as is how can we simplify the open 
 ended-ness of your design to make the approach fit our current data model.
The only difference is what you call the structures, and whether you put 
arbitrary limits on the nesting: I'm open to suggestions.
  
 New SSTable Format
 --

 Key: CASSANDRA-674
 URL: https://issues.apache.org/jira/browse/CASSANDRA-674
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Stu Hood
 Fix For: 0.8

 Attachments: 674-v1.diff, perf-674-v1.txt, 
 perf-trunk-2f3d2c0e4845faf62e33c191d152cb1b3fa62806.txt


 Various tickets exist due to limitations in the SSTable file format, 
 including #16, #47 and #328. Attached is a proposed design/implementation of 
 a new file format for SSTables that addresses a few of these limitations. The 
 implementation has a bunch of issues/fixmes, which I'll describe in the 
 comments.
 The file format is described in the javadoc for the o.a.c.io.SSTableWriter 
 class, but briefly:
  * Blocks are opaque (except for their header) so that they can be 
 compressed. The index file contains an entry for the first key in every 
 Block. Blocks contain Slices.
  * Slices are series of columns with the same parents and (deletion) 
 metadata. They can be used to represent ColumnFamilies or