[jira] [Created] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)
Enormous counter 
-

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen


I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
APP status information.
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081545#comment-13081545
 ] 

Boris Yen commented on CASSANDRA-3006:
--

I forgot the mention that the counter is out of sync between these two nodes, 
one shows 481387 and the other one shows 20706.

 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
 Keyspace: test:
   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [datacenter1:2]
   Column Families:
 ColumnFamily: testCounter (Super)
 APP status information.
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.CounterColumnType
   Columns sorted by: 
 org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true
   Built indexes: []
 Then, I use a test program based on hector to add a counter column 
 (testCounter[sc][column]) 1000 times. In the middle the adding process, I 
 intentional shut down the node 172.17.19.152. In addition to that, the test 
 program is smart enough to switch the consistency level from Quorum to One, 
 so that the following adding actions would not fail. 
 After all the adding actions are done, I start the cassandra on 
 172.17.19.152, and I use cassandra-cli to check if the counter is correct on 
 both nodes, and I got a result 1001 which should be reasonable because hector 
 will retry once. However, when I shut down 172.17.19.151 and after 
 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra 
 on 172.17.19.151 again. Then, I check the counter again, this time I got a 
 result 481387 which is so wrong.
 I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
 before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boris Yen updated CASSANDRA-3006:
-

Description: 
I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
APP status information.
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 

  was:
I have two-node cluster with the following keyspace and column family settings.

Cluster Information:
   Snitch: org.apache.cassandra.locator.SimpleSnitch
   Partitioner: org.apache.cassandra.dht.RandomPartitioner
   Schema versions: 
63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]

Keyspace: test:
  Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
  Durable Writes: true
Options: [datacenter1:2]
  Column Families:
ColumnFamily: testCounter (Super)
APP status information.
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.CounterColumnType
  Columns sorted by: 
org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Built indexes: []

Then, I use a test program based on hector to add a counter column 
(testCounter[sc][column]) 1000 times. In the middle the adding process, I 
intentional shut down the node 172.17.19.152. In addition to that, the test 
program is smart enough to switch the consistency level from Quorum to One, so 
that the following adding actions would not fail. 

After all the adding actions are done, I start the cassandra on 172.17.19.152, 
and I use cassandra-cli to check if the counter is correct on both nodes, and I 
got a result 1001 which should be reasonable because hector will retry once. 
However, when I shut down 172.17.19.151 and after 172.17.19.152 is aware of 
172.17.19.151 is down, I try to start the cassandra on 172.17.19.151 again. 
Then, I check the counter again, this time I got a result 481387 which is so 
wrong.

I use 0.8.3 the reproduce this bug, but I think this also happens on 0.8.2 or 
before also. 


 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: 

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-08-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2843:


Attachment: 2843_h.patch

bq. the IColumnMap name when it does not implement Map interface, and some 
things it has in common with Map (iteration) it changes semantics of (iterating 
values instead of keys). not sure what to use instead though, since we already 
have an IColumnContainer. Maybe ISortedColumns?

Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but 
not sure it's better than ISortedColumns so attached rebased patch simply 
rename ColumnMap - SortedColumns

bq. TSCM and ALCM extending instead of wrapping CSLM/AL, respectively

The idea was to save one object creation. I admit this is probably not a huge 
deal, but it felt that in this case it was no big deal to extend instead of 
wrapping either, so felt like worth optimizing. I still stand by that choice 
but I have no good argument against the criticism that it is possibly premature.

bq. unrelated reformatting

If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced 
re-indentation on the file which rewrote the tab to spaces. New patch keeps the 
old formatting.  I'd mention that there is also a few places where I've rewrote 
cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit 
gratuitous for this patch, but I figured this avoids creating a new Collection 
in the case of CLSM and there's not so many occurrences.


 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Fix For: 1.0

 Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, 
 fix.diff, microBenchmark.patch, patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide 

[jira] [Created] (CASSANDRA-3007) NullPointerException in MessagingService.java:420

2011-08-09 Thread Viliam Holub (JIRA)
NullPointerException in MessagingService.java:420
-

 Key: CASSANDRA-3007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 
05:15:26 UTC 2010 x86_64 GNU/Linux
java version 1.6.0_18
OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Viliam Holub
Priority: Minor


I'm getting large quantity of exceptions during streaming. It is always in 
MessagingService.java:420. The streaming appears to be blocked.

 INFO 10:11:14,734 Streaming to /10.235.77.27
ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main]
java.lang.NullPointerException
at 
org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420)
at 
org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176)
at 
org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148)
at 
org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54)
at 
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-1717:
---

Attachment: CASSANDRA-1717-v2.patch

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit 
tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of 
java CRC32. In particular, it seems they have been able to close to double the 
speed of the CRC32, with a solution that seems fairly simple to me. It would be 
ok to use java native CRC32 and leave the improvement to another ticket, but 
quite frankly if it is that simple and since the hadoop guys have done all the 
hard work for us, I say we start with the efficient version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081569#comment-13081569
 ] 

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:25 AM:
-

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit 
tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the 
efficiency of java CRC32. In particular, it seems they have been able to close 
to double the speed of the CRC32, with a solution that seems fairly simple to 
me. It would be ok to use java native CRC32 and leave the improvement to 
another ticket, but quite frankly if it is that simple and since the hadoop 
guys have done all the hard work for us, I say we start with the efficient 
version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

  was (Author: xedin):
bq. CSW.flushData() forgot to reset the checksum (this is caught by the 
unit tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

As Todd noted, HADOOP-6148 contains a bunch of discussions on the efficiency of 
java CRC32. In particular, it seems they have been able to close to double the 
speed of the CRC32, with a solution that seems fairly simple to me. It would be 
ok to use java native CRC32 and leave the improvement to another ticket, but 
quite 

[jira] [Issue Comment Edited] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081569#comment-13081569
 ] 

Pavel Yaskevich edited comment on CASSANDRA-1717 at 8/9/11 11:29 AM:
-

bq. CSW.flushData() forgot to reset the checksum (this is caught by the unit 
tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  It checksums original (non-compressed) data and stores checksum at the end of 
the compressed chunk, reader makes a checksum check after decompression.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the 
efficiency of java CRC32. In particular, it seems they have been able to close 
to double the speed of the CRC32, with a solution that seems fairly simple to 
me. It would be ok to use java native CRC32 and leave the improvement to 
another ticket, but quite frankly if it is that simple and since the hadoop 
guys have done all the hard work for us, I say we start with the efficient 
version directly.

  As decided previously this will be a matter of the separate ticket.

Rebased with latest trunk (last commit 1e36fb1e44bff96005dd75a25648ff25eea6a95f)

  was (Author: xedin):
bq. CSW.flushData() forgot to reset the checksum (this is caught by the 
unit tests btw).

  Not a problem since it was due to Sylvain's bad merge.

bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

  Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide.

bq. Here we checksum the compressed data. The other approach would be to 
checksum the uncompressed data. The advantage of checksumming compressed data 
is the speed (less data to checksum), but checksumming the uncompressed data 
would be a little bit safer. In particular, it would prevent us from messing up 
in the decompression (and we don't have to trust the compression algorithm, not 
that I don't trust Snappy, but...). This is a clearly a trade-off that we have 
to make, but I admit that my personal preference would lean towards safety (in 
particular, I know that checksumming the uncompressed data give a bit more 
safety, I don't know what is our exact gain quantitatively with checksumming 
compressed data). On the other side, checksumming the uncompressed data would 
likely mean that a good part of the bitrot would result in a decompression 
error rather than a checksum error, which is maybe less convenient from the 
implementation point of view. So I don't know, I guess I'm thinking aloud to 
have other's opinions more than anything else.

  Checksum is moved to the original data.
 
bq. Let's add some unit tests. At least it's relatively easy to write a few 
blocks, switch one bit in the resulting file, and checking this is caught at 
read time (or better, do that multiple time changing a different bit each time).

  Test was added to CompressedRandomAccessReaderTest.

bq. As Todd noted, HADOOP-6148 contains a bunch of discussions on the 
efficiency of java CRC32. In particular, it seems they have been able to close 
to double the speed of the CRC32, with a solution that seems 

[jira] [Commented] (CASSANDRA-3007) NullPointerException in MessagingService.java:420

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081601#comment-13081601
 ] 

Jonathan Ellis commented on CASSANDRA-3007:
---

What kind of streaming are you attempting?  

 NullPointerException in MessagingService.java:420
 -

 Key: CASSANDRA-3007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 
 05:15:26 UTC 2010 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Viliam Holub
Priority: Minor
  Labels: nullpointerexception, streaming

 I'm getting large quantity of exceptions during streaming. It is always in 
 MessagingService.java:420. The streaming appears to be blocked.
  INFO 10:11:14,734 Streaming to /10.235.77.27
 ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420)
 at 
 org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176)
 at 
 org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148)
 at 
 org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3007) NullPointerException in MessagingService.java:420

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3007:
--

Attachment: 3007.txt

Never mind, not relevant.  Looks like you upgraded from 0.7 without updating 
your configuration file?

Fix for missing encryption_options attached.

 NullPointerException in MessagingService.java:420
 -

 Key: CASSANDRA-3007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 
 05:15:26 UTC 2010 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Viliam Holub
Priority: Minor
  Labels: nullpointerexception, streaming
 Fix For: 0.8.4

 Attachments: 3007.txt


 I'm getting large quantity of exceptions during streaming. It is always in 
 MessagingService.java:420. The streaming appears to be blocked.
  INFO 10:11:14,734 Streaming to /10.235.77.27
 ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420)
 at 
 org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176)
 at 
 org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148)
 at 
 org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-3006:
-

Assignee: Sylvain Lebresne

 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen
Assignee: Sylvain Lebresne

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
 Keyspace: test:
   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [datacenter1:2]
   Column Families:
 ColumnFamily: testCounter (Super)
 APP status information.
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.CounterColumnType
   Columns sorted by: 
 org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true
   Built indexes: []
 Then, I use a test program based on hector to add a counter column 
 (testCounter[sc][column]) 1000 times. In the middle the adding process, I 
 intentional shut down the node 172.17.19.152. In addition to that, the test 
 program is smart enough to switch the consistency level from Quorum to One, 
 so that the following adding actions would not fail. 
 After all the adding actions are done, I start the cassandra on 
 172.17.19.152, and I use cassandra-cli to check if the counter is correct on 
 both nodes, and I got a result 1001 which should be reasonable because hector 
 will retry once. However, when I shut down 172.17.19.151 and after 
 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra 
 on 172.17.19.151 again. Then, I check the counter again, this time I got a 
 result 481387 which is so wrong.
 I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or 
 before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081603#comment-13081603
 ] 

Sylvain Lebresne commented on CASSANDRA-1717:
-

{quote}
bq. We should convert the CRC32 to an int (and only write that) as it is an int 
internally (getValue() returns a long only because CRC32 implements the 
interface Checksum that require that).

Lets leave that to the ticket for CRC optimization which will allow us to 
modify that system-wide
{quote}
Let's not:
* this is completely orthogonal to switching to a drop-in, faster, CRC 
implementation.
* it is unclear we want to make that system-wide. Imho, it is not worth 
breaking commit log compatibility for that, but it it stupid to commit new code 
that perpetuate the mistake, especially to change it later.

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081605#comment-13081605
 ] 

Jonathan Ellis commented on CASSANDRA-1717:
---

Saving 4 bytes out of 64K doesn't seem like enough benefit to make life harder 
for ourselves if we want to use a long checksum later.

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081609#comment-13081609
 ] 

Pavel Yaskevich commented on CASSANDRA-1717:


+1 with Jonathan, also it is better if we satisfy interface instead of relying 
on internal implementation details that also could be helpful if we will decide 
to change checksum algorithm.

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081629#comment-13081629
 ] 

Sylvain Lebresne commented on CASSANDRA-1717:
-

What are the chance we'll switch from CRC32 any time soon ? And even if we do, 
why would that help us to save 4 bytes of 0's right now ? We will still have to 
bump the file format versioning and to keep the code to be compatible with the 
old CRC32 format if we do so. It's not like the only difference between 
checksum algorithms is the size of the checksum.

So yes, 4 bytes out of 64K is not a lot of data, but to knowingly write 4 bytes 
of 0's every 64k every time for the vague remote chance that it may save us 1 
or 2 lines of code someday (again, that even remains to be proven) feels 
ridiculous to me. But if I'm the only one to feel that way, fine, it's not a 
big deal.

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081637#comment-13081637
 ] 

Pavel Yaskevich commented on CASSANDRA-1717:


I still think that such change is a matter of the separate ticket as we will 
want to change CRC stuff globally, we can make own Checksum class with will 
return int value, apply performance improvements mentioned by HADOOP-6148 to it 
and use system-wide.

Is there anything else that keeps this from being committed?

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1155374 - /cassandra/branches/cassandra-0.8/debian/control

2011-08-09 Thread eevans
Author: eevans
Date: Tue Aug  9 14:05:55 2011
New Revision: 1155374

URL: http://svn.apache.org/viewvc?rev=1155374view=rev
Log:
build requires subversion (line 235 of build.xml)

Patch by Sven Wilhelm; reviewed by eevans

Modified:
cassandra/branches/cassandra-0.8/debian/control

Modified: cassandra/branches/cassandra-0.8/debian/control
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/debian/control?rev=1155374r1=1155373r2=1155374view=diff
==
--- cassandra/branches/cassandra-0.8/debian/control (original)
+++ cassandra/branches/cassandra-0.8/debian/control Tue Aug  9 14:05:55 2011
@@ -2,7 +2,7 @@ Source: cassandra
 Section: misc
 Priority: extra
 Maintainer: Eric Evans eev...@apache.org
-Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7), ant-optional (= 1.7)
+Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7), ant-optional (= 1.7), subversion
 Homepage: http://cassandra.apache.org
 Vcs-Svn: https://svn.apache.org/repos/asf/cassandra/trunk
 Vcs-Browser: http://svn.apache.org/viewvc/cassandra/trunk




[jira] [Created] (CASSANDRA-3008) Error getting range slices

2011-08-09 Thread Luis Eduardo Villares Matta (JIRA)
Error getting range slices
--

 Key: CASSANDRA-3008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3008
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.2
 Environment: Ubuntu, using the 08x repository
Reporter: Luis Eduardo Villares Matta
Priority: Critical


I can get a range slice on one of my column families.

ERROR 14:16:26,672 Internal error processing get_range_slices
java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191
at 
org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:66)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91)
at 
org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:86)
at 
org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184)
at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
at 
org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
at 
org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
at 
org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
at 
org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
at 
org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392)
at 
org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684)
at 
org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
at 
org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
at 
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191
at 
org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
at 
org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50)
at 
org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
... 24 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081660#comment-13081660
 ] 

Sylvain Lebresne commented on CASSANDRA-3006:
-

I've haven't had luck with reproducing so far. I've tried to stick with the 
description above but did not used hector (not saying it is hector fault 
though, maybe it is the way it does retry that I don't emulate well). If you 
are able to share a minimal hector script with which you reproduce this easily, 
that would be very helpful.

 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen
Assignee: Sylvain Lebresne

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
 Keyspace: test:
   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [datacenter1:2]
   Column Families:
 ColumnFamily: testCounter (Super)
 APP status information.
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.CounterColumnType
   Columns sorted by: 
 org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true
   Built indexes: []
 Then, I use a test program based on hector to add a counter column 
 (testCounter[sc][column]) 1000 times. In the middle the adding process, I 
 intentional shut down the node 172.17.19.152. In addition to that, the test 
 program is smart enough to switch the consistency level from Quorum to One, 
 so that the following adding actions would not fail. 
 After all the adding actions are done, I start the cassandra on 
 172.17.19.152, and I use cassandra-cli to check if the counter is correct on 
 both nodes, and I got a result 1001 which should be reasonable because hector 
 will retry once. However, when I shut down 172.17.19.151 and after 
 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra 
 on 172.17.19.151 again. Then, I check the counter again, this time I got a 
 result 481387 which is so wrong.
 I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or 
 before also. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-08-09 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081665#comment-13081665
 ] 

T Jake Luciani commented on CASSANDRA-2474:
---

I don't (yet) know how to add hint types to hive but once a transposed hint 
operator was added we should be able to hook it into the hive driver.  

 CQL support for compound columns
 

 Key: CASSANDRA-2474
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0


 For the most part, this boils down to supporting the specification of 
 compound column names (the CQL syntax is colon-delimted terms), and then 
 teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081669#comment-13081669
 ] 

Jonathan Ellis commented on CASSANDRA-2474:
---

Isn't changing query semantics kind of the opposite of what hints are supposed 
to be for?

 CQL support for compound columns
 

 Key: CASSANDRA-2474
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474
 Project: Cassandra
  Issue Type: Sub-task
  Components: API, Core
Reporter: Eric Evans
  Labels: cql
 Fix For: 1.0


 For the most part, this boils down to supporting the specification of 
 compound column names (the CQL syntax is colon-delimted terms), and then 
 teaching the decoders (drivers) to create structures from the results.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3008) Error getting range slices

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081673#comment-13081673
 ] 

Jonathan Ellis commented on CASSANDRA-3008:
---

did you try nodetool scrub?

 Error getting range slices
 --

 Key: CASSANDRA-3008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3008
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.2
 Environment: Ubuntu, using the 08x repository
Reporter: Luis Eduardo Villares Matta
Priority: Critical

 I can get a range slice on one of my column families.
 ERROR 14:16:26,672 Internal error processing get_range_slices
 java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:66)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:86)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392)
 at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684)
 at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
 at 
 org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 ... 24 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2749) fine-grained control over data directories

2011-08-09 Thread Chris Burroughs (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081679#comment-13081679
 ] 

Chris Burroughs commented on CASSANDRA-2749:


It would also be cool (but this is obviously speculative) to have the ability 
to keep Index files on an SSD, and the larger data files on rotating disks.

 fine-grained control over data directories
 --

 Key: CASSANDRA-2749
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2749
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 1.0


 Currently Cassandra supports multiple data directories but no way to control 
 what sstables are placed where. Particularly for systems with mixed SSDs and 
 rotational disks, it would be nice to pin frequently accessed columnfamilies 
 to the SSDs.
 Postgresql does this with tablespaces 
 (http://www.postgresql.org/docs/9.0/static/manage-ag-tablespaces.html) but we 
 should probably avoid using that name because of confusing similarity to 
 keyspaces.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3007) NullPointerException in MessagingService.java:420

2011-08-09 Thread Viliam Holub (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081680#comment-13081680
 ] 

Viliam Holub commented on CASSANDRA-3007:
-

It's removetoken command.

Yes, I updated the node and forgot to specify encryption_options - thanks!

 NullPointerException in MessagingService.java:420
 -

 Key: CASSANDRA-3007
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3007
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
 Environment: Linux w0 2.6.35-24-virtual #42-Ubuntu SMP Thu Dec 2 
 05:15:26 UTC 2010 x86_64 GNU/Linux
 java version 1.6.0_18
 OpenJDK Runtime Environment (IcedTea6 1.8.7) (6b18-1.8.7-2~squeeze1)
 OpenJDK 64-Bit Server VM (build 14.0-b16, mixed mode)
Reporter: Viliam Holub
Assignee: Jonathan Ellis
Priority: Minor
  Labels: nullpointerexception, streaming
 Fix For: 0.8.4

 Attachments: 3007.txt


 I'm getting large quantity of exceptions during streaming. It is always in 
 MessagingService.java:420. The streaming appears to be blocked.
  INFO 10:11:14,734 Streaming to /10.235.77.27
 ERROR 10:11:14,734 Fatal exception in thread Thread[StreamStage:2,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.net.MessagingService.stream(MessagingService.java:420)
 at 
 org.apache.cassandra.streaming.StreamOutSession.begin(StreamOutSession.java:176)
 at 
 org.apache.cassandra.streaming.StreamOut.transferRangesForRequest(StreamOut.java:148)
 at 
 org.apache.cassandra.streaming.StreamRequestVerbHandler.doVerb(StreamRequestVerbHandler.java:54)
 at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:636)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081685#comment-13081685
 ] 

Sylvain Lebresne commented on CASSANDRA-1717:
-

As previously said, I both disagree on using 8 bytes when we need 4 and that 
using 4 is a matter for another ticket, but since this is probably me being too 
anal as usual, +1 on the rest of the patch, modulo a small optional nitpick: 
the toLong() function is a bit hard to read imho. It's hard to see where the 
parenthesis are, and if it does the right thing. It seems ok though, I just 
think a simple for loop on the bytes would be more readable. We also 
historically keep ByteBufferUtil for ByteBuffer manipulations and use 
FBUtilities for byte[] manipulation.


 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081689#comment-13081689
 ] 

Pavel Yaskevich commented on CASSANDRA-1717:


Ok, I will move toLong(byte[] bytes) to FBUtilities and commit, thanks!

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081690#comment-13081690
 ] 

Jonathan Ellis commented on CASSANDRA-1717:
---

You're right, if we change checksum implementation we need to bump sstable 
revision anyway.  +1 casting to int here.  (But as you said above, -1 changing 
this in CommitLog.)

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717.patch, 
 checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3008) Error getting range slices

2011-08-09 Thread Luis Eduardo Villares Matta (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081701#comment-13081701
 ] 

Luis Eduardo Villares Matta commented on CASSANDRA-3008:


No I did not, it seams to have fixed my issues.
Thank you Very Much. 
(I am inclined to close this issue, but I do not know if I should. Also I am 
testing every thing in the next few hours)

 Error getting range slices
 --

 Key: CASSANDRA-3008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3008
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.2
 Environment: Ubuntu, using the 08x repository
Reporter: Luis Eduardo Villares Matta
Priority: Critical

 I can get a range slice on one of my column families.
 ERROR 14:16:26,672 Internal error processing get_range_slices
 java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:66)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:86)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392)
 at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684)
 at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
 at 
 org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 ... 24 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-1717:
---

Attachment: CASSANDRA-1717-v3.patch

v3 which removes BBU.toLong and adds FBU.byteArrayToInt + uses int instead of 
long for checksum

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, 
 CASSANDRA-1717.patch, checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3008) Error getting range slices

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081712#comment-13081712
 ] 

Jonathan Ellis commented on CASSANDRA-3008:
---

Check (scrub) your other nodes -- data corruption can happen (usually from bad 
memory) but if there's a pattern of all the nodes being affected at the same 
time there could be a Cassandra bug.

 Error getting range slices
 --

 Key: CASSANDRA-3008
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3008
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.2
 Environment: Ubuntu, using the 08x repository
Reporter: Luis Eduardo Villares Matta
Priority: Critical

 I can get a range slice on one of my column families.
 ERROR 14:16:26,672 Internal error processing get_range_slices
 java.io.IOError: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:66)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.createReader(SSTableSliceIterator.java:91)
 at 
 org.apache.cassandra.db.columniterator.SSTableSliceIterator.init(SSTableSliceIterator.java:86)
 at 
 org.apache.cassandra.db.filter.SliceQueryFilter.getSSTableColumnIterator(SliceQueryFilter.java:71)
 at 
 org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:87)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:184)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:144)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:136)
 at 
 org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:39)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
 at 
 org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:69)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135)
 at org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
 at 
 org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1392)
 at 
 org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:684)
 at 
 org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:617)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:3202)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: java.io.EOFException: EOF after 26948 bytes out of 1681403191
 at 
 org.apache.cassandra.io.util.FileUtils.skipBytesFully(FileUtils.java:229)
 at 
 org.apache.cassandra.io.sstable.IndexHelper.skipBloomFilter(IndexHelper.java:50)
 at 
 org.apache.cassandra.db.columniterator.SimpleSliceReader.init(SimpleSliceReader.java:57)
 ... 24 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081718#comment-13081718
 ] 

Sylvain Lebresne commented on CASSANDRA-1717:
-

lgtm, +1

 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, 
 CASSANDRA-1717.patch, checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian

2011-08-09 Thread Chris Lohfink (JIRA)
404 on apt-get install from http://www.apache.org/dist/cassandra/debian
---

 Key: CASSANDRA-3009
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3009
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website
Affects Versions: 0.8.3
 Environment: ubuntu maverick 64-bit
Reporter: Chris Lohfink
Priority: Minor


First bug report on here so sorry if I am doing something incorrectly.  I 
followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am 
receiving a 404 error during the install.  Looks like the 
{code}
clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra
[sudo] password for clohfink: 
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libcommons-pool-java authbind libmcrypt4 libtomcat6-java libcommons-dbcp-java 
tomcat6-common
Use 'apt-get autoremove' to remove them.
The following NEW packages will be installed:
  cassandra
0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded.
Need to get 8,415kB of archives.
After this operation, 9,540kB of additional disk space will be used.
Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all 
0.8.0
  404  Not Found
Failed to fetch 
http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb
  404  Not Found
E: Unable to fetch some archives, maybe run apt-get update or try with 
--fix-missing?
{code}
for debugging info:
{code}
clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
N: Can't select versions from package 'cassandra' as it purely virtual
N: No packages found
clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository deb 
http://www.apache.org/dist/cassandra/debian unstable main
clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update
...
Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en   
   
Ign http://www.apache.org/dist/cassandra/debian/ unstable/main 
Translation-en_US
...
Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages
Fetched 6,989B in 1s (5,974B/s)
Reading package lists... Done
clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
Package: cassandra
Version: 0.8.0
Architecture: all
Maintainer: Eric Evans eev...@apache.org
Installed-Size: 9316
Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
libcommons-daemon-java (= 1.0), adduser
Recommends: libjna-java
Homepage: http://cassandra.apache.org
Priority: extra
Section: misc
Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb
Size: 8415180
SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833
SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6
MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea
Description: distributed storage system for structured data
 Cassandra is a distributed (peer-to-peer) system for the management
 and storage of structured data.
{code}

included fabric script, if have fabric installed can run
{code}
fab -H localhost install_cassandra
{code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-1974) PFEPS-like snitch that uses gossip instead of a property file

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-1974:
-

Assignee: (was: Brandon Williams)

I think the biggest win is when you can automatically determine rack/dc from 
the environment somehow (e.g.: ec2snitch).  Otherwise the advantage of editing 
a file, vs edit + rsync, is small.  Small enough that it's probably not worth 
the education headache.

 PFEPS-like snitch that uses gossip instead of a property file
 -

 Key: CASSANDRA-1974
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1974
 Project: Cassandra
  Issue Type: New Feature
Reporter: Brandon Williams
Priority: Minor

 Now that we have an ec2 snitch that propagates its rack/dc info via gossip 
 from CASSANDRA-1654, it doesn't make a lot of sense to use PFEPS where you 
 have to rsync the property file across all the machines when you add a node.  
 Instead, we could have a snitch where you specify its rack/dc in a property 
 file, and propagate this via gossip like the ec2 snitch.  In order to not 
 break PFEPS, this should probably be a new snitch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian

2011-08-09 Thread Chris Lohfink (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink updated CASSANDRA-3009:
-

Attachment: fabfile.py

 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
 ---

 Key: CASSANDRA-3009
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3009
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website
Affects Versions: 0.8.3
 Environment: ubuntu maverick 64-bit
Reporter: Chris Lohfink
Priority: Minor
 Attachments: fabfile.py


 First bug report on here so sorry if I am doing something incorrectly.  I 
 followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am 
 receiving a 404 error during the install.  Looks like the 
 {code}
 clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra
 [sudo] password for clohfink: 
 Reading package lists... Done
 Building dependency tree   
 Reading state information... Done
 The following packages were automatically installed and are no longer 
 required:
   libcommons-pool-java authbind libmcrypt4 libtomcat6-java 
 libcommons-dbcp-java tomcat6-common
 Use 'apt-get autoremove' to remove them.
 The following NEW packages will be installed:
   cassandra
 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded.
 Need to get 8,415kB of archives.
 After this operation, 9,540kB of additional disk space will be used.
 Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all 
 0.8.0
   404  Not Found
 Failed to fetch 
 http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb
   404  Not Found
 E: Unable to fetch some archives, maybe run apt-get update or try with 
 --fix-missing?
 {code}
 for debugging info:
 {code}
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 N: Can't select versions from package 'cassandra' as it purely virtual
 N: No packages found
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository deb 
 http://www.apache.org/dist/cassandra/debian unstable main
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update
 ...
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en 
  
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main 
 Translation-en_US
 ...
 Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages
 Fetched 6,989B in 1s (5,974B/s)
 Reading package lists... Done
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 Package: cassandra
 Version: 0.8.0
 Architecture: all
 Maintainer: Eric Evans eev...@apache.org
 Installed-Size: 9316
 Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
 libcommons-daemon-java (= 1.0), adduser
 Recommends: libjna-java
 Homepage: http://cassandra.apache.org
 Priority: extra
 Section: misc
 Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb
 Size: 8415180
 SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833
 SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6
 MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea
 Description: distributed storage system for structured data
  Cassandra is a distributed (peer-to-peer) system for the management
  and storage of structured data.
 {code}
 included fabric script, if have fabric installed can run
 {code}
 fab -H localhost install_cassandra
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2892:


Attachment: 2892.patch

That's a super easy one and it removes some nasty boolean flag from 
SP.sendToHintedEndpoints so let's do it.

 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081728#comment-13081728
 ] 

Jonathan Ellis commented on CASSANDRA-2892:
---

can you spell out what's going on with this part?

{code}
-if (cm.shouldReplicateOnWrite())
+hintedEndpoints.removeAll(FBUtilities.getLocalAddress());
+
+if (cm.shouldReplicateOnWrite()  !hintedEndpoints.isEmpty())
{code}

 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[Cassandra Wiki] Trivial Update of DebianPackaging by SylvainLebresne

2011-08-09 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The DebianPackaging page has been changed by SylvainLebresne:
http://wiki.apache.org/cassandra/DebianPackaging?action=diffrev1=22rev2=23

  To install on Debian or Debian derivatives, use the following sources:
  
  {{{
- deb http://www.apache.org/dist/cassandra/debian unstable main
+ deb http://www.apache.org/dist/cassandra/debian 08x main
- deb-src http://www.apache.org/dist/cassandra/debian unstable main
+ deb-src http://www.apache.org/dist/cassandra/debian 08x main
  }}}
  
- ''Note: the unstable suite points to the most current branch of development 
(for historical reasons).  Production systems should use a version-specific 
suite/codename, (for example, `06x` for the 0.6.x series, `07x` for the 0.7.x 
series, etc).''
+ You will want to replace `08x` by the series you want to use: `06x` for the 
0.6.x series, 07x for the 0.7.x series, etc... It does mean that you will not 
get major version update unless you change the series, but that is ''a 
feature''.
+ 
  
  If you run ''apt-get update'' now, you will see an error similar to this:
  {{{


[jira] [Commented] (CASSANDRA-2843) better performance on long row read

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081729#comment-13081729
 ] 

Jonathan Ellis commented on CASSANDRA-2843:
---

+1

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Fix For: 1.0

 Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, 
 fix.diff, microBenchmark.patch, patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian

2011-08-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-3009.
-

Resolution: Not A Problem

Sorry, this is because I don't update the 'unstable' series anymore. You should 
use 08x instead (or 07x if you feel inclined to).

It felt more easily harmful to have an 'unstable' version that would silently 
do major version upgrade, so we've switched to numbered series instead. I've 
updated the wiki accordingly. Sorry for the inconvenience.

 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
 ---

 Key: CASSANDRA-3009
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3009
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website
Affects Versions: 0.8.3
 Environment: ubuntu maverick 64-bit
Reporter: Chris Lohfink
Priority: Minor
 Attachments: fabfile.py


 First bug report on here so sorry if I am doing something incorrectly.  I 
 followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am 
 receiving a 404 error during the install.  Looks like the 
 {code}
 clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra
 [sudo] password for clohfink: 
 Reading package lists... Done
 Building dependency tree   
 Reading state information... Done
 The following packages were automatically installed and are no longer 
 required:
   libcommons-pool-java authbind libmcrypt4 libtomcat6-java 
 libcommons-dbcp-java tomcat6-common
 Use 'apt-get autoremove' to remove them.
 The following NEW packages will be installed:
   cassandra
 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded.
 Need to get 8,415kB of archives.
 After this operation, 9,540kB of additional disk space will be used.
 Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all 
 0.8.0
   404  Not Found
 Failed to fetch 
 http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb
   404  Not Found
 E: Unable to fetch some archives, maybe run apt-get update or try with 
 --fix-missing?
 {code}
 for debugging info:
 {code}
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 N: Can't select versions from package 'cassandra' as it purely virtual
 N: No packages found
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository deb 
 http://www.apache.org/dist/cassandra/debian unstable main
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update
 ...
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en 
  
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main 
 Translation-en_US
 ...
 Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages
 Fetched 6,989B in 1s (5,974B/s)
 Reading package lists... Done
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 Package: cassandra
 Version: 0.8.0
 Architecture: all
 Maintainer: Eric Evans eev...@apache.org
 Installed-Size: 9316
 Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
 libcommons-daemon-java (= 1.0), adduser
 Recommends: libjna-java
 Homepage: http://cassandra.apache.org
 Priority: extra
 Section: misc
 Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb
 Size: 8415180
 SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833
 SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6
 MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea
 Description: distributed storage system for structured data
  Cassandra is a distributed (peer-to-peer) system for the management
  and storage of structured data.
 {code}
 included fabric script, if have fabric installed can run
 {code}
 fab -H localhost install_cassandra
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (CASSANDRA-3009) 404 on apt-get install from http://www.apache.org/dist/cassandra/debian

2011-08-09 Thread Chris Lohfink (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Lohfink closed CASSANDRA-3009.



wiki was updated with distribution changes

 404 on apt-get install from http://www.apache.org/dist/cassandra/debian
 ---

 Key: CASSANDRA-3009
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3009
 Project: Cassandra
  Issue Type: Bug
  Components: Documentation  website
Affects Versions: 0.8.3
 Environment: ubuntu maverick 64-bit
Reporter: Chris Lohfink
Priority: Minor
 Attachments: fabfile.py


 First bug report on here so sorry if I am doing something incorrectly.  I 
 followed the wiki (http://wiki.apache.org/cassandra/DebianPackaging) but I am 
 receiving a 404 error during the install.  Looks like the 
 {code}
 clohfink@roc-lvm-dev:~dev$ sudo apt-get install cassandra
 [sudo] password for clohfink: 
 Reading package lists... Done
 Building dependency tree   
 Reading state information... Done
 The following packages were automatically installed and are no longer 
 required:
   libcommons-pool-java authbind libmcrypt4 libtomcat6-java 
 libcommons-dbcp-java tomcat6-common
 Use 'apt-get autoremove' to remove them.
 The following NEW packages will be installed:
   cassandra
 0 upgraded, 1 newly installed, 0 to remove and 66 not upgraded.
 Need to get 8,415kB of archives.
 After this operation, 9,540kB of additional disk space will be used.
 Err http://www.apache.org/dist/cassandra/debian/ unstable/main cassandra all 
 0.8.0
   404  Not Found
 Failed to fetch 
 http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/cassandra_0.8.0_all.deb
   404  Not Found
 E: Unable to fetch some archives, maybe run apt-get update or try with 
 --fix-missing?
 {code}
 for debugging info:
 {code}
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 N: Can't select versions from package 'cassandra' as it purely virtual
 N: No packages found
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo add-apt-repository deb 
 http://www.apache.org/dist/cassandra/debian unstable main
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-get update
 ...
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main Translation-en 
  
 Ign http://www.apache.org/dist/cassandra/debian/ unstable/main 
 Translation-en_US
 ...
 Hit http://us.archive.ubuntu.com maverick-proposed/universe amd64 Packages
 Fetched 6,989B in 1s (5,974B/s)
 Reading package lists... Done
 clohfink@roc-lvm-dev:~dev/fabrictests$ sudo apt-cache show cassandra
 Package: cassandra
 Version: 0.8.0
 Architecture: all
 Maintainer: Eric Evans eev...@apache.org
 Installed-Size: 9316
 Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
 libcommons-daemon-java (= 1.0), adduser
 Recommends: libjna-java
 Homepage: http://cassandra.apache.org
 Priority: extra
 Section: misc
 Filename: pool/main/c/cassandra/cassandra_0.8.0_all.deb
 Size: 8415180
 SHA256: 7eaaeb9d3ef5af6abff834fe93f1a84349dff98776eaee83f8dabb267ffe4833
 SHA1: 9cca3ffbcbab9e6ba2385f734691c97afeaa8be6
 MD5sum: 01e0435495f7ff40e1b4e4be5857a1ea
 Description: distributed storage system for structured data
  Cassandra is a distributed (peer-to-peer) system for the management
  and storage of structured data.
 {code}
 included fabric script, if have fabric installed can run
 {code}
 fab -H localhost install_cassandra
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2919) CQL system test for counters is failing

2011-08-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne resolved CASSANDRA-2919.
-

Resolution: Cannot Reproduce

Ok, I cannot reproduce either anymore. Probably got fixed, or I screwed up the 
first time. Sorry for that.

 CQL system test for counters is failing
 ---

 Key: CASSANDRA-2919
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2919
 Project: Cassandra
  Issue Type: Bug
  Components: Tests
 Environment: ubuntu 11.04 64 bit
Reporter: Sylvain Lebresne
Assignee: Tyler Hobbs
Priority: Minor
  Labels: cql, test

 On my machine (and on current 0.8 branch) the CQL system test for counters is 
 failing. While reading the counter value, junk bytes are apparently returned 
 instead of the value (on the following excerpt it looks like a empty value, 
 but on the terminal it does show a random character):
 {noformat}
 ==
 FAIL: update statement should be able to work with counter columns
 --
 Traceback (most recent call last):
   File /usr/lib/pymodules/python2.7/nose/case.py, line 186, in runTest
 self.test(*self.arg)
   File /home/pcmanus/Git/cassandra/test/system/test_cql.py, line 1130, in 
 test_counter_column_support
 unrecognized value '%s' % r[1]
 AssertionError: unrecognized value ''
 --
 {noformat}
 I've checked, the server correctly fetch the right column and return what it 
 should. So this seems to be on the python driver side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1717) Cassandra cannot detect corrupt-but-readable column data

2011-08-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081773#comment-13081773
 ] 

Hudson commented on CASSANDRA-1717:
---

Integrated in Cassandra #1010 (See 
[https://builds.apache.org/job/Cassandra/1010/])
Add block level checksum for compressed data
patch by Pavel Yaskevich; reviewed by Sylvain Lebresne for CASSANDRA-1717

xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1155420
Files : 
* /cassandra/trunk/test/unit/org/apache/cassandra/Util.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/io/compress/CompressedRandomAccessReaderTest.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/io/compress/CorruptedBlockException.java
* /cassandra/trunk/CHANGES.txt
* 
/cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressedRandomAccessReader.java
* 
/cassandra/trunk/test/unit/org/apache/cassandra/io/util/BufferedRandomAccessFileTest.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressionMetadata.java
* /cassandra/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java
* 
/cassandra/trunk/src/java/org/apache/cassandra/io/compress/CompressedSequentialWriter.java


 Cassandra cannot detect corrupt-but-readable column data
 

 Key: CASSANDRA-1717
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1717
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0

 Attachments: CASSANDRA-1717-v2.patch, CASSANDRA-1717-v3.patch, 
 CASSANDRA-1717.patch, checksums.txt


 Most corruptions of on-disk data due to bitrot render the column (or row) 
 unreadable, so the data can be replaced by read repair or anti-entropy.  But 
 if the corruption keeps column data readable we do not detect it, and if it 
 corrupts to a higher timestamp value can even resist being overwritten by 
 newer values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2990) We should refuse query for counters at CL.ANY

2011-08-09 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2990:


Attachment: 2990.patch

 We should refuse query for counters at CL.ANY
 -

 Key: CASSANDRA-2990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2990
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2990.patch


 We currently do not reject writes for counters at CL.ANY, even though this is 
 not supported (and rightly so).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2892:
--

Attachment: 2892-v1.5.txt

v1.5 attached.  I thought I could improve it more, but couldn't. :)

Ended up just extracting counterWriteTask() to remove the 
executeOnMutationStage flag.

 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892-v1.5.txt, 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081802#comment-13081802
 ] 

Sylvain Lebresne commented on CASSANDRA-2892:
-

v1.5 lgtm

 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892-v1.5.txt, 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1155460 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/service/StorageProxy.java

2011-08-09 Thread jbellis
Author: jbellis
Date: Tue Aug  9 18:37:20 2011
New Revision: 1155460

URL: http://svn.apache.org/viewvc?rev=1155460view=rev
Log:
avoid doing read forno-op replicate-on-write at CL=1
patch by slebresne and jbellis for CASSANDRA-2892

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1155460r1=1155459r2=1155460view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug  9 18:37:20 2011
@@ -1,6 +1,7 @@
 0.8.4
  * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972)
  * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992)
+ * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892)
 
 
 0.8.3

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java?rev=1155460r1=1155459r2=1155460view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java
 Tue Aug  9 18:37:20 2011
@@ -96,7 +96,7 @@ public class StorageProxy implements Sto
 public void apply(IMutation mutation, MultimapInetAddress, 
InetAddress hintedEndpoints, IWriteResponseHandler responseHandler, String 
localDataCenter, ConsistencyLevel consistency_level) throws IOException
 {
 assert mutation instanceof RowMutation;
-sendToHintedEndpoints((RowMutation) mutation, hintedEndpoints, 
responseHandler, localDataCenter, true, consistency_level);
+sendToHintedEndpoints((RowMutation) mutation, hintedEndpoints, 
responseHandler, localDataCenter, consistency_level);
 }
 };
 
@@ -110,7 +110,11 @@ public class StorageProxy implements Sto
 {
 public void apply(IMutation mutation, MultimapInetAddress, 
InetAddress hintedEndpoints, IWriteResponseHandler responseHandler, String 
localDataCenter, ConsistencyLevel consistency_level) throws IOException
 {
-applyCounterMutation(mutation, hintedEndpoints, 
responseHandler, localDataCenter, consistency_level, false);
+if (logger.isDebugEnabled())
+logger.debug(insert writing local  replicate  + 
mutation.toString(true));
+
+Runnable runnable = counterWriteTask(mutation, 
hintedEndpoints, responseHandler, localDataCenter, consistency_level);
+runnable.run();
 }
 };
 
@@ -118,7 +122,11 @@ public class StorageProxy implements Sto
 {
 public void apply(IMutation mutation, MultimapInetAddress, 
InetAddress hintedEndpoints, IWriteResponseHandler responseHandler, String 
localDataCenter, ConsistencyLevel consistency_level) throws IOException
 {
-applyCounterMutation(mutation, hintedEndpoints, 
responseHandler, localDataCenter, consistency_level, true);
+if (logger.isDebugEnabled())
+logger.debug(insert writing local  replicate  + 
mutation.toString(true));
+
+Runnable runnable = counterWriteTask(mutation, 
hintedEndpoints, responseHandler, localDataCenter, consistency_level);
+StageManager.getStage(Stage.MUTATION).execute(runnable);
 }
 };
 }
@@ -218,7 +226,7 @@ public class StorageProxy implements Sto
 return 
ss.getTokenMetadata().getWriteEndpoints(StorageService.getPartitioner().getToken(key),
 table, naturalEndpoints);
 }
 
-private static void sendToHintedEndpoints(final RowMutation rm, 
MultimapInetAddress, InetAddress hintedEndpoints, IWriteResponseHandler 
responseHandler, String localDataCenter, boolean insertLocalMessages, 
ConsistencyLevel consistency_level)
+private static void sendToHintedEndpoints(final RowMutation rm, 
MultimapInetAddress, InetAddress hintedEndpoints, IWriteResponseHandler 
responseHandler, String localDataCenter, ConsistencyLevel consistency_level)
 throws IOException
 {
 // Multimap that holds onto all the messages and addresses meant for a 
specific datacenter
@@ -237,8 +245,7 @@ public class StorageProxy implements Sto
 // unhinted writes
 if (destination.equals(FBUtilities.getLocalAddress()))
 {
-if (insertLocalMessages)
-insertLocal(rm, responseHandler);
+insertLocal(rm, 

[jira] [Resolved] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2892.
---

Resolution: Fixed
  Reviewer: jbellis

committed

 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892-v1.5.txt, 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1155466 - in /cassandra/trunk: ./ contrib/ debian/ interface/thrift/gen-java/org/apache/cassandra/thrift/ redhat/ src/java/org/apache/cassandra/cli/ src/java/org/apache/cassandra/service/

2011-08-09 Thread jbellis
Author: jbellis
Date: Tue Aug  9 18:40:54 2011
New Revision: 1155466

URL: http://svn.apache.org/viewvc?rev=1155466view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)
cassandra/trunk/debian/control

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/redhat/cassandra
cassandra/trunk/src/java/org/apache/cassandra/cli/Cli.g
cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java
cassandra/trunk/src/java/org/apache/cassandra/cli/CliCompleter.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java
cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java
cassandra/trunk/src/resources/org/apache/cassandra/cli/CliHelp.yaml
cassandra/trunk/test/unit/org/apache/cassandra/cli/CliTest.java

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 18:40:54 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
 /cassandra/branches/cassandra-0.7:1026516-1151306
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1154424
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1155466r1=1155465r2=1155466view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Tue Aug  9 18:40:54 2011
@@ -33,6 +33,8 @@
 
 0.8.4
  * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972)
+ * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992)
+ * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892)
 
 
 0.8.3

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 18:40:54 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
 /cassandra/branches/cassandra-0.7/contrib:1026516-1151306
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1154424
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Modified: cassandra/trunk/debian/control
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/debian/control?rev=1155466r1=1155465r2=1155466view=diff
==
--- cassandra/trunk/debian/control (original)
+++ cassandra/trunk/debian/control Tue Aug  9 18:40:54 2011
@@ -2,7 +2,7 @@ Source: cassandra
 Section: misc
 Priority: extra
 Maintainer: Eric Evans eev...@apache.org
-Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7), ant-optional (= 1.7)
+Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7), ant-optional (= 1.7), subversion
 Homepage: http://cassandra.apache.org
 Vcs-Svn: https://svn.apache.org/repos/asf/cassandra/trunk
 Vcs-Browser: http://svn.apache.org/viewvc/cassandra/trunk

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 18:40:54 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1151306
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654

[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081831#comment-13081831
 ] 

Jonathan Ellis commented on CASSANDRA-2990:
---

A few days ago, you said, A counter mutation only live enough so that it is 
applied to the first replica. Once this is done, a *row* mutation is generated 
for the other replica. That second mutation can be hinted. But that is a row 
mutation, so there should be no special casing at all for that.

Why can't we hint the first replica?

 We should refuse query for counters at CL.ANY
 -

 Key: CASSANDRA-2990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2990
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2990.patch


 We currently do not reject writes for counters at CL.ANY, even though this is 
 not supported (and rightly so).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




buildbot failure in ASF Buildbot on cassandra-trunk

2011-08-09 Thread buildbot
The Buildbot has detected a new failure on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1503

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1155466
Blamelist: jbellis

BUILD FAILED: failed compile

sincerely,
 -The Buildbot



[jira] [Commented] (CASSANDRA-2868) Native Memory Leak

2011-08-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081834#comment-13081834
 ] 

Brandon Williams commented on CASSANDRA-2868:
-

bq. Wouldn't it be worth indicating that how many collection have been done 
since last log message if it's  1, since it can (be  1).

The only reason I added count tracking was to prevent it from firing when there 
were no GCs (the api is flakey.)  I've never actually been able to get  1 to 
happen, but we can add it to the logging.

bq. IMO the duration-based thresholds are hard to reason about here, where 
we're dealing w/ summaries and not individual GC results.

We are dealing with individual GCs at least 99% of the time in practice.  The 
worst case is 1 GC inflates the gctime enough that we errantly log when it's 
not needed, but I imagine to trigger that you would have to be in a gc pressure 
situation already.

bq. I think I'd rather have something like the dropped messages logger, where 
every N seconds we log the summary we get from the mbean.

That seems like it could a lot of noise since GC is constantly happening.

bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be 
removed. 

I think the logic there is still sound (Did we just do a CMS? Is the heap 
still 80% full?) and it seems to work as well as it always has.



 Native Memory Leak
 --

 Key: CASSANDRA-2868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Daniel Doubleday
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8.4

 Attachments: 2868-v1.txt, 2868-v2.txt, 48hour_RES.png, 
 low-load-36-hours-initial-results.png


 We have memory issues with long running servers. These have been confirmed by 
 several users in the user list. That's why I report.
 The memory consumption of the cassandra java process increases steadily until 
 it's killed by the os because of oom (with no swap)
 Our server is started with -Xmx3000M and running for around 23 days.
 pmap -x shows
 Total SST: 1961616 (mem mapped data and index files)
 Anon  RSS: 6499640
 Total RSS: 8478376
 This shows that  3G are 'overallocated'.
 We will use BRAF on one of our less important nodes to check wether it is 
 related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2868) Native Memory Leak

2011-08-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2868?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081834#comment-13081834
 ] 

Brandon Williams edited comment on CASSANDRA-2868 at 8/9/11 6:43 PM:
-

bq. Wouldn't it be worth indicating that how many collection have been done 
since last log message if it's  1, since it can (be  1).

The only reason I added count tracking was to prevent it from firing when there 
were no GCs (the api is flakey.)  I've never actually been able to get  1 to 
happen, but we can add it to the logging.

bq. IMO the duration-based thresholds are hard to reason about here, where 
we're dealing w/ summaries and not individual GC results.

We are dealing with individual GCs at least 99% of the time in practice.  The 
worst case is 1 GC inflates the gctime enough that we errantly log when it's 
not needed, but I imagine to trigger that you would have to be in a gc pressure 
situation already.

bq. I think I'd rather have something like the dropped messages logger, where 
every N seconds we log the summary we get from the mbean.

That seems like it could be a lot of noise since GC is constantly happening.

bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be 
removed. 

I think the logic there is still sound (Did we just do a CMS? Is the heap 
still 80% full?) and it seems to work as well as it always has.



  was (Author: brandon.williams):
bq. Wouldn't it be worth indicating that how many collection have been done 
since last log message if it's  1, since it can (be  1).

The only reason I added count tracking was to prevent it from firing when there 
were no GCs (the api is flakey.)  I've never actually been able to get  1 to 
happen, but we can add it to the logging.

bq. IMO the duration-based thresholds are hard to reason about here, where 
we're dealing w/ summaries and not individual GC results.

We are dealing with individual GCs at least 99% of the time in practice.  The 
worst case is 1 GC inflates the gctime enough that we errantly log when it's 
not needed, but I imagine to trigger that you would have to be in a gc pressure 
situation already.

bq. I think I'd rather have something like the dropped messages logger, where 
every N seconds we log the summary we get from the mbean.

That seems like it could a lot of noise since GC is constantly happening.

bq. The flushLargestMemtables/reduceCacheSizes stuff should probably be 
removed. 

I think the logic there is still sound (Did we just do a CMS? Is the heap 
still 80% full?) and it seems to work as well as it always has.


  
 Native Memory Leak
 --

 Key: CASSANDRA-2868
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2868
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Daniel Doubleday
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.8.4

 Attachments: 2868-v1.txt, 2868-v2.txt, 48hour_RES.png, 
 low-load-36-hours-initial-results.png


 We have memory issues with long running servers. These have been confirmed by 
 several users in the user list. That's why I report.
 The memory consumption of the cassandra java process increases steadily until 
 it's killed by the os because of oom (with no swap)
 Our server is started with -Xmx3000M and running for around 23 days.
 pmap -x shows
 Total SST: 1961616 (mem mapped data and index files)
 Anon  RSS: 6499640
 Total RSS: 8478376
 This shows that  3G are 'overallocated'.
 We will use BRAF on one of our less important nodes to check wether it is 
 related to mmap and report back.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY

2011-08-09 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081854#comment-13081854
 ] 

Sylvain Lebresne commented on CASSANDRA-2990:
-

bq. Why can't we hint the first replica?

Well, actually I think we could. Or at least if we cannot I forgot why. We 
would need to be sure we never replay an hint twice though, which I'm not sure 
is a guarantee right now. Also, we can only make this if what we store as a 
hint is the serialized mutation (in this case, the serialized CounterMutation): 
we can't apply the CounterMutation on a non-replica (partly because that would 
potentially increase the counter context too much, partly because counter 
remove suck, which would probably be a problem at some point).

So it should be doable, but it's a bit of work.

 We should refuse query for counters at CL.ANY
 -

 Key: CASSANDRA-2990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2990
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2990.patch


 We currently do not reject writes for counters at CL.ANY, even though this is 
 not supported (and rightly so).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2892) Don't replicate_on_write with RF=1

2011-08-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081858#comment-13081858
 ] 

Hudson commented on CASSANDRA-2892:
---

Integrated in Cassandra-0.8 #264 (See 
[https://builds.apache.org/job/Cassandra-0.8/264/])
avoid doing read forno-op replicate-on-write at CL=1
patch by slebresne and jbellis for CASSANDRA-2892

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1155460
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/StorageProxy.java


 Don't replicate_on_write with RF=1
 

 Key: CASSANDRA-2892
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2892
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2892-v1.5.txt, 2892.patch


 For counters with RF=1, we still do a read to replicate, even though there is 
 nothing to replicate it too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081862#comment-13081862
 ] 

Jonathan Ellis commented on CASSANDRA-2990:
---

Okay, +1 on making the validation match what is actually currently supported 
(no ANY for counters), although I'd change not supported to not yet 
supported.

We can deal w/ adding ANY support if and when someone actually needs it.

 We should refuse query for counters at CL.ANY
 -

 Key: CASSANDRA-2990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2990
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2990.patch


 We currently do not reject writes for counters at CL.ANY, even though this is 
 not supported (and rightly so).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2518) invalid column name length 0

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2518.
---

Resolution: Duplicate

probably CASSANDRA-2675, fixed in 0.7.7

 invalid column name length 0
 

 Key: CASSANDRA-2518
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2518
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.3
 Environment: three nodes, 
 JVM:
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms6G -Xmx6G -Xmn2400M 
 -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC 
 -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 
 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 
 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true
Reporter: lichenglin

 one of the three nodes cassandra 0.7.3 report error after start up:
 ERROR [CompactionExecutor:1] 2011-04-16 22:18:39,281 PrecompactedRow.java 
 (line 82) Skipping row DecoratedKey(3813860378406449638560060231106122758, 
 79616e79776275636b65743030303030303030312f6f626a303030303030323534) in 
 /opt/cassandra/data/Keyspace/cf-f-4715-Data.db
 org.apache.cassandra.db.ColumnSerializer$CorruptColumnException: invalid 
 column name length 0
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:68)
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176)
 at 
 org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78)
 at 
 org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:139)
 at 
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108)
 at 
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43)
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at 
 com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
 at 
 org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
 at 
 org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
 at 
 org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:449)
 at 
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:124)
 at 
 org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:94)
 at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
 at java.util.concurrent.FutureTask.run(FutureTask.java:138)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)
 and few minutes later,
 ERROR [CompactionExecutor:1] 2011-04-16 22:20:20,073 
 AbstractCassandraDaemon.java (line 114) Fatal exception in thread 
 Thread[CompactionExecutor:1,1,main]
 java.lang.OutOfMemoryError: Java heap space
 at 
 org.apache.cassandra.io.util.BufferedRandomAccessFile.readBytes(BufferedRandomAccessFile.java:267)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:310)
 at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:267)
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
 at 
 org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns(ColumnFamilySerializer.java:129)
 at 
 org.apache.cassandra.io.sstable.SSTableIdentityIterator.getColumnFamilyWithColumns(SSTableIdentityIterator.java:176)
 at 
 org.apache.cassandra.io.PrecompactedRow.init(PrecompactedRow.java:78)
 at 
 org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:139)
 at 
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:108)
 at 
 org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:43)
 at 
 org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
 at 
 com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
 at 
 

[Cassandra Wiki] Trivial Update of Committers by JonathanEllis

2011-08-09 Thread Apache Wiki
Dear Wiki user,

You have subscribed to a wiki page or wiki category on Cassandra Wiki for 
change notification.

The Committers page has been changed by JonathanEllis:
http://wiki.apache.org/cassandra/Committers?action=diffrev1=15rev2=16

Comment:
update release manager

  ||Avinash Lakshman||Jan 2009||Facebook||Co-author of Facebook Cassandra||
  ||Prashant Malik||Jan 2009||Facebook||Co-author of Facebook Cassandra||
  ||Jonathan Ellis||Mar 2009||Datastax||Project chair||
- ||Eric Evans||Jun 2009||Rackspace||PMC member, Release manager, Debian 
packager||
+ ||Eric Evans||Jun 2009||Rackspace||PMC member, Debian packager||
  ||Jun Rao||Jun 2009||!LinkedIn||PMC member||
  ||Chris Goffinet||Sept 2009||Twitter||PMC member||
  ||Johan Oskarsson||Nov 2009||Twitter||Also a 
[[http://hadoop.apache.org/|Hadoop]] committer||
@@ -12, +12 @@

  ||Jaakko Laine||Dec 2009||?|| ||
  ||Brandon Williams||Jun 2010||Datastax||PMC member||
  ||Jake Luciani||Jan 2011||Datastax||Also a 
[[http://thrift.apache.org/|Thrift]] committer||
- ||Sylvain Lebresne||Mar 2011||Datastax||PMC member||
+ ||Sylvain Lebresne||Mar 2011||Datastax||PMC member, Release manager||
  ||Pavel Yaskevich||Aug 2011||Datastax|| ||
  


[jira] [Commented] (CASSANDRA-2993) Issues with parameters being escaped correctly in Python CQL

2011-08-09 Thread Blake Visin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081879#comment-13081879
 ] 

Blake Visin commented on CASSANDRA-2993:


Works for me too.  Thanks Tyler!

 Issues with parameters being escaped correctly in Python CQL
 

 Key: CASSANDRA-2993
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2993
 Project: Cassandra
  Issue Type: Bug
 Environment: Python CQL
Reporter: Blake Visin
Assignee: Tyler Hobbs
  Labels: CQL, parameter, python
 Attachments: 2993-cql-grammar.txt, 2993-pycql.txt, 
 2993-system-test.txt


 When using parameterised queries in Python CQL strings are not being escaped 
 correctly.
 Query and Parameters:
 {code}
 'UPDATE sites SET :col = :val WHERE KEY = :site_id'
 {'col': 'feed_stats:1312493736688033024',
  'site_id': '899d15e8-bd4a-11e0-bc8c-001fe14cba06',
  'val': 
 (dp0\nS'1'\np1\n(lp2\nI1\naI2\naI3\naI4\nasS'0'\np3\n(lp4\nI1\naI2\naI3\naI4\nasS'3'\np5\n(lp6\nI1\naI2\naI3\naI4\nasS'2'\np7\n(lp8\nI1\naI2\naI3\naI4\nas.}
 {code}
 Query trying to be executed after processing parameters
 {code} 
 UPDATE sites SET 'feed_stats:1312493736688033024' = 
 '(dp0\nS''1''\np1\n(lp2\nI1\naI2\naI3\naI4\nasS''0''\np3\n(lp4\nI1\naI2\naI3\naI4\nasS''3''\np5\n(lp6\nI1\naI2\naI3\naI4\nasS''2''\np7\n(lp8\nI1\naI2\naI3\naI4\nas.'
  WHERE KEY = '899d15e8-bd4a-11e0-bc8c-001fe14cba06'
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2993) Issues with parameters being escaped correctly in Python CQL

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2993:
--

Reviewer: xedin

 Issues with parameters being escaped correctly in Python CQL
 

 Key: CASSANDRA-2993
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2993
 Project: Cassandra
  Issue Type: Bug
 Environment: Python CQL
Reporter: Blake Visin
Assignee: Tyler Hobbs
  Labels: CQL, parameter, python
 Attachments: 2993-cql-grammar.txt, 2993-pycql.txt, 
 2993-system-test.txt


 When using parameterised queries in Python CQL strings are not being escaped 
 correctly.
 Query and Parameters:
 {code}
 'UPDATE sites SET :col = :val WHERE KEY = :site_id'
 {'col': 'feed_stats:1312493736688033024',
  'site_id': '899d15e8-bd4a-11e0-bc8c-001fe14cba06',
  'val': 
 (dp0\nS'1'\np1\n(lp2\nI1\naI2\naI3\naI4\nasS'0'\np3\n(lp4\nI1\naI2\naI3\naI4\nasS'3'\np5\n(lp6\nI1\naI2\naI3\naI4\nasS'2'\np7\n(lp8\nI1\naI2\naI3\naI4\nas.}
 {code}
 Query trying to be executed after processing parameters
 {code} 
 UPDATE sites SET 'feed_stats:1312493736688033024' = 
 '(dp0\nS''1''\np1\n(lp2\nI1\naI2\naI3\naI4\nasS''0''\np3\n(lp4\nI1\naI2\naI3\naI4\nasS''3''\np5\n(lp6\nI1\naI2\naI3\naI4\nasS''2''\np7\n(lp8\nI1\naI2\naI3\naI4\nas.'
  WHERE KEY = '899d15e8-bd4a-11e0-bc8c-001fe14cba06'
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk

2011-08-09 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated CASSANDRA-2325:
---

Attachment: cassandra-2325.patch.2.txt

 invalidateKeyCache / invalidateRowCache should remove saved cache files from 
 disk
 -

 Key: CASSANDRA-2325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2325
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.7.8, 0.8.2
Reporter: Matthew F. Dennis
Assignee: Edward Capriolo
Priority: Minor
 Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt


 the invalidate[Key|Row]Cache calls don't remove the saved caches from disk.
 It seems logical that if you are clearing the caches you don't expect them to 
 be reinstantiated with the old values the next time C* starts.
 This is not a huge issue since next time the caches are saved the old values 
 will be removed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1155544 - /cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java

2011-08-09 Thread jbellis
Author: jbellis
Date: Tue Aug  9 20:18:47 2011
New Revision: 1155544

URL: http://svn.apache.org/viewvc?rev=1155544view=rev
Log:
r/m merged reference to obsolete memtable_flush_after_mins

Modified:
cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java

Modified: cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java?rev=1155544r1=1155543r2=1155544view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/cli/CliClient.java Tue Aug  9 
20:18:47 2011
@@ -1671,7 +1671,6 @@ public class CliClient
 normaliseType(cfDef.key_validation_class, 
org.apache.cassandra.db.marshal));
 writeAttr(sb, false, memtable_operations, 
cfDef.memtable_operations_in_millions);
 writeAttr(sb, false, memtable_throughput, 
cfDef.memtable_throughput_in_mb);
-writeAttr(sb, false, memtable_flush_after, 
cfDef.memtable_flush_after_mins);
 writeAttr(sb, false, rows_cached, cfDef.row_cache_size);
 writeAttr(sb, false, row_cache_save_period, 
cfDef.row_cache_save_period_in_seconds);
 writeAttr(sb, false, keys_cached, cfDef.key_cache_size);




buildbot success in ASF Buildbot on cassandra-trunk

2011-08-09 Thread buildbot
The Buildbot has detected a restored build on builder cassandra-trunk while 
building ASF Buildbot.
Full details are available at:
 http://ci.apache.org/builders/cassandra-trunk/builds/1504

Buildbot URL: http://ci.apache.org/

Buildslave for this Build: isis_ubuntu

Build Reason: scheduler
Build Source Stamp: [branch cassandra/trunk] 1155544
Blamelist: jbellis

Build succeeded!

sincerely,
 -The Buildbot



svn commit: r1155548 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/cql/UpdateStatement.java src/java/org/apache/cassandra/thrift/ThriftValidation.java test/system/t

2011-08-09 Thread slebresne
Author: slebresne
Date: Tue Aug  9 20:24:17 2011
New Revision: 1155548

URL: http://svn.apache.org/viewvc?rev=1155548view=rev
Log:
Refuse counter write at CL.ANY
patch by slebresne; reviewed by jbellis for CASSANDRA-2990

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java
cassandra/branches/cassandra-0.8/test/system/test_cql.py
cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1155548r1=1155547r2=1155548view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug  9 20:24:17 2011
@@ -2,6 +2,7 @@
  * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972)
  * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992)
  * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892)
+ * refuse counter write for CL.ANY (CASSANDRA-2990)
 
 
 0.8.3

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java?rev=1155548r1=1155547r2=1155548view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java
 Tue Aug  9 20:24:17 2011
@@ -39,6 +39,7 @@ import static org.apache.cassandra.cql.Q
 
 import static org.apache.cassandra.cql.Operation.OperationType;
 import static 
org.apache.cassandra.thrift.ThriftValidation.validateColumnFamily;
+import static 
org.apache.cassandra.thrift.ThriftValidation.validateCommutativeForWrite;
 
 /**
  * An codeUPDATE/code statement parsed from a CQL query statement.
@@ -142,6 +143,8 @@ public class UpdateStatement extends Abs
 }
 
 CFMetaData metadata = validateColumnFamily(keyspace, columnFamily, 
hasCommutativeOperation);
+if (hasCommutativeOperation)
+validateCommutativeForWrite(metadata, cLevel);
 
 QueryProcessor.validateKeyAlias(metadata, keyName);
 

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java?rev=1155548r1=1155547r2=1155548view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java
 Tue Aug  9 20:24:17 2011
@@ -627,7 +627,11 @@ public class ThriftValidation
 
 public static void validateCommutativeForWrite(CFMetaData metadata, 
ConsistencyLevel consistency) throws InvalidRequestException
 {
-if (!metadata.getReplicateOnWrite()  consistency != 
ConsistencyLevel.ONE)
+if (consistency == ConsistencyLevel.ANY)
+{
+throw new InvalidRequestException(Consistency level ANY is not 
yet supported for counter columnfamily  + metadata.cfName);
+}
+else if (!metadata.getReplicateOnWrite()  consistency != 
ConsistencyLevel.ONE)
 {
 throw new InvalidRequestException(cannot achieve CL  CL.ONE 
without replicate_on_write on columnfamily  + metadata.cfName);
 }

Modified: cassandra/branches/cassandra-0.8/test/system/test_cql.py
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/system/test_cql.py?rev=1155548r1=1155547r2=1155548view=diff
==
--- cassandra/branches/cassandra-0.8/test/system/test_cql.py (original)
+++ cassandra/branches/cassandra-0.8/test/system/test_cql.py Tue Aug  9 
20:24:17 2011
@@ -1260,6 +1260,11 @@ class TestCql(ThriftTester):
   cursor.execute,
   UPDATE CounterCF SET count_me = count_not_me + 2 WHERE 
key = 'counter1')
 
+# counters can't do ANY
+assert_raises(cql.ProgrammingError,
+  cursor.execute,
+  UPDATE CounterCF USING CONSISTENCY ANY SET count_me = 
count_me + 2 WHERE key = 'counter1')
+
 def test_key_alias_support(self):
 should be possible to use alias instead of KEY keyword
 cursor = init()

Modified: cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py
URL: 

svn commit: r1155549 - in /cassandra/trunk: ./ contrib/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/cql/ src/java/org/apache/cassandra/thrift/ test/system/

2011-08-09 Thread slebresne
Author: slebresne
Date: Tue Aug  9 20:26:07 2011
New Revision: 1155549

URL: http://svn.apache.org/viewvc?rev=1155549view=rev
Log:
commit from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)
cassandra/trunk/src/java/org/apache/cassandra/cql/UpdateStatement.java
cassandra/trunk/src/java/org/apache/cassandra/thrift/ThriftValidation.java
cassandra/trunk/test/system/test_cql.py
cassandra/trunk/test/system/test_thrift_server.py

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 20:26:07 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
 /cassandra/branches/cassandra-0.7:1026516-1151306
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1155460,1155548
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1155549r1=1155548r2=1155549view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Tue Aug  9 20:26:07 2011
@@ -35,6 +35,7 @@
  * include files-to-be-streamed in StreamInSession.getSources (CASSANDRA-2972)
  * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992)
  * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892)
+ * refuse counter write for CL.ANY (CASSANDRA-2990)
 
 
 0.8.3

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 20:26:07 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
 /cassandra/branches/cassandra-0.7/contrib:1026516-1151306
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1155460,1155548
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 20:26:07 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1151306
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
-/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1155460
+/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125019-1155460,1155548
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369
 
/cassandra/branches/cassandra-0.8.1/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1101014-1125018
 
/cassandra/tags/cassandra-0.7.0-rc3/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1051699-1053689

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Tue Aug  9 20:26:07 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
 

[jira] [Commented] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever

2011-08-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081901#comment-13081901
 ] 

Brandon Williams commented on CASSANDRA-3004:
-

+1

 Once a message has been dropped, cassandra logs total messages dropped and 
 tpstats every 5s forever
 ---

 Key: CASSANDRA-3004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3004
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.3
Reporter: Brandon Williams
Assignee: Jonathan Ellis
Priority: Minor
  Labels: lhf
 Fix For: 0.8.4

 Attachments: 3004.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1155558 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/net/MessagingService.java

2011-08-09 Thread jbellis
Author: jbellis
Date: Tue Aug  9 20:47:35 2011
New Revision: 118

URL: http://svn.apache.org/viewvc?rev=118view=rev
Log:
switch back to only logging recent dropped messages
patch by jbellis; reviewed by brandonwilliams for CASSANDRA-3004

Modified:
cassandra/branches/cassandra-0.8/CHANGES.txt

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java

Modified: cassandra/branches/cassandra-0.8/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=118r1=117r2=118view=diff
==
--- cassandra/branches/cassandra-0.8/CHANGES.txt (original)
+++ cassandra/branches/cassandra-0.8/CHANGES.txt Tue Aug  9 20:47:35 2011
@@ -3,6 +3,7 @@
  * use JAVA env var in cassandra-env.sh (CASSANDRA-2785, 2992)
  * avoid doing read for no-op replicate-on-write at CL=1 (CASSANDRA-2892)
  * refuse counter write for CL.ANY (CASSANDRA-2990)
+ * switch back to only logging recent dropped messages (CASSANDRA-3004)
 
 
 0.8.3

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java?rev=118r1=117r2=118view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java
 Tue Aug  9 20:47:35 2011
@@ -100,18 +100,11 @@ public final class MessagingService impl
 private final MapStorageService.Verb, AtomicInteger droppedMessages = 
new EnumMapStorageService.Verb, AtomicInteger(StorageService.Verb.class);
 // dropped count when last requested for the Recent api.  high concurrency 
isn't necessary here.
 private final MapStorageService.Verb, Integer lastDropped = 
Collections.synchronizedMap(new EnumMapStorageService.Verb, 
Integer(StorageService.Verb.class));
+private final MapStorageService.Verb, Integer lastDroppedInternal = new 
EnumMapStorageService.Verb, Integer(StorageService.Verb.class);
 
 private final ListILatencySubscriber subscribers = new 
ArrayListILatencySubscriber();
 private static final long DEFAULT_CALLBACK_TIMEOUT = (long) (1.1 * 
DatabaseDescriptor.getRpcTimeout());
 
-{
-for (StorageService.Verb verb : DROPPABLE_VERBS)
-{
-droppedMessages.put(verb, new AtomicInteger());
-lastDropped.put(verb, 0);
-}
-}
-
 private static class MSHandle
 {
 public static final MessagingService instance = new MessagingService();
@@ -123,6 +116,13 @@ public final class MessagingService impl
 
 private MessagingService()
 {
+for (StorageService.Verb verb : DROPPABLE_VERBS)
+{
+droppedMessages.put(verb, new AtomicInteger());
+lastDropped.put(verb, 0);
+lastDroppedInternal.put(verb, 0);
+}
+
 listenGate = new SimpleCondition();
 verbHandlers_ = new EnumMapStorageService.Verb, 
IVerbHandler(StorageService.Verb.class);
 streamExecutor_ = new DebuggableThreadPoolExecutor(Streaming, 
DatabaseDescriptor.getCompactionThreadPriority());
@@ -584,11 +584,13 @@ public final class MessagingService impl
 for (Map.EntryStorageService.Verb, AtomicInteger entry : 
droppedMessages.entrySet())
 {
 AtomicInteger dropped = entry.getValue();
-if (dropped.get()  0)
+StorageService.Verb verb = entry.getKey();
+int recent = dropped.get() - lastDroppedInternal.get(verb);
+if (recent  0)
 {
 logTpstats = true;
-logger_.info({} {} messages dropped in server lifetime,
- dropped, entry.getKey());
+logger_.info({} {} messages dropped in server lifetime, 
recent, verb);
+lastDroppedInternal.put(verb, dropped.get());
 }
 }
 




[jira] [Updated] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-3004:
--

Affects Version/s: (was: 0.8.3)
   0.8.2
   Issue Type: Improvement  (was: Bug)

 Once a message has been dropped, cassandra logs total messages dropped and 
 tpstats every 5s forever
 ---

 Key: CASSANDRA-3004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3004
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.2
Reporter: Brandon Williams
Assignee: Jonathan Ellis
Priority: Minor
  Labels: lhf
 Fix For: 0.8.4

 Attachments: 3004.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2325:
--

Affects Version/s: (was: 0.7.8)
   (was: 0.8.2)
   0.6
Fix Version/s: 0.8.4

 invalidateKeyCache / invalidateRowCache should remove saved cache files from 
 disk
 -

 Key: CASSANDRA-2325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2325
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.6
Reporter: Matthew F. Dennis
Assignee: Edward Capriolo
Priority: Minor
 Fix For: 0.8.4

 Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt


 the invalidate[Key|Row]Cache calls don't remove the saved caches from disk.
 It seems logical that if you are clearing the caches you don't expect them to 
 be reinstantiated with the old values the next time C* starts.
 This is not a huge issue since next time the caches are saved the old values 
 will be removed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2325) invalidateKeyCache / invalidateRowCache should remove saved cache files from disk

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081920#comment-13081920
 ] 

Jonathan Ellis commented on CASSANDRA-2325:
---

Shouldn't we check that the file exists first?  otherwise we log spurious 
errors.

 invalidateKeyCache / invalidateRowCache should remove saved cache files from 
 disk
 -

 Key: CASSANDRA-2325
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2325
 Project: Cassandra
  Issue Type: Improvement
Affects Versions: 0.6
Reporter: Matthew F. Dennis
Assignee: Edward Capriolo
Priority: Minor
 Fix For: 0.8.4

 Attachments: cassandra-2325-1.patch.txt, cassandra-2325.patch.2.txt


 the invalidate[Key|Row]Cache calls don't remove the saved caches from disk.
 It seems logical that if you are clearing the caches you don't expect them to 
 be reinstantiated with the old values the next time C* starts.
 This is not a huge issue since next time the caches are saved the old values 
 will be removed.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-3005) OutboundTcpConnection's sending queue goes unboundedly without any backpressure logic

2011-08-09 Thread Melvin Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Melvin Wang reassigned CASSANDRA-3005:
--

Assignee: Melvin Wang

 OutboundTcpConnection's sending queue goes unboundedly without any 
 backpressure logic
 -

 Key: CASSANDRA-3005
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3005
 Project: Cassandra
  Issue Type: Improvement
Reporter: Melvin Wang
Assignee: Melvin Wang

 OutboundTcpConnection's sending queue unconditionally queues up the request 
 and process them in sequence. Thinking about tagging the message coming in 
 with timestamp and drop them before actually sending it if the message stays 
 in the queue for too long, which is defined by the message's own time out 
 value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081930#comment-13081930
 ] 

Pavel Yaskevich commented on CASSANDRA-2988:


First of all I would like to point you to 
http://wiki.apache.org/cassandra/CodeStyle, please modify your code according 
to conventions listed in there.

According to c2988-modified-buffer.patch:

 - please encapsulate your modifications because if you compare how it was and 
how it is in your patch it's hard to undertand and just looks like a mess, I 
would like to suggest moving those modifications to separate inner class 
(IndexReader maybe?) and replace only RandomAccessReader initialization in the 
SSTableReader.load(...) method...
 - let's add a test comparing getEstimatedRowSize().count(); and 
SSTable.estimateRowsFromIndex(input); just to be sure it works correctly.

Also I don't quiet understand logic behind while (buffer.remaining()  10) { 
in SSTableReader.loadByteBuffer, let's avoid any hardcoding or at least comment 
why you did that.

I'm going to take a closer look at patch for parallel index file loading after 
we will be done with index reader patch (c2988-modified-buffer.patch).

 Improve SSTableReader.load() when loading index files
 -

 Key: CASSANDRA-2988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Melvin Wang
Assignee: Melvin Wang
Priority: Minor
 Fix For: 1.0

 Attachments: c2988-modified-buffer.patch, 
 c2988-parallel-load-sstables.patch


 * when we create BufferredRandomAccessFile, we pass skipCache=true. This 
 hurts the read performance because we always process the index files 
 sequentially. Simple fix would be set it to false.
 * multiple index files of a single column family can be loaded in parallel. 
 This buys a lot when you have multiple super large index files.
 * we may also change how we buffer. By using BufferredRandomAccessFile, for 
 every read, we need bunch of checking like
   - do we need to rebuffer?
   - isEOF()?
   - assertions
   These can be simplified to some extent.  We can blindly buffer the index 
 file by chunks and process the buffer until a key lies across boundary of a 
 chunk. Then we rebuffer and start from the beginning of the partially read 
 key. Conceptually, this is same as what BRAF does but w/o the overhead in the 
 read**() methods in BRAF.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart

2011-08-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081941#comment-13081941
 ] 

Brandon Williams commented on CASSANDRA-2950:
-

Currently, truncate does:
* force a flush
* record the time
* delete any sstables older than the time

This isn't quite enough if the machine crashes shortly afterward, however, 
since there can be mutations present in the commitlog that were previously 
truncated and are now resurrected by CL replay.

One thing we could do is record the truncate time for the CF in the system ks 
and then ignore mutations older than that, however this would require time 
synchronization between the client and the server to be accurate.


 Data from truncated CF reappears after server restart
 -

 Key: CASSANDRA-2950
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2950
 Project: Cassandra
  Issue Type: Bug
Reporter: Cathy Daw
Assignee: Brandon Williams

 * Configure 3 node cluster
 * Ensure the java stress tool creates Keyspace1 with RF=3
 {code}
 // Run Stress Tool to generate 10 keys, 1 column
 stress --operation=INSERT -t 2 --num-keys=50 --columns=20 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 50 keys in CLI
 use Keyspace1; 
 list Standard1; 
 // TRUNCATE CF in CLI
 use Keyspace1;
 truncate counter1;
 list counter1;
 // Run stress tool and verify creation of 1 key with 10 columns
 stress --operation=INSERT -t 2 --num-keys=1 --columns=10 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 1 key in CLI
 use Keyspace1; 
 list Standard1; 
 // Restart all three nodes
 // You will see 51 keys in CLI
 use Keyspace1; 
 list Standard1; 
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3004) Once a message has been dropped, cassandra logs total messages dropped and tpstats every 5s forever

2011-08-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081948#comment-13081948
 ] 

Hudson commented on CASSANDRA-3004:
---

Integrated in Cassandra-0.8 #265 (See 
[https://builds.apache.org/job/Cassandra-0.8/265/])
switch back to only logging recent dropped messages
patch by jbellis; reviewed by brandonwilliams for CASSANDRA-3004

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=118
Files : 
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/net/MessagingService.java


 Once a message has been dropped, cassandra logs total messages dropped and 
 tpstats every 5s forever
 ---

 Key: CASSANDRA-3004
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3004
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.8.2
Reporter: Brandon Williams
Assignee: Jonathan Ellis
Priority: Minor
  Labels: lhf
 Fix For: 0.8.4

 Attachments: 3004.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2990) We should refuse query for counters at CL.ANY

2011-08-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081947#comment-13081947
 ] 

Hudson commented on CASSANDRA-2990:
---

Integrated in Cassandra-0.8 #265 (See 
[https://builds.apache.org/job/Cassandra-0.8/265/])
Refuse counter write at CL.ANY
patch by slebresne; reviewed by jbellis for CASSANDRA-2990

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1155548
Files : 
* /cassandra/branches/cassandra-0.8/test/system/test_cql.py
* /cassandra/branches/cassandra-0.8/CHANGES.txt
* /cassandra/branches/cassandra-0.8/test/system/test_thrift_server.py
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/cql/UpdateStatement.java


 We should refuse query for counters at CL.ANY
 -

 Key: CASSANDRA-2990
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2990
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Trivial
  Labels: counters
 Fix For: 0.8.4

 Attachments: 2990.patch


 We currently do not reject writes for counters at CL.ANY, even though this is 
 not supported (and rightly so).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2982) Refactor secondary index api

2011-08-09 Thread T Jake Luciani (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

T Jake Luciani updated CASSANDRA-2982:
--

Attachment: 2982-v1.txt

refactored api, should cover new index types. Should we consider removing 
IndexType enum and just use classname?

 Refactor secondary index api
 

 Key: CASSANDRA-2982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2982
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 1.0

 Attachments: 2982-v1.txt


 Secondary indexes currently make some bad assumptions about the underlying 
 indexes.
 1. That they are always stored in other column families.
 2. That there is a unique index per column
 In the case of CASSANDRA-2915 neither of these are true.  The new api should 
 abstract the search concepts and allow any search api to plug in.
 Once the code is refactored and basically pluggable we can remove the 
 IndexType enum and use class names similar to how we handle partitioners and 
 comparators.
 Basic api is to add a SecondaryIndexManager that handles different index 
 types per CF and a SecondaryIndex base class that handles a particular type 
 implementation.
 This requires major changes to ColumnFamilyStore and Table.IndexBuilder

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081966#comment-13081966
 ] 

Jonathan Ellis commented on CASSANDRA-2950:
---

but we record CL context at time of flush in the sstable it makes, and we on 
replay we ignore any mutations from before that position.

checked and we do wait for flush to complete in truncate.

 Data from truncated CF reappears after server restart
 -

 Key: CASSANDRA-2950
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2950
 Project: Cassandra
  Issue Type: Bug
Reporter: Cathy Daw
Assignee: Brandon Williams

 * Configure 3 node cluster
 * Ensure the java stress tool creates Keyspace1 with RF=3
 {code}
 // Run Stress Tool to generate 10 keys, 1 column
 stress --operation=INSERT -t 2 --num-keys=50 --columns=20 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 50 keys in CLI
 use Keyspace1; 
 list Standard1; 
 // TRUNCATE CF in CLI
 use Keyspace1;
 truncate counter1;
 list counter1;
 // Run stress tool and verify creation of 1 key with 10 columns
 stress --operation=INSERT -t 2 --num-keys=1 --columns=10 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 1 key in CLI
 use Keyspace1; 
 list Standard1; 
 // Restart all three nodes
 // You will see 51 keys in CLI
 use Keyspace1; 
 list Standard1; 
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081968#comment-13081968
 ] 

Jonathan Ellis commented on CASSANDRA-2982:
---

I don't think full index pluggability is a goal here.  So I don't see the point 
of that.

 Refactor secondary index api
 

 Key: CASSANDRA-2982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2982
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 1.0

 Attachments: 2982-v1.txt


 Secondary indexes currently make some bad assumptions about the underlying 
 indexes.
 1. That they are always stored in other column families.
 2. That there is a unique index per column
 In the case of CASSANDRA-2915 neither of these are true.  The new api should 
 abstract the search concepts and allow any search api to plug in.
 Once the code is refactored and basically pluggable we can remove the 
 IndexType enum and use class names similar to how we handle partitioners and 
 comparators.
 Basic api is to add a SecondaryIndexManager that handles different index 
 types per CF and a SecondaryIndex base class that handles a particular type 
 implementation.
 This requires major changes to ColumnFamilyStore and Table.IndexBuilder

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2950) Data from truncated CF reappears after server restart

2011-08-09 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081969#comment-13081969
 ] 

Brandon Williams commented on CASSANDRA-2950:
-

bq. but we record CL context at time of flush in the sstable it makes, and we 
on replay we ignore any mutations from before that position.

I think there's something wrong with that, then:

{noformat}
 INFO 21:25:15,274 Replaying 
/var/lib/cassandra/commitlog/CommitLog-1312924388053.log
DEBUG 21:25:15,290 Replaying 
/var/lib/cassandra/commitlog/CommitLog-1312924388053.log starting at 0
DEBUG 21:25:15,291 Reading mutation at 0
DEBUG 21:25:15,295 replaying mutation for system.4c: {ColumnFamily(LocationInfo 
[47656e65726174696f6e:false:4@131292438814,])}
DEBUG 21:25:15,321 Reading mutation at 89
DEBUG 21:25:15,322 replaying mutation for system.426f6f747374726170: 
{ColumnFamily(LocationInfo [42:false:1@1312924388203,])}
DEBUG 21:25:15,322 Reading mutation at 174
DEBUG 21:25:15,322 replaying mutation for system.4c: {ColumnFamily(LocationInfo 
[546f6b656e:false:16@1312924388204,])}
DEBUG 21:25:15,322 Reading mutation at 270
DEBUG 21:25:15,324 replaying mutation for Keyspace1.3030: 
{ColumnFamily(Standard1 
[C0:false:34@1312924813259,C1:false:34@1312924813260,C2:false:34@1312924813260,C3:false:34@1312924813260,C4:false:34@1312924813260,])}
{noformat}

The last entry there is the first of many errant mutations.

 Data from truncated CF reappears after server restart
 -

 Key: CASSANDRA-2950
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2950
 Project: Cassandra
  Issue Type: Bug
Reporter: Cathy Daw
Assignee: Brandon Williams

 * Configure 3 node cluster
 * Ensure the java stress tool creates Keyspace1 with RF=3
 {code}
 // Run Stress Tool to generate 10 keys, 1 column
 stress --operation=INSERT -t 2 --num-keys=50 --columns=20 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 50 keys in CLI
 use Keyspace1; 
 list Standard1; 
 // TRUNCATE CF in CLI
 use Keyspace1;
 truncate counter1;
 list counter1;
 // Run stress tool and verify creation of 1 key with 10 columns
 stress --operation=INSERT -t 2 --num-keys=1 --columns=10 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 1 key in CLI
 use Keyspace1; 
 list Standard1; 
 // Restart all three nodes
 // You will see 51 keys in CLI
 use Keyspace1; 
 list Standard1; 
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-2950) Data from truncated CF reappears after server restart

2011-08-09 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis reassigned CASSANDRA-2950:
-

Assignee: Jonathan Ellis  (was: Brandon Williams)

 Data from truncated CF reappears after server restart
 -

 Key: CASSANDRA-2950
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2950
 Project: Cassandra
  Issue Type: Bug
Reporter: Cathy Daw
Assignee: Jonathan Ellis

 * Configure 3 node cluster
 * Ensure the java stress tool creates Keyspace1 with RF=3
 {code}
 // Run Stress Tool to generate 10 keys, 1 column
 stress --operation=INSERT -t 2 --num-keys=50 --columns=20 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 50 keys in CLI
 use Keyspace1; 
 list Standard1; 
 // TRUNCATE CF in CLI
 use Keyspace1;
 truncate counter1;
 list counter1;
 // Run stress tool and verify creation of 1 key with 10 columns
 stress --operation=INSERT -t 2 --num-keys=1 --columns=10 
 --consistency-level=QUORUM --average-size-values --replication-factor=3 
 --create-index=KEYS --nodes=cathy1,cathy2
 // Verify 1 key in CLI
 use Keyspace1; 
 list Standard1; 
 // Restart all three nodes
 // You will see 51 keys in CLI
 use Keyspace1; 
 list Standard1; 
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081974#comment-13081974
 ] 

Jonathan Ellis commented on CASSANDRA-3010:
---

I.e., do we do \d CF (postgresql) or describe CF (mysql) or desc CF 
(oracle)?

 Java CQL command-line shell
 ---

 Key: CASSANDRA-3010
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0


 We need a real CQL shell that:
 - does not require installing additional environments
 - includes show keyspaces and other introspection tools
 - does not break existing cli scripts
 I.e., it needs to be java, but it should be a new tool instead of replacing 
 the existing cli.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081973#comment-13081973
 ] 

Jonathan Ellis commented on CASSANDRA-3010:
---

We should also pick a SQL command line to imitate for the introspection stuff. 
Might as well get that degree of familiarity as well since there is no reason 
not to.

 Java CQL command-line shell
 ---

 Key: CASSANDRA-3010
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0


 We need a real CQL shell that:
 - does not require installing additional environments
 - includes show keyspaces and other introspection tools
 - does not break existing cli scripts
 I.e., it needs to be java, but it should be a new tool instead of replacing 
 the existing cli.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell

2011-08-09 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081977#comment-13081977
 ] 

Jeremy Hanna commented on CASSANDRA-3010:
-

If I had to choose one, it would be nice to be more descriptive (describe 
versus \d).  However, it would be really nice to have a basic concept of 
synonyms.  For example mysql's cli supports both describe and desc.  Building 
that type of functionality in from the start shouldn't be too onerous.

 Java CQL command-line shell
 ---

 Key: CASSANDRA-3010
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0


 We need a real CQL shell that:
 - does not require installing additional environments
 - includes show keyspaces and other introspection tools
 - does not break existing cli scripts
 I.e., it needs to be java, but it should be a new tool instead of replacing 
 the existing cli.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-3010) Java CQL command-line shell

2011-08-09 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081977#comment-13081977
 ] 

Jeremy Hanna edited comment on CASSANDRA-3010 at 8/9/11 10:36 PM:
--

If I had to choose one, it would be nice to be more descriptive (describe 
versus \d).  However, it would be really nice to have a basic concept of 
synonyms.  For example mysql's cli supports both describe and desc.  Building 
that type of functionality in from the start would hopefully not be too onerous.

  was (Author: jeromatron):
If I had to choose one, it would be nice to be more descriptive (describe 
versus \d).  However, it would be really nice to have a basic concept of 
synonyms.  For example mysql's cli supports both describe and desc.  Building 
that type of functionality in from the start shouldn't be too onerous.
  
 Java CQL command-line shell
 ---

 Key: CASSANDRA-3010
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0


 We need a real CQL shell that:
 - does not require installing additional environments
 - includes show keyspaces and other introspection tools
 - does not break existing cli scripts
 I.e., it needs to be java, but it should be a new tool instead of replacing 
 the existing cli.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3010) Java CQL command-line shell

2011-08-09 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13081982#comment-13081982
 ] 

Pavel Yaskevich commented on CASSANDRA-3010:


I don't think that we should choose anything because we can support all of 
those notations using synonyms in the ANTLR grammar. That would be hard from 
the begging to include all of the possible synonyms but grammar will be 
designed in the way which will allow to easy add new synonyms as we go.

 Java CQL command-line shell
 ---

 Key: CASSANDRA-3010
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3010
 Project: Cassandra
  Issue Type: New Feature
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.0


 We need a real CQL shell that:
 - does not require installing additional environments
 - includes show keyspaces and other introspection tools
 - does not break existing cli scripts
 I.e., it needs to be java, but it should be a new tool instead of replacing 
 the existing cli.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files

2011-08-09 Thread Melvin Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082008#comment-13082008
 ] 

Melvin Wang commented on CASSANDRA-2988:


bq. First of all I would like to point you to 
http://wiki.apache.org/cassandra/CodeStyle, please modify your code according 
to conventions listed in there.
Sure. This boils down to where to put the curly braces

bq. please encapsulate your modifications because if you compare how it was and 
how it is in your patch it's hard to undertand and just looks like a mess, I 
would like to suggest moving those modifications to separate inner class 
(IndexReader maybe?) and replace only RandomAccessReader initialization in the 
SSTableReader.load(...) method...
This patch is about changing the most part of the load() method. I am not clear 
how we could only change the initialization of RandomAcessReader.

bq. Also I don't quiet understand logic behind while (buffer.remaining()  10) 
{ in SSTableReader.loadByteBuffer, let's avoid any hardcoding or at least 
comment why you did that.
Sorry for lacking comments. I will add it. However, this is not a hard coding 
in the sense that, Short consists of 2 bytes and Long consists of 8 bytes, the 
sum is 10 bytes. It is just a quick checking if we reach the end.

bq. I'm going to take a closer look at patch for parallel index file loading 
after we will be done with index reader patch (c2988-modified-buffer.patch).
FYI, these two patches are completely independent with each other.

 Improve SSTableReader.load() when loading index files
 -

 Key: CASSANDRA-2988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Melvin Wang
Assignee: Melvin Wang
Priority: Minor
 Fix For: 1.0

 Attachments: c2988-modified-buffer.patch, 
 c2988-parallel-load-sstables.patch


 * when we create BufferredRandomAccessFile, we pass skipCache=true. This 
 hurts the read performance because we always process the index files 
 sequentially. Simple fix would be set it to false.
 * multiple index files of a single column family can be loaded in parallel. 
 This buys a lot when you have multiple super large index files.
 * we may also change how we buffer. By using BufferredRandomAccessFile, for 
 every read, we need bunch of checking like
   - do we need to rebuffer?
   - isEOF()?
   - assertions
   These can be simplified to some extent.  We can blindly buffer the index 
 file by chunks and process the buffer until a key lies across boundary of a 
 chunk. Then we rebuffer and start from the beginning of the partially read 
 key. Conceptually, this is same as what BRAF does but w/o the overhead in the 
 read**() methods in BRAF.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

2011-08-09 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2777:


Attachment: 2777-v2.txt

v2 rebased.

 Pig storage handler should implement LoadMetadata
 -

 Key: CASSANDRA-2777
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.7.9

 Attachments: 2777-v2.txt, 2777.txt


 The reason for this is many builtin functions like SUM won't work on longs 
 (you can workaround using LongSum, but that's lame) because the query planner 
 doesn't know about the types beforehand, even though we are casting to native 
 longs.
 There is some impact to this, though.  With LoadMetadata implemented, 
 existing scripts that specify schema will need to remove it (since LM is 
 doing it for them) and they will need to conform to LM's terminology (key, 
 columns, name, value) within the script.  This is trivial to change, however, 
 and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2988) Improve SSTableReader.load() when loading index files

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082016#comment-13082016
 ] 

Jonathan Ellis commented on CASSANDRA-2988:
---

bq. Short consists of 2 bytes and Long consists of 8 bytes, the sum is 10 bytes

IMO that's more obvious if you leave it as 2 + 8, or use the DBConstants 
class.

 Improve SSTableReader.load() when loading index files
 -

 Key: CASSANDRA-2988
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2988
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Melvin Wang
Assignee: Melvin Wang
Priority: Minor
 Fix For: 1.0

 Attachments: c2988-modified-buffer.patch, 
 c2988-parallel-load-sstables.patch


 * when we create BufferredRandomAccessFile, we pass skipCache=true. This 
 hurts the read performance because we always process the index files 
 sequentially. Simple fix would be set it to false.
 * multiple index files of a single column family can be loaded in parallel. 
 This buys a lot when you have multiple super large index files.
 * we may also change how we buffer. By using BufferredRandomAccessFile, for 
 every read, we need bunch of checking like
   - do we need to rebuffer?
   - isEOF()?
   - assertions
   These can be simplified to some extent.  We can blindly buffer the index 
 file by chunks and process the buffer until a key lies across boundary of a 
 chunk. Then we rebuffer and start from the beginning of the partially read 
 key. Conceptually, this is same as what BRAF does but w/o the overhead in the 
 read**() methods in BRAF.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-08-09 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams updated CASSANDRA-2810:


Attachment: 2810-v2.txt

It looks like the final problem here is that IntegerType always returns a 
BigInteger, which pig does not like.  This is unfortunate since IntegerType 
can't be easily subclassed and overridden to return ints.

v2 instead adds a setTupleValue method that is always used for adding values to 
tuples, and houses all the special-casing currently needed and provides a spot 
for more in the future, rather than proliferating custom type converters since 
I'm sure IntegerType won't be alone here.

 RuntimeException in Pig when using dump command on column name
 

 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
 java version 1.6.0_24
 Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang
Assignee: Brandon Williams
 Attachments: 2810-v2.txt, 2810.txt


 This bug was previously report on [Brisk bug 
 tracker|https://datastax.jira.com/browse/BRISK-232].
 In cassandra-cli:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] describe keyspace;
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
   Durable Writes: true
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 {code}
 In Pig command line:
 {code}
 grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
 (rowkey:chararray, columns: bag {T: (name:long, value:int)});
 grunt value_test = foreach test generate rowkey, columns.name, columns.value;
 grunt dump value_test;
 {code}
 In /var/log/cassandra/system.log, I have severals time this exception:
 {code}
 INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 
 TaskInProgress.java (line 551) Error from 
 attempt_201106210955_0051_m_00_3: java.lang.RuntimeException: Unexpected 
 data type -1 found in stream.
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
   at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
   at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
   at 
 org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
   at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
   at 
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
   at 
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
   at 
 

[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082039#comment-13082039
 ] 

Jonathan Ellis commented on CASSANDRA-2982:
---

Want to give a high-level overview of the changes here?

 Refactor secondary index api
 

 Key: CASSANDRA-2982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2982
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 1.0

 Attachments: 2982-v1.txt


 Secondary indexes currently make some bad assumptions about the underlying 
 indexes.
 1. That they are always stored in other column families.
 2. That there is a unique index per column
 In the case of CASSANDRA-2915 neither of these are true.  The new api should 
 abstract the search concepts and allow any search api to plug in.
 Once the code is refactored and basically pluggable we can remove the 
 IndexType enum and use class names similar to how we handle partitioners and 
 comparators.
 Basic api is to add a SecondaryIndexManager that handles different index 
 types per CF and a SecondaryIndex base class that handles a particular type 
 implementation.
 This requires major changes to ColumnFamilyStore and Table.IndexBuilder

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082091#comment-13082091
 ] 

Boris Yen commented on CASSANDRA-3006:
--

Here is the test program I am using now. the hector version is 0.8.0-2.
Hope this will be helpful.


import java.util.Arrays;

import me.prettyprint.cassandra.model.AllOneConsistencyLevelPolicy;
import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.cassandra.service.CassandraHostConfigurator;
import me.prettyprint.cassandra.service.ThriftCluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HCounterColumn;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.mutation.Mutator;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;


public class CounterTest {
private Logger logger = LoggerFactory.getLogger(CounterTest.class) ;
private static final Integer COUNTER_NUM = 1000 ;
private static final StringSerializer ss = StringSerializer.get();
private static final String HOST = 172.17.19.151:9160 ;
private ThriftCluster cluster ;

/**
 * @param args
 */
public static void main(String[] args) {
CounterTest tc = new CounterTest() ;

try {
tc.testAlarmCounter() ;
} catch (InterruptedException e) {

}
}

public CounterTest(){
CassandraHostConfigurator chc = new 
CassandraHostConfigurator(HOST) ;
chc.setMaxActive(100) ;
chc.setMaxIdle(10) ;
chc.setCassandraThriftSocketTimeout(6) ;

cluster = new ThriftCluster(Test Cluster, chc) ;
}

public void testAlarmCounter() throws InterruptedException{
int successCounter = 0 ;
int cl = 0;

for(int i=0; iCOUNTER_NUM; i++){
try{
logger.info(count: +i) ;

MutatorString mutator = 
HFactory.createMutator(getKeyspace(cl), StringSerializer.get());

HCounterColumnString column = 
HFactory.createCounterColumn(testSC, 1L) ;
mutator.addCounter(sc, testCounter, 
HFactory.createCounterSuperColumn(testC, Arrays.asList(column), ss, ss));
mutator.execute() ;

successCounter++ ;
} catch(Exception e){
logger.info(Error! Change consistency level to 
1., e) ;
cl=1 ;
}

Thread.sleep(50) ;
}

logger.info(\nsuccess counter: +successCounter) ;
}

private Keyspace getKeyspace(int cl){
if(cl == 1)
return HFactory.createKeyspace(test, cluster, new 
AllOneConsistencyLevelPolicy()) ;
else
return HFactory.createKeyspace(test, cluster) ; // 
default consistency level is Quorum
}
}

 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen
Assignee: Sylvain Lebresne

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
 Keyspace: test:
   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [datacenter1:2]
   Column Families:
 ColumnFamily: testCounter (Super)
 APP status information.
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.CounterColumnType
   Columns sorted by: 
 org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
 

[jira] [Commented] (CASSANDRA-2991) Add a 'load new sstables' JMX/nodetool command

2011-08-09 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082097#comment-13082097
 ] 

Jonathan Ellis commented on CASSANDRA-2991:
---

What about the restore snapshot scenario?

 Add a 'load new sstables' JMX/nodetool command
 --

 Key: CASSANDRA-2991
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2991
 Project: Cassandra
  Issue Type: New Feature
Reporter: Brandon Williams
Priority: Minor
 Fix For: 0.8.4


 Sometimes people have to create a new cluster to get around a problem and 
 need to copy sstables around.  It would be convenient to be able to trigger 
 this from nodetool or JMX instead of doing a restart of the node.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-1608) Redesigned Compaction

2011-08-09 Thread Benjamin Coverston (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Coverston updated CASSANDRA-1608:
--

Attachment: 1608-v13.txt

1608 without some of the cruft

 Redesigned Compaction
 -

 Key: CASSANDRA-1608
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1608
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Chris Goffinet
Assignee: Benjamin Coverston
 Attachments: 1608-v11.txt, 1608-v13.txt, 1608-v2.txt


 After seeing the I/O issues in CASSANDRA-1470, I've been doing some more 
 thinking on this subject that I wanted to lay out.
 I propose we redo the concept of how compaction works in Cassandra. At the 
 moment, compaction is kicked off based on a write access pattern, not read 
 access pattern. In most cases, you want the opposite. You want to be able to 
 track how well each SSTable is performing in the system. If we were to keep 
 statistics in-memory of each SSTable, prioritize them based on most accessed, 
 and bloom filter hit/miss ratios, we could intelligently group sstables that 
 are being read most often and schedule them for compaction. We could also 
 schedule lower priority maintenance on SSTable's not often accessed.
 I also propose we limit the size of each SSTable to a fix sized, that gives 
 us the ability to  better utilize our bloom filters in a predictable manner. 
 At the moment after a certain size, the bloom filters become less reliable. 
 This would also allow us to group data most accessed. Currently the size of 
 an SSTable can grow to a point where large portions of the data might not 
 actually be accessed as often.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3006) Enormous counter

2011-08-09 Thread Boris Yen (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082122#comment-13082122
 ] 

Boris Yen commented on CASSANDRA-3006:
--

In order to make it easier to reproduce this issue, I document how I recreate 
this issue step by step.

1. clean any thing that is inside /var/lib/cassandra on node 172.17.19.151

2. start cassandra on node 172.17.19.151.

3. clean any thing that is inside /var/lib/cassnadra on node 172.17.19.152

4. modify the cassandra.yaml of 172.17.19.152 and add 172.17.19.151 as a seed.

5. start cassandra on node 172.17.19.152, I could see two node has formed a 
cluster, I also double check that using nodetool.

6. on node 172.17.19.151, I use cassandra-cli: to connect 172.17.19.151/9160, 
and execute commands - 

create keyspace test
with placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
and strategy_options = [{datacenter1:2}];

create column family testCounter
with column_type = Super
and default_validation_class = CounterColumnType
and replicate_on_write = true
and comparator = BytesType
and subcomparator = BytesType
and comment = 'APP status information.';

7. use the test program to add the counter 1000 times. between each adding 
action the program will pause 50 millisecond.

8. in the middle of the adding process, shut down the cassandra on node 
172.17.19.152, (let's say I shut down node 172.17.19.152 when count is 200.). 
Because the test program changes the consistency level to One when it 
encounters an exception (timeout exception to be exact), the following adding 
actions will still be success.

9. wait for the overall adding process to complete. I saw success counter: 
999 due to one exception. 

10. use the cassandra-cli to connect to 172.17.19.151 and 172.17.19.152 and 
check the counter value, the value is 1001 on both nodes. It shows 1001 because 
hector will retry when it encounters the timeout exception. 

11. shutdown the cassandra on 172.17.19.151, wait for a few seconds, I saw 
InetAddress /172.17.19.151 is now dead on node 172.17.19.152.

12. after seeing InetAddress /172.17.19.151 is now dead, restart the 
cassandra on node 172.17.19.151.

13. check the counter again with cassandra-cli on both nodes, this time the 
counter should no longer be 1001, it should be other weird number.

Hope someone else could recreate it by these steps.

 Enormous counter 
 -

 Key: CASSANDRA-3006
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3006
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.3
 Environment: ubuntu 10.04
Reporter: Boris Yen
Assignee: Sylvain Lebresne

 I have two-node cluster with the following keyspace and column family 
 settings.
 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions: 
   63fda700-c243-11e0--2d03dcafebdf: [172.17.19.151, 172.17.19.152]
 Keyspace: test:
   Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
   Durable Writes: true
 Options: [datacenter1:2]
   Column Families:
 ColumnFamily: testCounter (Super)
 APP status information.
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.CounterColumnType
   Columns sorted by: 
 org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: true
   Built indexes: []
 Then, I use a test program based on hector to add a counter column 
 (testCounter[sc][column]) 1000 times. In the middle the adding process, I 
 intentional shut down the node 172.17.19.152. In addition to that, the test 
 program is smart enough to switch the consistency level from Quorum to One, 
 so that the following adding actions would not fail. 
 After all the adding actions are done, I start the cassandra on 
 172.17.19.152, and I use cassandra-cli to check if the counter is correct on 
 both nodes, and I got a result 1001 which should be reasonable because hector 
 will retry once. However, when I shut down 172.17.19.151 and after 
 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the cassandra 
 on 172.17.19.151 again. Then, I check the counter again, this time I got a 
 result 481387 which is so wrong.
 I use 0.8.3 to reproduce this bug, but I think this also happens on 0.8.2 or 
 before also. 

--
This message is automatically 

[jira] [Commented] (CASSANDRA-2982) Refactor secondary index api

2011-08-09 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082123#comment-13082123
 ] 

T Jake Luciani commented on CASSANDRA-2982:
---

Sure. I've abstracted the index management for a column family to 
SecondaryIndexManager. For a particular column a index type can be specified 
that is implemented by a SecondaryIndex subclass. 

Index building and updating works the same but is now encapsulated by this API. 
The search API is abstracted by a custom SecondaryIndexSearcher subclass which 
handles searching a IndexClause for columns of a specific index type. 

This does not support searching across index types so all queries must accept 
index expressions of the same index type. Otherwise you get an exception. 

The one thing I might change is not exposing the cfs indexmanager variable and 
instead expose all the index manager calls as part of the Cfs API that 
delegates to indexmanager. 

 Refactor secondary index api
 

 Key: CASSANDRA-2982
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2982
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core
Reporter: T Jake Luciani
Assignee: T Jake Luciani
 Fix For: 1.0

 Attachments: 2982-v1.txt


 Secondary indexes currently make some bad assumptions about the underlying 
 indexes.
 1. That they are always stored in other column families.
 2. That there is a unique index per column
 In the case of CASSANDRA-2915 neither of these are true.  The new api should 
 abstract the search concepts and allow any search api to plug in.
 Once the code is refactored and basically pluggable we can remove the 
 IndexType enum and use class names similar to how we handle partitioners and 
 comparators.
 Basic api is to add a SecondaryIndexManager that handles different index 
 types per CF and a SecondaryIndex base class that handles a particular type 
 implementation.
 This requires major changes to ColumnFamilyStore and Table.IndexBuilder

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2170) Load spikes

2011-08-09 Thread Jason Harvey (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082138#comment-13082138
 ] 

Jason Harvey commented on CASSANDRA-2170:
-

Re-opening per request of driftx.

So, still seeing this problem ever since our upgrade from 0.6.7.

It is 100% consistent on 0.7.1, 0.7.2, 0.7.3, 0.7.4, 0.8.0, 0.8.1. I've tried 
Sun JRE and OpenJDK. Tried with JNA and without. Tried Ubuntu 
08.04/10.04/10.10/11.04, as well as RHEL5.1. It *only* happens on coordinator 
nodes.

For the 0.8 ring, I created a brand new ring and added data from our app one CF 
at a time. As soon as I added a busy CF, the problem popped up again. The load 
on the boxes in the new ring is under 1 all the time, except for when the load 
spike occurs.

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Reopened] (CASSANDRA-2170) Load spikes

2011-08-09 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams reopened CASSANDRA-2170:
-

  Assignee: Brandon Williams

 Load spikes
 ---

 Key: CASSANDRA-2170
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2170
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.6.11
Reporter: Jonathan Ellis
Assignee: Brandon Williams

 as reported on CASSANDRA-2058, some users are still seeing load spikes on 
 0.6.11, even with fairly low-volume read workloads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira