[jira] [Created] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging

2011-06-22 Thread Jackson Chung (JIRA)
add java vendor/versoin to cassandra startup logging


 Key: CASSANDRA-2808
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Priority: Minor


currently to determine which exact java is being used by the CassandraDaemon 
jvm could be difficult. Some may have use rpm/deb java package, other may have 
used tarbar and set JAVA_HOME, PATH, etc etc.

It could be done, but may take some iteration to get the true answer if one's 
setup is complicated (user is root/cassandra and contains difference env 
settings between cassandra startup user vs the login user)

It would be very helpful to have this information simply logged in the log 
file, right at the beginning. This helps identifying the java type/version 
quickly without much operation overhead, and easily done in 1-liner:

logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), 
System.getProperty(java.version) );

In OpenJDK java, you will something similar to: 
 INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) 
Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20

In Java(TM), you will get something like:

 INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) 
Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24

this little edition will go a long way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging

2011-06-22 Thread Jackson Chung (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackson Chung updated CASSANDRA-2808:
-

Attachment: 2808.patch

patch based on trunk

 add java vendor/versoin to cassandra startup logging
 

 Key: CASSANDRA-2808
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Priority: Minor
 Attachments: 2808.patch


 currently to determine which exact java is being used by the CassandraDaemon 
 jvm could be difficult. Some may have use rpm/deb java package, other may 
 have used tarbar and set JAVA_HOME, PATH, etc etc.
 It could be done, but may take some iteration to get the true answer if 
 one's setup is complicated (user is root/cassandra and contains difference 
 env settings between cassandra startup user vs the login user)
 It would be very helpful to have this information simply logged in the log 
 file, right at the beginning. This helps identifying the java type/version 
 quickly without much operation overhead, and easily done in 1-liner:
 logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), 
 System.getProperty(java.version) );
 In OpenJDK java, you will something similar to: 
  INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) 
 Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20
 In Java(TM), you will get something like:
  INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) 
 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 this little edition will go a long way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2792) Bootstrapping node stalls. Bootstrapper thinks it is still streaming some sstables. The source nodes do not. Caused by IllegalStateException on source nodes.

2011-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053104#comment-13053104
 ] 

Hudson commented on CASSANDRA-2792:
---

Integrated in Cassandra-0.7 #507 (See 
[https://builds.apache.org/job/Cassandra-0.7/507/])
Improve thread safety in StreamOutSession
patch by slebresne; reviewed by jbellis for CASSANDRA-2792

slebresne : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1138148
Files : 
* /cassandra/branches/cassandra-0.7/CHANGES.txt
* 
/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOutSession.java


 Bootstrapping node stalls. Bootstrapper thinks it is still streaming some 
 sstables. The source nodes do not. Caused by IllegalStateException on source 
 nodes.
 -

 Key: CASSANDRA-2792
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2792
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6
 Environment: Ubuntu 
Reporter: Dominic Williams
Assignee: Sylvain Lebresne
 Fix For: 0.7.7

 Attachments: 0001-Make-StreamOutSession-threadSafe.patch

   Original Estimate: 4h
  Remaining Estimate: 4h

 I am bootstrapping a node into a 4 node cluster with RF3 (1 node is currently 
 down due to sstable issues, but the cluster is running without issues). 
 There are two keyspaces FightMyMonster and FMM_Studio. The first keyspace 
 successfully streams and the whole operation is probably at 99% when it 
 stalls on some sstables in the much smaller FMM_Studio keyspace.
 Netstats on the bootstrapping node reports it is still streaming:
 Mode: Bootstrapping
 Not sending any streams.
 Streaming from: /192.168.1.4
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-101-Data.db 
 sections=1 progress=0/76453 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-103-Data.db 
 sections=1 progress=0/90475 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-102-Data.db 
 sections=1 progress=0/4304182 - 0%
 Streaming from: /192.168.1.3
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-158-Data.db 
 sections=2 progress=0/146990 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/AuthorClasses-f-81-Data.db 
 sections=1 progress=0/3992 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/Studio-f-70-Data.db 
 sections=1 progress=0/1776 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-159-Data.db 
 sections=2 progress=0/136829 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/PartsData-f-157-Data.db 
 sections=2 progress=0/5779597 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/AuthorClasses-f-82-Data.db 
 sections=1 progress=0/161 - 0%
FMM_Studio: /var/opt/cassandra/data/FMM_Studio/Studio-f-71-Data.db 
 sections=1 progress=0/135 - 0%
 Pool NameActive   Pending  Completed
 Commandsn/a 0334
 Responses   n/a 0 421957
 However, running netstats on the source nodes reports they are not streaming:
 Mode: Normal
  Nothing streaming to /192.168.1.9
 Not receiving any streams.
 Pool NameActive   Pending  Completed
 Commandsn/a 01949476
 Responses   n/a 11778768
 Examination of the logs on the source nodes show an IllegalStateException 
 that has likely interrupted/broken the streaming process.
 17 22:27:05,924 StreamOut.java (line 126) Beginning transfer to /192.168.1.9
  INFO [StreamStage:1] 2011-06-17 22:27:05,925 StreamOut.java (line 100) 
 Flushing memtables for FMM_Studio...
  INFO [StreamStage:1] 2011-06-17 22:27:06,004 StreamOut.java (line 173) 
 Stream context metadata 
 [/var/opt/cassandra/data/FMM_Studio/Classes-f-107-Data.db sections=1 
 progress=0/1585378 - 0%, /var/opt/cas
 sandra/data/FMM_Studio/PartsData-f-100-Data.db sections=1 progress=0/76453 - 
 0%, /var/opt/cassandra/data/FMM_Studio/PartsData-f-98-Data.db sections=1 
 progress=0/4309514 - 0%, /var/opt/cassandra/data/FMM
 _Studio/PartsData-f-99-Data.db sections=1 progress=0/90475 - 0%], 11 sstables.
  INFO [StreamStage:1] 2011-06-17 22:27:06,005 StreamOutSession.java (line 
 174) Streaming to /192.168.1.9
  INFO [StreamStage:1] 2011-06-17 22:27:06,006 StreamOut.java (line 126) 
 Beginning transfer to /192.168.1.9
  INFO [StreamStage:1] 2011-06-17 22:27:06,007 StreamOut.java (line 100) 
 Flushing memtables for FightMyMonster...
  INFO [StreamStage:1] 2011-06-17 22:27:06,007 ColumnFamilyStore.java (line 
 1065) Enqueuing flush of Memtable-MonsterMarket_1@1054909557(338 bytes, 24 
 operations)
  INFO 

[jira] [Commented] (CASSANDRA-2164) debian build dep on ant-optional is missing

2011-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053105#comment-13053105
 ] 

Hudson commented on CASSANDRA-2164:
---

Integrated in Cassandra-0.7 #507 (See 
[https://builds.apache.org/job/Cassandra-0.7/507/])
add ant-optional debian Build-Depends
patch by Peter Schuller; reviewed by Paul Cannon for CASSANDRA-2164

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1138102
Files : 
* /cassandra/branches/cassandra-0.7/debian/control


 debian build dep on ant-optional is missing
 ---

 Key: CASSANDRA-2164
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2164
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Peter Schuller
Assignee: Peter Schuller
Priority: Minor
 Fix For: 0.7.7, 0.8.2

 Attachments: 2164.txt


 Without the ant-optional package installed in Debian, builds fail (on lenny) 
 with:
 Could not create type regexpmapper due to No supported regular expression 
 matcher found: java.lang.ClassNotFoundException: 
 org.apache.tools.ant.util.regexp.Jdk14RegexpMatcher
 The attached patch makes it build. Tested on lenny and squeeze.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2804) expose dropped messages, exceptions over JMX

2011-06-22 Thread Wojciech Meler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053130#comment-13053130
 ] 

Wojciech Meler commented on CASSANDRA-2804:
---

You read in my mind :) - I can't wait to see dropped messages graph in zabbix :)
It would also be nice to have request time stats as we don't know how long 
requests live in queues so it is hard to say how long clients wait to perform 
an operation.

 expose dropped messages, exceptions over JMX
 

 Key: CASSANDRA-2804
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2804
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2807) ColumnFamilyInputFormat configuration should support multiple initial addresses

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053143#comment-13053143
 ] 

Mck SembWever edited comment on CASSANDRA-2807 at 6/22/11 9:36 AM:
---

there was a brief mention of this in CASSANDRA-2388 

  see comment 
https://issues.apache.org/jira/browse/CASSANDRA-2388?focusedCommentId=13046450page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046450

  was (Author: michaelsembwever):
there was a brief mention of this in 
https://issues.apache.org/jira/browse/CASSANDRA-2388?focusedCommentId=13046450page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046450
  
 ColumnFamilyInputFormat configuration should support multiple initial 
 addresses
 ---

 Key: CASSANDRA-2807
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2807
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.8.0
Reporter: Greg Katz

 The {{ColumnFamilyInputFormat}} class only allows a single initial node to be 
 specified through the cassandra.thrift.address configuration property. The 
 configuration should support a list of nodes in order to account for the 
 possibility that the initial node becomes unavailable.
 By contrast, the {{RingCache}} class used by the {{ColumnFamilyRecordWriter}} 
 reads the exact same {{cassandra.thrift.address}} property but splits its 
 value on commas to allow multiple initial nodes to be specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2807) ColumnFamilyInputFormat configuration should support multiple initial addresses

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053143#comment-13053143
 ] 

Mck SembWever commented on CASSANDRA-2807:
--

there was a brief mention of this in 
https://issues.apache.org/jira/browse/CASSANDRA-2388?focusedCommentId=13046450page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13046450

 ColumnFamilyInputFormat configuration should support multiple initial 
 addresses
 ---

 Key: CASSANDRA-2807
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2807
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.8.0
Reporter: Greg Katz

 The {{ColumnFamilyInputFormat}} class only allows a single initial node to be 
 specified through the cassandra.thrift.address configuration property. The 
 configuration should support a list of nodes in order to account for the 
 possibility that the initial node becomes unavailable.
 By contrast, the {{RingCache}} class used by the {{ColumnFamilyRecordWriter}} 
 reads the exact same {{cassandra.thrift.address}} property but splits its 
 value on commas to allow multiple initial nodes to be specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2804) expose dropped messages, exceptions over JMX

2011-06-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053148#comment-13053148
 ] 

Sylvain Lebresne commented on CASSANDRA-2804:
-

bq. It would also be nice to have request time stats as we don't know how long 
requests live in queues so it is hard to say how long clients wait to perform 
an operation.

That is the goal of the values (RecentReadLatencyMicros, 
TotalReadLatencyMicros, ...) exposed for each column family in the 
org.apache.cassandra.db:ColumnFamilies.keyspaceName.columnFamilyName mbean. 
It doesn't include any queuing, it is recorded directly in the thrift thread 
accepting the request (the only thing it doesn't take into account is the small 
validation we do of the request, but that's a handful of cpu cycles so it 
doesn't matter). 

 expose dropped messages, exceptions over JMX
 

 Key: CASSANDRA-2804
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2804
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2798) Repair Fails 0.8

2011-06-22 Thread David Arena (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053185#comment-13053185
 ] 

David Arena commented on CASSANDRA-2798:


Soo after a restart and a compact.. im looking at this.. ( Still doesnt seem 
absolutely correct, but yes you are correct about compaction problems.. )

10.0.1.150 Up Normal 2.61 GB 33.33% 0 
10.0.1.152 Up Normal 2.61 GB 33.33% 56713727820156410577229101238628035242 
10.0.1.154 Up Normal 3.16 GB 33.33% 113427455640312821154458202477256070485

Node1  Node2 is now back to normal..
but Node3 did not return to 2.61GB...
Ive tried, compact, flush, cleanup.. etc etc.. It wont get smaller.. :(

I still dont understand why a repair on node3 balloons the data on node1  
node2 in 0.8.. This should happen as far as i believe.. 
Its my understanding that node3 should, copy the data from its replicas on 
other nodes ( hence why we see 2x the data size... ) and then a compact to 
aggregate it down to a proper replica for the cluster..

Node1  Node2 really shouldnt be changing at all ???


 Repair Fails 0.8
 

 Key: CASSANDRA-2798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: David Arena
Assignee: Sylvain Lebresne

 I am seeing a fatal problem in the new 0.8
 Im running a 3 node cluster with a replication_factor of 3..
 On Node 3.. If i 
 # kill -9 cassandra-pid
 # rm -rf All data  logs
 # start cassandra
 # nodetool -h node-3-ip repair
 The whole cluster become duplicated..
 * e.g Before 
 node 1 - 2.65GB
 node 2 - 2.65GB
 node 3 - 2.65GB
 * e.g After
 node 1 - 5.3GB
 node 2 - 5.3GB
 node 3 - 7.95GB
 - nodetool repair, never ends (96 hours +), however there is no streams 
 running, nor any cpu or disk activity..
 - Manually killing the repair and restarting does not help.. Restarting the 
 server/cassandra does not help..
 - nodetool flush,compact,cleanup all complete, but do not help...
 This is not occuring in 0.7.6.. I have come to the conclusion this is a Major 
 0.8 issue
 Running: CentOS 5.6, JDK 1.6.0_26

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2798) Repair Fails 0.8

2011-06-22 Thread David Arena (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053185#comment-13053185
 ] 

David Arena edited comment on CASSANDRA-2798 at 6/22/11 11:18 AM:
--

Soo after a restart and a compact.. im looking at this.. ( Still doesnt seem 
absolutely correct, but yes you are correct about compaction problems.. )

10.0.1.150 Up Normal 2.61 GB 33.33% 0 
10.0.1.152 Up Normal 2.61 GB 33.33% 56713727820156410577229101238628035242 
10.0.1.154 Up Normal 3.16 GB 33.33% 113427455640312821154458202477256070485

Node1  Node2 is now back to normal..
but Node3 did not return to 2.61GB...
Ive tried, compact, flush, cleanup.. etc etc.. It wont get smaller.. :(

I still dont understand why a repair on node3 balloons the data on node1  
node2 in 0.8.. This should not happen as far as i believe.. 
Its my understanding that node3 should, copy the data from its replicas on 
other nodes ( hence why we see 2x the data size... ) and then a compact to 
aggregate it down to a proper replica for the cluster..

Node1  Node2 really shouldnt be changing at all ???


  was (Author: arenstar):
Soo after a restart and a compact.. im looking at this.. ( Still doesnt 
seem absolutely correct, but yes you are correct about compaction problems.. )

10.0.1.150 Up Normal 2.61 GB 33.33% 0 
10.0.1.152 Up Normal 2.61 GB 33.33% 56713727820156410577229101238628035242 
10.0.1.154 Up Normal 3.16 GB 33.33% 113427455640312821154458202477256070485

Node1  Node2 is now back to normal..
but Node3 did not return to 2.61GB...
Ive tried, compact, flush, cleanup.. etc etc.. It wont get smaller.. :(

I still dont understand why a repair on node3 balloons the data on node1  
node2 in 0.8.. This should happen as far as i believe.. 
Its my understanding that node3 should, copy the data from its replicas on 
other nodes ( hence why we see 2x the data size... ) and then a compact to 
aggregate it down to a proper replica for the cluster..

Node1  Node2 really shouldnt be changing at all ???

  
 Repair Fails 0.8
 

 Key: CASSANDRA-2798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: David Arena
Assignee: Sylvain Lebresne

 I am seeing a fatal problem in the new 0.8
 Im running a 3 node cluster with a replication_factor of 3..
 On Node 3.. If i 
 # kill -9 cassandra-pid
 # rm -rf All data  logs
 # start cassandra
 # nodetool -h node-3-ip repair
 The whole cluster become duplicated..
 * e.g Before 
 node 1 - 2.65GB
 node 2 - 2.65GB
 node 3 - 2.65GB
 * e.g After
 node 1 - 5.3GB
 node 2 - 5.3GB
 node 3 - 7.95GB
 - nodetool repair, never ends (96 hours +), however there is no streams 
 running, nor any cpu or disk activity..
 - Manually killing the repair and restarting does not help.. Restarting the 
 server/cassandra does not help..
 - nodetool flush,compact,cleanup all complete, but do not help...
 This is not occuring in 0.7.6.. I have come to the conclusion this is a Major 
 0.8 issue
 Running: CentOS 5.6, JDK 1.6.0_26

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2804) expose dropped messages, exceptions over JMX

2011-06-22 Thread Wojciech Meler (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053193#comment-13053193
 ] 

Wojciech Meler commented on CASSANDRA-2804:
---

I have TotalReadLatencyMicros graph in my zabbix - but unfortunately time 
request wait in queue is not included :(.
It would be ok if thread pool was unlimited (so queue len will always be 0), 
but now I have to check queue lenghts first to trust TotalReadLatencyMicros...
Is there other way to see stats how long client wait for data from given CF ? - 
because probably I misunderstood TotalReadLatencyMicros meaning :/ 


 expose dropped messages, exceptions over JMX
 

 Key: CASSANDRA-2804
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2804
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2798) Repair Fails 0.8

2011-06-22 Thread David Arena (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053195#comment-13053195
 ] 

David Arena commented on CASSANDRA-2798:


I retried the whole process..
Node3 returned this time to 2.62...

Are you able to test this in 0.8.1 ???




 Repair Fails 0.8
 

 Key: CASSANDRA-2798
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2798
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: David Arena
Assignee: Sylvain Lebresne

 I am seeing a fatal problem in the new 0.8
 Im running a 3 node cluster with a replication_factor of 3..
 On Node 3.. If i 
 # kill -9 cassandra-pid
 # rm -rf All data  logs
 # start cassandra
 # nodetool -h node-3-ip repair
 The whole cluster become duplicated..
 * e.g Before 
 node 1 - 2.65GB
 node 2 - 2.65GB
 node 3 - 2.65GB
 * e.g After
 node 1 - 5.3GB
 node 2 - 5.3GB
 node 3 - 7.95GB
 - nodetool repair, never ends (96 hours +), however there is no streams 
 running, nor any cpu or disk activity..
 - Manually killing the repair and restarting does not help.. Restarting the 
 server/cassandra does not help..
 - nodetool flush,compact,cleanup all complete, but do not help...
 This is not occuring in 0.7.6.. I have come to the conclusion this is a Major 
 0.8 issue
 Running: CentOS 5.6, JDK 1.6.0_26

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2653:


Attachment: 0001-Handle-data-get-returning-null-in-secondary-indexes.patch

The Tyler problem is actually not limited to 0 column query. The problem is 
that when we query the rows for data, we use whatever filter the user provided 
(there's a number of optimiziation in the case we have more than 1 clause but 
that doesn't really matter for our problem). The thing is, there is no 
guarantee that whatever that filter is, it will include the column of the 
primary clause (having a column count of 0 is just one case where we're sure it 
won't include it). Thus the assertion that something will be returned is bogus.

Attaching a patch (against 0.8) to fix. Note that this mean we have no way to 
assert the sanity of the index during a read, unless we force the querying of 
the primary index clause, but this will have a performance impact (and a non 
negligible one in cases this would force us to do a new query just for that).


 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2804) expose dropped messages, exceptions over JMX

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053220#comment-13053220
 ] 

Jonathan Ellis commented on CASSANDRA-2804:
---

avg time in queue = (avg read latency * avg pending tasks in queue / reader 
threads)

 expose dropped messages, exceptions over JMX
 

 Key: CASSANDRA-2804
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2804
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2809) In the Cli, update column family cf with comparator; create Column metadata

2011-06-22 Thread JIRA
In the Cli, update column family cf with comparator; create Column metadata
-

 Key: CASSANDRA-2809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2809
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32bit
java version 1.6.0_24
installed from Debian packages of Brisk-beta2
Reporter: Silvère Lestang


Using cassandra-cli, I can't update the comparator of a column family with the 
type I want and when I did it with BytesType, Column metadata appear for each 
of my existing columns.
Step to reproduce:
{code}
[default@unknown] create keyspace Test
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = [{replication_factor:1}];

[default@unknown] use Test;
Authenticated to keyspace: Test

[default@Test] create column family test;

[default@Test] describe keyspace;
...
ColumnFamily: test
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: false
  Built indexes: []
...

[default@Test] update column family test with comparator = 'LongType';
comparators do not match.
{code}
why?? the CF is empty
{code}
[default@Test] update column family test with comparator = 'BytesType';
f8e4dcb0-9cca-11e0--d0583497e7ff
Waiting for schema agreement...
... schemas agree across the cluster

[default@Test] describe keyspace;
...
ColumnFamily: test
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: false
  Built indexes: []
...

[default@Test] set test[ascii('row1')][long(1)]=integer(35);
set test[ascii('row1')][long(2)]=integer(36);
set test[ascii('row1')][long(3)]=integer(38);
set test[ascii('row2')][long(1)]=integer(45);
set test[ascii('row2')][long(2)]=integer(42);
set test[ascii('row2')][long(3)]=integer(33);

[default@Test] list test;
Using default limit of 100
---
RowKey: 726f7731
= (column=0001, value=35, timestamp=1308744931122000)
= (column=0002, value=36, timestamp=1308744931124000)
= (column=0003, value=38, timestamp=1308744931125000)
---
RowKey: 726f7732
= (column=0001, value=45, timestamp=1308744931127000)
= (column=0002, value=42, timestamp=1308744931128000)
= (column=0003, value=33, timestamp=1308744932722000)

2 Rows Returned.

[default@Test] update column family test with comparator = 'LongType';
comparators do not match.
{code}
same question than before, my columns contains only long, why I can't?

{code}
[default@Test] update column family test with comparator = 'BytesType';

[default@Test] describe keyspace;  
Keyspace: Test:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
Options: [replication_factor:1]
  Column Families:
ColumnFamily: test
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: false
  Built indexes: []
  Column Metadata:
Column Name:  (0001)
  Validation Class: org.apache.cassandra.db.marshal.IntegerType
Column Name:  (0003)
  Validation Class: org.apache.cassandra.db.marshal.IntegerType
Column Name:  (0002)
  Validation Class: org.apache.cassandra.db.marshal.IntegerType
{code}
Column Metadata appear from nowhere. I don't think that it's expected.


--
This message is automatically generated by JIRA.
For more information on 

[jira] [Commented] (CASSANDRA-2804) expose dropped messages, exceptions over JMX

2011-06-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053230#comment-13053230
 ] 

Sylvain Lebresne commented on CASSANDRA-2804:
-

Actually I apologize, I gave you the wrong number. The TotalReadLatencyMicros 
that corresponds to the full request is the one in 
org.apache.cassandra.db:StorageProxy. That one does include the time waited in 
the different queues.

 expose dropped messages, exceptions over JMX
 

 Key: CASSANDRA-2804
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2804
 Project: Cassandra
  Issue Type: Improvement
  Components: Tools
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053233#comment-13053233
 ] 

Mck SembWever commented on CASSANDRA-2388:
--

Problem with the suggested approach is that sortByProximity(..) *only* works 
when address is the local address. See assert statement 
DynamicEndpointSnitch:134

I could hack this are write the line instead

{noformat}
IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
snitch = snitch instanceof DynamicEndpointSnitch ? 
((DynamicEndpointSnitch)snitch).subsnitch : snitch;
snitch.sortByProximity(address, addresses);{noformat}
But this of course means that we always bypass DynamicEndpointSnitch's scores.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Eldon Stegall
Assignee: Mck SembWever
  Labels: hadoop, inputformat
 Fix For: 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388.patch, CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053233#comment-13053233
 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/22/11 1:05 PM:
---

Problem with the suggested approach is that sortByProximity(..) *only* works 
when address is the local address. See assert statement 
DynamicEndpointSnitch:134

I could hack this and rewrite the line to
{noformat}IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
snitch = snitch instanceof DynamicEndpointSnitch ? 
((DynamicEndpointSnitch)snitch).subsnitch : snitch;
snitch.sortByProximity(address, addresses);{noformat}
But this of course means that we always bypass DynamicEndpointSnitch's scores.

  was (Author: michaelsembwever):
Problem with the suggested approach is that sortByProximity(..) *only* 
works when address is the local address. See assert statement 
DynamicEndpointSnitch:134

I could hack this are write the line instead

{noformat}
IEndpointSnitch snitch = DatabaseDescriptor.getEndpointSnitch();
snitch = snitch instanceof DynamicEndpointSnitch ? 
((DynamicEndpointSnitch)snitch).subsnitch : snitch;
snitch.sortByProximity(address, addresses);{noformat}
But this of course means that we always bypass DynamicEndpointSnitch's scores.
  
 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Eldon Stegall
Assignee: Mck SembWever
  Labels: hadoop, inputformat
 Fix For: 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388.patch, CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2653:


Attachment: 0001-Handle-null-returns-in-data-index-query-v0.7.patch

This also affects 0.7 actually so attaching a patch for 0.7.

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Sylvain Lebresne (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sylvain Lebresne updated CASSANDRA-2653:


Affects Version/s: 0.7.6
Fix Version/s: 0.7.7

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

2011-06-22 Thread Mck SembWever (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mck SembWever updated CASSANDRA-2388:
-

Attachment: CASSANDRA-2388.patch

Up to date patch.
Follows T Jake's points (1),(2), and (4).
And bypasses DynamicEndpointSnitch when sorting by proximity.

 ColumnFamilyRecordReader fails for a given split because a host is down, even 
 if records could reasonably be read from other replica.
 -

 Key: CASSANDRA-2388
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Eldon Stegall
Assignee: Mck SembWever
  Labels: hadoop, inputformat
 Fix For: 0.8.2

 Attachments: 0002_On_TException_try_next_split.patch, 
 CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch


 ColumnFamilyRecordReader only tries the first location for a given split. We 
 should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053261#comment-13053261
 ] 

Jonathan Ellis commented on CASSANDRA-2653:
---

Is there a way we can keep a sanity check here?  CASSANDRA-2401 was not so long 
ago.

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2807) ColumnFamilyInputFormat configuration should support multiple initial addresses

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053264#comment-13053264
 ] 

Jonathan Ellis commented on CASSANDRA-2807:
---

But this is not actually addressed by the 2388 patch, right?  I don't mind 
addressing this separately.

 ColumnFamilyInputFormat configuration should support multiple initial 
 addresses
 ---

 Key: CASSANDRA-2807
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2807
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.8.0
Reporter: Greg Katz

 The {{ColumnFamilyInputFormat}} class only allows a single initial node to be 
 specified through the cassandra.thrift.address configuration property. The 
 configuration should support a list of nodes in order to account for the 
 possibility that the initial node becomes unavailable.
 By contrast, the {{RingCache}} class used by the {{ColumnFamilyRecordWriter}} 
 reads the exact same {{cassandra.thrift.address}} property but splits its 
 value on commas to allow multiple initial nodes to be specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2807) ColumnFamilyInputFormat configuration should support multiple initial addresses

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2807:
--

Priority: Minor  (was: Major)

 ColumnFamilyInputFormat configuration should support multiple initial 
 addresses
 ---

 Key: CASSANDRA-2807
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2807
 Project: Cassandra
  Issue Type: Improvement
  Components: Hadoop
Affects Versions: 0.8.0
Reporter: Greg Katz
Priority: Minor

 The {{ColumnFamilyInputFormat}} class only allows a single initial node to be 
 specified through the cassandra.thrift.address configuration property. The 
 configuration should support a list of nodes in order to account for the 
 possibility that the initial node becomes unavailable.
 By contrast, the {{RingCache}} class used by the {{ColumnFamilyRecordWriter}} 
 reads the exact same {{cassandra.thrift.address}} property but splits its 
 value on commas to allow multiple initial nodes to be specified.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053268#comment-13053268
 ] 

Sylvain Lebresne commented on CASSANDRA-2653:
-

As I said earlier, I think the only way to keep one would be to force the 
querying of the primary index clause column name. In some cases, when we 
already do a NameQuery, either as part of the first data query or because we 
need a query for the extraFilter, this won't be a big deal. If it's a slice 
query and the primary index clause name is part of the return, we're good to. 
But otherwise, we'll have to do a specific query to validate the assert. Maybe 
the cases where we'll have to do an extra query are considered low enough than 
we think it's worth. But then there is the other problem.

The other problem is that this assertion is not thread safe, because the query 
to the index and the data is not atomic. 

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2810) RuntimeException in Pig when using dump command on column name

2011-06-22 Thread JIRA
RuntimeException in Pig when using dump command on column name


 Key: CASSANDRA-2810
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2810
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 0.8.1
 Environment: Ubuntu 10.10, 32 bits
java version 1.6.0_24
Brisk beta-2 installed from Debian packages
Reporter: Silvère Lestang


This bug was previously report on [Brisk bug 
tracker|https://datastax.jira.com/browse/BRISK-232].

In cassandra-cli:
{code}
[default@unknown] create keyspace Test
with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
and strategy_options = [{replication_factor:1}];

[default@unknown] use Test;
Authenticated to keyspace: Test

[default@Test] create column family test;

[default@Test] set test[ascii('row1')][long(1)]=integer(35);
set test[ascii('row1')][long(2)]=integer(36);
set test[ascii('row1')][long(3)]=integer(38);
set test[ascii('row2')][long(1)]=integer(45);
set test[ascii('row2')][long(2)]=integer(42);
set test[ascii('row2')][long(3)]=integer(33);

[default@Test] list test;
Using default limit of 100
---
RowKey: 726f7731
= (column=0001, value=35, timestamp=1308744931122000)
= (column=0002, value=36, timestamp=1308744931124000)
= (column=0003, value=38, timestamp=1308744931125000)
---
RowKey: 726f7732
= (column=0001, value=45, timestamp=1308744931127000)
= (column=0002, value=42, timestamp=1308744931128000)
= (column=0003, value=33, timestamp=1308744932722000)

2 Rows Returned.

[default@Test] describe keyspace;
Keyspace: Test:
  Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
  Durable Writes: true
Options: [replication_factor:1]
  Column Families:
ColumnFamily: test
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds: 0.0/0
  Key cache size / save period in seconds: 20.0/14400
  Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
  GC grace seconds: 864000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: false
  Built indexes: []
{code}
In Pig command line:
{code}
grunt test = LOAD 'cassandra://Test/test' USING CassandraStorage() AS 
(rowkey:chararray, columns: bag {T: (name:long, value:int)});

grunt value_test = foreach test generate rowkey, columns.name, columns.value;

grunt dump value_test;
{code}
In /var/log/cassandra/system.log, I have severals time this exception:
{code}
INFO [IPC Server handler 3 on 8012] 2011-06-22 15:03:28,533 TaskInProgress.java 
(line 551) Error from attempt_201106210955_0051_m_00_3: 
java.lang.RuntimeException: Unexpected data type -1 found in stream.
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:478)
at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
at org.apache.pig.data.BinInterSedes.writeBag(BinInterSedes.java:522)
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:361)
at org.apache.pig.data.BinInterSedes.writeTuple(BinInterSedes.java:541)
at org.apache.pig.data.BinInterSedes.writeDatum(BinInterSedes.java:357)
at 
org.apache.pig.impl.io.InterRecordWriter.write(InterRecordWriter.java:73)
at org.apache.pig.impl.io.InterStorage.putNext(InterStorage.java:87)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:138)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputFormat$PigRecordWriter.write(PigOutputFormat.java:97)
at 
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.collect(PigMapOnly.java:48)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:239)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:232)
at 
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:53)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:763)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:369)
at org.apache.hadoop.mapred.Child$4.run(Child.java:259)
at java.security.AccessController.doPrivileged(Native Method)
at 

[jira] [Commented] (CASSANDRA-2686) Distributed per row locks

2011-06-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053296#comment-13053296
 ] 

LuĂ­s Ferreira commented on CASSANDRA-2686:
--

I've taken your advice and used Cages for the Read and Write locks. From this I 
constructed a Transaction system on top of Cassandra. As soon as I have some 
performance test results I'll put them here, as well as the code, if anyone is 
interested. 

It basically implements a write ahead log, taking advantage of the atomicity in 
per row updates, and of  idempotent updates. It also has a pre processing 
mechanism for transactions that do not know Ă  priori the columns they will use 
(when using indexes, for example). 

 Distributed per row locks
 -

 Key: CASSANDRA-2686
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2686
 Project: Cassandra
  Issue Type: Wish
  Components: Core
 Environment: any
Reporter: LuĂ­s Ferreira
  Labels: api-addition, features

 Instead of using a centralized locking strategy like cages with zookeeper, I 
 would like to have it in a decentralized way. Even if it carries some 
 limitations. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2811) Repair doesn't stagger flushes

2011-06-22 Thread Sylvain Lebresne (JIRA)
Repair doesn't stagger flushes
--

 Key: CASSANDRA-2811
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2811
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8.2


When you do a nodetool repair (with no options), the following things occured:
* For each keyspace, a call to SS.forceTableRepair is issued
* In each of those calls: for each token range the node is responsible for, a 
repair session is created and started
* Each of these session will request one merkle tree by column family (to each 
node for which it makes sense, which includes the node the repair is started on)

All those merkle tree requests are done basically at the same time. And now 
that compaction is multi-threaded, this means that usually more than one 
validation compaction will be started at the same time. The problem is that a 
validation compaction starts by a flush. Given that by default the 
flush_queue_size is 4 and the number of compaction thread is the number of 
processors and given that on any recent machine the number of core will be = 
4, this means that this will easily end up blocking write for some period of 
time.

It turns out to also have a more subtle problem for repair itself. If two 
validation compaction for the same column family (but different range) are 
started in a very short time interval, the first validation will block on the 
flush, but the second one may not block at all if the memtable is clean when it 
request it's own flush. In which case that second validation will be executed 
on data older than it should.

I think the simpler fix is to make sure we only ever do one validation 
compaction at a time. It's probably a better use of resources anyway. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2811) Repair doesn't stagger flushes

2011-06-22 Thread Sylvain Lebresne (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053302#comment-13053302
 ] 

Sylvain Lebresne commented on CASSANDRA-2811:
-

The question that remains is whether we prefer adding a specific mono-threaded 
executor for validation compaction (could make sense) or simply introduce a 
validationCompactionLock.

 Repair doesn't stagger flushes
 --

 Key: CASSANDRA-2811
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2811
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
 Fix For: 0.8.2


 When you do a nodetool repair (with no options), the following things occured:
 * For each keyspace, a call to SS.forceTableRepair is issued
 * In each of those calls: for each token range the node is responsible for, a 
 repair session is created and started
 * Each of these session will request one merkle tree by column family (to 
 each node for which it makes sense, which includes the node the repair is 
 started on)
 All those merkle tree requests are done basically at the same time. And now 
 that compaction is multi-threaded, this means that usually more than one 
 validation compaction will be started at the same time. The problem is that a 
 validation compaction starts by a flush. Given that by default the 
 flush_queue_size is 4 and the number of compaction thread is the number of 
 processors and given that on any recent machine the number of core will be = 
 4, this means that this will easily end up blocking write for some period of 
 time.
 It turns out to also have a more subtle problem for repair itself. If two 
 validation compaction for the same column family (but different range) are 
 started in a very short time interval, the first validation will block on the 
 flush, but the second one may not block at all if the memtable is clean when 
 it request it's own flush. In which case that second validation will be 
 executed on data older than it should.
 I think the simpler fix is to make sure we only ever do one validation 
 compaction at a time. It's probably a better use of resources anyway. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-06-22 Thread Brian Palmer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053365#comment-13053365
 ] 

Brian Palmer commented on CASSANDRA-2500:
-

Jon,

jbellis asked me to take a quick look at this, but I haven't had much luck so 
far getting this to work with DBI 0.4.3, ruby 1.8.7, cassandra 0.8.0 on OS X. 
The short list:

 * ruby syntax error in DBI::DBD::Cass::Database#execute_prepared (missing 
comma)
 * Cass::Driver calls super with VERSION which is 0.0.0, but DBI is expecting 
the DBI API version (0.4.0), so an exception is thrown on init
 * initializing a new DBI connection fails due to 
DBI::DBD::Cass::Database#active? not checking if @tconn is nil before calling 
@tconn.current_server
 * DBI::DBD::Cass::Statement#execute references db, which isn't defined, it 
probably means to use @db local var
 * After working around the above errors, I get an exception when running a 
select query like `dbh.execute(select * from users;)` : NoMethodError: 
undefined method `size' for #CassandraThrift::CqlRow:0x102badf58

I'm not entirely sure if any of these last four are issues specific to my 
environment, but it seems unlikely that they all are. The require statements in 
Cass.rb definitely need cleaning up as well, but it sounds like that's still an 
open question. Also, I'd suggest changing all those define_method calls in 
Cass.rb to normal method definitions just using def, that's very unidiomatic 
ruby.

 Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
 ---

 Key: CASSANDRA-2500
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
 Project: Cassandra
  Issue Type: Task
Reporter: Jon Hermes
Assignee: Jon Hermes
 Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz


 Create a ruby driver for CQL.
 Lacking something standard (such as py-dbapi), going with something common 
 instead -- RoR ActiveRecord Connection Adapter 
 (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2653) index scan errors out when zero columns are requested

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053390#comment-13053390
 ] 

Jonathan Ellis commented on CASSANDRA-2653:
---

bq. I think the only way to keep one would be to force the querying of the 
primary index clause column name... but this will have a performance impact

I think we should take the impact.  (The common query that we want to be fast 
is name-based and this won't affect that.)

 index scan errors out when zero columns are requested
 -

 Key: CASSANDRA-2653
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2653
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.7.6, 0.8.0 beta 2
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7, 0.8.1

 Attachments: 
 0001-Handle-data-get-returning-null-in-secondary-indexes.patch, 
 0001-Handle-null-returns-in-data-index-query-v0.7.patch, 
 0001-Reset-SSTII-in-EchoedRow-constructor.patch, 
 v1-0001-CASSANDRA-2653-reproduce-regression.txt


 As reported by Tyler Hobbs as an addendum to CASSANDRA-2401,
 {noformat}
 ERROR 16:13:38,864 Fatal exception in thread Thread[ReadStage:16,5,main]
 java.lang.AssertionError: No data found for 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0] in DecoratedKey(81509516161424251288255223397843705139, 
 6b657931):QueryPath(columnFamilyName='cf', superColumnName='null', 
 columnName='null') (original filter 
 SliceQueryFilter(start=java.nio.HeapByteBuffer[pos=10 lim=10 cap=30], 
 finish=java.nio.HeapByteBuffer[pos=17 lim=17 cap=30], reversed=false, 
 count=0]) from expression 'cf.626972746864617465 EQ 1'
   at 
 org.apache.cassandra.db.ColumnFamilyStore.scan(ColumnFamilyStore.java:1517)
   at 
 org.apache.cassandra.service.IndexScanVerbHandler.doVerb(IndexScanVerbHandler.java:42)
   at 
 org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:72)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
   at java.lang.Thread.run(Thread.java:662)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2500:
--

 Reviewer: codekitchen  (was: xedin)
  Component/s: API
Fix Version/s: 0.8.2
 Assignee: Pavel Yaskevich  (was: Jon Hermes)
   Labels: cql  (was: )

 Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
 ---

 Key: CASSANDRA-2500
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
 Project: Cassandra
  Issue Type: Task
  Components: API
Reporter: Jon Hermes
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.2

 Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz


 Create a ruby driver for CQL.
 Lacking something standard (such as py-dbapi), going with something common 
 instead -- RoR ActiveRecord Connection Adapter 
 (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2808.
---

   Resolution: Fixed
Fix Version/s: 0.8.2
 Reviewer: jbellis
 Assignee: Jackson Chung

committed

 add java vendor/versoin to cassandra startup logging
 

 Key: CASSANDRA-2808
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Jackson Chung
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2808.patch


 currently to determine which exact java is being used by the CassandraDaemon 
 jvm could be difficult. Some may have use rpm/deb java package, other may 
 have used tarbar and set JAVA_HOME, PATH, etc etc.
 It could be done, but may take some iteration to get the true answer if 
 one's setup is complicated (user is root/cassandra and contains difference 
 env settings between cassandra startup user vs the login user)
 It would be very helpful to have this information simply logged in the log 
 file, right at the beginning. This helps identifying the java type/version 
 quickly without much operation overhead, and easily done in 1-liner:
 logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), 
 System.getProperty(java.version) );
 In OpenJDK java, you will something similar to: 
  INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) 
 Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20
 In Java(TM), you will get something like:
  INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) 
 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 this little edition will go a long way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138584 - /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

2011-06-22 Thread jbellis
Author: jbellis
Date: Wed Jun 22 19:06:17 2011
New Revision: 1138584

URL: http://svn.apache.org/viewvc?rev=1138584view=rev
Log:
log JVM version
patch by Jackson Chung; reviewed by jbellis for CASSANDRA-2808

Modified:

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java?rev=1138584r1=1138583r2=1138584view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/service/AbstractCassandraDaemon.java
 Wed Jun 22 19:06:17 2011
@@ -93,6 +93,7 @@ public abstract class AbstractCassandraD
  */
 protected void setup() throws IOException
 {
+   logger.info(JVM vendor/version: {}/{}, 
System.getProperty(java.vm.name), System.getProperty(java.version) );
 logger.info(Heap size: {}/{}, Runtime.getRuntime().totalMemory(), 
Runtime.getRuntime().maxMemory());
CLibrary.tryMlockall();
 




[jira] [Created] (CASSANDRA-2812) Allow changing comparator between compatible collations

2011-06-22 Thread Jonathan Ellis (JIRA)
Allow changing comparator between compatible collations
---

 Key: CASSANDRA-2812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2812
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Fix For: 0.8.2


Normally we should not allow changing comparators, but anything that sorts in 
lexical byte order (Bytes, Ascii, UTF8, LexicalUUID) is compatible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138585 - /cassandra/branches/cassandra-0.8/debian/control

2011-06-22 Thread jbellis
Author: jbellis
Date: Wed Jun 22 19:07:52 2011
New Revision: 1138585

URL: http://svn.apache.org/viewvc?rev=1138585view=rev
Log:
move jna from recommends to depends in debian
patch by Paul Cannon for CASSANDRA-2803

Modified:
cassandra/branches/cassandra-0.8/debian/control

Modified: cassandra/branches/cassandra-0.8/debian/control
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/debian/control?rev=1138585r1=1138584r2=1138585view=diff
==
--- cassandra/branches/cassandra-0.8/debian/control (original)
+++ cassandra/branches/cassandra-0.8/debian/control Wed Jun 22 19:07:52 2011
@@ -10,8 +10,7 @@ Standards-Version: 3.8.3
 
 Package: cassandra
 Architecture: all
-Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
libcommons-daemon-java (= 1.0), adduser
-Recommends: libjna-java
+Depends: openjdk-6-jre-headless (= 6b11) | java6-runtime, jsvc (= 1.0), 
libcommons-daemon-java (= 1.0), adduser, libjna-java
 Description: distributed storage system for structured data
  Cassandra is a distributed (peer-to-peer) system for the management
  and storage of structured data.




[jira] [Resolved] (CASSANDRA-2812) Allow changing comparator between compatible collations

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2812.
---

   Resolution: Won't Fix
Fix Version/s: (was: 0.8.2)
 Assignee: (was: Jonathan Ellis)

Having written the patch, I think it's more confusion than it's worth to 
explain, sometimes you can change comparator type but usually not.

 Allow changing comparator between compatible collations
 ---

 Key: CASSANDRA-2812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2812
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Priority: Minor
 Attachments: 2812.txt


 Normally we should not allow changing comparators, but anything that sorts in 
 lexical byte order (Bytes, Ascii, UTF8, LexicalUUID) is compatible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2812) Allow changing comparator between compatible collations

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2812:
--

Attachment: 2812.txt

 Allow changing comparator between compatible collations
 ---

 Key: CASSANDRA-2812
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2812
 Project: Cassandra
  Issue Type: Bug
Reporter: Jonathan Ellis
Assignee: Jonathan Ellis
Priority: Minor
 Attachments: 2812.txt


 Normally we should not allow changing comparators, but anything that sorts in 
 lexical byte order (Bytes, Ascii, UTF8, LexicalUUID) is compatible.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2809) In the Cli, update column family cf with comparator; create Column metadata

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2809:
--

 Priority: Minor  (was: Major)
Affects Version/s: (was: 0.8.1)
Fix Version/s: 0.8.2
 Assignee: Pavel Yaskevich

Changing comparators is not allowed, since the point of a comparator is that 
data within a row will be sorted on disk by the comparator's ordering.  
Changing the comparator without rewriting the data would corrupt the sstable.

Not sure where that column_metadata comes from though.  Looks like a bug.

 In the Cli, update column family cf with comparator; create Column metadata
 -

 Key: CASSANDRA-2809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2809
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Ubuntu 10.10, 32bit
 java version 1.6.0_24
 installed from Debian packages of Brisk-beta2
Reporter: Silvère Lestang
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2809-validate.txt


 Using cassandra-cli, I can't update the comparator of a column family with 
 the type I want and when I did it with BytesType, Column metadata appear for 
 each of my existing columns.
 Step to reproduce:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] describe keyspace;
 ...
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 ...
 [default@Test] update column family test with comparator = 'LongType';
 comparators do not match.
 {code}
 why?? the CF is empty
 {code}
 [default@Test] update column family test with comparator = 'BytesType';
 f8e4dcb0-9cca-11e0--d0583497e7ff
 Waiting for schema agreement...
 ... schemas agree across the cluster
 [default@Test] describe keyspace;
 ...
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 ...
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] update column family test with comparator = 'LongType';
 comparators do not match.
 {code}
 same question than before, my columns contains only long, why I can't?
 {code}
 [default@Test] update column family test with comparator = 'BytesType';
 [default@Test] describe keyspace;  
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row 

[jira] [Updated] (CASSANDRA-2809) In the Cli, update column family cf with comparator; create Column metadata

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-2809:
--

Attachment: 2809-validate.txt

Noticed that update_cf doesn't validate the given CFMetaData.  patch to add 
this.  (Does not address the column_metadata problem.)

 In the Cli, update column family cf with comparator; create Column metadata
 -

 Key: CASSANDRA-2809
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2809
 Project: Cassandra
  Issue Type: Bug
  Components: Tools
 Environment: Ubuntu 10.10, 32bit
 java version 1.6.0_24
 installed from Debian packages of Brisk-beta2
Reporter: Silvère Lestang
Assignee: Pavel Yaskevich
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2809-validate.txt


 Using cassandra-cli, I can't update the comparator of a column family with 
 the type I want and when I did it with BytesType, Column metadata appear for 
 each of my existing columns.
 Step to reproduce:
 {code}
 [default@unknown] create keyspace Test
 with placement_strategy = 'org.apache.cassandra.locator.SimpleStrategy'
 and strategy_options = [{replication_factor:1}];
 [default@unknown] use Test;
 Authenticated to keyspace: Test
 [default@Test] create column family test;
 [default@Test] describe keyspace;
 ...
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 ...
 [default@Test] update column family test with comparator = 'LongType';
 comparators do not match.
 {code}
 why?? the CF is empty
 {code}
 [default@Test] update column family test with comparator = 'BytesType';
 f8e4dcb0-9cca-11e0--d0583497e7ff
 Waiting for schema agreement...
 ... schemas agree across the cluster
 [default@Test] describe keyspace;
 ...
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0
   Replicate on write: false
   Built indexes: []
 ...
 [default@Test] set test[ascii('row1')][long(1)]=integer(35);
 set test[ascii('row1')][long(2)]=integer(36);
 set test[ascii('row1')][long(3)]=integer(38);
 set test[ascii('row2')][long(1)]=integer(45);
 set test[ascii('row2')][long(2)]=integer(42);
 set test[ascii('row2')][long(3)]=integer(33);
 [default@Test] list test;
 Using default limit of 100
 ---
 RowKey: 726f7731
 = (column=0001, value=35, timestamp=1308744931122000)
 = (column=0002, value=36, timestamp=1308744931124000)
 = (column=0003, value=38, timestamp=1308744931125000)
 ---
 RowKey: 726f7732
 = (column=0001, value=45, timestamp=1308744931127000)
 = (column=0002, value=42, timestamp=1308744931128000)
 = (column=0003, value=33, timestamp=1308744932722000)
 2 Rows Returned.
 [default@Test] update column family test with comparator = 'LongType';
 comparators do not match.
 {code}
 same question than before, my columns contains only long, why I can't?
 {code}
 [default@Test] update column family test with comparator = 'BytesType';
 [default@Test] describe keyspace;  
 Keyspace: Test:
   Replication Strategy: org.apache.cassandra.locator.SimpleStrategy
 Options: [replication_factor:1]
   Column Families:
 ColumnFamily: test
   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
   Default column value validator: 
 org.apache.cassandra.db.marshal.BytesType
   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
   Row cache size / save period in seconds: 0.0/0
   Key cache size / save period in seconds: 20.0/14400
   Memtable thresholds: 0.571875/122/1440 (millions of ops/MB/minutes)
   GC grace seconds: 864000
   Compaction min/max thresholds: 4/32
   Read repair chance: 1.0

[jira] [Created] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch

2011-06-22 Thread Jackson Chung (JIRA)
more info on logging when SSTable cannot create the builder due to version 
mismatch
---

 Key: CASSANDRA-2813
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Priority: Minor
 Attachments: 2813.patch

When run into the following:

2011-06-21 22:44:43,308 INFO [org.apache.cassandra.streaming.StreamOutSession] 
- Streaming to /10.128.64.163
2011-06-21 22:44:51,993 ERROR 
[org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in 
thread Thread[Thread-17651,5,main]
java.lang.RuntimeException: Cannot recover SSTable with version a (current 
version f).
at 
org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237)
at 
org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938)
at 
org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107)
at 
org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112)
at 
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
at 
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91)

There is no indication on which SSTable is at fault. To recover from this, one 
would need to run nodetool scrub.

This may however take some time, depending the SSTables' sizes, and it is 
possible that only 1 keyspace or CF is needed to be rebuilt by scrub.

It'd be nice to print more details of the SSTable here in case the end-user 
prefers to just scrub the keyspace/cf in question.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch

2011-06-22 Thread Jackson Chung (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jackson Chung updated CASSANDRA-2813:
-

Attachment: 2813.patch

base on trunk

 more info on logging when SSTable cannot create the builder due to version 
 mismatch
 ---

 Key: CASSANDRA-2813
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Priority: Minor
 Attachments: 2813.patch


 When run into the following:
 2011-06-21 22:44:43,308 INFO 
 [org.apache.cassandra.streaming.StreamOutSession] - Streaming to 
 /10.128.64.163
 2011-06-21 22:44:51,993 ERROR 
 [org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in 
 thread Thread[Thread-17651,5,main]
 java.lang.RuntimeException: Cannot recover SSTable with version a (current 
 version f).
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237)
 at 
 org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938)
 at 
 org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
 at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91)
 There is no indication on which SSTable is at fault. To recover from this, 
 one would need to run nodetool scrub.
 This may however take some time, depending the SSTables' sizes, and it is 
 possible that only 1 keyspace or CF is needed to be rebuilt by scrub.
 It'd be nice to print more details of the SSTable here in case the end-user 
 prefers to just scrub the keyspace/cf in question.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever commented on CASSANDRA-1125:
--

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
limits in CFIF.

And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}}

But both approaches can't be combined. So i guess ConfigHelper could have 
methods setInputKeyRange(..) and setInputIndexClause(..) which are mutually 
exclusive to call.



 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:08 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little earlier about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:08 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little earlier about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
limits in CFIF.

And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}}

But both approaches can't be combined. So i guess ConfigHelper could have 
methods setInputKeyRange(..) and setInputIndexClause(..) which are mutually 
exclusive to call.


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053467#comment-13053467
 ] 

Mck SembWever edited comment on CASSANDRA-1125 at 6/22/11 9:10 PM:
---

I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible? (it 
needs to to pass through the batch's range)



  was (Author: michaelsembwever):
I can use a {{KeyRange}} and {{Range.intersectionWith(..)}} for start/end 
rowKey limits in CFIF.

-And i can use a {{IndexClause}} (which also permits a start_key) and then 
{{get_indexed_slices(..)}} in CFRR's {{RowIterator.maybeInit()}} But both 
approaches can't be combined. So i guess ConfigHelper could have methods 
setInputKeyRange(..) and setInputIndexClause(..) which are mutually exclusive 
to call.- Spoke a little early here about using {{get_indexed_slices}}. I can't 
see how IndexClause can specify a start/end rowKey - is this possible?


  
 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-06-22 Thread Brian Palmer (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053365#comment-13053365
 ] 

Brian Palmer edited comment on CASSANDRA-2500 at 6/22/11 9:50 PM:
--

Jon,

jbellis asked me to take a quick look at this, but I haven't had much luck so 
far getting this to work with DBI 0.4.3, ruby 1.8.7, cassandra 0.8.0 on OS X. 
The short list:

 * ruby syntax error in DBI::DBD::Cass::Database#execute_prepared (missing 
comma)
 * Cass::Driver calls super with VERSION which is 0.0.0, but DBI is expecting 
the DBI API version (0.4.0), so an exception is thrown on init
 * initializing a new DBI connection fails due to 
DBI::DBD::Cass::Database#active? not checking if @tconn is nil before calling 
@tconn.current_server
 * DBI::DBD::Cass::Statement#execute references db, which isn't defined, it 
probably means to use @db local var
 * After working around the above errors, I get an exception when running a 
select query like `dbh.execute(select * from users;).fetch` : NoMethodError: 
undefined method `size' for #CassandraThrift::CqlRow:0x102badf58

I'm not entirely sure if any of these last four are issues specific to my 
environment, but it seems unlikely that they all are. The require statements in 
Cass.rb definitely need cleaning up as well, but it sounds like that's still an 
open question. Also, I'd suggest changing all those define_method calls in 
Cass.rb to normal method definitions just using def, that's very unidiomatic 
ruby.

  was (Author: codekitchen):
Jon,

jbellis asked me to take a quick look at this, but I haven't had much luck so 
far getting this to work with DBI 0.4.3, ruby 1.8.7, cassandra 0.8.0 on OS X. 
The short list:

 * ruby syntax error in DBI::DBD::Cass::Database#execute_prepared (missing 
comma)
 * Cass::Driver calls super with VERSION which is 0.0.0, but DBI is expecting 
the DBI API version (0.4.0), so an exception is thrown on init
 * initializing a new DBI connection fails due to 
DBI::DBD::Cass::Database#active? not checking if @tconn is nil before calling 
@tconn.current_server
 * DBI::DBD::Cass::Statement#execute references db, which isn't defined, it 
probably means to use @db local var
 * After working around the above errors, I get an exception when running a 
select query like `dbh.execute(select * from users;)` : NoMethodError: 
undefined method `size' for #CassandraThrift::CqlRow:0x102badf58

I'm not entirely sure if any of these last four are issues specific to my 
environment, but it seems unlikely that they all are. The require statements in 
Cass.rb definitely need cleaning up as well, but it sounds like that's still an 
open question. Also, I'd suggest changing all those define_method calls in 
Cass.rb to normal method definitions just using def, that's very unidiomatic 
ruby.
  
 Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
 ---

 Key: CASSANDRA-2500
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
 Project: Cassandra
  Issue Type: Task
  Components: API
Reporter: Jon Hermes
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.2

 Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz


 Create a ruby driver for CQL.
 Lacking something standard (such as py-dbapi), going with something common 
 instead -- RoR ActiveRecord Connection Adapter 
 (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2500) Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter

2011-06-22 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053494#comment-13053494
 ] 

Pavel Yaskevich commented on CASSANDRA-2500:


Another good thing will be to change Cass to Cassandra. Thanks for your review 
Brian! I will take it from Jon from now on.

 Ruby dbi client (for CQL) that conforms to AR:ConnectionAdapter
 ---

 Key: CASSANDRA-2500
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2500
 Project: Cassandra
  Issue Type: Task
  Components: API
Reporter: Jon Hermes
Assignee: Pavel Yaskevich
  Labels: cql
 Fix For: 0.8.2

 Attachments: 2500.txt, genthriftrb.txt, rbcql-0.0.0.tgz


 Create a ruby driver for CQL.
 Lacking something standard (such as py-dbapi), going with something common 
 instead -- RoR ActiveRecord Connection Adapter 
 (http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/AbstractAdapter.html).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (CASSANDRA-1125) Filter out ColumnFamily rows that aren't part of the query

2011-06-22 Thread Mck SembWever (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mck SembWever reassigned CASSANDRA-1125:


Assignee: Mck SembWever

 Filter out ColumnFamily rows that aren't part of the query
 --

 Key: CASSANDRA-1125
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1125
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Jeremy Hanna
Assignee: Mck SembWever
Priority: Minor
 Fix For: 1.0


 Currently, when running a MapReduce job against data in a Cassandra data 
 store, it reads through all the data for a particular ColumnFamily.  This 
 could be optimized to only read through those rows that have to do with the 
 query.
 It's a small change but wanted to put it in Jira so that it didn't fall 
 through the cracks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2317) Column family deletion time is not always reseted after gc_grace

2011-06-22 Thread Jeffrey Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053510#comment-13053510
 ] 

Jeffrey Wang commented on CASSANDRA-2317:
-

Does anyone know whether this is fixed in 0.8? We are thinking of upgrading 
soon, but I don't want to try to apply the 0.7 patch to 0.8...

 Column family deletion time is not always reseted after gc_grace
 

 Key: CASSANDRA-2317
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2317
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7

 Attachments: 
 0001-Add-AbstractColumnContainer-to-factor-common-parts-o.patch, 
 0002-Add-unit-test.patch, 
 0003-Reset-CF-and-SC-deletion-time-after-compaction.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Follow up of CASSANDRA-2305.
 Reproducible (thanks to Jeffrey Wang) by: 
 Create a CF with gc_grace_seconds = 0 and no row cache.
 Insert row X, col A with timestamp 0.
 Insert row X, col B with timestamp 2.
 Remove row X with timestamp 1 (expect col A to disappear, col B to stay).
 Wait 1 second.
 Force flush and compaction.
 Insert row X, col A with timestamp 0.
 Read row X, col A (see nothing).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2317) Column family deletion time is not always reseted after gc_grace

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053534#comment-13053534
 ] 

Jonathan Ellis commented on CASSANDRA-2317:
---

Unresolved means not fixed anywhere yet.

 Column family deletion time is not always reseted after gc_grace
 

 Key: CASSANDRA-2317
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2317
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.6
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.7.7

 Attachments: 
 0001-Add-AbstractColumnContainer-to-factor-common-parts-o.patch, 
 0002-Add-unit-test.patch, 
 0003-Reset-CF-and-SC-deletion-time-after-compaction.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Follow up of CASSANDRA-2305.
 Reproducible (thanks to Jeffrey Wang) by: 
 Create a CF with gc_grace_seconds = 0 and no row cache.
 Insert row X, col A with timestamp 0.
 Insert row X, col B with timestamp 2.
 Remove row X with timestamp 1 (expect col A to disappear, col B to stay).
 Wait 1 second.
 Force flush and compaction.
 Insert row X, col A with timestamp 0.
 Read row X, col A (see nothing).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-2814) Don't create data/commitlog/saved_caches directories in rpm package

2011-06-22 Thread Nick Bailey (JIRA)
Don't create data/commitlog/saved_caches directories in rpm package
---

 Key: CASSANDRA-2814
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2814
 Project: Cassandra
  Issue Type: Bug
  Components: Packaging
Affects Versions: 0.8.1
Reporter: Nick Bailey
Priority: Minor
 Fix For: 0.8.2


There is no need to create these directories since cassandra will create them 
if the don't exist. If you install the package and these directories already 
exist as symlinks then the package will replace them.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138692 - /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java

2011-06-22 Thread jbellis
Author: jbellis
Date: Wed Jun 22 23:39:50 2011
New Revision: 1138692

URL: http://svn.apache.org/viewvc?rev=1138692view=rev
Log:
add path for sstable version failure message
patch by Jackson Chung; reviewed by jbellis for CASSANDRA-2813

Modified:

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java

Modified: 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java?rev=1138692r1=1138691r2=1138692view=diff
==
--- 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
 (original)
+++ 
cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
 Wed Jun 22 23:39:50 2011
@@ -237,8 +237,8 @@ public class SSTableWriter extends SSTab
 if (!desc.isLatestVersion)
 // TODO: streaming between different versions will fail: need 
support for
 // recovering other versions to provide a stable streaming api
-throw new RuntimeException(String.format(Cannot recover SSTable 
with version %s (current version %s).,
- desc.version, 
Descriptor.CURRENT_VERSION));
+throw new RuntimeException(String.format(Cannot recover SSTable 
%s due to version mismatch. (current version is %s)., desc.toString()
+ , 
Descriptor.CURRENT_VERSION));
 
 return new Builder(desc, type);
 }




[jira] [Resolved] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch

2011-06-22 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-2813.
---

   Resolution: Fixed
Fix Version/s: 0.8.2
 Reviewer: jbellis
 Assignee: Jackson Chung

committed, thanks!

 more info on logging when SSTable cannot create the builder due to version 
 mismatch
 ---

 Key: CASSANDRA-2813
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Jackson Chung
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2813.patch


 When run into the following:
 2011-06-21 22:44:43,308 INFO 
 [org.apache.cassandra.streaming.StreamOutSession] - Streaming to 
 /10.128.64.163
 2011-06-21 22:44:51,993 ERROR 
 [org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in 
 thread Thread[Thread-17651,5,main]
 java.lang.RuntimeException: Cannot recover SSTable with version a (current 
 version f).
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237)
 at 
 org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938)
 at 
 org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
 at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91)
 There is no indication on which SSTable is at fault. To recover from this, 
 one would need to run nodetool scrub.
 This may however take some time, depending the SSTables' sizes, and it is 
 possible that only 1 keyspace or CF is needed to be rebuilt by scrub.
 It'd be nice to print more details of the SSTable here in case the end-user 
 prefers to just scrub the keyspace/cf in question.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2677) Optimize streaming to be single-pass

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053545#comment-13053545
 ] 

Jonathan Ellis commented on CASSANDRA-2677:
---

bq. one to write the Data component to disk from the socket

(IncomingTcpConnection.stream)

bq. another to build the [row] index and bloom filter from it

(StreamInSession.finished / CompactionManager.instance.submitSSTableBuild -- 
this is NOT talking about the buildSecondaryIndexes pass for column indexes, 
which we can't optimize away... yet)

 Optimize streaming to be single-pass
 

 Key: CASSANDRA-2677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.2


 Streaming currently is a two-pass operation: one to write the Data component 
 do disk from the socket, then another to build the index and bloom filter 
 from it.  This means we do about 2x the i/o we would if we created the index 
 and BF during the original write.
 For node movement this was not considered to be a Big Deal because the stream 
 target is not a member of the ring, so we can be inefficient without hurting 
 live queries.  But optimizing node movement to not require un/rebootstrap 
 (CASSANDRA-1427) and bulk load (CASSANDRA-1278) mean we can stream to live 
 nodes too.
 The main obstacle here is we don't know how many keys will be in the new 
 sstable ahead of time, which we need to size the bloom filter correctly. We 
 can solve this by including that information (or a close approximation) in 
 the stream setup -- the source node can calculate that without hitting disk 
 from the in-memory index summary.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2813) more info on logging when SSTable cannot create the builder due to version mismatch

2011-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053583#comment-13053583
 ] 

Hudson commented on CASSANDRA-2813:
---

Integrated in Cassandra-0.8 #185 (See 
[https://builds.apache.org/job/Cassandra-0.8/185/])
add path for sstable version failure message
patch by Jackson Chung; reviewed by jbellis for CASSANDRA-2813

jbellis : 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1138692
Files : 
* 
/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java


 more info on logging when SSTable cannot create the builder due to version 
 mismatch
 ---

 Key: CASSANDRA-2813
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2813
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Jackson Chung
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2813.patch


 When run into the following:
 2011-06-21 22:44:43,308 INFO 
 [org.apache.cassandra.streaming.StreamOutSession] - Streaming to 
 /10.128.64.163
 2011-06-21 22:44:51,993 ERROR 
 [org.apache.cassandra.service.AbstractCassandraDaemon] - Fatal exception in 
 thread Thread[Thread-17651,5,main]
 java.lang.RuntimeException: Cannot recover SSTable with version a (current 
 version f).
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.createBuilder(SSTableWriter.java:237)
 at 
 org.apache.cassandra.db.CompactionManager.submitSSTableBuild(CompactionManager.java:938)
 at 
 org.apache.cassandra.streaming.StreamInSession.finished(StreamInSession.java:107)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.readFile(IncomingStreamReader.java:112)
 at 
 org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:61)
 at 
 org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:91)
 There is no indication on which SSTable is at fault. To recover from this, 
 one would need to run nodetool scrub.
 This may however take some time, depending the SSTables' sizes, and it is 
 possible that only 1 keyspace or CF is needed to be rebuilt by scrub.
 It'd be nice to print more details of the SSTable here in case the end-user 
 prefers to just scrub the keyspace/cf in question.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2808) add java vendor/versoin to cassandra startup logging

2011-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053582#comment-13053582
 ] 

Hudson commented on CASSANDRA-2808:
---

Integrated in Cassandra-0.8 #185 (See 
[https://builds.apache.org/job/Cassandra-0.8/185/])


 add java vendor/versoin to cassandra startup logging
 

 Key: CASSANDRA-2808
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2808
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jackson Chung
Assignee: Jackson Chung
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2808.patch


 currently to determine which exact java is being used by the CassandraDaemon 
 jvm could be difficult. Some may have use rpm/deb java package, other may 
 have used tarbar and set JAVA_HOME, PATH, etc etc.
 It could be done, but may take some iteration to get the true answer if 
 one's setup is complicated (user is root/cassandra and contains difference 
 env settings between cassandra startup user vs the login user)
 It would be very helpful to have this information simply logged in the log 
 file, right at the beginning. This helps identifying the java type/version 
 quickly without much operation overhead, and easily done in 1-liner:
 logger.info(Java vendor/version: {}/{}, System.getProperty(java.vm.name), 
 System.getProperty(java.version) );
 In OpenJDK java, you will something similar to: 
  INFO [main] 2011-06-22 07:08:16,610 AbstractCassandraDaemon.java (line 95) 
 Java vendor/version: OpenJDK 64-Bit Server VM/1.6.0_20
 In Java(TM), you will get something like:
  INFO [main] 2011-06-22 00:15:34,936 AbstractCassandraDaemon.java (line 96) 
 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_24
 this little edition will go a long way.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2803) Cassandra deb should depend on libjna-java

2011-06-22 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053584#comment-13053584
 ] 

Hudson commented on CASSANDRA-2803:
---

Integrated in Cassandra-0.8 #185 (See 
[https://builds.apache.org/job/Cassandra-0.8/185/])


 Cassandra deb should depend on libjna-java
 --

 Key: CASSANDRA-2803
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2803
 Project: Cassandra
  Issue Type: Improvement
  Components: Packaging
Reporter: paul cannon
Priority: Minor
 Fix For: 0.8.2

 Attachments: 2803.txt


 Cassandra debs (0.7, 0.8, trunk) currently include a Recommends: for 
 libjna-java, the package that includes the JNA jar. The original reason for 
 the Recommends: instead of Depends: was that it's technically possible to run 
 without JNA.
 However, since (a) I know of no reason not to use JNA, and (b) the Cassandra 
 RPMs already require JNA, let us change this Recommends: to Depends: for all 
 future debs.
 I don't believe this affects the licensing issues which stopped us from 
 bundling JNA with cassandra directly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138710 - /cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java

2011-06-22 Thread jbellis
Author: jbellis
Date: Thu Jun 23 02:14:00 2011
New Revision: 1138710

URL: http://svn.apache.org/viewvc?rev=1138710view=rev
Log:
improve streaming comments

Modified:

cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java

Modified: 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java?rev=1138710r1=1138709r2=1138710view=diff
==
--- 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java
 (original)
+++ 
cassandra/branches/cassandra-0.7/src/java/org/apache/cassandra/streaming/StreamOut.java
 Thu Jun 23 02:14:00 2011
@@ -42,18 +42,31 @@ import org.apache.cassandra.utils.Pair;
 /**
  * This class handles streaming data from one node to another.
  *
- * The source node is in charge of the streaming session.  It begins the 
stream by sending
+ * The source node [the Out side] is always in charge of the streaming 
session.  Streams may
+ * be initiated either directly by the source via the methods in this class,
+ * or on demand from the target (via StreamRequest).
+ *
+ * Files to stream are grouped into sessions, which can have callbacks 
associated
+ * with them so that (for instance) we can mark a new node a full member of the
+ * cluster after all the data it needs has been streamed.
+ *
+ * The source begins a session by sending
  * a Message with the stream bit flag in the Header turned on.  Part of that 
Message
  * will include a StreamHeader that includes the files that will be streamed 
as part
  * of that session, as well as the first file-to-be-streamed. (Combining 
session list
  * and first file like this is inconvenient, but not as inconvenient as the old
  * three-part send-file-list, wait-for-ack, start-first-file dance.)
  *
- * After each file, the target will send a StreamReply indicating success
+ * This is done over a separate TCP connection to avoid blocking ordinary 
intra-node
+ * traffic during the stream.  So there is no Handler for the main stream of 
data --
+ * when a connection sets the Stream bit, IncomingTcpConnection knows what to 
expect
+ * without any further Messages.
+ *
+ * After each file, the target node [the In side] will send a StreamReply 
indicating success
  * (FILE_FINISHED) or failure (FILE_RETRY).
  *
- * When all files have been successfully transferred and integrated the source 
will send
- * SESSION_FINISHED and the session is complete.
+ * When all files have been successfully transferred and integrated the target 
will
+ * send an additional SESSION_FINISHED reply and the session is complete.
  *
  * For Stream requests (for bootstrap), one subtlety is that we always have to
  * create at least one stream reply, even if the list of files is empty, 
otherwise the




[jira] [Commented] (CASSANDRA-2677) Optimize streaming to be single-pass

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053604#comment-13053604
 ] 

Jonathan Ellis commented on CASSANDRA-2677:
---

The javadoc for the StreamOut class has a good overview of the streaming [file 
transfer] process.

 Optimize streaming to be single-pass
 

 Key: CASSANDRA-2677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.2


 Streaming currently is a two-pass operation: one to write the Data component 
 do disk from the socket, then another to build the index and bloom filter 
 from it.  This means we do about 2x the i/o we would if we created the index 
 and BF during the original write.
 For node movement this was not considered to be a Big Deal because the stream 
 target is not a member of the ring, so we can be inefficient without hurting 
 live queries.  But optimizing node movement to not require un/rebootstrap 
 (CASSANDRA-1427) and bulk load (CASSANDRA-1278) mean we can stream to live 
 nodes too.
 The main obstacle here is we don't know how many keys will be in the new 
 sstable ahead of time, which we need to size the bloom filter correctly. We 
 can solve this by including that information (or a close approximation) in 
 the stream setup -- the source node can calculate that without hitting disk 
 from the in-memory index summary.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-2677) Optimize streaming to be single-pass

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053605#comment-13053605
 ] 

Jonathan Ellis commented on CASSANDRA-2677:
---

bq. the source node can calculate that without hitting disk from the in-memory 
index summary

(referring to SSTableReader.indexSummary)

 Optimize streaming to be single-pass
 

 Key: CASSANDRA-2677
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2677
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 0.8.2


 Streaming currently is a two-pass operation: one to write the Data component 
 do disk from the socket, then another to build the index and bloom filter 
 from it.  This means we do about 2x the i/o we would if we created the index 
 and BF during the original write.
 For node movement this was not considered to be a Big Deal because the stream 
 target is not a member of the ring, so we can be inefficient without hurting 
 live queries.  But optimizing node movement to not require un/rebootstrap 
 (CASSANDRA-1427) and bulk load (CASSANDRA-1278) mean we can stream to live 
 nodes too.
 The main obstacle here is we don't know how many keys will be in the new 
 sstable ahead of time, which we need to size the bloom filter correctly. We 
 can solve this by including that information (or a close approximation) in 
 the stream setup -- the source node can calculate that without hitting disk 
 from the in-memory index summary.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138717 - in /cassandra/branches/cassandra-0.8: ./ contrib/ debian/ interface/thrift/gen-java/org/apache/cassandra/thrift/ src/java/org/apache/cassandra/config/ src/java/org/apache/cassan

2011-06-22 Thread jbellis
Author: jbellis
Date: Thu Jun 23 02:47:15 2011
New Revision: 1138717

URL: http://svn.apache.org/viewvc?rev=1138717view=rev
Log:
merge from 0.7

Modified:
cassandra/branches/cassandra-0.8/   (props changed)
cassandra/branches/cassandra-0.8/contrib/   (props changed)
cassandra/branches/cassandra-0.8/debian/control

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/config/DatabaseDescriptor.java

cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/streaming/StreamOut.java

Propchange: cassandra/branches/cassandra-0.8/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:47:15 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7:1026516-1133874,1135638,1138148
+/cassandra/branches/cassandra-0.7:1026516-1138710
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
 /cassandra/branches/cassandra-0.8:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0:1125021-1130369

Propchange: cassandra/branches/cassandra-0.8/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:47:15 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
-/cassandra/branches/cassandra-0.7/contrib:1026516-1133874,1135638,1138148
+/cassandra/branches/cassandra-0.7/contrib:1026516-1138710
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
 /cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125041
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369

Modified: cassandra/branches/cassandra-0.8/debian/control
URL: 
http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/debian/control?rev=1138717r1=1138716r2=1138717view=diff
==
--- cassandra/branches/cassandra-0.8/debian/control (original)
+++ cassandra/branches/cassandra-0.8/debian/control Thu Jun 23 02:47:15 2011
@@ -2,7 +2,7 @@ Source: cassandra
 Section: misc
 Priority: extra
 Maintainer: Eric Evans eev...@apache.org
-Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7)
+Build-Depends: debhelper (= 5), openjdk-6-jdk (= 6b11) | java6-sdk, ant (= 
1.7), ant-optional (= 1.7)
 Homepage: http://cassandra.apache.org
 Vcs-Svn: https://svn.apache.org/repos/asf/cassandra/trunk
 Vcs-Browser: http://svn.apache.org/viewvc/cassandra/trunk

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:47:15 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1133874,1135638,1138148
+/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1138710
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654
 
/cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1090934-1125013,1125041
 
/cassandra/branches/cassandra-0.8.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1125021-1130369

Propchange: 
cassandra/branches/cassandra-0.8/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:47:15 2011
@@ -1,5 +1,5 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
-/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java:1026516-1133874,1135638,1138148

svn commit: r1138718 - in /cassandra/trunk: ./ contrib/ debian/ interface/thrift/gen-java/org/apache/cassandra/thrift/

2011-06-22 Thread jbellis
Author: jbellis
Date: Thu Jun 23 02:48:35 2011
New Revision: 1138718

URL: http://svn.apache.org/viewvc?rev=1138718view=rev
Log:
merge from 0.8

Modified:
cassandra/trunk/   (props changed)
cassandra/trunk/CHANGES.txt
cassandra/trunk/contrib/   (props changed)
cassandra/trunk/debian/changelog

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Column.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/InvalidRequestException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/NotFoundException.java
   (props changed)

cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/SuperColumn.java
   (props changed)

Propchange: cassandra/trunk/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:48:35 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6:922689-1052356,1052358-1053452,1053454,1053456-1131291
 /cassandra/branches/cassandra-0.7:1026516-1133874,1135638,1138148
 /cassandra/branches/cassandra-0.7.0:1053690-1055654
-/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1137774,1137982,1137984,1138149
+/cassandra/branches/cassandra-0.8:1090934-1125013,1125019-1137982,1137984,1138149
 /cassandra/branches/cassandra-0.8.0:1125021-1130369
 /cassandra/branches/cassandra-0.8.1:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3:1051699-1053689

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1138718r1=1138717r2=1138718view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Thu Jun 23 02:48:35 2011
@@ -81,12 +81,20 @@
  * make stress.jar executable (CASSANDRA-2744)
  * add daemon mode to java stress (CASSANDRA-2267)
  * expose the DC and rack of a node through JMX and nodetool ring 
(CASSANDRA-2531)
+ .working
  * fix cache mbean getSize (CASSANDRA-2781)
  * Add Date, Float, Double, and Boolean types (CASSANDRA-2530)
  * fix repair hanging if a neighbor has nothing to send (CASSANDRA-2797)
  * add jamm agent to cassandra.bat (CASSANDRA-2787)
  * Fix wrong purge of deleted cf during compaction (CASSANDRA-2786)
  * purge tombstone even if row is in only one sstable (CASSANDRA-2801)
+===
+ * fix cache mbean getSize (CASSANDRA-2781)
+ * Add Date, Float, Double, and Boolean types (CASSANDRA-2530)
+ * Add startup flag to renew counter node id (CASSANDRA-2788)
+ * add jamm agent to cassandra.bat (CASSANDRA-2787)
+ * fix repair hanging if a neighbor has nothing to send (CASSANDRA-2797)
+ .merge-right.r1137981
 
 
 0.8.0-final

Propchange: cassandra/trunk/contrib/
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:48:35 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/contrib:922689-1052356,1052358-1053452,1053454,1053456-1068009
 /cassandra/branches/cassandra-0.7/contrib:1026516-1133874,1135638,1138148
 /cassandra/branches/cassandra-0.7.0/contrib:1053690-1055654
-/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1137774,1137982,1137984,1138149
+/cassandra/branches/cassandra-0.8/contrib:1090934-1125013,1125019-1137982,1137984,1138149
 /cassandra/branches/cassandra-0.8.0/contrib:1125021-1130369
 /cassandra/branches/cassandra-0.8.1/contrib:1101014-1125018
 /cassandra/tags/cassandra-0.7.0-rc3/contrib:1051699-1053689

Modified: cassandra/trunk/debian/changelog
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/debian/changelog?rev=1138718r1=1138717r2=1138718view=diff
==
--- cassandra/trunk/debian/changelog (original)
+++ cassandra/trunk/debian/changelog Thu Jun 23 02:48:35 2011
@@ -2,7 +2,7 @@ cassandra (0.8.1) unstable; urgency=low
 
   * New release 
 
- -- Sylvain Lebresne slebre...@apache.org  Thu, 16 Jun 2011 09:37:27 +0200
+ -- Sylvain Lebresne slebre...@apache.org  Thu, 21 Jun 2011 09:37:27 +0200
 
 cassandra (0.8.0) unstable; urgency=low
 

Propchange: 
cassandra/trunk/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java
--
--- svn:mergeinfo (original)
+++ svn:mergeinfo Thu Jun 23 02:48:35 2011
@@ -1,7 +1,7 @@
 
/cassandra/branches/cassandra-0.6/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:922689-1052356,1052358-1053452,1053454,1053456-1131291
 
/cassandra/branches/cassandra-0.7/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1026516-1133874,1135638,1138148
 
/cassandra/branches/cassandra-0.7.0/interface/thrift/gen-java/org/apache/cassandra/thrift/Cassandra.java:1053690-1055654

[jira] [Commented] (CASSANDRA-2589) row deletes do not remove columns

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053662#comment-13053662
 ] 

Jonathan Ellis commented on CASSANDRA-2589:
---

Makes sense to me.  Can you add a comment with that explanation?

 row deletes do not remove columns
 -

 Key: CASSANDRA-2589
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2589
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Aaron Morton
Assignee: Aaron Morton
Priority: Minor
 Fix For: 0.8.2

 Attachments: 
 0001-remove-deleted-columns-before-flushing-memtable-v07.patch, 
 0001-remove-deleted-columns-before-flushing-memtable-v08.patch


 When a row delete is issued CF.delete() sets the localDeletetionTime and 
 markedForDeleteAt values but does not remove columns which have a lower time 
 stamp. As a result:
 # Memory which could be freed is held on to (prob not too bad as it's already 
 counted)
 # The deleted columns are serialised to disk, along with the CF info to say 
 they are no longer valid. 
 # NamesQueryFilter and SliceQueryFilter have to do more work as they filter 
 out the irrelevant columns using QueryFilter.isRelevant()
 # Also columns written with a lower time stamp after the deletion are added 
 to the CF without checking markedForDeletionAt.
 This can cause RR to fail, will create another ticket for that and link. This 
 ticket is for a fix to removing the columns. 
 Two options I could think of:
 # Check for deletion when serialising to SSTable and ignore columns if the 
 have a lower timestamp. Otherwise leave as is so dead columns stay in memory. 
 # Ensure at all times if the CF is deleted all columns it contains have a 
 higher timestamp. 
 ## I *think* this would include all column types (DeletedColumn as well) as 
 the CF deletion has the same effect. But not sure.
 ## Deleting (potentially) all columns in delete() will take time. Could track 
 the highest timestamp in the CF so the normal case of deleting all cols does 
 not need to iterate. 
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




svn commit: r1138740 - in /cassandra/trunk: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/compaction/ src/java/org/apache/cassandra/io/sstable/ src/java/org/apache/cassandra/io

2011-06-22 Thread jbellis
Author: jbellis
Date: Thu Jun 23 05:49:35 2011
New Revision: 1138740

URL: http://svn.apache.org/viewvc?rev=1138740view=rev
Log:
clean up tmpfiles after failed compaction
patch by Aaron Morton; reviewed by slebresne and Stu Hood for CASSANDRA-2468

Modified:
cassandra/trunk/CHANGES.txt
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java

cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java

cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionTask.java
cassandra/trunk/src/java/org/apache/cassandra/io/sstable/Descriptor.java
cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTable.java

cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableDeletingReference.java
cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableReader.java
cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java
cassandra/trunk/src/java/org/apache/cassandra/io/util/FileUtils.java
cassandra/trunk/test/unit/org/apache/cassandra/io/sstable/SSTableTest.java

Modified: cassandra/trunk/CHANGES.txt
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1138740r1=1138739r2=1138740view=diff
==
--- cassandra/trunk/CHANGES.txt (original)
+++ cassandra/trunk/CHANGES.txt Thu Jun 23 05:49:35 2011
@@ -7,6 +7,7 @@
(CASSANDRA-2062)
  * Fixed the ability to set compaction strategy in cli using create column 
family command (CASSANDRA-2778)
  * Add startup flag to renew counter node id (CASSANDRA-2788)
+ * clean up tmp files after failed compaction (CASSANDRA-2468)
 
 
 0.8.2

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java?rev=1138740r1=1138739r2=1138740view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java 
(original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/ColumnFamilyStore.java Thu 
Jun 23 05:49:35 2011
@@ -454,7 +454,14 @@ public class ColumnFamilyStore implement
 
 if (components.contains(Component.COMPACTED_MARKER) || 
desc.temporary)
 {
-SSTable.delete(desc, components);
+try
+{
+SSTable.delete(desc, components);
+}
+catch (IOException e)
+{
+throw new IOError(e);
+}
 continue;
 }
 

Modified: cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java
URL: 
http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java?rev=1138740r1=1138739r2=1138740view=diff
==
--- cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java (original)
+++ cassandra/trunk/src/java/org/apache/cassandra/db/Memtable.java Thu Jun 23 
05:49:35 2011
@@ -224,14 +224,22 @@ public class Memtable
   + keySize // keys in data file
   + currentThroughput.get()) // data
  * 1.2); // bloom filter and row index 
overhead
+SSTableReader ssTable;
+// errors when creating the writer that may leave empty temp files.
 SSTableWriter writer = cfs.createFlushWriter(columnFamilies.size(), 
estimatedSize, context);
+try
+{
+// (we can't clear out the map as-we-go to free up memory,
+//  since the memtable is being used for queries in the pending 
flush category)
+for (Map.EntryDecoratedKey, ColumnFamily entry : 
columnFamilies.entrySet())
+writer.append(entry.getKey(), entry.getValue());
 
-// (we can't clear out the map as-we-go to free up memory,
-//  since the memtable is being used for queries in the pending 
flush category)
-for (Map.EntryDecoratedKey, ColumnFamily entry : 
columnFamilies.entrySet())
-writer.append(entry.getKey(), entry.getValue());
-
-SSTableReader ssTable = writer.closeAndOpenReader();
+ssTable = writer.closeAndOpenReader();
+}
+finally
+{
+writer.cleanupIfNecessary();
+}
 logger.info(String.format(Completed flushing %s (%d bytes),
   ssTable.getFilename(), new 
File(ssTable.getFilename()).length()));
 return ssTable;

Modified: 
cassandra/trunk/src/java/org/apache/cassandra/db/compaction/CompactionManager.java
URL: 

[jira] [Commented] (CASSANDRA-2777) Pig storage handler should implement LoadMetadata

2011-06-22 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053666#comment-13053666
 ] 

Jonathan Ellis commented on CASSANDRA-2777:
---

Is that +1 otherwise, Jeremy?

 Pig storage handler should implement LoadMetadata
 -

 Key: CASSANDRA-2777
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2777
 Project: Cassandra
  Issue Type: Improvement
  Components: Contrib
Reporter: Brandon Williams
Assignee: Brandon Williams
Priority: Minor
 Fix For: 0.7.7

 Attachments: 2777.txt


 The reason for this is many builtin functions like SUM won't work on longs 
 (you can workaround using LongSum, but that's lame) because the query planner 
 doesn't know about the types beforehand, even though we are casting to native 
 longs.
 There is some impact to this, though.  With LoadMetadata implemented, 
 existing scripts that specify schema will need to remove it (since LM is 
 doing it for them) and they will need to conform to LM's terminology (key, 
 columns, name, value) within the script.  This is trivial to change, however, 
 and the increased functionality is worth the switch.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira