[jira] [Created] (CASSANDRA-4556) upgrade from SizeTiered to Leveled failed because no enough free space

2012-08-20 Thread Cheng Zhang (JIRA)
Cheng Zhang created CASSANDRA-4556:
--

 Summary: upgrade from SizeTiered to Leveled failed because no 
enough free space
 Key: CASSANDRA-4556
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4556
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10, Ubuntu 64bit server.
Reporter: Cheng Zhang


I use cassandra 1.0.10 with two data directories and 
SizeTieredCompactionStrategy first, after some time, the total free space is 
smaller than the biggest data file. At this time, I want change the compaction 
strategy to Leveled to save more space. But failed because there is no enough 
space for the biggest data file to compact.
But when I change some code, if the biggest data file can't compact, I choose 
the second biggest data file to compact, the rest can be done in the same 
manner.The biggest data file will be compact when there is enough space. As the 
compaction goes by, there will be enough space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4557) Cassandra's startup is too slow when the data is more than 1T.

2012-08-20 Thread Cheng Zhang (JIRA)
Cheng Zhang created CASSANDRA-4557:
--

 Summary: Cassandra's startup is too slow when the data is more 
than 1T.
 Key: CASSANDRA-4557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4557
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10
Reporter: Cheng Zhang
Priority: Minor


My Cassandra cluster has more than 1T data in each server. Everytime, I need to 
restart the cluster, It need much time to read the index from index file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Kołaczkowski updated CASSANDRA-4558:
--

Attachment: configurable_transport.patch

Patch enabling to set transport factory class in hadoop job configuration. 
Added new properties cassandra.input.transport.factory.class and 
cassandra.output.transport.factory.class. TFramedTransportFactory is used by 
default if properties are not set, so old behaviour is preserved.

 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437753#comment-13437753
 ] 

Piotr Kołaczkowski edited comment on CASSANDRA-4558 at 8/20/12 8:11 PM:


I attach a patch enabling to set transport factory class in hadoop job 
configuration. Added new properties cassandra.input.transport.factory.class 
and cassandra.output.transport.factory.class. TFramedTransportFactory is used 
by default if properties are not set, so old behaviour is preserved. The 
modified code has been tested using PIG demo in DSE.

  was (Author: pkolaczk):
Patch enabling to set transport factory class in hadoop job configuration. 
Added new properties cassandra.input.transport.factory.class and 
cassandra.output.transport.factory.class. TFramedTransportFactory is used by 
default if properties are not set, so old behaviour is preserved.
  
 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437753#comment-13437753
 ] 

Piotr Kołaczkowski edited comment on CASSANDRA-4558 at 8/20/12 8:12 PM:


I attach a patch enabling to set transport factory class in hadoop job 
configuration. Added new properties cassandra.input.transport.factory.class 
and cassandra.output.transport.factory.class. TFramedTransportFactory is used 
by default if properties are not set, so old behaviour is preserved. The 
modified code has been tested using PIG demo in DSE.

Patch generated against 1.1 branch (1.1.4), intended for 1.1.

  was (Author: pkolaczk):
I attach a patch enabling to set transport factory class in hadoop job 
configuration. Added new properties cassandra.input.transport.factory.class 
and cassandra.output.transport.factory.class. TFramedTransportFactory is used 
by default if properties are not set, so old behaviour is preserved. The 
modified code has been tested using PIG demo in DSE.
  
 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4557) Cassandra's startup is too slow when the data is more than 1T.

2012-08-20 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437799#comment-13437799
 ] 

Radim Kolar commented on CASSANDRA-4557:


How much time it needs?

 Cassandra's startup is too slow when the data is more than 1T.
 --

 Key: CASSANDRA-4557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4557
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10
Reporter: Cheng Zhang
Priority: Minor

 My Cassandra cluster has more than 1T data in each server. Everytime, I need 
 to restart the cluster, It need much time to read the index from index file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-1123) Allow tracing query details

2012-08-20 Thread David Alves (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437816#comment-13437816
 ] 

David Alves edited comment on CASSANDRA-1123 at 8/20/12 11:28 PM:
--

Attached is a patch that implements almost everything mentioned:

- Trace is implemented asynchronously with a new Stage that has a threadpool 
with a single thread that refuses to execute when the queue if full ( as 
suggested a warn is logged, experiments say that under huge load 0.001 
traceProbability still works without rejecting events). (also this threadpool 
is the only one that does not propagate trace events).

- Allows to enable tracing/disable tracing from cli. Also enable tracing has 
two parameters traceProbability (the prop that any single request from that 
client gets traced) and maxTraceNumber to allow to set a maximum number of 
traces to do (-1 set this to Integer.MAX_INT which is also the default)

- Adds the possibility to enable tracing in stress (using -tr probability 
[optionally maxNumTraces]

- TraceEvents can be build using a fluent builder (TraceEventBuilder) that is 
also able to deserialize events (both from thrift and from IColumns).

- All requests in CassandraServer start a tracing session when tracing is 
enabled and all parameters are stored along with the request details. This is 
done using a ThriftType that is able to serialize and deserialize thrift 
objects into Cassandra.

- User Request/Reply, Stage Start/Finish and Message Request/Reply are traced 
along with specific custom requests (such as apply_mutation, and 
get_column_family)

- Cli contains two other commands to explore traces:

show tracing summary [request_name] - this displays a summary for a request 
type:

{code}
Summary for sessions of request: batch_mutate
Total Sessions: 500
Total Events: 3190
   
==
   |Avg.|   StdDev.  |   Max. |Min.|
 99%   | Unit   |
=
| Latency  |  22.48 |  28.68 | 293.90 |   0.56 |
 150.38|   msec |
|---|
| Batch Rows   |   1.00 |   0.00 |   1.00 |   1.00 |
   1.00| amount/req |
|---|
| Mutations|   5.00 |   0.00 |   5.00 |   5.00 |
   5.00| amount/req |
|---|
| Deletions|   0.00 |   0.00 |   0.00 |   0.00 |
   0.00| amount/req |
|---|
| Columns  |   5.00 |   0.00 |   5.00 |   5.00 |
   5.00| amount/req |
|---|
| Counters |   0.00 |   0.00 |   0.00 |   0.00 |
   0.00| amount/req |
|---|
| Super Col.   |   0.00 |   0.00 |   0.00 |   0.00 |
   0.00| amount/req |
|---|
| Written Bytes| 170.00 |   0.00 | 170.00 | 170.00 |
 170.00| amount/req |
|---|

Quickest Request sessionId: 135b0c60-eac0-11e1--fe8ebeead9ff
Slowest  Request sessionId: f09c3e10-eabf-11e1--fe8ebeead9ff
{code}

explain trace session [sessionId] - displays the complete set of events of a 
tracing session along with a to-scale timeline for its execution.

{code}
Session Summary: f09c3e10-eabf-11e1--fe8ebeead9ff
Request: batch_mutate
Coordinator: /127.0.0.1
Interacting nodes: 2 {[/127.0.0.1, /127.0.0.2]}
Duration: 293901000 nano seconds (293.901 msecs)
Consistency Level: ONE
Request Timeline:

||--|
Se;/Msg 
/Se


  |-||   

  Msg,apply_mutation;St:[Mutation];/St:[Mutation]   


Caption: 
Se  - Session start (start of request: batch_mutate)
/Se - Session end (end of 

[jira] [Commented] (CASSANDRA-4557) Cassandra's startup is too slow when the data is more than 1T.

2012-08-20 Thread Cheng Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437837#comment-13437837
 ] 

Cheng Zhang commented on CASSANDRA-4557:


I have a test, there is 700G data, the startup consume nearly half an hour.

 Cassandra's startup is too slow when the data is more than 1T.
 --

 Key: CASSANDRA-4557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4557
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10
Reporter: Cheng Zhang
Priority: Minor

 My Cassandra cluster has more than 1T data in each server. Everytime, I need 
 to restart the cluster, It need much time to read the index from index file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Comment Edited] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread JIRA

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437753#comment-13437753
 ] 

Piotr Kołaczkowski edited comment on CASSANDRA-4558 at 8/21/12 12:20 AM:
-

I attach a patch enabling to set transport factory class in hadoop job 
configuration. Added new properties cassandra.input.transport.factory.class 
and cassandra.output.transport.factory.class. TFramedTransportFactory is used 
by default if properties are not set, so old behaviour is preserved. The 
modified code has been tested using PIG demo in DSE.

Patch generated against 1.1 branch (1.1.4).

  was (Author: pkolaczk):
I attach a patch enabling to set transport factory class in hadoop job 
configuration. Added new properties cassandra.input.transport.factory.class 
and cassandra.output.transport.factory.class. TFramedTransportFactory is used 
by default if properties are not set, so old behaviour is preserved. The 
modified code has been tested using PIG demo in DSE.

Patch generated against 1.1 branch (1.1.4), intended for 1.1.
  
 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
 Fix For: 1.1.5

 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-4557) Cassandra's startup is too slow when the data is more than 1T.

2012-08-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-4557.
---

Resolution: Fixed

fixed by CASSANDRA-3762

 Cassandra's startup is too slow when the data is more than 1T.
 --

 Key: CASSANDRA-4557
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4557
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10
Reporter: Cheng Zhang
Priority: Minor

 My Cassandra cluster has more than 1T data in each server. Everytime, I need 
 to restart the cluster, It need much time to read the index from index file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (CASSANDRA-4556) upgrade from SizeTiered to Leveled failed because no enough free space

2012-08-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-4556.
---

Resolution: Later

As you point out, there are a number of reasonable workarounds.  Doesn't look 
like it's worth writing a ton of special case code for switching from strategy 
X to strategy Y, but if you want to submit a patch I'll be happy to review it.

 upgrade from SizeTiered to Leveled failed because no enough free space
 --

 Key: CASSANDRA-4556
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4556
 Project: Cassandra
  Issue Type: Improvement
  Components: API
Affects Versions: 1.0.10
 Environment: Cassandra 1.0.10, Ubuntu 64bit server.
Reporter: Cheng Zhang

 I use cassandra 1.0.10 with two data directories and 
 SizeTieredCompactionStrategy first, after some time, the total free space is 
 smaller than the biggest data file. At this time, I want change the 
 compaction strategy to Leveled to save more space. But failed because there 
 is no enough space for the biggest data file to compact.
 But when I change some code, if the biggest data file can't compact, I choose 
 the second biggest data file to compact, the rest can be done in the same 
 manner.The biggest data file will be compact when there is enough space. As 
 the compaction goes by, there will be enough space.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[cassandra-dbapi2] push by pcan...@gmail.com - tag 1.1.0-beta2 on 2012-08-19 23:59 GMT

2012-08-20 Thread cassandra-dbapi2 . apache-extras . org

Revision: 1d630ecc5023
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 16:58:57 2012
Log:  tag 1.1.0-beta2

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=1d630ecc5023

Modified:
 /setup.py

===
--- /setup.py   Sun Aug 19 14:13:25 2012
+++ /setup.py   Sun Aug 19 16:58:57 2012
@@ -20,7 +20,7 @@

 setup(
 name=cql,
-version=1.1.0-beta1,
+version=1.1.0-beta2,
 description=Cassandra Query Language driver,
  
long_description=open(abspath(join(dirname(__file__), 'README'))).read(),

 maintainer='Cassandra DBAPI-2 Driver Team',


[jira] [Updated] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4558:
--

Reviewer: brandon.williams
Assignee: Piotr Kołaczkowski

 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
Assignee: Piotr Kołaczkowski
 Fix For: 1.1.5

 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[cassandra-dbapi2] push by pcan...@gmail.com - update test_cql:test_create_column_family test on 2012-08-19 23:58 GMT

2012-08-20 Thread cassandra-dbapi2 . apache-extras . org

Revision: c084cb76ad87
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 16:58:25 2012
Log:  update test_cql:test_create_column_family test

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=c084cb76ad87

Modified:
 /test/test_cql.py

===
--- /test/test_cql.py   Mon Jul  9 21:42:25 2012
+++ /test/test_cql.py   Sun Aug 19 16:58:25 2012
@@ -40,6 +40,7 @@

 import cql
 from cql.cassandra import Cassandra
+from cql.cqltypes import AsciiType, UUIDType

 def get_thrift_client(host=TEST_HOST, port=TEST_PORT):
 socket = TSocket.TSocket(host, port)
@@ -685,8 +686,10 @@
 SELECT KEY, '%s' FROM StandardTimeUUID WHERE KEY = 'uuidtest'
  % str(timeuuid))
 self.assertEqual(len(cursor.name_info), 2)
-self.assertEqual(cursor.name_info[0], ('KEY', 'AsciiType'))
-self.assertEqual(cursor.name_info[1], (timeuuid.bytes, 'UUIDType'))
+self.assertEqual(cursor.name_info[0][0], 'KEY')
+self.assertIsSubclass(cursor.name_info[0][1], AsciiType)
+self.assertEqual(cursor.name_info[1][0], timeuuid.bytes)
+self.assertIsSubclass(cursor.name_info[1][1], UUIDType)

 def test_time_uuid(self):
 store and retrieve time-based (type 1) uuids


[cassandra-dbapi2] 2 new revisions pushed by pcan...@gmail.com on 2012-08-19 23:52 GMT

2012-08-20 Thread cassandra-dbapi2 . apache-extras . org

2 new revisions:

Revision: 602d5138da97
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 15:13:16 2012
Log:  fix generation of unknown casstype objects...
http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=602d5138da97

Revision: 1ac7beb124c7
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 16:50:44 2012
Log:  better support for more complex casstype strings...
http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=1ac7beb124c7

==
Revision: 602d5138da97
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 15:13:16 2012
Log:  fix generation of unknown casstype objects

since class names have to be bytes, not unicode

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=602d5138da97

Modified:
 /cql/cqltypes.py

===
--- /cql/cqltypes.pySun Aug 19 14:09:48 2012
+++ /cql/cqltypes.pySun Aug 19 15:13:16 2012
@@ -136,7 +136,7 @@
 if cls.num_subtypes != 'UNKNOWN' and len(subtypes) !=  
cls.num_subtypes:

 raise ValueError(%s types require %d subtypes (%d given)
  % (cls.typename, cls.num_subtypes,  
len(subtypes)))

-newname = cls.cass_parameterized_type_with(subtypes)
+newname = cls.cass_parameterized_type_with(subtypes).encode('utf8')
 return type(newname, (cls,), {'subtypes': subtypes, 'cassname':  
cls.cassname})


 @classmethod
@@ -157,7 +157,7 @@
 num_subtypes = 'UNKNOWN'

 def mkUnrecognizedType(casstypename):
-return CassandraTypeType(casstypename,
+return CassandraTypeType(casstypename.encode('utf8'),
  (_UnrecognizedType,),
  {'typename': '%s' % casstypename})


==
Revision: 1ac7beb124c7
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 16:50:44 2012
Log:  better support for more complex casstype strings

such as CompositeType and ColumnToCollectionType. it looks like yes,
this driver does need to be able to parse arbitrarily complex type
strings, so add parsing.

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=1ac7beb124c7

Modified:
 /cql/cqltypes.py

===
--- /cql/cqltypes.pySun Aug 19 15:13:16 2012
+++ /cql/cqltypes.pySun Aug 19 16:50:44 2012
@@ -22,6 +22,7 @@
 from decimal import Decimal
 import time
 import calendar
+import re

 try:
 from cStringIO import StringIO
@@ -50,23 +51,50 @@
 _cqltypes[cls.typename] = cls
 return cls

-def lookup_casstype(casstype):
-args = ()
+casstype_scanner = re.Scanner((
+(r'[()]', lambda s,t: t),
+(r'[a-zA-Z0-9_.:]+', lambda s,t: t),
+(r'[\s,]', None),
+))
+
+def lookup_casstype_simple(casstype):
 shortname = trim_if_startswith(casstype, apache_cassandra_type_prefix)
-if '(' in shortname:
-# do we need to support arbitrary nesting? if so, this is where
-# we need to tokenize and parse
-assert shortname.endswith(')'), shortname
-shortname, args = shortname[:-1].split('(', 1)
-args = [lookup_casstype(s.strip()) for s in args.split(',')]
 try:
 typeclass = _casstypes[shortname]
 except KeyError:
 typeclass = mkUnrecognizedType(casstype)
-if args:
-typeclass = typeclass.apply_parameters(*args)
 return typeclass

+def parse_casstype_args(typestring):
+tokens, remainder = casstype_scanner.scan(typestring)
+if remainder:
+raise ValueError(weird characters %r at end % remainder)
+args = [[]]
+for tok in tokens:
+if tok == '(':
+args.append([])
+elif tok == ')':
+arglist = args.pop()
+ctype = args[-1].pop()
+paramized = ctype.apply_parameters(*arglist)
+args[-1].append(paramized)
+else:
+if ':' in tok:
+# ignore those column name hex encoding bit; we have the
+# proper column name from elsewhere
+tok = tok.rsplit(':', 1)[-1]
+ctype = lookup_casstype_simple(tok)
+args[-1].append(ctype)
+assert len(args) == 1, args
+assert len(args[0]) == 1, args[0]
+return args[0][0]
+
+def lookup_casstype(casstype):
+try:
+return parse_casstype_args(casstype)
+except (ValueError, AssertionError, IndexError), e:
+raise ValueError(Don't know how to parse type string %r: %s %  
(casstype, e))

+
 def lookup_cqltype(cqltype):
 args = ()
 if cqltype.startswith(') and cqltype.endswith('):
@@ -439,10 +467,18 @@
 buf.write(valbytes)
 return buf.getvalue()

+class CompositeType(_ParameterizedType):
+typename = 'org.apache.cassandra.db.marshal.CompositeType'
+num_subtypes = 

[cassandra-dbapi2] push by pcan...@gmail.com - tag 1.1.0-beta1 on 2012-08-19 21:32 GMT

2012-08-20 Thread cassandra-dbapi2 . apache-extras . org

Revision: 7254df9ec6df
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 14:13:25 2012
Log:  tag 1.1.0-beta1

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=7254df9ec6df

Modified:
 /setup.py

===
--- /setup.py   Mon Mar 26 11:24:24 2012
+++ /setup.py   Sun Aug 19 14:13:25 2012
@@ -20,7 +20,7 @@

 setup(
 name=cql,
-version=1.0.10,
+version=1.1.0-beta1,
 description=Cassandra Query Language driver,
  
long_description=open(abspath(join(dirname(__file__), 'README'))).read(),

 maintainer='Cassandra DBAPI-2 Driver Team',


[cassandra-dbapi2] push by pcan...@gmail.com - share type objects in cursor.name_info attr on 2012-08-19 21:09 GMT

2012-08-20 Thread cassandra-dbapi2 . apache-extras . org

Revision: e434b96cc6cf
Author:   paul cannon p...@datastax.com
Date: Sun Aug 19 14:09:48 2012
Log:  share type objects in cursor.name_info attr

http://code.google.com/a/apache-extras.org/p/cassandra-dbapi2/source/detail?r=e434b96cc6cf

Modified:
 /cql/cqltypes.py
 /cql/cursor.py
 /cql/decoders.py

===
--- /cql/cqltypes.pySun Aug 19 13:16:40 2012
+++ /cql/cqltypes.pySun Aug 19 14:09:48 2012
@@ -149,6 +149,10 @@
 def cass_parameterized_type(cls, full=False):
 return cls.cass_parameterized_type_with(cls.subtypes, full=full)

+# it's initially named with a _ to avoid registering it as a real type, but
+# client programs may want to use the name still for isinstance(), etc
+CassandraType = _CassandraType
+
 class _UnrecognizedType(_CassandraType):
 num_subtypes = 'UNKNOWN'

===
--- /cql/cursor.py  Sun Aug 19 13:16:40 2012
+++ /cql/cursor.py  Sun Aug 19 14:09:48 2012
@@ -42,6 +42,7 @@
 _ddl_re = re.compile(\s*(CREATE|ALTER|DROP)\s+,
  re.IGNORECASE | re.MULTILINE)
 supports_prepared_queries = False
+supports_column_types = True
 supports_name_info = True

 def __init__(self, parent_connection):
@@ -110,6 +111,7 @@
 self.rowcount = 0
 self.description = None
 self.name_info = None
+self.column_types = None

 def execute(self, cql_query, params={}, decoder=None):
 if isinstance(cql_query, unicode):
@@ -153,10 +155,7 @@
 self.rs_idx = 0
 self.rowcount = len(self.result)
 if self.result:
-self.description, self.name_info, self.column_types = \
- 
self.decoder.decode_metadata_and_types(self.result[0])

-else:
-self.description = None
+self.get_metadata_info(self.result[0])
 elif response.type == CqlResultType.INT:
 self.result = [(response.num,)]
 self.rs_idx = 0
@@ -176,6 +175,10 @@
 # 'Return values are not defined.'
 return True

+def get_metadata_info(self, row):
+self.description, self.name_info, self.column_types = \
+self.decoder.decode_metadata_and_types(row)
+
 def executemany(self, operation_list, argslist):
 self.__checksock()
 opssize = len(operation_list)
@@ -201,8 +204,7 @@
 else:
 if self.cql_major_version  3:
 # (don't bother redecoding descriptions or names otherwise)
-self.description, self.name_info, self.column_types = \
-self.decoder.decode_metadata_and_types(row)
+self.get_metadata_info(row)
 return self.decoder.decode_row(row, self.column_types)

 def fetchmany(self, size=None):
===
--- /cql/decoders.pySun Aug 19 13:16:40 2012
+++ /cql/decoders.pySun Aug 19 14:09:48 2012
@@ -56,7 +56,7 @@
 name = self.name_decode_error(e, namebytes, comparator)
 column_types.append(valdtype)
 description.append((name, validator, None, None, None, None,  
True))

-name_info.append((namebytes, comparator))
+name_info.append((namebytes, comptype))

 return description, name_info, column_types



[jira] [Commented] (CASSANDRA-1123) Allow tracing query details

2012-08-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-1123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13437903#comment-13437903
 ] 

Jonathan Ellis commented on CASSANDRA-1123:
---

bq. Finally I'm having issues using the index from thrift. Is there an 
incompatibility (i.e., using an IndexExpression with a CQL3 table?).

Yes, see 
https://issues.apache.org/jira/browse/CASSANDRA-4377?focusedCommentId=13436817page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13436817

 Allow tracing query details
 ---

 Key: CASSANDRA-1123
 URL: https://issues.apache.org/jira/browse/CASSANDRA-1123
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: David Alves
 Fix For: 1.2.0

 Attachments: 1123-3.patch.gz, 1123.patch, 1123.patch


 In the spirit of CASSANDRA-511, it would be useful to tracing on queries to 
 see where latency is coming from: how long did row cache lookup take?  key 
 search in the index?  merging the data from the sstables?  etc.
 The main difference vs setting debug logging is that debug logging is too big 
 of a hammer; by turning on the flood of logging for everyone, you actually 
 distort the information you're looking for.  This would be something you 
 could set per-query (or more likely per connection).
 We don't need to be as sophisticated as the techniques discussed in the 
 following papers but they are interesting reading:
 http://research.google.com/pubs/pub36356.html
 http://www.usenix.org/events/osdi04/tech/full_papers/barham/barham_html/
 http://www.usenix.org/event/nsdi07/tech/fonseca.html

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Git Push Summary

2012-08-20 Thread eevans
Updated Tags:  refs/tags/1.1.4-tentative [deleted] 94e46ff95


Git Push Summary

2012-08-20 Thread eevans
Updated Tags:  refs/tags/cassandra-1.1.4 [created] 5cb344c90


svn commit: r1375101 - in /cassandra/site: publish/download/index.html publish/index.html src/settings.py

2012-08-20 Thread eevans
Author: eevans
Date: Mon Aug 20 16:50:58 2012
New Revision: 1375101

URL: http://svn.apache.org/viewvc?rev=1375101view=rev
Log:
updated versioning for 1.1.4 release

Modified:
cassandra/site/publish/download/index.html
cassandra/site/publish/index.html
cassandra/site/src/settings.py

Modified: cassandra/site/publish/download/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/download/index.html?rev=1375101r1=1375100r2=1375101view=diff
==
--- cassandra/site/publish/download/index.html (original)
+++ cassandra/site/publish/download/index.html Mon Aug 20 16:50:58 2012
@@ -49,8 +49,8 @@
   Cassandra releases include the core server, the a 
href=http://wiki.apache.org/cassandra/NodeTool;nodetool/a administration 
command-line interface, and a development shell (a 
href=http://cassandra.apache.org/doc/cql/CQL.html;ttcqlsh/tt/a and the 
old ttcassandra-cli/tt).
 
   p
-  The latest stable release of Apache Cassandra is 1.1.3
-  (released on 2012-08-05).  iIf you're just
+  The latest stable release of Apache Cassandra is 1.1.4
+  (released on 2012-08-20).  iIf you're just
   starting out, download this one./i
   /p
 
@@ -59,13 +59,13 @@
   ul
 li
 a class=filename 
-   
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.3/apache-cassandra-1.1.3-bin.tar.gz;
+   
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.4/apache-cassandra-1.1.4-bin.tar.gz;
onclick=javascript: 
pageTracker._trackPageview('/clicks/binary_download');
-  apache-cassandra-1.1.3-bin.tar.gz
+  apache-cassandra-1.1.4-bin.tar.gz
 /a
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-bin.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-bin.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-bin.tar.gz.sha1;SHA1/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-bin.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-bin.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-bin.tar.gz.sha1;SHA1/a]
 /li
 li
 a href=http://wiki.apache.org/cassandra/DebianPackaging;Debian 
installation instructions/a
@@ -153,13 +153,13 @@
   ul
 li
 a class=filename 
-   
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.3/apache-cassandra-1.1.3-src.tar.gz;
+   
href=http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.1.4/apache-cassandra-1.1.4-src.tar.gz;
onclick=javascript: 
pageTracker._trackPageview('/clicks/source_download');
-  apache-cassandra-1.1.3-src.tar.gz
+  apache-cassandra-1.1.4-src.tar.gz
 /a
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-src.tar.gz.asc;PGP/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-src.tar.gz.md5;MD5/a]
-[a 
href=http://www.apache.org/dist/cassandra/1.1.3/apache-cassandra-1.1.3-src.tar.gz.sha1;SHA1/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-src.tar.gz.asc;PGP/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-src.tar.gz.md5;MD5/a]
+[a 
href=http://www.apache.org/dist/cassandra/1.1.4/apache-cassandra-1.1.4-src.tar.gz.sha1;SHA1/a]
 /li
   
 li

Modified: cassandra/site/publish/index.html
URL: 
http://svn.apache.org/viewvc/cassandra/site/publish/index.html?rev=1375101r1=1375100r2=1375101view=diff
==
--- cassandra/site/publish/index.html (original)
+++ cassandra/site/publish/index.html Mon Aug 20 16:50:58 2012
@@ -75,8 +75,8 @@
   h2Download/h2
   div class=inner rc
 p
-The latest release is b1.1.3/b
-span class=relnotes(a 
href=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-1.1.3;Changes/a)/span
+The latest release is b1.1.4/b
+span class=relnotes(a 
href=http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=blob_plain;f=CHANGES.txt;hb=refs/tags/cassandra-1.1.4;Changes/a)/span
 /p
 
 pa class=filename href=/download/Download options/a/p

Modified: cassandra/site/src/settings.py
URL: 
http://svn.apache.org/viewvc/cassandra/site/src/settings.py?rev=1375101r1=1375100r2=1375101view=diff
==
--- cassandra/site/src/settings.py (original)
+++ cassandra/site/src/settings.py Mon Aug 20 16:50:58 2012
@@ -98,8 +98,8 @@ class CassandraDef(object):
 veryoldstable_version = '0.8.10'
 veryoldstable_release_date = '2012-02-13'
 veryoldstable_exists = True
-stable_version = '1.1.3'
-stable_release_date = '2012-08-05'
+stable_version = '1.1.4'
+stable_release_date = 

[jira] [Updated] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-08-20 Thread Pavel Yaskevich (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pavel Yaskevich updated CASSANDRA-3772:
---

Attachment: CASSANDRA-3772-v2.patch

I have removed ThreadLocal declaration from the M3P (and cleaned whitespace 
errors) which was the bottleneck, after re-running tests with that modification 
M3P beats RP with 903 to 847.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4479) JMX attribute setters not consistent with cassandra.yaml

2012-08-20 Thread Chris Merrill (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Merrill updated CASSANDRA-4479:
-

Attachment: trunk-4479.txt

 JMX attribute setters not consistent with cassandra.yaml
 

 Key: CASSANDRA-4479
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4479
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.2
Reporter: Eric Dong
 Attachments: trunk-4479.txt


 If a setting is configurable both via cassandra.yaml and JMX, the two should 
 be consistent. If that doesn't hold, then the JMX setter can't be trusted. 
 Here I present the example of phi_convict_threshold.
 I'm trying to set phi_convict_threshold via JMX, which sets 
 FailureDetector.phiConvictThreshold_, but this doesn't update 
 Config.phi_convict_threshold, which gets its value from cassandra.yaml when 
 starting up.
 Some places, such as FailureDetector.interpret(InetAddress), use 
 FailureDetector.phiConvictThreshold_; others, such as AntiEntropyService.line 
 813 in cassandra-1.1.2, use Config.phi_convict_threshold:
 {code}
 // We want a higher confidence in the failure detection than 
 usual because failing a repair wrongly has a high cost.
 if (phi  2 * DatabaseDescriptor.getPhiConvictThreshold())
 return;
 {code}
 where DatabaseDescriptor.getPhiConvictThreshold() returns 
 Conf.phi_convict_threshold.
 So, it looks like there are cases where a value is stored in multiple places, 
 and setting the value via JMX doesn't set all of them. I'd say there should 
 only be a single place where a configuration parameter is stored, and that 
 single field:
 * should read in the value from cassandra.yaml, optionally falling back to a 
 sane default
 * should be the field that the JMX attribute reads and sets, and
 * any place that needs the current global setting should get it from that 
 field. However, there could be cases where you read in a global value at the 
 start of a task and keep that value locally until the end of the task.
 Also, anything settable via JMX should be volatile or set via a synchronized 
 setter, or else according to the Java memory model other threads may be stuck 
 with the old setting.
 So, I'm requesting the following:
 * Setting up guidelines for how to expose a configuration parameter both via 
 cassandra.yaml and JMX, based on what I've mentioned above
 * Going through the list of configuration parameters and fixing any that 
 don't match those guidelines
 I'd also recommend logging any changes to configuration parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4479) JMX attribute setters not consistent with cassandra.yaml

2012-08-20 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-4479:
--

 Reviewer: brandon.williams
 Priority: Trivial  (was: Major)
Affects Version/s: (was: 1.1.2)
Fix Version/s: 1.2.0

 JMX attribute setters not consistent with cassandra.yaml
 

 Key: CASSANDRA-4479
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4479
 Project: Cassandra
  Issue Type: Bug
Reporter: Eric Dong
Priority: Trivial
 Fix For: 1.2.0

 Attachments: trunk-4479.txt


 If a setting is configurable both via cassandra.yaml and JMX, the two should 
 be consistent. If that doesn't hold, then the JMX setter can't be trusted. 
 Here I present the example of phi_convict_threshold.
 I'm trying to set phi_convict_threshold via JMX, which sets 
 FailureDetector.phiConvictThreshold_, but this doesn't update 
 Config.phi_convict_threshold, which gets its value from cassandra.yaml when 
 starting up.
 Some places, such as FailureDetector.interpret(InetAddress), use 
 FailureDetector.phiConvictThreshold_; others, such as AntiEntropyService.line 
 813 in cassandra-1.1.2, use Config.phi_convict_threshold:
 {code}
 // We want a higher confidence in the failure detection than 
 usual because failing a repair wrongly has a high cost.
 if (phi  2 * DatabaseDescriptor.getPhiConvictThreshold())
 return;
 {code}
 where DatabaseDescriptor.getPhiConvictThreshold() returns 
 Conf.phi_convict_threshold.
 So, it looks like there are cases where a value is stored in multiple places, 
 and setting the value via JMX doesn't set all of them. I'd say there should 
 only be a single place where a configuration parameter is stored, and that 
 single field:
 * should read in the value from cassandra.yaml, optionally falling back to a 
 sane default
 * should be the field that the JMX attribute reads and sets, and
 * any place that needs the current global setting should get it from that 
 field. However, there could be cases where you read in a global value at the 
 start of a task and keep that value locally until the end of the task.
 Also, anything settable via JMX should be volatile or set via a synchronized 
 setter, or else according to the Java memory model other threads may be stuck 
 with the old setting.
 So, I'm requesting the following:
 * Setting up guidelines for how to expose a configuration parameter both via 
 cassandra.yaml and JMX, based on what I've mentioned above
 * Going through the list of configuration parameters and fixing any that 
 don't match those guidelines
 I'd also recommend logging any changes to configuration parameters.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-08-20 Thread Radim Kolar (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438044#comment-13438044
 ] 

Radim Kolar commented on CASSANDRA-3772:


md5 is implemented using native call in Java?

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-08-20 Thread Pavel Yaskevich (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438081#comment-13438081
 ] 

Pavel Yaskevich commented on CASSANDRA-3772:


Java Cryptography Architecture doesn't disclose that but from the tests it 
looks like that it is.

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3730) If some streaming sessions fail on decommission, decommission hangs

2012-08-20 Thread Vitalii Tymchyshyn (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438100#comment-13438100
 ] 

Vitalii Tymchyshyn commented on CASSANDRA-3730:
---

As for me it can do exactly as if decomissioning node was restarted. 

 If some streaming sessions fail on decommission, decommission hangs
 ---

 Key: CASSANDRA-3730
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3730
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 1.1.0
 Environment: FreeBSD
Reporter: Vitalii Tymchyshyn
  Labels: streaming

 Currently cassandra do not handle StreamOutSession fails, e.g.:
 // Instead of just not calling the callback on failure, we could have
 // allow to register a specific callback for failures, but we leave
 // that to a future ticket (likely CASSANDRA-3112)
 if (callback != null  success)
 callback.run();
 This means that if during decommission a node that receives decommission data 
 fails or (my case) the node that tries to decommission becomes overloaded, 
 the streaming session fails and decommission don't know anything about this. 
 This makes it hard to decommission overloaded nodes because I need to restart 
 the node to restart decommission.
 Also I can see next errors because of streaming files try to get streaming 
 session that is closed by gossip:
 ERROR [Streaming to /10.112.0.216:1] 2012-01-11 15:57:28,882 
 AbstractCassandraDaemon.java (line 138) Fatal exception in thread 
 Thread[Streaming to /10.112.0.216:1,5,main]
 java.lang.NullPointerException
 at 
 org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:97)
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
 at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
 at java.lang.Thread.run(Thread.java:679)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-4474) Respect five-minute flush moratorium after initial CL replay

2012-08-20 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438132#comment-13438132
 ] 

Jonathan Ellis commented on CASSANDRA-4474:
---

Right, that's what I meant.  Your solution sounds perfect.

 Respect five-minute flush moratorium after initial CL replay
 

 Key: CASSANDRA-4474
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4474
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter: Jonathan Ellis
Assignee: Yuki Morishita
Priority: Minor
  Labels: commitlog, compaction
 Fix For: 1.1.5


 As noted in CASSANDRA-1967, the post-replay flush can kick off compactions 
 before the five minute grace period introduced in CASSANDRA-3181 to avoid i/o 
 contention while server is warming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (CASSANDRA-3772) Evaluate Murmur3-based partitioner

2012-08-20 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13438143#comment-13438143
 ] 

Vijay commented on CASSANDRA-3772:
--

+1

 Evaluate Murmur3-based partitioner
 --

 Key: CASSANDRA-3772
 URL: https://issues.apache.org/jira/browse/CASSANDRA-3772
 Project: Cassandra
  Issue Type: New Feature
  Components: Core
Reporter: Jonathan Ellis
Assignee: Pavel Yaskevich
 Fix For: 1.3

 Attachments: 0001-CASSANDRA-3772.patch, 
 0001-CASSANDRA-3772-Test.patch, CASSANDRA-3772-v2.patch, 
 hashed_partitioner_3.diff, hashed_partitioner.diff, MumPartitionerTest.docx, 
 try_murmur3_2.diff, try_murmur3.diff


 MD5 is a relatively heavyweight hash to use when we don't need cryptographic 
 qualities, just a good output distribution.  Let's see how much overhead we 
 can save by using Murmur3 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[1/3] git commit: Merge branch 'cassandra-1.1' into trunk

2012-08-20 Thread brandonwilliams
Updated Branches:
  refs/heads/cassandra-1.1 f5619bbfd - 7db46ef80
  refs/heads/trunk 9d2b26dac - 18b49021b


Merge branch 'cassandra-1.1' into trunk

Conflicts:
src/java/org/apache/cassandra/hadoop/ColumnFamilyOutputFormat.java


Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo
Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/18b49021
Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/18b49021
Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/18b49021

Branch: refs/heads/trunk
Commit: 18b49021b4dcbe7d5e5122d4a954804c11b4a17f
Parents: 9d2b26d 7db46ef
Author: Brandon Williams brandonwilli...@apache.org
Authored: Mon Aug 20 15:32:35 2012 -0500
Committer: Brandon Williams brandonwilli...@apache.org
Committed: Mon Aug 20 15:32:35 2012 -0500

--
 .../cassandra/hadoop/ColumnFamilyInputFormat.java  |2 +-
 .../cassandra/hadoop/ColumnFamilyOutputFormat.java |   14 +++-
 .../cassandra/hadoop/ColumnFamilyRecordReader.java |6 +-
 .../org/apache/cassandra/hadoop/ConfigHelper.java  |   55 ---
 .../apache/cassandra/thrift/ITransportFactory.java |   36 ++
 .../cassandra/thrift/TFramedTransportFactory.java  |   37 ++
 6 files changed, 133 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/cassandra/blob/18b49021/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java
--

http://git-wip-us.apache.org/repos/asf/cassandra/blob/18b49021/src/java/org/apache/cassandra/hadoop/ColumnFamilyOutputFormat.java
--
diff --cc src/java/org/apache/cassandra/hadoop/ColumnFamilyOutputFormat.java
index f663df6,e01ada5..a1b2171
--- a/src/java/org/apache/cassandra/hadoop/ColumnFamilyOutputFormat.java
+++ b/src/java/org/apache/cassandra/hadoop/ColumnFamilyOutputFormat.java
@@@ -24,6 -27,10 +24,9 @@@ import java.util.HashMap
  import java.util.List;
  import java.util.Map;
  
 -import org.apache.thrift.transport.TTransport;
+ import org.slf4j.Logger;
+ import org.slf4j.LoggerFactory;
+ 
  import org.apache.cassandra.auth.IAuthenticator;
  import org.apache.cassandra.thrift.*;
  import org.apache.hadoop.conf.Configuration;

http://git-wip-us.apache.org/repos/asf/cassandra/blob/18b49021/src/java/org/apache/cassandra/hadoop/ColumnFamilyRecordReader.java
--



[jira] [Resolved] (CASSANDRA-4558) Configurable transport in CF RecordReader / RecordWriter

2012-08-20 Thread Brandon Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brandon Williams resolved CASSANDRA-4558.
-

Resolution: Fixed

Committed.

 Configurable transport in CF RecordReader / RecordWriter
 

 Key: CASSANDRA-4558
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4558
 Project: Cassandra
  Issue Type: New Feature
  Components: Hadoop
Reporter: Piotr Kołaczkowski
Assignee: Piotr Kołaczkowski
 Fix For: 1.1.5

 Attachments: configurable_transport.patch


 Currently RecordReaders and RecordWriters use hardcoded TFramedTransport. In 
 order to use other transports, e.g. SSL transport, allow for setting custom 
 transport class.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (CASSANDRA-4533) Multithreaded cache saving can skip caches

2012-08-20 Thread Yuki Morishita (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuki Morishita updated CASSANDRA-4533:
--

Attachment: 4533-1.1.txt

Attaching patch against 1.1 branch.
Caches are grobal since 1.1, so I used CacheType as key for flushInProgerss 
concurrent set.

 Multithreaded cache saving can skip caches
 --

 Key: CASSANDRA-4533
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4533
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Affects Versions: 0.8.0
Reporter: Zhu Han
Assignee: Yuki Morishita
Priority: Trivial
 Fix For: 1.1.5

 Attachments: 4533-1.1.txt


 Cassandra flushes the key and row cache to disk periodically. It also uses a 
 atomic flag in flushInProgress to enforce single cache writer at any time.
 However, the cache saving task could be submitted to CompactionManager 
 concurrently, as long as the number of worker thread in CompactionManager is 
 larger than 1. 
 Due to the effect of above atomic flag, only one cache will be written out to 
 disk. Other writer are cancelled when the flag is true.
 I observe the situation in Cassandra 1.0. If nothing is changed, the problem 
 should remain in Cassandra 1.1, either.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4559) implement token relocation

2012-08-20 Thread Eric Evans (JIRA)
Eric Evans created CASSANDRA-4559:
-

 Summary: implement token relocation
 Key: CASSANDRA-4559
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4559
 Project: Cassandra
  Issue Type: Sub-task
  Components: Core, Tools
Reporter: Eric Evans
Assignee: Eric Evans


Whatever the specifics of a _shuffle_ (see CASSANDRA-4443), it will be 
necessary to relocate a range from one node to another.



h3. Patches
||Compare||Raw diff||Description||
|[010_refactor_range_move|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move]|[010_refactor_range_move.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/010_refactor_range_move...p/4443/010_refactor_range_move.diff]|No
 Description|
|[020_calculate_pending|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending]|[020_calculate_pending.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/020_calculate_pending...p/4443/020_calculate_pending.diff]|No
 Description|
|[030_relocate_token|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token]|[030_relocate_token.patch|https://github.com/acunu/cassandra/compare/top-bases/p/4443/030_relocate_token...p/4443/030_relocate_token.diff]|No
 Description|



_Note: These are branches managed with TopGit. If you are applying the patch 
output manually, you will either need to filter the TopGit metadata files (i.e. 
{{wget -O - url | filterdiff -x*.topdeps -x*.topmsg | patch -p1}}), or remove 
them afterward ({{rm .topmsg .topdeps}})._

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (CASSANDRA-4560) Add tracing support for CQL3 bind variables

2012-08-20 Thread Jonathan Ellis (JIRA)
Jonathan Ellis created CASSANDRA-4560:
-

 Summary: Add tracing support for CQL3 bind variables
 Key: CASSANDRA-4560
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4560
 Project: Cassandra
  Issue Type: Sub-task
Reporter: Jonathan Ellis
Priority: Minor
 Fix For: 1.2.1




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira