[jira] [Updated] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-22 Thread Robert Stupp (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Stupp updated CASSANDRA-7563:

Attachment: 7563v7.txt

Attached v7 of the patch with the fix for that. Also added a unit test for that 
using {{USE}}.

After that fix:
{code}
cqlsh use foo ;
cqlsh:foo create type mytype (a int);
cqlsh:foo create function bar (a mytype) RETURNS mytype LANGUAGE java AS 
$$return a;$$;
code=2200 [Invalid query] message=Non-frozen User-Defined types are not 
supported, please use frozen
cqlsh:foo create function bar (a frozenmytype) RETURNS frozenmytype 
LANGUAGE java AS $$return a;$$;
{code}


 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7563) UserType, TupleType and collections in UDFs

2014-11-22 Thread Robert Stupp (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221922#comment-14221922
 ] 

Robert Stupp edited comment on CASSANDRA-7563 at 11/22/14 11:44 AM:


Attached v7 of the patch with the fix for that. Also added a unit test for that 
using {{USE}}.

After that fix:
{code}
cqlsh use foo ;
cqlsh:foo create type mytype (a int);
cqlsh:foo create function bar (a mytype) RETURNS mytype LANGUAGE java AS 
$$return a;$$;
code=2200 [Invalid query] message=Non-frozen User-Defined types are not 
supported, please use frozen
cqlsh:foo create function bar (a frozenmytype) RETURNS frozenmytype 
LANGUAGE java AS $$return a;$$;
cqlsh:foo 
{code}



was (Author: snazy):
Attached v7 of the patch with the fix for that. Also added a unit test for that 
using {{USE}}.

After that fix:
{code}
cqlsh use foo ;
cqlsh:foo create type mytype (a int);
cqlsh:foo create function bar (a mytype) RETURNS mytype LANGUAGE java AS 
$$return a;$$;
code=2200 [Invalid query] message=Non-frozen User-Defined types are not 
supported, please use frozen
cqlsh:foo create function bar (a frozenmytype) RETURNS frozenmytype 
LANGUAGE java AS $$return a;$$;
{code}


 UserType, TupleType and collections in UDFs
 ---

 Key: CASSANDRA-7563
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7563
 Project: Cassandra
  Issue Type: Bug
Reporter: Robert Stupp
Assignee: Robert Stupp
 Fix For: 3.0

 Attachments: 7563-7740.txt, 7563.txt, 7563v2.txt, 7563v3.txt, 
 7563v4.txt, 7563v5.txt, 7563v6.txt, 7563v7.txt


 * is Java Driver as a dependency required ?
 * is it possible to extract parts of the Java Driver for UDT/TT/coll support ?
 * CQL {{DROP TYPE}} must check UDFs
 * must check keyspace access permissions (if those exist)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8260) Replacing a node can leave the old node in system.peers on the replacement

2014-11-22 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221835#comment-14221835
 ] 

Jason Brown edited comment on CASSANDRA-8260 at 11/22/14 1:50 PM:
--

+ 1 on the patch, with one small nit: rename the second parameter on the 
overloaded quarantineEndpoint() from quarantineExpiration to quarantineStart 
(or something similar). The reason being is that the timestamp indicates when 
the endpoint is put into quarantine, not when it should expire.

This is a reasonable fix to resolve this timing issue, but I'll add some 
thoughts to CASSANDRA-8304 about cleaning up the peers.



was (Author: jasobrown):
+ 1 on the patch, with one small nit: rename the second parameter on the 
overloaded quarantineEndpoint() from quarantineExpiration to quarantineStart 
(or something similar). The reason being is that the timestamp indicates when 
the endpoint is put into quarantine, not when it should expire.

This is a reasonable fix to resolve this timing issue, but I'll add some 
thoughts to #8304 about cleaning up the peers.


 Replacing a node can leave the old node in system.peers on the replacement
 --

 Key: CASSANDRA-8260
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8260
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Brandon Williams
Assignee: Brandon Williams
 Fix For: 2.0.12

 Attachments: 8260.txt


 Here's what happens:
 Nodes: X, Y, Z. Z replaces Y which is dead.
 0. Replacement finishes
 1. Z removes Y, quarantines and evicts (that is, removes the state)
 2. X sees the replacement, quarantines, but keeps state
 3. 60s elapses
 4. quarantine on Z expires
 5. X sends syn to Z, repopulates Y endpoint state and persists to 
 system.peers, but Z sees the conflict and does not update tMD for Y. 
 6. FatClient timer on Z starts counting.
 7. quarantine on X expires, fat client has been idle, evicts and 
 re-quarantines
 8. 30s elapses
 9. Fat client timeout occurs on Z, evicts and re-quarantines
 10. 30s elapses
 11. quarantine on X expires, so it never gets repopulated with Y since Z 
 already removed it
 It's important to note here that there is a small but relevant gap between 
 steps 1 and 2, which then correlates to steps 4 and 5, and step 5 is where 
 the problem occurs. This also explains why it looks related to RING_DELAY, 
 since the quarantine is RING_DELAY * 2, but Y never quarantines and the fat 
 client timeout is RING_DELAY, effectively making the discrepancy near equal 
 to RING_DELAY in the end.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8280) Cassandra crashing on inserting data over 64K into indexed strings

2014-11-22 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14221987#comment-14221987
 ] 

Sam Tunnicliffe commented on CASSANDRA-8280:


Unfortunately, that check is required as CompositeType.Builder.build() uses 
ByteBufferUtil.writeWithShortLength so we never reach 
ThriftValidation.validateKey(cfm, key)

 Cassandra crashing on inserting data over 64K into indexed strings
 --

 Key: CASSANDRA-8280
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8280
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian 7, Cassandra 2.1.1, java 1.7.0_60
Reporter: Cristian Marinescu
Assignee: Sam Tunnicliffe
Priority: Critical
 Fix For: 2.1.3

 Attachments: 8280-2.0-v2.txt, 8280-2.0-v3.txt, 8280-2.0.txt, 
 8280-2.1-v2.txt, 8280-2.1.txt


 An attemtp to instert 65536 bytes in a field that is a primary index throws 
 (correctly?) the cassandra.InvalidRequest exception. However, inserting the 
 same data *in a indexed field that is not a primary index* works just fine. 
 However, Cassandra will crash on next commit and never recover. So I rated it 
 as Critical as it can be used for DoS attacks.
 Reproduce: see the snippet below:
 {code}
 import uuid
 from cassandra import ConsistencyLevel
 from cassandra import InvalidRequest
 from cassandra.cluster import Cluster
 from cassandra.auth import PlainTextAuthProvider
 from cassandra.policies import ConstantReconnectionPolicy
 from cassandra.cqltypes import UUID
  
 # DROP KEYSPACE IF EXISTS cs;
 # CREATE KEYSPACE cs WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 1};
 # USE cs;
 # CREATE TABLE test3 (name text, value uuid, sentinel text, PRIMARY KEY 
 (name));
 # CREATE INDEX test3_sentinels ON test3(sentinel); 
  
 class CassandraDemo(object):
  
 def __init__(self):
 ips = [127.0.0.1]
 ap = PlainTextAuthProvider(username=cs, password=cs)
 reconnection_policy = ConstantReconnectionPolicy(20.0, 
 max_attempts=100)
 cluster = Cluster(ips, auth_provider=ap, protocol_version=3, 
 reconnection_policy=reconnection_policy)
 self.session = cluster.connect(cs)
  
 def exec_query(self, query, args):
 prepared_statement = self.session.prepare(query)
 prepared_statement.consistency_level = ConsistencyLevel.LOCAL_QUORUM
 self.session.execute(prepared_statement, args)
  
 def bug(self):
 k1 = UUID( str(uuid.uuid4()) )   
 long_string = X * 65536
 query = INSERT INTO test3 (name, value, sentinel) VALUES (?, ?, ?);
 args = (foo, k1, long_string)
  
 self.exec_query(query, args)
 self.session.execute(DROP KEYSPACE IF EXISTS cs_test, timeout=30)
 self.session.execute(CREATE KEYSPACE cs_test WITH replication = 
 {'class': 'SimpleStrategy', 'replication_factor': 1})
  
 c = CassandraDemo()
 #first run
 c.bug()
 #second run, Cassandra crashes with java.lang.AssertionError
 c.bug()
 {code}
 And here is the cassandra log:
 {code}
 ERROR [MemtableFlushWriter:3] 2014-11-06 16:44:49,263 
 CassandraDaemon.java:153 - Exception in thread 
 Thread[MemtableFlushWriter:3,5,main]
 java.lang.AssertionError: 65536
 at 
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:214)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:201) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:142) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:233)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:218) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:354)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:312) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
  ~[guava-16.0.jar:na]
 at 
 

[jira] [Commented] (CASSANDRA-8280) Cassandra crashing on inserting data over 64K into indexed strings

2014-11-22 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222084#comment-14222084
 ] 

Aleksey Yeschenko commented on CASSANDRA-8280:
--

Let me clarify. I meant we should do `keyBuilder.copy().add(val)` 
unconditionally, but only call the final `build()` (that used 
`ByteBufferUtil.writeWithShortLength()` if getLength() validates.

And if we put our check there, then we no longer need the 
`ThriftValidation.validateKey()` call at all. And the subsequent call to 
`ThriftValidation.validateKey()` in `ModificationStatement#getMutations()`, 
either.

 Cassandra crashing on inserting data over 64K into indexed strings
 --

 Key: CASSANDRA-8280
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8280
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian 7, Cassandra 2.1.1, java 1.7.0_60
Reporter: Cristian Marinescu
Assignee: Sam Tunnicliffe
Priority: Critical
 Fix For: 2.1.3

 Attachments: 8280-2.0-v2.txt, 8280-2.0-v3.txt, 8280-2.0.txt, 
 8280-2.1-v2.txt, 8280-2.1.txt


 An attemtp to instert 65536 bytes in a field that is a primary index throws 
 (correctly?) the cassandra.InvalidRequest exception. However, inserting the 
 same data *in a indexed field that is not a primary index* works just fine. 
 However, Cassandra will crash on next commit and never recover. So I rated it 
 as Critical as it can be used for DoS attacks.
 Reproduce: see the snippet below:
 {code}
 import uuid
 from cassandra import ConsistencyLevel
 from cassandra import InvalidRequest
 from cassandra.cluster import Cluster
 from cassandra.auth import PlainTextAuthProvider
 from cassandra.policies import ConstantReconnectionPolicy
 from cassandra.cqltypes import UUID
  
 # DROP KEYSPACE IF EXISTS cs;
 # CREATE KEYSPACE cs WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 1};
 # USE cs;
 # CREATE TABLE test3 (name text, value uuid, sentinel text, PRIMARY KEY 
 (name));
 # CREATE INDEX test3_sentinels ON test3(sentinel); 
  
 class CassandraDemo(object):
  
 def __init__(self):
 ips = [127.0.0.1]
 ap = PlainTextAuthProvider(username=cs, password=cs)
 reconnection_policy = ConstantReconnectionPolicy(20.0, 
 max_attempts=100)
 cluster = Cluster(ips, auth_provider=ap, protocol_version=3, 
 reconnection_policy=reconnection_policy)
 self.session = cluster.connect(cs)
  
 def exec_query(self, query, args):
 prepared_statement = self.session.prepare(query)
 prepared_statement.consistency_level = ConsistencyLevel.LOCAL_QUORUM
 self.session.execute(prepared_statement, args)
  
 def bug(self):
 k1 = UUID( str(uuid.uuid4()) )   
 long_string = X * 65536
 query = INSERT INTO test3 (name, value, sentinel) VALUES (?, ?, ?);
 args = (foo, k1, long_string)
  
 self.exec_query(query, args)
 self.session.execute(DROP KEYSPACE IF EXISTS cs_test, timeout=30)
 self.session.execute(CREATE KEYSPACE cs_test WITH replication = 
 {'class': 'SimpleStrategy', 'replication_factor': 1})
  
 c = CassandraDemo()
 #first run
 c.bug()
 #second run, Cassandra crashes with java.lang.AssertionError
 c.bug()
 {code}
 And here is the cassandra log:
 {code}
 ERROR [MemtableFlushWriter:3] 2014-11-06 16:44:49,263 
 CassandraDaemon.java:153 - Exception in thread 
 Thread[MemtableFlushWriter:3,5,main]
 java.lang.AssertionError: 65536
 at 
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:214)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:201) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:142) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:233)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:218) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:354)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:312) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 

[jira] [Commented] (CASSANDRA-8280) Cassandra crashing on inserting data over 64K into indexed strings

2014-11-22 Thread Sam Tunnicliffe (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222138#comment-14222138
 ] 

Sam Tunnicliffe commented on CASSANDRA-8280:


Well, ThriftValidation.validateKey() does more than simply check the length, so 
I think we'd still need it. Also, we do potentially less work the way it is now 
because we only iterate over the builder components (in getLength()) once, 
rather than once per value.  

 Cassandra crashing on inserting data over 64K into indexed strings
 --

 Key: CASSANDRA-8280
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8280
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian 7, Cassandra 2.1.1, java 1.7.0_60
Reporter: Cristian Marinescu
Assignee: Sam Tunnicliffe
Priority: Critical
 Fix For: 2.1.3

 Attachments: 8280-2.0-v2.txt, 8280-2.0-v3.txt, 8280-2.0.txt, 
 8280-2.1-v2.txt, 8280-2.1.txt


 An attemtp to instert 65536 bytes in a field that is a primary index throws 
 (correctly?) the cassandra.InvalidRequest exception. However, inserting the 
 same data *in a indexed field that is not a primary index* works just fine. 
 However, Cassandra will crash on next commit and never recover. So I rated it 
 as Critical as it can be used for DoS attacks.
 Reproduce: see the snippet below:
 {code}
 import uuid
 from cassandra import ConsistencyLevel
 from cassandra import InvalidRequest
 from cassandra.cluster import Cluster
 from cassandra.auth import PlainTextAuthProvider
 from cassandra.policies import ConstantReconnectionPolicy
 from cassandra.cqltypes import UUID
  
 # DROP KEYSPACE IF EXISTS cs;
 # CREATE KEYSPACE cs WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 1};
 # USE cs;
 # CREATE TABLE test3 (name text, value uuid, sentinel text, PRIMARY KEY 
 (name));
 # CREATE INDEX test3_sentinels ON test3(sentinel); 
  
 class CassandraDemo(object):
  
 def __init__(self):
 ips = [127.0.0.1]
 ap = PlainTextAuthProvider(username=cs, password=cs)
 reconnection_policy = ConstantReconnectionPolicy(20.0, 
 max_attempts=100)
 cluster = Cluster(ips, auth_provider=ap, protocol_version=3, 
 reconnection_policy=reconnection_policy)
 self.session = cluster.connect(cs)
  
 def exec_query(self, query, args):
 prepared_statement = self.session.prepare(query)
 prepared_statement.consistency_level = ConsistencyLevel.LOCAL_QUORUM
 self.session.execute(prepared_statement, args)
  
 def bug(self):
 k1 = UUID( str(uuid.uuid4()) )   
 long_string = X * 65536
 query = INSERT INTO test3 (name, value, sentinel) VALUES (?, ?, ?);
 args = (foo, k1, long_string)
  
 self.exec_query(query, args)
 self.session.execute(DROP KEYSPACE IF EXISTS cs_test, timeout=30)
 self.session.execute(CREATE KEYSPACE cs_test WITH replication = 
 {'class': 'SimpleStrategy', 'replication_factor': 1})
  
 c = CassandraDemo()
 #first run
 c.bug()
 #second run, Cassandra crashes with java.lang.AssertionError
 c.bug()
 {code}
 And here is the cassandra log:
 {code}
 ERROR [MemtableFlushWriter:3] 2014-11-06 16:44:49,263 
 CassandraDaemon.java:153 - Exception in thread 
 Thread[MemtableFlushWriter:3,5,main]
 java.lang.AssertionError: 65536
 at 
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:214)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:201) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:142) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:233)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:218) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:354)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:312) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 

[jira] [Commented] (CASSANDRA-8280) Cassandra crashing on inserting data over 64K into indexed strings

2014-11-22 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14222191#comment-14222191
 ] 

Aleksey Yeschenko commented on CASSANDRA-8280:
--

Fair enough re: TV.validateKey() doing more than that (although the second 
call, in MS#getMutations(), is still redundant, but that's not an issue 
introduced by this patch).

In that case you'd need logic more complex than `baseSize + val.remaining()`, 
to account for extra 3 bytes, conditionally, for partition keys. I'd rather we 
did some extra iteration (which in this case happens rarely and doesn't cost 
much, vs. extra complexity).

Anyway, sorry, this is me extremely nit-picking for lack of things to do on 
Saturday. Feel free to ignore the minor things.

 Cassandra crashing on inserting data over 64K into indexed strings
 --

 Key: CASSANDRA-8280
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8280
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: Debian 7, Cassandra 2.1.1, java 1.7.0_60
Reporter: Cristian Marinescu
Assignee: Sam Tunnicliffe
Priority: Critical
 Fix For: 2.1.3

 Attachments: 8280-2.0-v2.txt, 8280-2.0-v3.txt, 8280-2.0.txt, 
 8280-2.1-v2.txt, 8280-2.1.txt


 An attemtp to instert 65536 bytes in a field that is a primary index throws 
 (correctly?) the cassandra.InvalidRequest exception. However, inserting the 
 same data *in a indexed field that is not a primary index* works just fine. 
 However, Cassandra will crash on next commit and never recover. So I rated it 
 as Critical as it can be used for DoS attacks.
 Reproduce: see the snippet below:
 {code}
 import uuid
 from cassandra import ConsistencyLevel
 from cassandra import InvalidRequest
 from cassandra.cluster import Cluster
 from cassandra.auth import PlainTextAuthProvider
 from cassandra.policies import ConstantReconnectionPolicy
 from cassandra.cqltypes import UUID
  
 # DROP KEYSPACE IF EXISTS cs;
 # CREATE KEYSPACE cs WITH replication = {'class': 'SimpleStrategy', 
 'replication_factor': 1};
 # USE cs;
 # CREATE TABLE test3 (name text, value uuid, sentinel text, PRIMARY KEY 
 (name));
 # CREATE INDEX test3_sentinels ON test3(sentinel); 
  
 class CassandraDemo(object):
  
 def __init__(self):
 ips = [127.0.0.1]
 ap = PlainTextAuthProvider(username=cs, password=cs)
 reconnection_policy = ConstantReconnectionPolicy(20.0, 
 max_attempts=100)
 cluster = Cluster(ips, auth_provider=ap, protocol_version=3, 
 reconnection_policy=reconnection_policy)
 self.session = cluster.connect(cs)
  
 def exec_query(self, query, args):
 prepared_statement = self.session.prepare(query)
 prepared_statement.consistency_level = ConsistencyLevel.LOCAL_QUORUM
 self.session.execute(prepared_statement, args)
  
 def bug(self):
 k1 = UUID( str(uuid.uuid4()) )   
 long_string = X * 65536
 query = INSERT INTO test3 (name, value, sentinel) VALUES (?, ?, ?);
 args = (foo, k1, long_string)
  
 self.exec_query(query, args)
 self.session.execute(DROP KEYSPACE IF EXISTS cs_test, timeout=30)
 self.session.execute(CREATE KEYSPACE cs_test WITH replication = 
 {'class': 'SimpleStrategy', 'replication_factor': 1})
  
 c = CassandraDemo()
 #first run
 c.bug()
 #second run, Cassandra crashes with java.lang.AssertionError
 c.bug()
 {code}
 And here is the cassandra log:
 {code}
 ERROR [MemtableFlushWriter:3] 2014-11-06 16:44:49,263 
 CassandraDaemon.java:153 - Exception in thread 
 Thread[MemtableFlushWriter:3,5,main]
 java.lang.AssertionError: 65536
 at 
 org.apache.cassandra.utils.ByteBufferUtil.writeWithShortLength(ByteBufferUtil.java:290)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.maybeWriteRowHeader(ColumnIndex.java:214)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.add(ColumnIndex.java:201) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.ColumnIndex$Builder.build(ColumnIndex.java:142) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:233)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:218) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:354)
  ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:312) 
 ~[apache-cassandra-2.1.1.jar:2.1.1]
 at 
 

[jira] [Commented] (CASSANDRA-4175) Reduce memory, disk space, and cpu usage with a column name/id map

2014-11-22 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1449#comment-1449
 ] 

Edward Capriolo commented on CASSANDRA-4175:


There was once a https://twitter.com/roflscaletips suggestion that said 
something to the effect of make mongo faster by using small column names. The 
same advice applies here. If you name a column wombat_walnut_crackerjacks 
instead of w it is going to take up more space on disk. This is because 
cassandra stores the column name and value each column on disk, because it is a 
row store, apparently.

A simple way to solve this would be to have the CQL language store some 
meta-data about alternate column names.

{quote}
Create table abc ( wombat_walnul_crackerjacks int (shortname w) );
{quote}

Then the query engine could allow either to be used in a select cause.

{quote}
SELECT w from abc;
{quote}

{quote}
SELECT wombat_walnul_crackerjacks from abc;
{quote}

An even easier way is to name the column w. This way you avoid having systems 
where column needs two names, or systems where column names have a internal 
database of column name-shorter column name. But what is the fun of just 
telling people to use short names when a complex solution can be engineered :)

 Reduce memory, disk space, and cpu usage with a column name/id map
 --

 Key: CASSANDRA-4175
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jonathan Ellis
Assignee: Jason Brown
  Labels: performance
 Fix For: 3.0


 We spend a lot of memory on column names, both transiently (during reads) and 
 more permanently (in the row cache).  Compression mitigates this on disk but 
 not on the heap.
 The overhead is significant for typical small column values, e.g., ints.
 Even though we intern once we get to the memtable, this affects writes too 
 via very high allocation rates in the young generation, hence more GC 
 activity.
 Now that CQL3 provides us some guarantees that column names must be defined 
 before they are inserted, we could create a map of (say) 32-bit int column 
 id, to names, and use that internally right up until we return a resultset to 
 the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8231) Wrong size of cached prepared statements

2014-11-22 Thread Rajanarayanan Thottuvaikkatumana (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1457#comment-1457
 ] 

Rajanarayanan Thottuvaikkatumana commented on CASSANDRA-8231:
-

I see the following from git log f02d194 Add missing jamm jar from 
CASSANDRA-8231 commit and I think there is a problem associated with the jar 
file checked in. While running some of the tests, I am getting the following 
error messages 
{code}
[junit] Error opening zip file or JAR manifest missing : 
/Users/RajT/cassandra-source/cassandra-trunk/lib/jamm-0.3.0.jar 
{code}

Please have a look at it. Thanks

 Wrong size of cached prepared statements
 

 Key: CASSANDRA-8231
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8231
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Jaroslav Kamenik
Assignee: Benjamin Lerer
 Fix For: 2.1.3

 Attachments: 8231-notes.txt, CASSANDRA-8231-V2-trunk.txt, 
 CASSANDRA-8231-V2.txt, CASSANDRA-8231.txt, Unsafes.java


 Cassandra counts memory footprint of prepared statements for caching 
 purposes. It seems, that there is problem with some statements, ie 
 SelectStatement. Even simple selects is counted as 100KB object, updates, 
 deletes etc have few hundreds or thousands bytes. Result is that cache - 
 QueryProcessor.preparedStatements  - holds just fraction of statements..
 I dig a little into the code, and it seems that problem is in jamm in class 
 MemoryMeter. It seems that if instance contains reference to class, it counts 
 size of whole class too. SelectStatement references EnumSet through 
 ResultSet.Metadata and EnumSet holds reference to Enum class...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495
 ] 

Vijay commented on CASSANDRA-7438:
--

Alright the first version of pure Java version of LRUCache pushed, 
* Basically a port from the C version. (Most of the test cases pass and they 
are the same for both versions)
* As ariel mentioned before we can use disruptor for the ring buffer but this 
doesn't use it yet.
* Expiry in the queue thread is not implemented yet.
* Algorithm to start the rehash needs to be more configurable and based on the 
capacity will be pushing that soon.
* Overhead in JVM heap is just the segments array.

https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-7438) Serializing Row cache alternative (Fully off heap)

2014-11-22 Thread Vijay (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1495#comment-1495
 ] 

Vijay edited comment on CASSANDRA-7438 at 11/23/14 3:23 AM:


Alright the first version of pure Java version of LRUCache pushed, 
* Basically a port from the C version. (Most of the test cases pass and they 
are the same for both versions)
* As Ariel mentioned before... we can use disruptor for the ring buffer, 
current version doesnt use it yet.
* Proactive expiry in the queue thread is not implemented yet.
* Algorithm to start the rehash needs to be more configurable, and based on the 
capacity will be pushing that soon.
* Overhead in JVM heap is just the segments array, hence should be able to grow 
as much as the system can support.

https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe 


was (Author: vijay2...@yahoo.com):
Alright the first version of pure Java version of LRUCache pushed, 
* Basically a port from the C version. (Most of the test cases pass and they 
are the same for both versions)
* As ariel mentioned before we can use disruptor for the ring buffer but this 
doesn't use it yet.
* Expiry in the queue thread is not implemented yet.
* Algorithm to start the rehash needs to be more configurable and based on the 
capacity will be pushing that soon.
* Overhead in JVM heap is just the segments array.

https://github.com/Vijay2win/lruc/tree/master/src/main/java/com/lruc/unsafe 

 Serializing Row cache alternative (Fully off heap)
 --

 Key: CASSANDRA-7438
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7438
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
 Environment: Linux
Reporter: Vijay
Assignee: Vijay
  Labels: performance
 Fix For: 3.0

 Attachments: 0001-CASSANDRA-7438.patch


 Currently SerializingCache is partially off heap, keys are still stored in 
 JVM heap as BB, 
 * There is a higher GC costs for a reasonably big cache.
 * Some users have used the row cache efficiently in production for better 
 results, but this requires careful tunning.
 * Overhead in Memory for the cache entries are relatively high.
 So the proposal for this ticket is to move the LRU cache logic completely off 
 heap and use JNI to interact with cache. We might want to ensure that the new 
 implementation match the existing API's (ICache), and the implementation 
 needs to have safe memory access, low overhead in memory and less memcpy's 
 (As much as possible).
 We might also want to make this cache configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)