[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2017-08-23 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139537#comment-16139537
 ] 

Drew Kutcharian commented on CASSANDRA-10783:
-

Any chance of this getting back-ported to 3.0.x?

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Sylvain Lebresne
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.8
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-8877) Ability to read the TTL and WRTIE TIME of an element in a collection

2015-03-03 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345474#comment-14345474
 ] 

Drew Kutcharian edited comment on CASSANDRA-8877 at 3/3/15 6:46 PM:


[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['first_name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['first_name']), WRITETIME(fields['first_name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, METADATA(fields) from user
{code}

and the response would be
{code}
fields = { 'first_name': 'john', 'last_name': 'doe' }

METADATA(fields) = { 'first_name': {'ttl': ttl seconds, 'writetime': 
timestamp }, 'last_name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}

or alternatively (without adding a new function):
{code}
SELECT fields, TTL(fields), WRITETIME(fields) from user
{code}

and the response would be
{code}
fields = { 'first_name': 'john', 'last_name': 'doe' }

TTL(fields) = { 'first_name': ttl seconds, 'last_name': ttl seconds }

WRITETIME(fields) = { 'first_name': writetime millis, 'last_name': 
writetime millis }
{code}



was (Author: drew_kutchar):
[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, METADATA(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
METADATA(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}

or alternatively (without adding a new function):
{code}
SELECT fields, TTL(fields), WRITETIME(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
TTL(fields): { 'name': ttl seconds }
WRITETIME(fields): { 'name': writetime millis }
{code}


 Ability to read the TTL and WRTIE TIME of an element in a collection
 

 Key: CASSANDRA-8877
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8877
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0


 Currently it's possible to set the TTL and WRITE TIME of an element in a 
 collection using CQL, but there is no way to read them back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8877) Ability to read the TTL and WRTIE TIME of an element in a collection

2015-03-03 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345474#comment-14345474
 ] 

Drew Kutcharian edited comment on CASSANDRA-8877 at 3/3/15 6:40 PM:


[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, METADATA(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
METADATA(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}



was (Author: drew_kutchar):
[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, metadata(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
metadata(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}


 Ability to read the TTL and WRTIE TIME of an element in a collection
 

 Key: CASSANDRA-8877
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8877
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0


 Currently it's possible to set the TTL and WRITE TIME of an element in a 
 collection using CQL, but there is no way to read them back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8877) Ability to read the TTL and WRTIE TIME of an element in a collection

2015-03-03 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345474#comment-14345474
 ] 

Drew Kutcharian edited comment on CASSANDRA-8877 at 3/3/15 6:42 PM:


[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, METADATA(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
METADATA(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}

or alternatively (without adding a new function):
{code}
SELECT fields, TTL(fields), WRITETIME(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
TTL(fields): { 'name': ttl seconds }
WRITETIME(fields): { 'name': writetime millis }
{code}



was (Author: drew_kutchar):
[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, METADATA(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
METADATA(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}


 Ability to read the TTL and WRTIE TIME of an element in a collection
 

 Key: CASSANDRA-8877
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8877
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0


 Currently it's possible to set the TTL and WRITE TIME of an element in a 
 collection using CQL, but there is no way to read them back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8861) HyperLogLog Collection Type

2015-03-03 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8861?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345487#comment-14345487
 ] 

Drew Kutcharian commented on CASSANDRA-8861:


Thanks [~iamaleksey]

 HyperLogLog Collection Type
 ---

 Key: CASSANDRA-8861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8861
 Project: Cassandra
  Issue Type: Wish
Reporter: Drew Kutcharian
Assignee: Aleksey Yeschenko
 Fix For: 3.1


 Considering that HyperLogLog and its variants have become pretty popular in 
 analytics space and Cassandra has read-before-write collections (Lists), I 
 think it would not be too painful to add support for HyperLogLog collection 
 type. They would act similar to CQL 3 Sets, meaning you would be able to 
 set the value and add an element, but you won't be able to remove an 
 element. Also, when getting the value of a HyperLogLog collection column, 
 you'd get the cardinality.
 There are a couple of good attributes with HyperLogLog which fit Cassandra 
 pretty well.
 - Adding an element is idempotent (adding an existing element doesn't change 
 the HLL)
 - HLL can be thought of as a CRDT, since we can safely merge them. Which 
 means we can merge two HLLs during read-repair. But if that's too much work, 
 I guess we can even live with LWW since these counts are estimates after 
 all.
 There is already a proof of concept at:
 http://vilkeliskis.com/blog/2013/12/28/hacking_cassandra.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8877) Ability to read the TTL and WRTIE TIME of an element in a collection

2015-03-03 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14345474#comment-14345474
 ] 

Drew Kutcharian commented on CASSANDRA-8877:


[~slebresne] you are correct that this is relates to CASSANDRA-7396. The ideal 
situation would be:

1. Be able to select the value of the an element in a collection individually, 
i.e. 
{code}
SELECT fields['name'] from user
{code}

2. Be able to select the value, TTL and writetime of the an element in a 
collection individually
{code}
SELECT TTL(fields['name']), WRITETIME(fields['name']) from user
{code}

3. Be able to select the values of ALL the elements in a collection (this is 
the current functionality when selecting a collection column)
{code}
SELECT fields from user
{code}

Optionally:
4. Be able to select the value, TTL and writetime of ALL the elements in a 
collection. This is where I haven't come up with a good syntax but maybe 
something like this:
{code}
SELECT fields, metadata(fields) from user
{code}

and the response would be
{code}
fields: { 'name': 'john' }
metadata(fields): { 'name': {'ttl': ttl seconds, 'writetime': timestamp } }
{code}


 Ability to read the TTL and WRTIE TIME of an element in a collection
 

 Key: CASSANDRA-8877
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8877
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian
Assignee: Benjamin Lerer
Priority: Minor
 Fix For: 3.0


 Currently it's possible to set the TTL and WRITE TIME of an element in a 
 collection using CQL, but there is no way to read them back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8877) Ability to read the TTL and WRTIE TIME of an element in a collection

2015-02-27 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-8877:
--

 Summary: Ability to read the TTL and WRTIE TIME of an element in a 
collection
 Key: CASSANDRA-8877
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8877
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian


Currently it's possible to set the TTL and WRITE TIME of an element in a 
collection using CQL, but there is no way to read them back. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8861) HyperLogLog Collection Type

2015-02-24 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-8861:
--

 Summary: HyperLogLog Collection Type
 Key: CASSANDRA-8861
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8861
 Project: Cassandra
  Issue Type: Wish
Reporter: Drew Kutcharian


Considering that HyperLogLog and its variants have become pretty popular in 
analytics space and Cassandra has read-before-write collections (Lists), I 
think it would not be too painful to add support for HyperLogLog collection 
type. They would act similar to CQL 3 Sets, meaning you would be able to set 
the value and add an element, but you won't be able to remove an element. 
Also, when getting the value of a HyperLogLog collection column, you'd get the 
cardinality.

There are a couple of good attributes with HyperLogLog which fit Cassandra 
pretty well.
- Adding an element is idempotent (adding an existing element doesn't change 
the HLL)
- HLL can be thought of as a CRDT, since we can safely merge them. Which means 
we can merge two HLLs during read-repair. But if that's too much work, I guess 
we can even live with LWW since these counts are estimates after all.

There is already a proof of concept at:
http://vilkeliskis.com/blog/2013/12/28/hacking_cassandra.html




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8249) cassandra.yaml: rpc_address overwrites listen_address

2014-11-04 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14196419#comment-14196419
 ] 

Drew Kutcharian commented on CASSANDRA-8249:


[~mshuler] and [~brandon.williams] You guys are right. This was a 
misconfiguration on our part that demonstrated itself after an upgrade, hence I 
thought it could've been a bug. This issue can be closed.

 cassandra.yaml: rpc_address overwrites listen_address
 -

 Key: CASSANDRA-8249
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8249
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.1.1, Ubuntu 14.04
Reporter: Drew Kutcharian

 To reproduce set listen_address to the node's non-local address, say 
 192.168.0.10 and set rpc_address to localhost. Start C* and it will bind 
 the native protocol to localhost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-8249) cassandra.yaml: rpc_address overwrites listen_address

2014-11-03 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-8249:
--

 Summary: cassandra.yaml: rpc_address overwrites listen_address
 Key: CASSANDRA-8249
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8249
 Project: Cassandra
  Issue Type: Bug
 Environment: Cassandra 2.1.1, Ubuntu 14.04
Reporter: Drew Kutcharian


To reproduce set listen_address to the node's non-local address, say 
192.168.0.10 and set rpc_address to localhost. Start C* and it will bind 
the native protocol to localhost.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7850) Composite Aware Partitioner

2014-08-30 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116478#comment-14116478
 ] 

Drew Kutcharian commented on CASSANDRA-7850:


I agree that transparent sharding doesn't make sense and that's the whole point 
of this JIRA. I want to be able to do explicit sharding but keep the shards on 
the same node so I can do efficient multi_get and multi_get_slice.

 Composite Aware Partitioner
 ---

 Key: CASSANDRA-7850
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Since C* supports composites for partition keys, I think it'd be useful to 
 have the ability to only use first (or first few) components of the key to 
 calculate the token hash.
 A naive use case would be multi-tenancy:
 Say we have accounts and accounts have users. So we would have the following 
 tables:
 {code}
 CREATE TABLE account (
   id timeuuid PRIMARY KEY,
   company text
 );
 {code}
 {code}
 CREATE TABLE user (
   id  timeuuid PRIMARY KEY, 
   accountId timeuuid,
   emailtext,
   password text
 );
 {code}
 {code}
 // Get users by account
 CREATE TABLE user_account_index (
   accountId  timeuuid,
   userIdtimeuuid,
   PRIMARY KEY(acid, id)
 );
 {code}
 Say we want to get all the users that belong to an account. We would first 
 have to get the results from user_account_index and then use a multi-get 
 (WHERE IN) to get the records from user table. Now this multi-get part could 
 potentially query a lot of different nodes in the cluster. It’d be great if 
 there was a way to limit storage of users of an account to a single node so 
 that way multi-get would only need to query a single node.
 With this improvement we would be able to define the user table like so:
 {code}
 CREATE TABLE user (
   id  timeuuid, 
   accountId timeuuid,
   emailtext,
   password text,
   PRIMARY KEY(((accountId),id))  //extra parentheses
 );
 {code}
 I'm not too sure about the notation, it could be something like PRIMARY 
 KEY(((accountId),id)) where the (accountId) means use this part to 
 calculate the hash and ((accountId),id) is the actual partition key.
 The main complication I see with this is that we would have to use the table 
 definition when calculating hashes so we know what components of the 
 partition keys need to be used for hash calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7850) Composite Aware Partitioner

2014-08-29 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-7850:
--

 Summary: Composite Aware Partitioner
 Key: CASSANDRA-7850
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian


Since C* supports composites for partition keys, I think it'd be useful to have 
the ability to only use first (or first few) components of the key to calculate 
the token hash.

A naive use case would be multi-tenancy:

Say we have accounts and accounts have users. So we would have the following 
tables:

{code}
CREATE TABLE account (
  id timeuuid PRIMARY KEY,
  company text
);
{code}

{code}
CREATE TABLE user (
  id  timeuuid PRIMARY KEY, 
  accountId timeuuid,
  emailtext,
  password text
);
{code}

{code}
// Get users by account
CREATE TABLE user_account_index (
  accountId  timeuuid,
  userIdtimeuuid,
  PRIMARY KEY(acid, id)
);
{code}

Say we want to get all the users that belong to an account. We would first have 
to get the results from user_account_index and then use a multi-get (WHERE IN) 
to get the records from user table. Now this multi-get part could potentially 
query a lot of different nodes in the cluster. It’d be great if there was a way 
to limit storage of users of an account to a single node so that way multi-get 
would only need to query a single node.

With this improvement we would be able to define the user table like so:
{code}
CREATE TABLE user (
  id  timeuuid, 
  accountId timeuuid,
  emailtext,
  password text,
  PRIMARY KEY(((accountId),id))  //extra parentheses
);
{code}

I'm not too sure about the notation, it could be something like PRIMARY 
KEY(((accountId),id)) where the (accountId) means use this part to calculate 
the hash and ((accountId),id) is the actual partition key.

The main complication I see with this is that we would have to use the table 
definition when calculating hashes so we know what components of the partition 
keys need to be used for hash calculation.




--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Reopened] (CASSANDRA-7850) Composite Aware Partitioner

2014-08-29 Thread Drew Kutcharian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Drew Kutcharian reopened CASSANDRA-7850:



Hi [~jbellis], I think you misunderstood this JIRA or more likely I didn't 
explain it properly.

In the link that you provided:

bq. Generally, Cassandra will store columns having the same block_id but a 
different breed on different nodes, and columns having the same block_id and 
breed on the same node.

The point of this JIRA is to be able to store columns having the _same_ 
block_id but different breeds on the same node. (Think wide row sharding)

 Composite Aware Partitioner
 ---

 Key: CASSANDRA-7850
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Since C* supports composites for partition keys, I think it'd be useful to 
 have the ability to only use first (or first few) components of the key to 
 calculate the token hash.
 A naive use case would be multi-tenancy:
 Say we have accounts and accounts have users. So we would have the following 
 tables:
 {code}
 CREATE TABLE account (
   id timeuuid PRIMARY KEY,
   company text
 );
 {code}
 {code}
 CREATE TABLE user (
   id  timeuuid PRIMARY KEY, 
   accountId timeuuid,
   emailtext,
   password text
 );
 {code}
 {code}
 // Get users by account
 CREATE TABLE user_account_index (
   accountId  timeuuid,
   userIdtimeuuid,
   PRIMARY KEY(acid, id)
 );
 {code}
 Say we want to get all the users that belong to an account. We would first 
 have to get the results from user_account_index and then use a multi-get 
 (WHERE IN) to get the records from user table. Now this multi-get part could 
 potentially query a lot of different nodes in the cluster. It’d be great if 
 there was a way to limit storage of users of an account to a single node so 
 that way multi-get would only need to query a single node.
 With this improvement we would be able to define the user table like so:
 {code}
 CREATE TABLE user (
   id  timeuuid, 
   accountId timeuuid,
   emailtext,
   password text,
   PRIMARY KEY(((accountId),id))  //extra parentheses
 );
 {code}
 I'm not too sure about the notation, it could be something like PRIMARY 
 KEY(((accountId),id)) where the (accountId) means use this part to 
 calculate the hash and ((accountId),id) is the actual partition key.
 The main complication I see with this is that we would have to use the table 
 definition when calculating hashes so we know what components of the 
 partition keys need to be used for hash calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7850) Composite Aware Partitioner

2014-08-29 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116170#comment-14116170
 ] 

Drew Kutcharian commented on CASSANDRA-7850:


Yes, but then I might end up with very wide rows.

Basically what I want is {{PRIMARY KEY ((block_id, breed_bucket), breed)}} 
where records with same block_id and breed_bucket get stored on the same node, 
but in different _thrift_ rows so I don't have very wide rows (millions of 
_thrift_ columns per _thrift_ row). 

 Composite Aware Partitioner
 ---

 Key: CASSANDRA-7850
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Since C* supports composites for partition keys, I think it'd be useful to 
 have the ability to only use first (or first few) components of the key to 
 calculate the token hash.
 A naive use case would be multi-tenancy:
 Say we have accounts and accounts have users. So we would have the following 
 tables:
 {code}
 CREATE TABLE account (
   id timeuuid PRIMARY KEY,
   company text
 );
 {code}
 {code}
 CREATE TABLE user (
   id  timeuuid PRIMARY KEY, 
   accountId timeuuid,
   emailtext,
   password text
 );
 {code}
 {code}
 // Get users by account
 CREATE TABLE user_account_index (
   accountId  timeuuid,
   userIdtimeuuid,
   PRIMARY KEY(acid, id)
 );
 {code}
 Say we want to get all the users that belong to an account. We would first 
 have to get the results from user_account_index and then use a multi-get 
 (WHERE IN) to get the records from user table. Now this multi-get part could 
 potentially query a lot of different nodes in the cluster. It’d be great if 
 there was a way to limit storage of users of an account to a single node so 
 that way multi-get would only need to query a single node.
 With this improvement we would be able to define the user table like so:
 {code}
 CREATE TABLE user (
   id  timeuuid, 
   accountId timeuuid,
   emailtext,
   password text,
   PRIMARY KEY(((accountId),id))  //extra parentheses
 );
 {code}
 I'm not too sure about the notation, it could be something like PRIMARY 
 KEY(((accountId),id)) where the (accountId) means use this part to 
 calculate the hash and ((accountId),id) is the actual partition key.
 The main complication I see with this is that we would have to use the table 
 definition when calculating hashes so we know what components of the 
 partition keys need to be used for hash calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-7850) Composite Aware Partitioner

2014-08-29 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14116170#comment-14116170
 ] 

Drew Kutcharian edited comment on CASSANDRA-7850 at 8/30/14 2:00 AM:
-

Yes, but then I might end up with very wide _thrift_ rows.

Basically what I want is {{PRIMARY KEY ((block_id, breed_bucket), breed)}} 
where records with same block_id get stored on the same node *regardless* of 
the value of breed_bucket. But I don't want to use {{PRIMARY KEY (block_id, 
breed_bucket, breed)}} since in that case all the records for a block_id would 
end up in a single _thrift_ row.

So, ideally the layout would be:
block_id - decides the node
(block_id, breed_bucket) - decides the _thrift_ row. Old school row key
breed - prefix of _thrift_ columns. Old school column name prefix



was (Author: drew_kutchar):
Yes, but then I might end up with very wide rows.

Basically what I want is {{PRIMARY KEY ((block_id, breed_bucket), breed)}} 
where records with same block_id and breed_bucket get stored on the same node, 
but in different _thrift_ rows so I don't have very wide rows (millions of 
_thrift_ columns per _thrift_ row). 

 Composite Aware Partitioner
 ---

 Key: CASSANDRA-7850
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7850
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Since C* supports composites for partition keys, I think it'd be useful to 
 have the ability to only use first (or first few) components of the key to 
 calculate the token hash.
 A naive use case would be multi-tenancy:
 Say we have accounts and accounts have users. So we would have the following 
 tables:
 {code}
 CREATE TABLE account (
   id timeuuid PRIMARY KEY,
   company text
 );
 {code}
 {code}
 CREATE TABLE user (
   id  timeuuid PRIMARY KEY, 
   accountId timeuuid,
   emailtext,
   password text
 );
 {code}
 {code}
 // Get users by account
 CREATE TABLE user_account_index (
   accountId  timeuuid,
   userIdtimeuuid,
   PRIMARY KEY(acid, id)
 );
 {code}
 Say we want to get all the users that belong to an account. We would first 
 have to get the results from user_account_index and then use a multi-get 
 (WHERE IN) to get the records from user table. Now this multi-get part could 
 potentially query a lot of different nodes in the cluster. It’d be great if 
 there was a way to limit storage of users of an account to a single node so 
 that way multi-get would only need to query a single node.
 With this improvement we would be able to define the user table like so:
 {code}
 CREATE TABLE user (
   id  timeuuid, 
   accountId timeuuid,
   emailtext,
   password text,
   PRIMARY KEY(((accountId),id))  //extra parentheses
 );
 {code}
 I'm not too sure about the notation, it could be something like PRIMARY 
 KEY(((accountId),id)) where the (accountId) means use this part to 
 calculate the hash and ((accountId),id) is the actual partition key.
 The main complication I see with this is that we would have to use the table 
 definition when calculating hashes so we know what components of the 
 partition keys need to be used for hash calculation.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements

2014-06-19 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037546#comment-14037546
 ] 

Drew Kutcharian commented on CASSANDRA-7304:


I agree with [~slebresne] here. Having it on the query is much clearer and 
easier to read. Also, I don't see a problem with having multiple statements 
that do the same thing, since:

bq. UPDATE table SET column = 3 WHERE key = 2;
This means use the default behavior

bq. UPDATE table USING IGNORE_NULLS true SET column = 3 WHERE key = 2;
This explicitly sets USING IGNORE_NULLS to true.

bq. UPDATE table USING IGNORE_NULLS false SET column = 3 WHERE key = 2;
This explicitly sets USING IGNORE_NULLS to false. Say if the default ever 
changes and you just don't want to be at the mercy of the default.

So I wouldn't say these 3 statements have the same meaning.

 Ability to distinguish between NULL and UNSET values in Prepared Statements
 ---

 Key: CASSANDRA-7304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Currently Cassandra inserts tombstones when a value of a column is bound to 
 NULL in a prepared statement. At higher insert rates managing all these 
 tombstones becomes an unnecessary overhead. This limits the usefulness of the 
 prepared statements since developers have to either create multiple prepared 
 statements (each with a different combination of column names, which at times 
 is just unfeasible because of the sheer number of possible combinations) or 
 fall back to using regular (non-prepared) statements.
 This JIRA is here to explore the possibility of either:
 A. Have a flag on prepared statements that once set, tells Cassandra to 
 ignore null columns
 or
 B. Have an UNSET value which makes Cassandra skip the null columns and not 
 tombstone them
 Basically, in the context of a prepared statement, a null value means delete, 
 but we don’t have anything that means ignore (besides creating a new 
 prepared statement without the ignored column).
 Please refer to the original conversation on DataStax Java Driver mailing 
 list for more background:
 https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements

2014-06-19 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037705#comment-14037705
 ] 

Drew Kutcharian commented on CASSANDRA-7304:


+1 with {{IGNORE NULLS}} it's much more consistent with {{ALLOW FILTERING}}

 Ability to distinguish between NULL and UNSET values in Prepared Statements
 ---

 Key: CASSANDRA-7304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian

 Currently Cassandra inserts tombstones when a value of a column is bound to 
 NULL in a prepared statement. At higher insert rates managing all these 
 tombstones becomes an unnecessary overhead. This limits the usefulness of the 
 prepared statements since developers have to either create multiple prepared 
 statements (each with a different combination of column names, which at times 
 is just unfeasible because of the sheer number of possible combinations) or 
 fall back to using regular (non-prepared) statements.
 This JIRA is here to explore the possibility of either:
 A. Have a flag on prepared statements that once set, tells Cassandra to 
 ignore null columns
 or
 B. Have an UNSET value which makes Cassandra skip the null columns and not 
 tombstone them
 Basically, in the context of a prepared statement, a null value means delete, 
 but we don’t have anything that means ignore (besides creating a new 
 prepared statement without the ignored column).
 Please refer to the original conversation on DataStax Java Driver mailing 
 list for more background:
 https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-7304) Ability to distinguish between NULL and UNSET values in Prepared Statements

2014-05-26 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-7304:
--

 Summary: Ability to distinguish between NULL and UNSET values in 
Prepared Statements
 Key: CASSANDRA-7304
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7304
 Project: Cassandra
  Issue Type: Improvement
Reporter: Drew Kutcharian


Currently Cassandra inserts tombstones when a value of a column is bound to 
NULL in a prepared statement. At higher insert rates managing all these 
tombstones becomes an unnecessary overhead. This limits the usefulness of the 
prepared statements since developers have to either create multiple prepared 
statements (each with a different combination of column names, which at times 
is just unfeasible because of the sheer number of possible combinations) or 
fall back to using regular (non-prepared) statements.

This JIRA is here to explore the possibility of either:
A. Have a flag on prepared statements that once set, tells Cassandra to ignore 
null columns

or

B. Have an UNSET value which makes Cassandra skip the null columns and not 
tombstone them

Basically, in the context of a prepared statement, a null value means delete, 
but we don’t have anything that means ignore (besides creating a new prepared 
statement without the ignored column).

Please refer to the original conversation on DataStax Java Driver mailing list 
for more background:
https://groups.google.com/a/lists.datastax.com/d/topic/java-driver-user/cHE3OOSIXBU/discussion



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (CASSANDRA-6672) Support for Microsecond Resolution Time UUIDs in CQL3

2014-02-06 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-6672:
--

 Summary: Support for Microsecond Resolution Time UUIDs in CQL3
 Key: CASSANDRA-6672
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6672
 Project: Cassandra
  Issue Type: Wish
  Components: Core
Reporter: Drew Kutcharian


Currently CQL3 supports time uuid based functions (now, unixtimestampof, 
dateof, ...) that deal with millisecond resolution time uuids. I think it will 
be a good idea to have the microsecond resolution version of those functions.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (CASSANDRA-5252) Starting Cassandra throws EOF while reading saved cache

2013-02-13 Thread Drew Kutcharian (JIRA)
Drew Kutcharian created CASSANDRA-5252:
--

 Summary: Starting Cassandra throws EOF while reading saved cache
 Key: CASSANDRA-5252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5252
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Drew Kutcharian
Assignee: Dave Brosius
Priority: Minor
 Fix For: 1.2.1
 Attachments: data.zip

Currently seeing nodes throw an EOF while reading a saved cache on the system 
schema when starting cassandra

 WARN 14:25:54,896 error reading saved cache 
/ssd/saved_caches/system-schema_columns-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:278)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:393)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:365)
at org.apache.cassandra.db.Table.initCf(Table.java:334)
at org.apache.cassandra.db.Table.init(Table.java:272)
at org.apache.cassandra.db.Table.open(Table.java:102)
at org.apache.cassandra.db.Table.open(Table.java:80)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:320)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:203)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:395)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:438)


to reproduce delete all data files, start a cluster, leave cluster up long 
enough to build a cache. nodetool drain, kill cassandra process. start 
cassandra process in foreground and note EOF thrown (see above for stack trace)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5252) Starting Cassandra throws EOF while reading saved cache

2013-02-13 Thread Drew Kutcharian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Drew Kutcharian updated CASSANDRA-5252:
---

Attachment: (was: 4916.txt)

 Starting Cassandra throws EOF while reading saved cache
 ---

 Key: CASSANDRA-5252
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5252
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Drew Kutcharian
Assignee: Dave Brosius
Priority: Minor
 Fix For: 1.2.1

 Attachments: data.zip


 Currently seeing nodes throw an EOF while reading a saved cache on the system 
 schema when starting cassandra
  WARN 14:25:54,896 error reading saved cache 
 /ssd/saved_caches/system-schema_columns-KeyCache-b.db
 java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:392)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
   at 
 org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:278)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:393)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:365)
   at org.apache.cassandra.db.Table.initCf(Table.java:334)
   at org.apache.cassandra.db.Table.init(Table.java:272)
   at org.apache.cassandra.db.Table.open(Table.java:102)
   at org.apache.cassandra.db.Table.open(Table.java:80)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:320)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:203)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:395)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:438)
 to reproduce delete all data files, start a cluster, leave cluster up long 
 enough to build a cache. nodetool drain, kill cassandra process. start 
 cassandra process in foreground and note EOF thrown (see above for stack 
 trace)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5252) Starting Cassandra throws EOF while reading saved cache

2013-02-13 Thread Drew Kutcharian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Drew Kutcharian updated CASSANDRA-5252:
---

Fix Version/s: (was: 1.2.1)
  Description: 
I just saw this exception happen on Cassandra 1.2.1. I thought this was fixed 
by CASSANDRA-4916. Was this part of the 1.2.1 release?

I'm on Mac OS X 10.8.2, Oracle JDK 1.7.0_11, using snappy-java 1.0.5-M3 from 
Maven (not sure if that's the cause).
I'm attaching my data and log directory as data.zip.

{code}
 WARN [main] 2013-02-12 17:50:11,714 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-schema
_columnfamilies-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:277)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:364)
at org.apache.cassandra.db.Table.initCf(Table.java:337)
at org.apache.cassandra.db.Table.init(Table.java:280)
at org.apache.cassandra.db.Table.open(Table.java:110)
at org.apache.cassandra.db.Table.open(Table.java:88)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:421)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:177)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:370)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:413)
 INFO [SSTableBatchOpen:1] 2013-02-12 17:50:11,722 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/schema_columns/syste
m-schema_columns-ib-6 (193 bytes)
 INFO [SSTableBatchOpen:2] 2013-02-12 17:50:11,722 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/schema_columns/syste
m-schema_columns-ib-5 (3840 bytes)
 INFO [main] 2013-02-12 17:50:11,725 AutoSavingCache.java (line 139) reading 
saved cache /Users/services/cassandra/data/saved_caches/system-schema_colum
ns-KeyCache-b.db
 WARN [main] 2013-02-12 17:50:11,725 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-schema
_columns-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:277)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:364)
at org.apache.cassandra.db.Table.initCf(Table.java:337)
at org.apache.cassandra.db.Table.init(Table.java:280)
at org.apache.cassandra.db.Table.open(Table.java:110)
at org.apache.cassandra.db.Table.open(Table.java:88)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:421)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:177)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:370)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:413)
 INFO [SSTableBatchOpen:1] 2013-02-12 17:50:11,736 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/local/system-local-i
b-14 (458 bytes)
 INFO [main] 2013-02-12 17:50:11,738 AutoSavingCache.java (line 139) reading 
saved cache /Users/services/cassandra/data/saved_caches/system-local-KeyCac
he-b.db
 WARN [main] 2013-02-12 17:50:11,739 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-local-
KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 

[jira] [Updated] (CASSANDRA-5252) Starting Cassandra throws EOF while reading saved cache

2013-02-13 Thread Drew Kutcharian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Drew Kutcharian updated CASSANDRA-5252:
---

Description: 
I just saw this exception happen on Cassandra 1.2.1. I thought this was fixed 
by CASSANDRA-4916. Wasn't CASSANDRA-4916 part of the 1.2.1 release?

I'm on Mac OS X 10.8.2, Oracle JDK 1.7.0_11, using snappy-java 1.0.5-M3 from 
Maven (not sure if that's the cause).
I'm attaching my data and log directory as data.zip.


{code}
 WARN [main] 2013-02-12 17:50:11,714 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-schema
_columnfamilies-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:277)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:364)
at org.apache.cassandra.db.Table.initCf(Table.java:337)
at org.apache.cassandra.db.Table.init(Table.java:280)
at org.apache.cassandra.db.Table.open(Table.java:110)
at org.apache.cassandra.db.Table.open(Table.java:88)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:421)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:177)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:370)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:413)
 INFO [SSTableBatchOpen:1] 2013-02-12 17:50:11,722 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/schema_columns/syste
m-schema_columns-ib-6 (193 bytes)
 INFO [SSTableBatchOpen:2] 2013-02-12 17:50:11,722 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/schema_columns/syste
m-schema_columns-ib-5 (3840 bytes)
 INFO [main] 2013-02-12 17:50:11,725 AutoSavingCache.java (line 139) reading 
saved cache /Users/services/cassandra/data/saved_caches/system-schema_colum
ns-KeyCache-b.db
 WARN [main] 2013-02-12 17:50:11,725 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-schema
_columns-KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:277)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:392)
at 
org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:364)
at org.apache.cassandra.db.Table.initCf(Table.java:337)
at org.apache.cassandra.db.Table.init(Table.java:280)
at org.apache.cassandra.db.Table.open(Table.java:110)
at org.apache.cassandra.db.Table.open(Table.java:88)
at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:421)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:177)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:370)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:413)
 INFO [SSTableBatchOpen:1] 2013-02-12 17:50:11,736 SSTableReader.java (line 
164) Opening /Users/services/cassandra/data/data/system/local/system-local-i
b-14 (458 bytes)
 INFO [main] 2013-02-12 17:50:11,738 AutoSavingCache.java (line 139) reading 
saved cache /Users/services/cassandra/data/saved_caches/system-local-KeyCac
he-b.db
 WARN [main] 2013-02-12 17:50:11,739 AutoSavingCache.java (line 160) error 
reading saved cache /Users/services/cassandra/data/saved_caches/system-local-
KeyCache-b.db
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at 
org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
at 
org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
at 
org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
at 
org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:277)

[jira] [Commented] (CASSANDRA-4916) Starting Cassandra throws EOF while reading saved cache

2013-02-12 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13577276#comment-13577276
 ] 

Drew Kutcharian commented on CASSANDRA-4916:


I just saw this same exception happen on Cassandra 1.2.1. Was this part of the 
1.2.1 release?
I'm on Mac OS X 10.8.2, Oracle JDK 1.7.0_11, using snappy-java 1.0.5-M3 from 
Maven (not sure if that's the cause).

I'm attaching my data and log directory as data.zip.


 Starting Cassandra throws EOF while reading saved cache
 ---

 Key: CASSANDRA-4916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael Kjellman
Assignee: Dave Brosius
Priority: Minor
 Fix For: 1.2.1

 Attachments: 4916.txt, data.zip


 Currently seeing nodes throw an EOF while reading a saved cache on the system 
 schema when starting cassandra
  WARN 14:25:54,896 error reading saved cache 
 /ssd/saved_caches/system-schema_columns-KeyCache-b.db
 java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:392)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
   at 
 org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:278)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:393)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:365)
   at org.apache.cassandra.db.Table.initCf(Table.java:334)
   at org.apache.cassandra.db.Table.init(Table.java:272)
   at org.apache.cassandra.db.Table.open(Table.java:102)
   at org.apache.cassandra.db.Table.open(Table.java:80)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:320)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:203)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:395)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:438)
 to reproduce delete all data files, start a cluster, leave cluster up long 
 enough to build a cache. nodetool drain, kill cassandra process. start 
 cassandra process in foreground and note EOF thrown (see above for stack 
 trace)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-4916) Starting Cassandra throws EOF while reading saved cache

2013-02-12 Thread Drew Kutcharian (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-4916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Drew Kutcharian updated CASSANDRA-4916:
---

Attachment: data.zip

 Starting Cassandra throws EOF while reading saved cache
 ---

 Key: CASSANDRA-4916
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4916
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Michael Kjellman
Assignee: Dave Brosius
Priority: Minor
 Fix For: 1.2.1

 Attachments: 4916.txt, data.zip


 Currently seeing nodes throw an EOF while reading a saved cache on the system 
 schema when starting cassandra
  WARN 14:25:54,896 error reading saved cache 
 /ssd/saved_caches/system-schema_columns-KeyCache-b.db
 java.io.EOFException
   at java.io.DataInputStream.readInt(DataInputStream.java:392)
   at 
 org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:349)
   at 
 org.apache.cassandra.service.CacheService$KeyCacheSerializer.deserialize(CacheService.java:378)
   at 
 org.apache.cassandra.cache.AutoSavingCache.loadSaved(AutoSavingCache.java:144)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.init(ColumnFamilyStore.java:278)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:393)
   at 
 org.apache.cassandra.db.ColumnFamilyStore.createColumnFamilyStore(ColumnFamilyStore.java:365)
   at org.apache.cassandra.db.Table.initCf(Table.java:334)
   at org.apache.cassandra.db.Table.init(Table.java:272)
   at org.apache.cassandra.db.Table.open(Table.java:102)
   at org.apache.cassandra.db.Table.open(Table.java:80)
   at org.apache.cassandra.db.SystemTable.checkHealth(SystemTable.java:320)
   at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:203)
   at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:395)
   at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:438)
 to reproduce delete all data files, start a cluster, leave cluster up long 
 enough to build a cache. nodetool drain, kill cassandra process. start 
 cassandra process in foreground and note EOF thrown (see above for stack 
 trace)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (CASSANDRA-4400) Correctly catch exception when Snappy cannot be loaded

2012-12-08 Thread Drew Kutcharian (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13527231#comment-13527231
 ] 

Drew Kutcharian commented on CASSANDRA-4400:


Is there a reason Cassandra's not using the pure Java version of Snappy? 
https://github.com/dain/snappy
The performance numbers are very similar. 
https://github.com/ning/jvm-compressor-benchmark/wiki

 Correctly catch exception when Snappy cannot be loaded
 --

 Key: CASSANDRA-4400
 URL: https://issues.apache.org/jira/browse/CASSANDRA-4400
 Project: Cassandra
  Issue Type: Bug
Affects Versions: 1.1.1
Reporter: Sylvain Lebresne
Assignee: Sylvain Lebresne
Priority: Minor
 Fix For: 1.1.3

 Attachments: 4400.txt


 From the mailing list, on C* 1.1.1:
 {noformat}
 INFO 14:22:07,600 Global memtable threshold is enabled at 35MB
 java.lang.reflect.InvocationTargetException
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:616)
 at 
 org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.clinit(SnappyCompressor.java:37)
 at org.apache.cassandra.config.CFMetaData.clinit(CFMetaData.java:76)
 at 
 org.apache.cassandra.config.KSMetaData.systemKeyspace(KSMetaData.java:79)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:439)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:118)
 at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:126)
 at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:353)
 at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:106)
 Caused by: java.lang.UnsatisfiedLinkError: no snappyjava in java.library.path
 at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1681)
 at java.lang.Runtime.loadLibrary0(Runtime.java:840)
 at java.lang.System.loadLibrary(System.java:1047)
 at 
 org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
 ... 17 more
 ERROR 14:22:09,934 Exception encountered during startup
 org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.clinit(SnappyCompressor.java:37)
 at org.apache.cassandra.config.CFMetaData.clinit(CFMetaData.java:76)
 at 
 org.apache.cassandra.config.KSMetaData.systemKeyspace(KSMetaData.java:79)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:439)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.clinit(DatabaseDescriptor.java:118)
 at 
 org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:126)
 at 
 org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:353)
 at 
 org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:106)
 org.xerial.snappy.SnappyError: [FAILED_TO_LOAD_NATIVE_LIBRARY] null
 at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:229)
 at org.xerial.snappy.Snappy.clinit(Snappy.java:44)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
 at 
 org.apache.cassandra.io.compress.SnappyCompressor.clinit(SnappyCompressor.java:37)
 at org.apache.cassandra.config.CFMetaData.clinit(CFMetaData.java:76)
 at 
 org.apache.cassandra.config.KSMetaData.systemKeyspace(KSMetaData.java:79)
 at 
 org.apache.cassandra.config.DatabaseDescriptor.loadYaml(DatabaseDescriptor.java:439)