[jira] [Commented] (CASSANDRA-7296) Add CL.COORDINATOR_ONLY

2016-10-10 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563258#comment-15563258
 ] 

 Brian Hess commented on CASSANDRA-7296:


Consistency Level does feel like the right approach. The THIS_ prefix is in 
line with LOCAL_ in that it would identify the locus of nodes that are 
available for consistency. With LOCAL_ONE, we need just one replica from the 
data center of this coordinator. If no replicas exist (like the RF=0) then you 
get UnavailableException. Namely, you don't reach out to other nodes and proxy 
for another DC, etc. Also note that while the client can certainly see that 
it's talking to a DC with no RF by looking at system tables or driver API 
calls, we still throw the UnavailableException. 

In the THIS_ONE, we are saying that the locus of available nodes for 
consistency level is just the coordonator itself. If that node is not a 
replica, then it should also throw an UnavailableException. It should not 
silently go ask the actual replicas, just like in the LOCAL_ONE case we don't 
ask other DCs. While it is true that the client could know that this node is 
not a replica, it is the same as in LOCAL_ONE and RF. 

> Add CL.COORDINATOR_ONLY
> ---
>
> Key: CASSANDRA-7296
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7296
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>
> For reasons such as CASSANDRA-6340 and similar, it would be nice to have a 
> read that never gets distributed, and only works if the coordinator you are 
> talking to is an owner of the row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7296) Add CL.COORDINATOR_ONLY

2016-10-10 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15562407#comment-15562407
 ] 

 Brian Hess commented on CASSANDRA-7296:


Just my 2 cents, but having this be a per-table option is not really a great 
solution for the debugging issue.  I'd be okay if we had that as a default, but 
we'd certainly need to support having this behavior even if that table option 
isn't set (it would be unfortunate to have to ALTER the table to get this 
behavior).

> Add CL.COORDINATOR_ONLY
> ---
>
> Key: CASSANDRA-7296
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7296
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>
> For reasons such as CASSANDRA-6340 and similar, it would be nice to have a 
> read that never gets distributed, and only works if the coordinator you are 
> talking to is an owner of the row.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-12216) TTL Reading And Writing is Asymmetric

2016-07-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384557#comment-15384557
 ] 

 Brian Hess commented on CASSANDRA-12216:
-

This looks like it should apply to previous versions, going back to 2.1 even.  
The only behavior change is an undocumented behavior of throwing an 
InvalidQueryException when providing a null TTL.

> TTL Reading And Writing is Asymmetric 
> --
>
> Key: CASSANDRA-12216
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12216
> Project: Cassandra
>  Issue Type: Bug
>  Components: CQL
>Reporter: Russell Spitzer
>Assignee: Russell Spitzer
>Priority: Minor
> Attachments: 12216-3.7-2.txt, 12216-3.7.txt
>
>
> There is an inherent asymmetry in the way TTL's are read and Written. 
> An `TTL` of 0 when written becomes a `null` in C*
> When read, this `TTL` becomes a `null` 
> The `null` cannot be written back to C* as `TTL`
> This means that end users attempting to copy tables with TTL have to do 
> manual mapping of the null TTL values to 0 to avoid NPE. This is a bit 
> onerous when C* seems to have an internal logic that 0 == NULL. I don't think 
> C* should return values which are not directly insertable back to C*. 
> Even with the advent CASSANDRA-7304 this still remains a problem that the 
> User needs to be aware of and take care of.
> The following prepared statement
> {code}
> INSERT INTO test.table2 (k,v) (?,?) USING TTL: ? 
> {code}
> Will throw NPEs unless we specifically check that the value to be bound to 
> TTL is not null.
> I think we should discuss whether `null` should be treated as 0 in TTL for 
> prepared statements. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15335986#comment-15335986
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

That largely covers my thoughts, with a couple things:
1. Change "us" to "u" to be consistent with InfluxDB, and to keep the 
abbreviations consistently to 1 letter.
2. Change the "ns" to "n" to keep all the abbreviations consistently to one 
letter. 
3. Change "y" to "Y" so year and month are both capital. I'm not sure exactly, 
but I would be inclined to changing "d" to "D" and "w" to "W" so units smaller 
than a day are lower-case and the others are upper-case, for user-friendliness. 

Two other suggestions/questions:
a) Is it possible to change Month or Minute to be something other than M/m so 
that we could make these units case insensitive? I don't have a great 
suggestion there, but I have no good suggestion there (maybe M for month and N 
for minute? Like I said, not great).
b) What CQL type has nanosecond resolution? Is there any way someone can do any 
arithmetic on a CQL type where it would do anything with "4ns"? now() is a 
timestamp which has millisecond resolution. So, I actually don't see the data 
type that has anything fine grained enough here. If that's the case we can get 
rid of "ns". I'm not sure what had microsecond resolution either now that I 
look at our docs, so perhaps the same for "u". 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15334052#comment-15334052
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

I will save the discussion/debate on the relationship between CQL and SQL to 
another venue. The reason to bring it up is in the context of user/developer  
experience and usability. If SQL has an approach then we should consider it, 
but if we can do better then by all means we should do that instead (which I 
think nobody is debating). 

A few comments:
1. We should certainly consider the month and year durations. These are common 
uses and we should at least sketch out how we would support that (if not also 
implement it in this ticket - which I think we should do). 
2. How would we abbreviate the example that Sylvain proposes "1 year 2 months 3 
days 4 hours 5 minutes 6 seconds"? Specifically, what is the abbreviation for 
months and minutes? ISO 8601 has M for both, but the P/T format allows for 
disambiguation. 
3. With respect to ISO 8601 that Postgres does also support, if someone bothers 
to read the CQL documentation on Date formats for Timestamp types he will find 
that it states "A timestamp type can be entered as an integer for CQL input, or 
as a string literal in any of the following ISO 8601 formats" 
(https://docs.datastax.com/en/cql/3.3/cql/cql_reference/timestamp_type_r.html). 
So, C* already chose ISO 8601 for Date formats. For consistency with CQL itself 
we should seriously consider making the same choice for durations. 
4. According to the C* documentation, the TIMESTAMP data type, which is what is 
returned from the Now() call, is the "number of milliseconds since the standard 
base time known as the epoch". How are we going to support microseconds and 
nanoseconds? Even Version 1 UUIDs (UUID/TimeUUID format for C*) don't support 
nanosecond resolution. 
5. If we choose to stick with the current bespoke syntax, I suggest moving at 
least to the Influx format. That leaves 2 items:
a) change microseconds from "us" to "u", which is what Influx uses 
b) support weeks with the "w" abbreviation. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529
 ] 

 Brian Hess edited comment on CASSANDRA-11873 at 6/16/16 10:27 AM:
---

Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, if one bothers to read 
https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#time-syntax-in-queries,
 he would find that this patch only had a subset of Influx's supported format. 
I don't see a week unit. Moreover, Influx doesn't use "us", it uses just "u". 
So, our proposed syntax isn't even consistent (in subtle ways) with Influx's 
format. Let alone that Influx's format is incomplete (specifically, no support 
for months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 


was (Author: brianmhess):
Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, this patch only had a subset of Influx's supported format. I don't see 
a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our 
proposed syntax isn't even consistent (in subtle ways) with Influx's format. 
Let alone that Influx's format is incomplete (specifically, no support for 
months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This 

[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15333529#comment-15333529
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

Being subtlety different on syntax is in some cases worse than being very 
different. So, if we are thinking we will go with ISO 8601 format (an option 
that could make sense - it is a widely recognized format and present in more 
than a few systems (not just databases, I mean)) then we should make sure we 
include the "P" and the "T". 

While Postgres does support ISO 8601 formats (of course I bothered to read it), 
in that format the highest resolution is seconds. There is a good reason to 
want milliseconds and microseconds (and maybe nanoseconds). The standard 
Postgres format support all of these (with the exception of nanoseconds, though 
that addition to their format would be straightforward to understand). If you 
want to shorten the Postgres format to save typing, what abbreviation do you 
propose for "minute" and "month"?

I will certainly agree that the Oracle syntax is not user-friendly. I think 
arguing it is desirable is a stretch. 

I have a reservation on the Influx syntax here, though. Influx does not support 
month or year. They only have up to week 
(https://docs.influxdata.com/influxdb/v0.13/query_language/data_exploration/#relative-time).
 So, it is not possible to say "now() - 2 months" or "now() - 1 year". To do 1 
year, what would you do? "now() - 365d"? What about leap year? What about going 
back one month?
In fact, this patch only had a subset of Influx's supported format. I don't see 
a week unit. Moreover, Influx doesn't use "us", it uses just "u". So, our 
proposed syntax isn't even consistent (in subtle ways) with Influx's format. 
Let alone that Influx's format is incomplete (specifically, no support for 
months and years). 

Of the formats I've seen here, Postgres native format is the most user 
friendly, and accomplishes the goals of durations for us. I'm (non-PMC, 
non-binding) -1 on the currently proposed format from a 
usability/product/developer POV. 

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-15 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15332685#comment-15332685
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

So, one question about syntax.  SQL has a time interval type with its own 
syntax.  Instead of inventing additional syntax, what about taking on the 
Postgres or Oracle/SQL-Standard syntax.

For example, Postgres uses "1 day ago" or "-1 day" (or hour, minute, etc): 
https://www.postgresql.org/docs/current/static/datatype-datetime.html

Oracle uses "INTERVAL '1' HOUR": 
https://docs.oracle.com/cd/B28359_01/server.111/b28286/sql_elements003.htm#i38598

I suggest we choose something similar to one that exists, rather than create 
our own new syntax.


> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11873) Add duration type

2016-06-14 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15330154#comment-15330154
 ] 

 Brian Hess commented on CASSANDRA-11873:
-

This raises an interesting question.  Will you be able to bind to the duration? 
 As in:
session.prepare("SELECT * FROM myTable WHERE pkey=5 AND clustCol > now() - ?")
or
session.prepare("SELECT * FROM myTable WHERE pkey=5 AND clustCol > now() - ?h")

Because you can do the following, right?
session.prepare("SELECT * FROM myTable WHERE pkey=5 AND clustCol > 
DateDiff(now, ?)")

> Add duration type
> -
>
> Key: CASSANDRA-11873
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11873
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>  Labels: client-impacting, doc-impacting
> Fix For: 3.x
>
>
> For CASSANDRA-11871 or to allow queries with {{WHERE}} clause like:
> {{... WHERE reading_time < now() - 2h}}, we need to support some duration 
> type.
> In my opinion, it should be represented internally as a number of 
> microseconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320976#comment-15320976
 ] 

 Brian Hess commented on CASSANDRA-7622:


+1

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320933#comment-15320933
 ] 

 Brian Hess commented on CASSANDRA-7622:


For metrics, Postgres appears to use a SELECT syntax. For system settings they 
use SHOW/SET. 

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320913#comment-15320913
 ] 

 Brian Hess commented on CASSANDRA-7622:


I think this is a bit narrow of thinking. I can see many use cases for 
programmatic access to these values. If SHOW VARIABLE is useful in cqlsh it 
could have special cqlsh commands that convert into this. 
If you want to mix regular CQL and these, you'd need different Sessions as 
you'd need separate Cluster objects (so you can have a WhiteListPolicy for the 
one node of interest). Or if you want to look at the metrics for multiple nodes 
you would need one Cluster object per node.  

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320876#comment-15320876
 ] 

 Brian Hess commented on CASSANDRA-7622:


Jinx

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320874#comment-15320874
 ] 

 Brian Hess commented on CASSANDRA-7622:


If the plan is to enable writing to these virtual tables, then we should think 
more about a SELECT type syntax. That would make the future UPDATE operations 
make a lot more sense. And if we are thinking that eventually we will enable 
user-defined VTs then INSERT and DELETE operations may make sense. There really 
is no need for special syntax here, so I'd caution against that. 

For that reason, I'd suggest we stick with SELECT syntax even in this first use 
case - in a forward thinking way. 

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15320846#comment-15320846
 ] 

 Brian Hess commented on CASSANDRA-7622:


I'm still curious how you would get the JMX values (SHOW VARIABLES or whatever) 
for a particular node. That is, how would this look syntactically (WHERE 
clause? Special CQL?)? And if it's something like a WHERE clause, then how will 
the driver route the query correctly? Would it be through a custom 
LoadBalancingPolicy? If so, then it could be hard to mix normal CQL with these 
(as you'd want a different LBP for the normal CQL queries). 

> Implement virtual tables
> 
>
> Key: CASSANDRA-7622
> URL: https://issues.apache.org/jira/browse/CASSANDRA-7622
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Tupshin Harper
>Assignee: Jeff Jirsa
> Fix For: 3.x
>
>
> There are a variety of reasons to want virtual tables, which would be any 
> table that would be backed by an API, rather than data explicitly managed and 
> stored as sstables.
> One possible use case would be to expose JMX data through CQL as a 
> resurrection of CASSANDRA-3527.
> Another is a more general framework to implement the ability to expose yaml 
> configuration information. So it would be an alternate approach to 
> CASSANDRA-7370.
> A possible implementation would be in terms of CASSANDRA-7443, but I am not 
> presupposing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-7622) Implement virtual tables

2016-06-06 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15316875#comment-15316875
 ] 

 Brian Hess commented on CASSANDRA-7622:


It appears there are a few different things that could be accomplished with 
Virtual Tables, and I'm curious which one this ticket will focus on - and I'd 
suggest that we consider breaking out the full set of things below into 
separate tickets as I think some deserve more discussion/debate than others.
1. CQL access to internal metrics and management.  This would be access to 
metrics (such as JMX) and management settings (such as cassandra.yaml 
settings).  These could be covered by read-only access to built-in virtual 
tables (update/write access is below).
2. CQL access to updating management settings.  This is different than 
management actions (such as `nodetool flush` or repair).
3. CQL access to management actions.  This is the category of things such as 
`nodetool flush`, repair, drain, etc.
4. CQL access to arbitrary classes that implement user-defined behavior.

I think that 1 is the least controversial, and clearly of high value.  My 
suggestion is that this be built-in Virtual Tables, perhaps in system._vt_*.  
For example system._vt_metrics as SELECT access to the JMX metrics, or 
system._vt_settings as SELECT access to yaml/command-line/etc settings.  These 
would be predefined virtual tables.  I am concerned about how we will indicate 
how we specify which node's metrics we are querying, and how we route the query 
to that node (more on this below - it applies to all 4 (and more if we come up 
with more) scenarios).

I also think that 2 is not very controversial.  This would allow updating 
settings.  That could be, for example via a CQL UPDATE command (e.g., `UPDATE 
system._vt_settings SET max_hint_window_in_ms = 1080 WHERE node='1.2.3.4'` 
- for illustrative purposes).  The reason why this is different than 3 is that 
after doing the update, a read will return the same value, since it was 
changed.  Would we want to support `UPDATE system._vt_settings SET 
max_hint_window_in_ms = 1080` and update all nodes, which is not valid CQL 
since we don't specify a partition key?  

I'm not sure about 3, and I think it should be discussed further.  The thing is 
that something like `ALTER system_virtual.actions SET drain=true` or really 
`UPDATE system_virtual.actions SET drain=true` what would the value be if we 
SELECT it after the ALTER/UPDATE?  It feels like it would be set back to FALSE 
after the drain, which semantically feels strange.  I like [~snazy]'s 
suggestion of completely separate CQL syntax for these actions.  Maybe 
something like `EXECUTE ACTION drain ON '1.2.3.4'` - again, syntax just for 
illustrative purposes.  The semantics here is that the system is being altered, 
but outside a table, so a SELECT doesn't have the confusion as above.  As 
[~snazy] pointed out, we would also need to make sure we are careful about 
access to this capability (perhaps user/role permissions is sufficient, and we 
just need to add new permission types).

4 is probably the one that causes the most pause, IMO, as there would be 
arbitrary user code running in the read and write path.  We need to take care 
with that - as we have with UDFs.  It isn't exactly clear to me what the goal 
is for arbitrary virtual tables.  What are we trying to accomplish with this 
extension point?

As I mentioned above, I'm curious how we will route the query (SELECT, UPDATE, 
EXECUTE, whatever) to the right node.  In the simple case of SELECT of JMX 
metrics, how would we query the JMX metrics of node 1.2.3.4?  These 
queries/operations are per-node, not cluster-wide, so we need to think about 
that.  Is that the partition key of the virtual table (` PARTITION BY 
((node), ...)`)?  How would we route that internally in CQL (and with awareness 
on the driver)?  

It sounds like it would require a special partitioner (like system tables have 
- or maybe ByteOrderedPartitioner could work with a customer replication 
strategy, but with Murmur3Partitioner things are hard (it's essentially hash 
inversion)) and possibly replication strategy (I don't think that a replication 
strategy alone would cover this, right [~jjordan]).  If these are fully 
user-defined (as [~jjirsa] suggests), then we need to have a way to specify 
partitioners at the table (or at least keyspace) level, as opposed to cluster 
wide (which would actually be a nice change, IMO).  If they are built-in, then 
we can cover this case like we do with system keyspace tables.

Another approach may be to have the drivers deal with the routing.  The Virtual 
Tables could be special cases for routing purposes and have an API call to 
specify which node to query - or the LoadBalancingPolicy could inspect the 
Statement and route appropriately, which could work for any/all partitioners 
and/or re

[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-06-03 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314193#comment-15314193
 ] 

 Brian Hess commented on CASSANDRA-10783:
-

[~slebresne] - okay, but let's open that ticket, then.

With respect to function resolution with literals (or non-literals, for that 
matter), the topic of function overloading that you point to (with your Add 
example), Postgres has some guidance on what can/should be done.  It would be 
better to match *something* than to create something completely different.

> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10783) Allow literal value as parameter of UDF & UDA

2016-06-03 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15314150#comment-15314150
 ] 

 Brian Hess commented on CASSANDRA-10783:
-

So, I think it would be good to pick up this topic in this ticket.  
Furthermore, I think there is some direction to take from SQL in terms of 
implicit types.

Looking at Postgres' documentation (here: 
https://www.postgresql.org/docs/current/static/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS),
 they do the following

{quote}
A numeric constant that contains neither a decimal point nor an exponent is 
initially presumed to be type integer if its value fits in type integer (32 
bits); otherwise it is presumed to be type bigint if its value fits in type 
bigint (64 bits); otherwise it is taken to be type numeric. Constants that 
contain decimal points and/or exponents are always initially presumed to be 
type numeric.
{quote}

As this pertains to functions and function overloading, the appropriate 
Postgres documentation is here: 
https://www.postgresql.org/docs/current/static/typeconv-func.html


> Allow literal value as parameter of UDF & UDA
> -
>
> Key: CASSANDRA-10783
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10783
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: DOAN DuyHai
>Assignee: Robert Stupp
>Priority: Minor
>  Labels: CQL3, UDF, client-impacting, doc-impacting
> Fix For: 3.x
>
>
> I have defined the following UDF
> {code:sql}
> CREATE OR REPLACE FUNCTION  maxOf(current int, testValue int) RETURNS NULL ON 
> NULL INPUT 
> RETURNS int 
> LANGUAGE java 
> AS  'return Math.max(current,testValue);'
> CREATE TABLE maxValue(id int primary key, val int);
> INSERT INTO maxValue(id, val) VALUES(1, 100);
> SELECT maxOf(val, 101) FROM maxValue WHERE id=1;
> {code}
> I got the following error message:
> {code}
> SyntaxException:  message="line 1:19 no viable alternative at input '101' (SELECT maxOf(val1, 
> [101]...)">
> {code}
>  It would be nice to allow literal value as parameter of UDF and UDA too.
>  I was thinking about an use-case for an UDA groupBy() function where the end 
> user can *inject* at runtime a literal value to select which aggregation he 
> want to display, something similar to GROUP BY ... HAVING 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11871) Allow to aggregate by time intervals

2016-05-31 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307722#comment-15307722
 ] 

 Brian Hess commented on CASSANDRA-11871:
-

I also agree with [~slebresne] that it would be a shame if only the one 
function was supported, and am +1 on adding a property to built-in and 
user-defined functions that indicate whether the function is monotonic.

> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> For time series data it can be usefull to aggregate by time intervals.
> The idea would be to add support for one or several functions in the {{GROUP 
> BY}} clause.
> Regarding the implementation, even if in general I also prefer to follow the 
> SQL syntax, I do not believe it will be a good fit for Cassandra.
> If we have a table like:
> {code}
> CREATE TABLE trades
> {
> symbol text,
> date date,
> time time,
> priceMantissa int,
> priceExponent tinyint,
> volume int,
> PRIMARY KEY ((symbol, date), time)
> };
> {code}
> The trades will be inserted with an increasing time and sorted in the same 
> order. As we can have to process a large amount of data, we want to try to 
> limit ourself to the cases where we can build the groups on the flight (which 
> is not a requirement in the SQL world).
> If we want to get the number of trades per minutes with the SQL syntax we 
> will have to write:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
> which is fine. The problem is that if the user invert by mistake the 
> functions like that:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
> the query will return weird results.
> The only way to prevent that would be to check the function order and make 
> sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
> second(time)}}).
> In my opinion a function like {{floor(, )}} will be 
> much better as it does not allow for this type of mistakes and is much more 
> flexible (you can create 5 minutes buckets if you want to).
> {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = 
> '2016-01-11' GROUP BY floor(time, m);}}
> An important aspect to keep in mind with a function like {{floor}} is the 
> starting point. For a query like:  {{SELECT floor(time, m), count() FROM 
> Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' 
> AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the 
> result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11871) Allow to aggregate by time intervals

2016-05-31 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307720#comment-15307720
 ] 

 Brian Hess commented on CASSANDRA-11871:
-

It may not be totally important here, but the following are not the same thing:
1. SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
AND date = '2016-01-11' GROUP BY minute(time);
2. SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = 
'2016-01-11' GROUP BY floor(time, m);

In the first we would get exactly 60 groups (one for every minute of the hour, 
assuming data for every minute of the hour).  That is data for 12:01, 1:01, 
2:01, etc would all be in this bucket.
In the second we would get 1440 groups (one for every minute of the day, 
assuming data for every minute of the day).  That is data for 12:01, 1:01, and 
2:01 would each be in their own groups.

Sometimes people will want to group by both the Hour and the Minute (and other 
combinations, but I'm just following the example here) and get 1440 groups, and 
other times they really will want to group by the Minute and get 60 groups.  
Both use cases are necessary.

It is worth noting that the Minute (and Hour and Day, etc) functions are *not* 
monotonic.  When it wraps around the value drops from 59 to 0 (or 11 to 0, or 
28/30/31 to 1, etc).  The Floor function, however, is monotonic because what 
you get is the full date/time truncated to the desired resolution (no wrapping).

With respect to the starting ranges for things like "WHERE time > now - 10h 
GROUP BY Floor(time, 1h)", I think Benjamin is right with saying that if now is 
12:21, that the starting point is then 2:21 and the bucket starts 2:00.  The 
strange thing is that this first time bucket (2:00) will only be partially 
filled since it will only have the data from 2:21-2:59, where all the other 
buckets (except the 12:00 one, which only has data for 12:00-12:21) will be 
full hours.  Of course, the proper way to specify that is probably "WHERE time 
> Floor(now - 10h, 1h) GROUP BY Floor(time, 1h)" - will that be supported 
[~blerer]?

> Allow to aggregate by time intervals
> 
>
> Key: CASSANDRA-11871
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11871
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> For time series data it can be usefull to aggregate by time intervals.
> The idea would be to add support for one or several functions in the {{GROUP 
> BY}} clause.
> Regarding the implementation, even if in general I also prefer to follow the 
> SQL syntax, I do not believe it will be a good fit for Cassandra.
> If we have a table like:
> {code}
> CREATE TABLE trades
> {
> symbol text,
> date date,
> time time,
> priceMantissa int,
> priceExponent tinyint,
> volume int,
> PRIMARY KEY ((symbol, date), time)
> };
> {code}
> The trades will be inserted with an increasing time and sorted in the same 
> order. As we can have to process a large amount of data, we want to try to 
> limit ourself to the cases where we can build the groups on the flight (which 
> is not a requirement in the SQL world).
> If we want to get the number of trades per minutes with the SQL syntax we 
> will have to write:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY hour(time), minute(time);}}
> which is fine. The problem is that if the user invert by mistake the 
> functions like that:
> {{SELECT hour(time), minute(time), count() FROM Trades WHERE symbol = 'AAPL' 
> AND date = '2016-01-11' GROUP BY minute(time), hour(time);}}
> the query will return weird results.
> The only way to prevent that would be to check the function order and make 
> sure that we do not allow to skip functions (e.g. {{GROUP BY hour(time), 
> second(time)}}).
> In my opinion a function like {{floor(, )}} will be 
> much better as it does not allow for this type of mistakes and is much more 
> flexible (you can create 5 minutes buckets if you want to).
> {{SELECT floor(time, m), count() FROM Trades WHERE symbol = 'AAPL' AND date = 
> '2016-01-11' GROUP BY floor(time, m);}}
> An important aspect to keep in mind with a function like {{floor}} is the 
> starting point. For a query like:  {{SELECT floor(time, m), count() FROM 
> Trades WHERE symbol = 'AAPL' AND date = '2016-01-11' AND time >= '01:30:00' 
> AND time =< '07:30:00' GROUP BY floor(time, 2h);}}, I think that ideally the 
> result should return 3 groups: {{01:30:00}}, {{03:30:00}} and {{05:30:00}}.  
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8488) Filter by UDF

2016-04-29 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15264523#comment-15264523
 ] 

 Brian Hess commented on CASSANDRA-8488:


There is a way around CASSANDRA-8273 for UDFs, but it might not be the best 
(though, then again, there doesn't appear to be a solution for 8273).  We could 
not have replica-based filtering if the filter is a UDF.  That is, you will 
bring all rows that meet the *other* filters back to the coordinator and do the 
filtering of the UDF(s) on the replica after ConsistencyLevel is applied.

> Filter by UDF
> -
>
> Key: CASSANDRA-8488
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8488
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL, Local Write-Read Paths
>Reporter: Jonathan Ellis
>  Labels: client-impacting, cql, udf
> Fix For: 3.x
>
>
> Allow user-defined functions in WHERE clause with ALLOW FILTERING.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-04-21 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15251904#comment-15251904
 ] 

 Brian Hess commented on CASSANDRA-8844:


I think that restricting a user to not be able to ALTER any table in side a 
keyspace that has CDC enabled is a bit too much.  Additionally, I see use cases 
where the keyspace exists already and is in use and then a user layers in CDC - 
namely, that CDC is enabled after the keyspace exists.  So, I'm -1 on saying 
that CDC needs to be there at keyspace creation time.  One big reason is that 
if I want to add CDC later, there isn't a good way in Cassandra to create a new 
keyspace (with CDC enabled) and then copy all the data from the first one to 
the second one (no INSERT-SELECT in Cassandra).  So, I'd be stuck.

As for the durability question, I think we should throw an error if someone 
wants to set the keyspace to have DURABLE_WRITES=false and CDC enabled - or to 
alter the keyspace to put it in that position.  I do not like the idea of 
automagically changing the DURABLE_WRITES=true for the user.  I don't like that 
surprise behavior.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles w

[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

2016-04-11 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15235653#comment-15235653
 ] 

 Brian Hess commented on CASSANDRA-11521:
-

This is configurable, but the default for reads is LOCAL_ONE and on writes is 
LOCAL_QUORUM. 

> Implement streaming for bulk read requests
> --
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.x
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

2016-04-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232771#comment-15232771
 ] 

 Brian Hess commented on CASSANDRA-11521:
-

Ah - that's a good point (about internal things for other CLs).

> Implement streaming for bulk read requests
> --
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.x
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11521) Implement streaming for bulk read requests

2016-04-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15232715#comment-15232715
 ] 

 Brian Hess commented on CASSANDRA-11521:
-

[~slebresne] - I like this new approach better.  I think it simplifies things a 
bit and I'm worried about the server easily overpowering the client, which I 
think could be really easy to do (then we'd have to think about things like 
back-pressure, etc).  There could be a way to tell the server that the client 
is going to ask for all (or a lot) of the pages, so keep this stuff ready to 
flow, etc. Additionally we could have a setting that will tell the server "if 
you haven't heard me ask for the next page (or given some heartbeat) in a X 
long, then feel free to clean things up and throw an error if I ask for the 
next page later", or something, so that we don't have resources tied up even if 
the client dies.

WRT CL, with this approach, I don't quite see why you would have to stick to 
CL_ONE here.  That said, starting with CL_ONE and "growing" to other CL's is 
probably okay.  Just not entirely sure what it gains given this new approach.

> Implement streaming for bulk read requests
> --
>
> Key: CASSANDRA-11521
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11521
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Local Write-Read Paths
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 3.x
>
>
> Allow clients to stream data from a C* host, bypassing the coordination layer 
> and eliminating the need to query individual pages one by one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-2527) Add ability to snapshot data as input to hadoop jobs

2016-03-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-2527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202252#comment-15202252
 ] 

 Brian Hess commented on CASSANDRA-2527:


[~iamaleksey] - this ticket isn't only applicable to Hadoop, but to any 
analytics engine, including Spark.  I think we should either reopen this ticket 
and retarget to Spark (though the same mechanism would be interesting to 
Hadoop, Flink, etc) or open a new one with the word "Spark" replacing the word 
"Hadoop", but otherwise the exact same idea.

> Add ability to snapshot data as input to hadoop jobs
> 
>
> Key: CASSANDRA-2527
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2527
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jeremy Hanna
>Priority: Minor
>  Labels: hadoop
>
> It is desirable to have immutable inputs to hadoop jobs for the duration of 
> the job.  That way re-execution of individual tasks do not alter the output.  
> One way to accomplish this would be to snapshot the data that is used as 
> input to a job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

2016-03-04 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15179992#comment-15179992
 ] 

 Brian Hess commented on CASSANDRA-10707:
-

A quick question/clarification on the GROUP BY and ORDER BY discussion from 
above.  The following are valid SQL:

{code:SQL}
SELECT pkey, ccol1, Max(x) As maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
ccol1 pkey;
SELECT pkey, ccol1, Max(x) As maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
maxx;
{code}

I think you are suggesting that the only real ordering that is allowed is the 
native ordering in the CQL tables.  Specifically, these 2 queries would not be 
supported.  Is that correct?

I think the logic is more like that the following CQL

{code:SQL}
SELECT pkey, ccol1 Max(x) AS maxx FROM myTable GROUP BY pkey, ccol1 ORDER BY 
pkey, ccol1
{code}

turns into

{code:SQL}
SELECT pkey, ccol1, Max(x) AS maxx FROM (
  SELECT pkey, ccol1, x FROM myTable ORDER BY pkey, ccol1
) AS sub1 GROUP BY pkey, ccol1;
{code}

That means that the ORDER BY clause must work in that inner query as valid CQL. 
 More generally:

{code:SQL}
SELECT [grouping columns], [aggregate function]([aggregate columns]) FROM 
[table] GROUP BY [grouping columns] ORDER BY [ordering columns]
{code}

Must satisfy the transformation to:

{code:SQL}
SELECT [grouping columns], [aggregate function]([aggregate columns]) FROM (
  SELECT [grouping columns], [aggregate columns] FROM [table] ORDER BY 
[ordering columns]
) AS sub1 GROUP BY [grouping columns]
{code}

And specifically that the inner sub-query is valid CQL, namely:

{code:SQL}
SELECT [grouping columns], [aggregate columns] FROM [table] ORDER BY [ordering 
columns]
{code}

That is certainly different than SQL, which does not have this restriction.  
I'm +0.5 on having the syntax be the same as SQL as I think it is slightly 
better than the alternative.  I'm just noting that the semantics really are a 
bit different and there are more restrictions with the ORDER BY clause in CQL 
with this ticket than in SQL.  That nuance needs to be called out in the 
documentation or folks will certainly run into the error.

I would also add that if someone uses an incorrect ORDER BY, the error should 
not only call out that it is an error, but also indicate what sorts of ORDER BY 
clauses are supported.

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-11053) COPY FROM on large datasets: fix progress report and debug performance

2016-02-03 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15130553#comment-15130553
 ] 

 Brian Hess commented on CASSANDRA-11053:
-

[~Stefania] - I believe that is the correct invocation.  However, that 
performance is about half of what I saw when I ran against the same setup 
(r3.2xlarge client writing to a cluster of 8 i2.2xlarge machines).  The rate I 
had was 98K writes/sec (total time was 209 seconds).

> COPY FROM on large datasets: fix progress report and debug performance
> --
>
> Key: CASSANDRA-11053
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11053
> Project: Cassandra
>  Issue Type: Bug
>  Components: Tools
>Reporter: Stefania
>Assignee: Stefania
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.x
>
> Attachments: copy_from_large_benchmark.txt, 
> copy_from_large_benchmark_2.txt, parent_profile.txt, parent_profile_2.txt, 
> worker_profiles.txt, worker_profiles_2.txt
>
>
> Running COPY from on a large dataset (20G divided in 20M records) revealed 
> two issues:
> * The progress report is incorrect, it is very slow until almost the end of 
> the test at which point it catches up extremely quickly.
> * The performance in rows per second is similar to running smaller tests with 
> a smaller cluster locally (approx 35,000 rows per second). As a comparison, 
> cassandra-stress manages 50,000 rows per second under the same set-up, 
> therefore resulting 1.5 times faster. 
> See attached file _copy_from_large_benchmark.txt_ for the benchmark details.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-21 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15111571#comment-15111571
 ] 

 Brian Hess commented on CASSANDRA-10707:
-

Can a UDF be used as the GROUP BY key, such as:
SELECT partitionKey, clusteringCol1, Udf(clusteringCol2), Max( x ) FROM myData 
WHERE partitionKey=5 GROUP BY partitionKey, clusteringCol1, Udf(clusteringCol2);

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-20 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109181#comment-15109181
 ] 

 Brian Hess edited comment on CASSANDRA-10707 at 1/20/16 7:07 PM:
--

Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max( x ) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, x INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum( x ) FROM myData GROUP BY 
partitionKey, clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.  So, the 
assumption that the data will be sorted when it arrives to the coordinator 
might not be true in all cases.


was (Author: brianmhess):
Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(y) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, y INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(y) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.  So, the 
assumption that the data will be sorted when it arrives to the coordinator 
might not be true in all cases.

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-20 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109181#comment-15109181
 ] 

 Brian Hess edited comment on CASSANDRA-10707 at 1/20/16 7:06 PM:
--

Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(y) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, y INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(y) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.  So, the 
assumption that the data will be sorted when it arrives to the coordinator 
might not be true in all cases.


was (Author: brianmhess):
Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(x) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, x INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(x) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.  So, the 
assumption that the data will be sorted when it arrives to the coordinator 
might not be true in all cases.

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-20 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109181#comment-15109181
 ] 

 Brian Hess edited comment on CASSANDRA-10707 at 1/20/16 7:04 PM:
--

Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(x) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, x INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(x) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.  So, the 
assumption that the data will be sorted when it arrives to the coordinator 
might not be true in all cases.


was (Author: brianmhess):
Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(x) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, x INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(x) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-20 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109181#comment-15109181
 ] 

 Brian Hess commented on CASSANDRA-10707:
-

Correct, what [~iamaleksey] said.  In fact, pushing the aggregate computation 
to the replicas is troublesome at an RF>1.

Quick follow up - will this ticket also cover:
SELECT clusterCol, Max(x) FROM myData GROUP BY clusterCol;

That is, you group on a clustering column, but not on a partition key?

Second question - consider a table with schema myData(partitionKey INT, 
clusteringCol1 INT, clusteringCol2 INT, x INT, PRIMARY KEY ((partitionKey), 
clusteringCol1, clusteringCol2).  Now, will the following query be supported:
SELECT partitionKey, clusteringCol2, Sum(x) FROM myData GROUP BY partitionKey, 
clusteringCol2;

The reason I ask is that the following is not supported:
SELECT partitionKey, clusteringCol2, x FROM myData WHERE partitionKey=5 ORDER 
BY clusteringCol2;
Because you cannot order by clusteringCol2, only clusteringCol1.

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

2016-01-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107583#comment-15107583
 ] 

 Brian Hess commented on CASSANDRA-10707:
-

I think that supporting grouping by clustering column (or perhaps even a 
regular column) with a partition key predicate is a good idea.  

I think that supporting grouping by partition key (either in part, or in toto) 
is a bad idea.  In that query, all the data in the cluster would stream to the 
coordinator who would then be responsible for doing a *lot* of processing.  In 
other distributed systems that do GROUP BY queries, the groups end up being 
split up among the nodes in the system and each node is responsible for rolling 
up the data for those groups it was assigned.  This is a common way to get all 
the nodes in the system to help with a pretty significant computation - and the 
data streamed out (potentially via a single node in the system) to the client.  
However, in this approach, all the data is streaming to a single node and that 
node is doing all the work, for all the groups.  This feels like either a ton 
of work to orchestrate the computation (that would start to mimic other systems 
- e.g., Spark) or would do a lot of work and risk being very inefficient and 
slow.  I am also concerned to what this would do in the face of 
QueryTimeoutException - would we really be able to do a GROUP BY partitionKey 
aggregate under the QTE limit?


> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9778) CQL support for time series aggregation

2016-01-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15107407#comment-15107407
 ] 

 Brian Hess commented on CASSANDRA-9778:


[~blerer] - that really isn't an example of window functions (A/K/A window 
aggregates, window aggregate functions, etc).  That's really an example of a 
grouped aggregate with time functions (Floor, Minute, Hour, etc).  The 
cardinality of the output of this query is that the number of rows equals the 
number of groups.  Whereas, with window functions the cardinality of the output 
is that the number of rows equals the number of rows of input.

Let me simplify your trades example to daily stock prices with a schema of 
(symbol TEXT, transDate DATE, closePrice DOUBLE).  For each stock you'd like 
the sliding 3-day average of the stock closing prices.  You would do that with 
the following SQL-99 syntax:
SELECT symbol, transDate, closePrice, Avg(closePrice) OVER (PARTITION BY symbol 
ORDER BY closePrice ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS 
threeDayAverage FROM stocks WHERE symbol = 'XYZ';

Here, each day will have a "window" of rows that stretches from 2 rows before 
(if they exist) to the current row, and the value is the average of the three 
closePrice values.  Thus, there is an output for every row of input.  For 
Thursday's threeDayAverage for stock XYZ we will use the closePrice from 
Tuesday, Wednesday, and Thursday.  For Friday's threeDayAverage for stock XYZ 
we will use the closePrice from Wednesday, Thursday, and Friday.  And so on.

For what it's worth, this is not hard to do if there is a partition key 
predicate.  We are simply doing a pass over the rows to return to the client 
and rolling things up.  It is possible we need to sort the data depending on 
the ORDER BY clause, but otherwise the aggregation is a simple rollup.  It 
should be noted that SQL allows for very flexible window specifications that 
can cause trouble, such as
"OVER (PARTITION BY symbol ORDER BY transDate ROWS BETWEEN CURRENT ROW AND 
UNBOUNDED FOLLOWING)"
which would go from the current row to the end of the partition.  That can be a 
tricky case.  SQL99 also supports RANGE window specifications in addition to 
ROW specifications.  That can also be tricky.

That said, window functions would be a nice addition (especially with a 
partition key predicate).


> CQL support for time series aggregation
> ---
>
> Key: CASSANDRA-9778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9778
> Project: Cassandra
>  Issue Type: New Feature
>  Components: CQL
>Reporter: Jonathan Ellis
>Assignee: Benjamin Lerer
> Fix For: 3.x
>
>
> Along with MV (CASSANDRA-6477), time series aggregation or "rollups" are a 
> common design pattern in cassandra applications.  I'd like to add CQL support 
> for this along these lines:
> {code}
> CREATE MATERIALIZED VIEW stocks_by_hour AS
> SELECT exchange, day, day_time(1h) AS hour, symbol, avg(price), sum(volume)
> FROM stocks
> GROUP BY exchange, day, symbol, hour
> PRIMARY KEY  ((exchange, day), hour, symbol);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9552) COPY FROM times out after 110000 inserts

2016-01-15 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102092#comment-15102092
 ] 

 Brian Hess commented on CASSANDRA-9552:


[~slebresne] - it was different than the 9302, 9303, and 9304 improvements.  
This was a case where COPY failed.  Do the improvements to COPY FROM also 
happen to fix this problem?  Maybe [~Stefania] knows.

> COPY FROM times out after 11 inserts
> 
>
> Key: CASSANDRA-9552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9552
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: cqlsh
> Fix For: 2.1.x
>
>
> I am trying to test out performance of COPY FROM on various schemas.  I have 
> a 100-BIGINT-column table defined as:
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE test.test100 (
> pkey bigint,ccol bigint,col0 bigint,col1 bigint,col10 
> bigint,
> col11 bigint,col12 bigint,col13 bigint,col14 bigint,col15 
> bigint,
> col16 bigint,col17 bigint,col18 bigint,col19 bigint,col2 
> bigint,
> col20 bigint,col21 bigint,col22 bigint,col23 bigint,col24 
> bigint,
> col25 bigint,col26 bigint,col27 bigint,col28 bigint,col29 
> bigint,
> col3 bigint,col30 bigint,col31 bigint,col32 bigint,col33 
> bigint,
> col34 bigint,col35 bigint,col36 bigint,col37 bigint,col38 
> bigint,
> col39 bigint,col4 bigint,col40 bigint,col41 bigint,col42 
> bigint,
> col43 bigint,col44 bigint,col45 bigint,col46 bigint,col47 
> bigint,
> col48 bigint,col49 bigint,col5 bigint,col50 bigint,col51 
> bigint,
> col52 bigint,col53 bigint,col54 bigint,col55 bigint,col56 
> bigint,
> col57 bigint,col58 bigint,col59 bigint,col6 bigint,col60 
> bigint,
> col61 bigint,col62 bigint,col63 bigint,col64 bigint,col65 
> bigint,
> col66 bigint,col67 bigint,col68 bigint,col69 bigint,col7 
> bigint,
> col70 bigint,col71 bigint,col72 bigint,col73 bigint,col74 
> bigint,
> col75 bigint,col76 bigint,col77 bigint,col78 bigint,col79 
> bigint,
> col8 bigint,col80 bigint,col81 bigint,col82 bigint,col83 
> bigint,
> col84 bigint,col85 bigint,col86 bigint,col87 bigint,col88 
> bigint,
> col89 bigint,col9 bigint,col90 bigint,col91 bigint,col92 
> bigint,
> col93 bigint,col94 bigint,col95 bigint,col96 bigint,col97 
> bigint,
> PRIMARY KEY (pkey, ccol)
> ) WITH CLUSTERING ORDER BY (ccol ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> I then try to load the linked file of 120,000 rows of 100 BIGINT columns via:
> {code}
> cqlsh -e "COPY 
> test.test100(pkey,ccol,col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30,col31,col32,col33,col34,col35,col36,col37,col38,col39,col40,col41,col42,col43,col44,col45,col46,col47,col48,col49,col50,col51,col52,col53,col54,col55,col56,col57,col58,col59,col60,col61,col62,col63,col64,col65,col66,col67,col68,col69,col70,col71,col72,col73,col74,col75,col76,col77,col78,col79,col80,col81,col82,col83,col84,col85,col86,col87,col88,col89,col90,col91,col92,col93,col94,col95,col96,col97)
>  FROM 'data120K.csv'"
> {code}
> Data file here: 
> https://drive.google.com/file/d/0B87-Pevy14fuUVcxemFRcFFtRjQ/view?usp=sharing
> After 11 rows, it errors and hangs:
> {code}
> :1:11 rows; Write: 19848.21 rows/s
> Connection heartbeat failure
> :1:Aborting import at record #1196. Previously inserted records are 
> still present, and some records after that may be present as well.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-09 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090643#comment-15090643
 ] 

 Brian Hess commented on CASSANDRA-8844:


But this would allow you to have 2 CDC keyspaces, one with ERROR_ON_OVERFLOW 
and one with DISCARD_OLD_ON_OVERFLOW. As it's designed (as I understand it), 
mutations from both keyspaces end up in the same CDC log files. So, if you hit 
overflow on the keyspace that has DISCARD_OLD_ON_OVERFLOW you need to discard 
some files, but you probably shouldn't discard mutations from the keyspace that 
has ERROR_ON_OVERFLOW. 

That's the situation I'm thinking about. 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-09 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090626#comment-15090626
 ] 

 Brian Hess commented on CASSANDRA-8844:


As I understand it, making it per-keyspace could be problematic with 
DISCARD_OLD_ON_OVERFLOW since each CDC file will have data from all keyspaces. 
So, if you say to discard some but not others, then you would need to process 
the CDC file to only discard the ones that have the DISCARD_OLD_ON_OVERFLOW but 
keep the others. Is that feasible?
The other 2 seem fine, though. 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" t

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088257#comment-15088257
 ] 

 Brian Hess commented on CASSANDRA-8844:


The semantics don't change.  There are additional semantics - mutations have to 
be in the same keyspace, have the same partition, have the same CDC-enabled 
flag, and can be in different tables.  That is the same as today, as all tables 
have the CDC-enabled flag set to false.

Changing the whole keyspace would mean my CDC consumer is now processing a lot 
of unneeded data, just to get to the one table of interest.

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> proce

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088210#comment-15088210
 ] 

 Brian Hess commented on CASSANDRA-8844:


The only supported language for CDC consumer is Java, then?  Will there be an 
example consumer leveraging the CommitLogReplayer?  Will that program need to 
have Cassandra running on that machine to process the logs?  As in, will I be 
able to copy off the CDC logs to be processed elsewhere?

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to continuously "tail" the most recent logfile and get 
> low-latency(ms?) access to the data as it is writ

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088208#comment-15088208
 ] 

 Brian Hess commented on CASSANDRA-8844:


With respect to the atomicity, saying that you break into CDC and non-CDC 
commit logs would not change existing behavior, as essentially prior to this 
ticket all tables have CDC disabled.
If I have a table in a non-CDC keyspace and I choose enable CDC on that table, 
how would I do that here?  Or vice versa.  I expect that turning on CDC after 
the fact is likely to come up (possibly often).

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order, even when it is multiple logfiles behind in 
> processing
> -- Be able to c

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2016-01-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15088155#comment-15088155
 ] 

 Brian Hess commented on CASSANDRA-8844:


A couple comments/question
1. You seem to have changed from specifying CDC at the table level to the 
Keyspace level.  Is that correct?  That sort of contradicts one of the 
requirements in your document.
2. The CDC consumer needs to process a serialized mutation.  How will a 
developer write that?  Is there a commitlog serialized mutation format document 
that someone can use?  Do they need to write that in Java and link to the full 
Cassandra jars (and dependencies)?

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's directory.
> - A sophisticated daemon should be able to be written that could 
> -- Catch up, in written-order

[jira] [Commented] (CASSANDRA-8519) Mechanism to determine which nodes own which token ranges without Thrift

2016-01-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083826#comment-15083826
 ] 

 Brian Hess commented on CASSANDRA-8519:


[~snazy] I do still think this should be something you can query from the 
SYSTEM tables somewhere.  I do agree that the Java driver does provide this 
information, though I'm not sure about the other drivers.  It feels like a good 
thing to be able to simply lookup the information in system tables, versus 
having each driver implement the same logic (and have to change if/when it 
changes).

> Mechanism to determine which nodes own which token ranges without Thrift
> 
>
> Key: CASSANDRA-8519
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8519
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>Priority: Minor
>
> Right now the only way to determine which nodes own which token ranges is via 
> the Thrift interface.  There is not a Java/CQL driver mechanism to determine 
> this.  Applications that make multiple connections to Cassandra to extract 
> data in parallel need this ability so they can split the data into pieces, 
> and it is reasonable to want those splits to be on token range boundaries.  
> Of course, once you split this way, you would want to route those queries to 
> nodes that own that token range / split, for efficiency.
> This applies for both Hadoop and Spark, but other applications, too.  Hadoop 
> and Spark currently use Thrift to determine this topology.
> Additionally, different replication strategies and replication factors result 
> in different token range ownership, so there will have to be a different 
> answer based on which keyspace is used. 
> It would be useful if this data was stored in a CQL table and could be simply 
> queried.  A suggestion would be to add a column to the 
> SYSTEM.SCHEMA_KEYSPACES table (maybe a complex Map of Host to a UDT that has 
> a List of (beginRange, endRange) pairs - as an example).  This table would 
> need to be updated on an ALTER KEYSPACE command or on a topology change 
> event.  This would allow the server(s) to hold this information and the 
> drivers could simply query it (as opposed to having each driver manage this 
> separately).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10707) Add support for Group By to Select statement

2015-12-31 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076099#comment-15076099
 ] 

 Brian Hess commented on CASSANDRA-10707:
-

[~blerer] - will this GROUP BY require a partitionKey predicate?  That is, will 
the following be supported:
SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable GROUP BY 
partitionKey, clusteringColumn1;
Or will it also require a partitionKey predicate like:
SELECT partitionKey, clusteringColumn1, Max(value) FROM myTable WHERE 
partitionKey=5 GROUP BY partitionKey, clusteringColumn1;
Or will both be supported?

> Add support for Group By to Select statement
> 
>
> Key: CASSANDRA-10707
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10707
> Project: Cassandra
>  Issue Type: Improvement
>  Components: CQL
>Reporter: Benjamin Lerer
>Assignee: Benjamin Lerer
>
> Now that Cassandra support aggregate functions, it makes sense to support 
> {{GROUP BY}} on the {{SELECT}} statements.
> It should be possible to group either at the partition level or at the 
> clustering column level.
> {code}
> SELECT partitionKey, max(value) FROM myTable GROUP BY partitionKey;
> SELECT partitionKey, clustering0, clustering1, max(value) FROM myTable GROUP 
> BY partitionKey, clustering0, clustering1; 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2015-12-30 Thread Brian Hess (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hess updated CASSANDRA-10050:
---
Comment: was deleted

(was: I'm a little confused here.  [~beobal], why would the performance get 
slower as the tokens got larger, though?  Wouldn't it be just as inefficient to 
read the first bit of the token range rather than the last bit of the token 
range?  Seems like something else is going on here, no?)

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Alexander Spitzer
> Fix For: 3.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-10050) Secondary Index Performance Dependent on TokenRange Searched in Analytics

2015-12-30 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-10050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15075572#comment-15075572
 ] 

 Brian Hess commented on CASSANDRA-10050:
-

I'm a little confused here.  [~beobal], why would the performance get slower as 
the tokens got larger, though?  Wouldn't it be just as inefficient to read the 
first bit of the token range rather than the last bit of the token range?  
Seems like something else is going on here, no?

> Secondary Index Performance Dependent on TokenRange Searched in Analytics
> -
>
> Key: CASSANDRA-10050
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10050
> Project: Cassandra
>  Issue Type: Improvement
> Environment: Single node, macbook, 2.1.8
>Reporter: Russell Alexander Spitzer
> Fix For: 3.x
>
>
> In doing some test work on the Spark Cassandra Connector I saw some odd 
> performance when pushing down range queries with Secondary Index filters. 
> When running the queries we see huge amount of time when the C* server is not 
> doing any work and the query seem to be hanging. This investigation led to 
> the work in this document
> https://docs.google.com/spreadsheets/d/1aJg3KX7nPnY77RJ9ZT-IfaYADgJh0A--nAxItvC6hb4/edit#gid=0
> The Spark Cassandra Connector builds up token range specific queries and 
> allows the user to pushdown relevant fields to C*. Here we have two indexed 
> fields (size) and (color) being pushed down to C*. 
> {code}
> SELECT count(*) FROM ks.tab WHERE token("store") > $min AND token("store") <= 
> $max AND color = 'red' AND size = 'P' ALLOW FILTERING;{code}
> These queries will have different token ranges inserted and executed as 
> separate spark tasks. Spark tasks with token ranges near the Min(token) end 
> up executing much faster than those near Max(token) which also happen to 
> through errors.
> {code}
> Coordinator node timed out waiting for replica nodes' responses] 
> message="Operation timed out - received only 0 responses." 
> info={'received_responses': 0, 'required_responses': 1, 'consistency': 'ONE'}
> {code}
> I took the queries and ran them through CQLSH to see the difference in time. 
> A linear relationship is seen based on where the tokenRange being queried is 
> starting with only 2 second for queries near the beginning of the full token 
> spectrum and over 12 seconds at the end of the spectrum. 
> The question is, can this behavior be improved? or should we not recommend 
> using secondary indexes with Analytics workloads?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-24 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15071175#comment-15071175
 ] 

 Brian Hess commented on CASSANDRA-8844:


I certainly think that Cassandra should not transparently "upgrade" or alter 
the requested consistency level of the write from the application.  Doing so is 
"surprise behavior" and I think should be avoided, especially since the client 
requested a specific consistency level (which we'd be then ignoring and doing 
something different).  So, I'm -1 on that approach (do we do anything like that 
anywhere else in the system?).

I'm also -1 on the choice that the resiliency of the CDC log is only up to the 
original consistency level of the write.  CDC is a layer that should be 
additive to a table without needing to alter the application that is writing to 
that table.  The semantics of RF and CL for the CDC log itself should be 
governed by separate settings, though having the RF be equivalent to the base 
table and the CL equivalent to the original write make sense - however, the 
semantics should allow for the CDC log to achieve the full RF just as the base 
table does (e.g., via repair, etc).

Essentially, I think I disagree with this Assumption in the design doc:

bq. Don’t have to worry about repair - guarantee is CL # of nodes have the 
update and it will be consumed by client within ms. at-least-once semantics and 
time horizon mean we don’t have to worry about repairing CDC data.

I think the second goal/requirement you laid out above (a record of all 
mutations seen by the cluster) is an important part of the purpose of this 
ticket (and the original intent I had when [~tupshin] and I discussed this).  
Many uses stem from this interpretation of CDC logs, especially when the goal 
is for consumers other than other Cassandra clusters are processing the CDC 
log, but in some cases even Cassandra clusters would need to see every 
mutation, not just the final delta (e.g., Cassandra triggers on the destination 
system that provide for side-effects to mutations).

The issue of capturing every mutation in a durable/distributed/resilient way is 
intertwined with my belief that we should not have to process all RF copies of 
the CDC log.  Needing to process all RF copies of the CDC log puts a processing 
burden on the consuming system that is at least undesirable (and at most 
prohibitive).  There does not appear to be an approach to avoiding processing 
all RF copies to ensure at-least-once-delivery (but not 
at-least-RF-deliveries).  Moreover, the resiliency approach depends on 
requiring the consumer to process all RF copies of the logs in order to ensure 
processing every mutation.  In addition, processing all RF copies of the CDC 
log will incur a large processing overhead (200% extra in the RF=3 case, for 
example).  So, I'm also -1 on needing to process all the CDC logs on all 
replicas.  


> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that 

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-23 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15070494#comment-15070494
 ] 

 Brian Hess commented on CASSANDRA-8844:


This makes the consistency level of the CDC log completely defined by the CL of 
the mutation as it is written. That impacts anyone using CDC all the way to the 
application. The CDC log should transparently be additive to the table. That 
is, the user should not have to write to the table in any special or prescribed 
way in order to get the mutations captured. Requiring the client to write at 
CL_QUORUM of CDC is enabled is not okay. 

The thing that concerns me is that the consistency/replication of the CDC log 
now has a completely different guarantee than the base table (system, 
actually). The fact that a user can use Cassandra with any CL on writes and the 
system will eventually get to a consistent state is a good thing. The CDC log 
should be similarly eventually consistent, via any CL for the mutations. 

The situation described above is a situation that Cassandra covers just fine 
and results in an eventually consistent state. However, it will result in CDC 
loss which is not acceptable. The goals of CDC is not to get to snapshot 
differences, but to capture every mutation (including the intermediate 
mutations). The solution should tolerate nodes being offline and not lose any 
mutations, without requiring the application producing the mutations to be 
modified. 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to app

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-22 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068897#comment-15068897
 ] 

 Brian Hess commented on CASSANDRA-8844:


[~JoshuaMcKenzie] can you explain what happens in the following case. I do a 
CL_ONE write and only one replica is available, so the other 2 (assume RF=3) 
are not available. The other 2 come back online and a repair operation happens 
to bring those nodes up to date. Then the original node that got the write goes 
down. At this point, I want to process the changes. But the mutation is not in 
the commitlog of either of the online nodes. Is that correct?

I'm worried that there is a mutation that made it into this cluster but is now 
not in any CDC log and that the CDC log is not resilient in this case. Do I 
have this correct? 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would 

[jira] [Commented] (CASSANDRA-8844) Change Data Capture (CDC)

2015-12-22 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15068893#comment-15068893
 ] 

 Brian Hess commented on CASSANDRA-8844:


I share [~rustyrazorblade]'s concerns here. It would be good to think about how 
to consume the CDC logs without having to process each mutation RF times. That 
is in many cases redundant work that may be prohibitive in some/many cases. 
At-least-once processing is what the goal should be. If there is a way to layer 
at-least-once semantics (but not at-least-RF) then okay, but that is not clear 
here. 

At to [~tupshin]'s Kafka suggestion, you cannot specify the IDs/offsets on 
writing into the queue, so you would end up enqueueing RF times and not really 
solve the problem. 

> Change Data Capture (CDC)
> -
>
> Key: CASSANDRA-8844
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8844
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Coordination, Local Write-Read Paths
>Reporter: Tupshin Harper
>Assignee: Joshua McKenzie
>Priority: Critical
> Fix For: 3.x
>
>
> "In databases, change data capture (CDC) is a set of software design patterns 
> used to determine (and track) the data that has changed so that action can be 
> taken using the changed data. Also, Change data capture (CDC) is an approach 
> to data integration that is based on the identification, capture and delivery 
> of the changes made to enterprise data sources."
> -Wikipedia
> As Cassandra is increasingly being used as the Source of Record (SoR) for 
> mission critical data in large enterprises, it is increasingly being called 
> upon to act as the central hub of traffic and data flow to other systems. In 
> order to try to address the general need, we (cc [~brianmhess]), propose 
> implementing a simple data logging mechanism to enable per-table CDC patterns.
> h2. The goals:
> # Use CQL as the primary ingestion mechanism, in order to leverage its 
> Consistency Level semantics, and in order to treat it as the single 
> reliable/durable SoR for the data.
> # To provide a mechanism for implementing good and reliable 
> (deliver-at-least-once with possible mechanisms for deliver-exactly-once ) 
> continuous semi-realtime feeds of mutations going into a Cassandra cluster.
> # To eliminate the developmental and operational burden of users so that they 
> don't have to do dual writes to other systems.
> # For users that are currently doing batch export from a Cassandra system, 
> give them the opportunity to make that realtime with a minimum of coding.
> h2. The mechanism:
> We propose a durable logging mechanism that functions similar to a commitlog, 
> with the following nuances:
> - Takes place on every node, not just the coordinator, so RF number of copies 
> are logged.
> - Separate log per table.
> - Per-table configuration. Only tables that are specified as CDC_LOG would do 
> any logging.
> - Per DC. We are trying to keep the complexity to a minimum to make this an 
> easy enhancement, but most likely use cases would prefer to only implement 
> CDC logging in one (or a subset) of the DCs that are being replicated to
> - In the critical path of ConsistencyLevel acknowledgment. Just as with the 
> commitlog, failure to write to the CDC log should fail that node's write. If 
> that means the requested consistency level was not met, then clients *should* 
> experience UnavailableExceptions.
> - Be written in a Row-centric manner such that it is easy for consumers to 
> reconstitute rows atomically.
> - Written in a simple format designed to be consumed *directly* by daemons 
> written in non JVM languages
> h2. Nice-to-haves
> I strongly suspect that the following features will be asked for, but I also 
> believe that they can be deferred for a subsequent release, and to guage 
> actual interest.
> - Multiple logs per table. This would make it easy to have multiple 
> "subscribers" to a single table's changes. A workaround would be to create a 
> forking daemon listener, but that's not a great answer.
> - Log filtering. Being able to apply filters, including UDF-based filters 
> would make Casandra a much more versatile feeder into other systems, and 
> again, reduce complexity that would otherwise need to be built into the 
> daemons.
> h2. Format and Consumption
> - Cassandra would only write to the CDC log, and never delete from it. 
> - Cleaning up consumed logfiles would be the client daemon's responibility
> - Logfile size should probably be configurable.
> - Logfiles should be named with a predictable naming schema, making it 
> triivial to process them in order.
> - Daemons should be able to checkpoint their work, and resume from where they 
> left off. This means they would have to leave some file artifact in the CDC 
> log's d

[jira] [Commented] (CASSANDRA-6696) Partition sstables by token range

2015-12-08 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15047130#comment-15047130
 ] 

 Brian Hess commented on CASSANDRA-6696:


Will this change do anything for the case where there is only one disk?  Will 
each sstable contain a single token range (or at least not all token ranges)?

> Partition sstables by token range
> -
>
> Key: CASSANDRA-6696
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6696
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: sankalp kohli
>Assignee: Marcus Eriksson
>  Labels: compaction, correctness, dense-storage, 
> jbod-aware-compaction, performance
> Fix For: 3.2
>
>
> In JBOD, when someone gets a bad drive, the bad drive is replaced with a new 
> empty one and repair is run. 
> This can cause deleted data to come back in some cases. Also this is true for 
> corrupt stables in which we delete the corrupt stable and run repair. 
> Here is an example:
> Say we have 3 nodes A,B and C and RF=3 and GC grace=10days. 
> row=sankalp col=sankalp is written 20 days back and successfully went to all 
> three nodes. 
> Then a delete/tombstone was written successfully for the same row column 15 
> days back. 
> Since this tombstone is more than gc grace, it got compacted in Nodes A and B 
> since it got compacted with the actual data. So there is no trace of this row 
> column in node A and B.
> Now in node C, say the original data is in drive1 and tombstone is in drive2. 
> Compaction has not yet reclaimed the data and tombstone.  
> Drive2 becomes corrupt and was replaced with new empty drive. 
> Due to the replacement, the tombstone in now gone and row=sankalp col=sankalp 
> has come back to life. 
> Now after replacing the drive we run repair. This data will be propagated to 
> all nodes. 
> Note: This is still a problem even if we run repair every gc grace. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-02 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035771#comment-15035771
 ] 

 Brian Hess edited comment on CASSANDRA-9303 at 12/2/15 1:22 PM:
-

3 questions:
1. If I set SKIPROWS to 10 say, and also set HEADER to true, will I skip 10 or 
11 rows?
2. Why do you disable the ERRFILE for stdin?
3. If the MAXERRORS/MAXINSERTERRORS is >1, where do you keep the error around? 
Is it captured anywhere so someone can look back on what type of error 
occurred? NoHostAvailableException, WriteTimeoutException, bad date format, etc.


was (Author: brianmhess):
2 questions:
1. If I set SKIPROWS to 10 say, and also set HEADER to true, will I skip 10 or 
11 rows?
2. Why do you disable the ERRFILE for stdin?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-12-02 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035771#comment-15035771
 ] 

 Brian Hess commented on CASSANDRA-9303:


2 questions:
1. If I set SKIPROWS to 10 say, and also set HEADER to true, will I skip 10 or 
11 rows?
2. Why do you disable the ERRFILE for stdin?

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014280#comment-15014280
 ] 

 Brian Hess commented on CASSANDRA-9303:


I'm curious - how do you set the following in CQLSH COPY FROM:
- numFutures (the number of concurrent asynchronous requests "in flight" at a 
time)
- batchSize (the number of INSERTs to batch and send as one request)
- queryTimeout (the amount of time to wait on queries)
- numRetries (the number of times to retry failed/timed-out queries)
- progressRate (the rate at which progress is reported)

All of these are marked as "already available", but it isn't clear how to set 
them (nor from the documentation).

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9303) Match cassandra-loader options in COPY FROM

2015-11-19 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15014261#comment-15014261
 ] 

 Brian Hess commented on CASSANDRA-9303:


Are there no plans to support loading a directory of files?  I would say that 
that is one of the bigger options leveraged by users of cassandra-loader.

I'm +1 on not doing the things that CQLSH already handles (username, password, 
etc).

> Match cassandra-loader options in COPY FROM
> ---
>
> Key: CASSANDRA-9303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9303
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Critical
> Fix For: 2.1.x
>
>
> https://github.com/brianmhess/cassandra-loader added a bunch of options to 
> handle real world requirements, we should match those.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9304) COPY TO improvements

2015-10-23 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970961#comment-14970961
 ] 

 Brian Hess commented on CASSANDRA-9304:


[~Stefania] have you re-run the benchmark that [~dkua] performed above (or 
something similar) - specifically, comparing the "old version" to this new 
version (and to the cassandra-unloader 
(https://github.com/brianmhess/cassandra-loader))?  What performance 
improvement do we see?

> COPY TO improvements
> 
>
> Key: CASSANDRA-9304
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9304
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Core
>Reporter: Jonathan Ellis
>Assignee: Stefania
>Priority: Minor
>  Labels: cqlsh
> Fix For: 3.x, 2.1.x, 2.2.x
>
>
> COPY FROM has gotten a lot of love.  COPY TO not so much.  One obvious 
> improvement could be to parallelize reading and writing (write one page of 
> data while fetching the next).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8234) CTAS (CREATE TABLE AS SELECT)

2015-08-06 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660640#comment-14660640
 ] 

 Brian Hess commented on CASSANDRA-8234:


Another question:
6. If we include Spark in Cassandra, what are we going to do about the Spark 
Master?  That is, a feature of Cassandra (CTAS) will depend on a system that 
does not have the same HA profile (namely, always-on) as Cassandra.  Will we 
modify Spark to be "masterless"?  Will this ticket have a dependency on a 
"masterless" OSS Spark?  Or will Cassandra have this as a SPOF?

> CTAS (CREATE TABLE AS SELECT)
> -
>
> Key: CASSANDRA-8234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8234
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Robin Schumacher
> Fix For: 3.x
>
>
> Continuous request from users is the ability to do CREATE TABLE AS SELECT.  
> The simplest form would be copying the entire table.  More advanced would 
> allow specifying thes column and UDF to call as well as filtering rows out in 
> WHERE.
> More advanced still would be to get all the way to allowing JOIN, for which 
> we probably want to integrate Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8234) CTAS (CREATE TABLE AS SELECT)

2015-08-06 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14660443#comment-14660443
 ] 

 Brian Hess commented on CASSANDRA-8234:


A few questions:
1. For OSS C* (as opposed to DSE), will the Spark Master be visible to users 
other than C* itself?  As in, will Cassandra be the only process/user able to 
execute Spark jobs?  Or will users be able to submit jobs, start the SparkSQL 
thriftserver, etc?
2. How current will Spark be kept with Cassandra?  Will there be any guidance 
(or guarantees) about how stale the Spark is that is being included?  Or how 
often Cassandra will be upgraded to incorporate a new Spark?  Same for the OSS 
spark-cassandra-connector.
3. Will there be a load-sharing system in place so that multiple CTAS queries 
can run simultaneously (Spark in stand-alone mode will by default "reserve" all 
available cores and "lock out" another spark job)?
4. Will there be some "sandboxing" of Spark so that C* and Spark play nicely 
(with respect to RAM, CPU, etc)?
5. My assumption is that "CREATE TABLE b(x INT, y INT, z INT) AS SELECT x, y, z 
FROM a WITH PRIMARY KEY ((x), y)" [syntax for illustrative purposes only] will 
be an asynchronous operation.  That is, it will return "success" to the client, 
but the operation will be a background operation.  First, is that correct?  If 
so, I think there will have to be a status like in MVs and 2Is, correct?  If 
not, what will do about timing out of this query?


> CTAS (CREATE TABLE AS SELECT)
> -
>
> Key: CASSANDRA-8234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8234
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Robin Schumacher
> Fix For: 3.x
>
>
> Continuous request from users is the ability to do CREATE TABLE AS SELECT.  
> The simplest form would be copying the entire table.  More advanced would 
> allow specifying thes column and UDF to call as well as filtering rows out in 
> WHERE.
> More advanced still would be to get all the way to allowing JOIN, for which 
> we probably want to integrate Spark.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659385#comment-14659385
 ] 

 Brian Hess edited comment on CASSANDRA-9302 at 8/6/15 4:17 AM:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later", but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)


was (Author: brianmhess):
There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brian Hess updated CASSANDRA-9302:
--
Comment: was deleted

(was: There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That 
ticket has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552))

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659387#comment-14659387
 ] 

 Brian Hess commented on CASSANDRA-9302:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9302) Optimize cqlsh COPY FROM, part 3

2015-08-05 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14659385#comment-14659385
 ] 

 Brian Hess commented on CASSANDRA-9302:


There is also https://issues.apache.org/jira/browse/CASSANDRA-9048 That ticket 
has been tagged as "Later" for some reason, but the work continues at 
https://github.com/brianmhess/cassandra-loader
In addition to more options, the performance of cassandra-loader is 4x (or 
more) than COPY FROM in 2.1. Also, it is more stable and can handle wider rows 
(see https://issues.apache.org/jira/browse/CASSANDRA-9552)

> Optimize cqlsh COPY FROM, part 3
> 
>
> Key: CASSANDRA-9302
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9302
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Jonathan Ellis
>Assignee: David Kua
> Fix For: 2.1.x
>
>
> We've had some discussion moving to Spark CSV import for bulk load in 3.x, 
> but people need a good bulk load tool now.  One option is to add a separate 
> Java bulk load tool (CASSANDRA-9048), but if we can match that performance 
> from cqlsh I would prefer to leave COPY FROM as the preferred option to which 
> we point people, rather than adding more tools that need to be supported 
> indefinitely.
> Previous work on COPY FROM optimization was done in CASSANDRA-7405 and 
> CASSANDRA-8225.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-07-27 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642912#comment-14642912
 ] 

 Brian Hess commented on CASSANDRA-9892:


Yeah - I agree with [~tjake] and [~snazy].  TRUSTED should mean that the 
function can be run without the extra safeguards.  You trust the function.

> Add support for unsandboxed UDF
> ---
>
> Key: CASSANDRA-9892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>Priority: Minor
>
> From discussion on CASSANDRA-9402,
> The approach postgresql takes is to distinguish between "trusted" (sandboxed) 
> and "untrusted" (anything goes) UDF languages. 
> Creating an untrusted language always requires superuser mode. Once that is 
> done, creating functions in it requires nothing special.
> Personally I would be fine with this approach, but I think it would be more 
> useful to have the extra permission on creating the function, and also 
> wouldn't require adding explicit CREATE LANGUAGE.
> So I'd suggest just providing different CQL permissions for trusted and 
> untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
> create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
> UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9892) Add support for unsandboxed UDF

2015-07-24 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640610#comment-14640610
 ] 

 Brian Hess commented on CASSANDRA-9892:


I don't particularly like the change of the meaning of FENCED/UNFENCED, but we 
could use that.  FENCED being safe and UNFENCED being unsafe.  This is what DB2 
and Netezza do.

> Add support for unsandboxed UDF
> ---
>
> Key: CASSANDRA-9892
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9892
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
>Priority: Minor
>
> From discussion on CASSANDRA-9402,
> The approach postgresql takes is to distinguish between "trusted" (sandboxed) 
> and "untrusted" (anything goes) UDF languages. 
> Creating an untrusted language always requires superuser mode. Once that is 
> done, creating functions in it requires nothing special.
> Personally I would be fine with this approach, but I think it would be more 
> useful to have the extra permission on creating the function, and also 
> wouldn't require adding explicit CREATE LANGUAGE.
> So I'd suggest just providing different CQL permissions for trusted and 
> untrusted, i.e. if you have CREATE FUNCTION permission that allows you to 
> create sandboxed UDF, but you can only create unsandboxed if you have CREATE 
> UNTRUSTED.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9481) FENCED UDFs

2015-07-24 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640437#comment-14640437
 ] 

 Brian Hess commented on CASSANDRA-9481:


See 9892

> FENCED UDFs
> ---
>
> Key: CASSANDRA-9481
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9481
> Project: Cassandra
>  Issue Type: New Feature
>Reporter:  Brian Hess
>
> Related to security/sandboxing of UDFs (CASSANDRA-9042)
> Essentially, the UDF will run in a separate process when it is registered as 
> FENCED, and run in-process when it is registered as UNFENCED.
> This doesn't necessarily remove all the issues, but it does help mitigate 
> them/some - especially since it would (optionally) run as another user.
> This could look like the following with Cassandra:
> - FENCED is a GRANTable privilege
> - In cassandra.yaml you can specify the user to use when launching the 
> separate process (so that it is not the same user that is running the 
> database - or optionally is)
>   - This is good so that the UDF can't stop the database, delete database 
> files, etc.
> - For FENCED UDFs, IPC would be used to transfer rows to the UDF and to 
> return results. We could use CQL rows for the data. This could be shared 
> memory or sockets (Unux or TPC - slight preference for sockets for some 
> follow-on ideas).
> - Ideally, switching from FENCED to UNFENCED would be just a DDL change. That 
> is, the API would work such that a simple "ALTER FUNCTION myFunction(DOUBLE, 
> DOUBLE) UNFENCED" would change it.
> - If you wanted, because this is a separate process you could use a separate 
> class loader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632059#comment-14632059
 ] 

 Brian Hess commented on CASSANDRA-6477:


Let's go back to the basic use case that this is supposed to replace/help/make 
better.  The case where we want two query tables for the same data.  That is, 
they have the same primary keys, but different partition keys (and clustering 
column orders).

Today, I would do this by having a logged batch for the insert and that batch 
would insert into each of the two query tables.  With this I get some data 
consistency guarantees.  For example, if the client returns "success", I know 
that *both* of the inserts were accepted at the desired consistency level.  So, 
if I did 2 writes at CL_QUORUM, and I receive a "success", then I know I can 
then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  
I write to the base table at CL_QUORUM and get the "success" return.  At that 
point, I can do a CL_QUORUM read from the base table and see the most recent 
insert.  However, if I do a CL_QUORUM read from the MV, I have no guarantees at 
all.

This approach does not address the basic situation that we are trying to cover. 
 That concerns me greatly.


> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14632008#comment-14632008
 ] 

 Brian Hess commented on CASSANDRA-6477:


Let's go back to the basic use case that this is supposed to replace/help/make 
better.  The case where we want two query tables for the same data.  That is, 
they have the same primary keys, but different partition keys (and clustering 
column orders).

Today, I would do this by having a logged batch for the insert and that batch 
would insert into each of the two query tables.  With this I get some data 
consistency guarantees.  For example, if the client returns "success", I know 
that *both* of the inserts were accepted at the desired consistency level.  So, 
if I did 2 writes at CL_QUORUM, and I receive a "success", then I know I can 
then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  
I write to the base table at CL_QUORUM and get the "success" return.  At that 
point, I can do a CL_QUORUM read from the base table and see the most recent 
insert.  However, if I do a CL_QUORUM read from the MV, I have no guarantees at 
all.

This approach does not address the basic situation that we are trying to cover. 
 That concerns me greatly.


> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631991#comment-14631991
 ] 

 Brian Hess commented on CASSANDRA-6477:


Let's go back to the basic use case that this is supposed to replace/help/make 
better.  The case where we want two query tables for the same data.  That is, 
they have the same primary keys, but different partition keys (and clustering 
column orders).

Today, I would do this by having a logged batch for the insert and that batch 
would insert into each of the two query tables.  With this I get some data 
consistency guarantees.  For example, if the client returns "success", I know 
that *both* of the inserts were accepted at the desired consistency level.  So, 
if I did 2 writes at CL_QUORUM, and I receive a "success", then I know I can 
then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  
I write to the base table at CL_QUORUM and get the "success" return.  At that 
point, I can do a CL_QUORUM read from the base table and see the most recent 
insert.  However, if I do a CL_QUORUM read from the MV, I have no guarantees at 
all.

This approach does not address the basic situation that we are trying to cover. 
 That concerns me greatly.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631992#comment-14631992
 ] 

 Brian Hess commented on CASSANDRA-6477:


Let's go back to the basic use case that this is supposed to replace/help/make 
better.  The case where we want two query tables for the same data.  That is, 
they have the same primary keys, but different partition keys (and clustering 
column orders).

Today, I would do this by having a logged batch for the insert and that batch 
would insert into each of the two query tables.  With this I get some data 
consistency guarantees.  For example, if the client returns "success", I know 
that *both* of the inserts were accepted at the desired consistency level.  So, 
if I did 2 writes at CL_QUORUM, and I receive a "success", then I know I can 
then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  
I write to the base table at CL_QUORUM and get the "success" return.  At that 
point, I can do a CL_QUORUM read from the base table and see the most recent 
insert.  However, if I do a CL_QUORUM read from the MV, I have no guarantees at 
all.

This approach does not address the basic situation that we are trying to cover. 
 That concerns me greatly.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631988#comment-14631988
 ] 

 Brian Hess commented on CASSANDRA-6477:


Let's go back to the basic use case that this is supposed to replace/help/make 
better.  The case where we want two query tables for the same data.  That is, 
they have the same primary keys, but different partition keys (and clustering 
column orders).

Today, I would do this by having a logged batch for the insert and that batch 
would insert into each of the two query tables.  With this I get some data 
consistency guarantees.  For example, if the client returns "success", I know 
that *both* of the inserts were accepted at the desired consistency level.  So, 
if I did 2 writes at CL_QUORUM, and I receive a "success", then I know I can 
then do a CL_QUORUM read of *either* table and see the most recent data.

However, with this "asynchronous" MV approach, I no longer get this behavior.  
I write to the base table at CL_QUORUM and get the "success" return.  At that 
point, I can do a CL_QUORUM read from the base table and see the most recent 
insert.  However, if I do a CL_QUORUM read from the MV, I have no guarantees at 
all.

This approach does not address the basic situation that we are trying to cover.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-17 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631267#comment-14631267
 ] 

 Brian Hess commented on CASSANDRA-6477:


+1  I think that is the "promise" of the MV.  

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-16 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629870#comment-14629870
 ] 

 Brian Hess commented on CASSANDRA-6477:


Just for clarity [~slebresne] - you'd have to make the first name the Partition 
Key and the rest of the Primary Keys of the base table as Clustering Columns, 
correct?


> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9778) CQL support for time series aggregation

2015-07-15 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628427#comment-14628427
 ] 

 Brian Hess commented on CASSANDRA-9778:


Did you mean to have DAY be an argument to these functions: "day_time(dtime, 
1h)"?  Or is implied to be the "hour" of Now()?

Do we want to support "Day_time(dtime, 2h) as doublehour"?  Or are the options 
really Day_time(dtimedtime, 1h), Day_time(dtime, 1d), Day_time(dtime, 1w), etc? 
 

If so, Oracle has syntax for extracting the parts of the date or datetime type: 
https://docs.oracle.com/cd/E17952_01/refman-5.1-en/date-and-time-functions.html#function_hour
For example "Day(dtime)", "Hour(dtime)", etc.

Postgres also has this, though their syntax is a bit fancier: 
http://www.postgresql.org/docs/8.2/static/functions-datetime.html
For example "Extract(hour FROM dtime)", "Extract(day FROM dtime), etc.

So, while windowed aggregates are overkill here (and really, they are), if we 
want this functionality in Cassandra, we should consider the similar syntax in 
Postgres or Oracle (or DB2 (Day(), Month(), etc), MySQL (Day(), Month(), etc), 
SQL Server (DatePart()), etc).

> CQL support for time series aggregation
> ---
>
> Key: CASSANDRA-9778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9778
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
> Fix For: 3.x
>
>
> Along with MV (CASSANDRA-6477), time series aggregation or "rollups" are a 
> common design pattern in cassandra applications.  I'd like to add CQL support 
> for this along these lines:
> {code}
> CREATE MATERIALIZED VIEW stocks_by_hour AS
> SELECT exchange, day, day_time(1h) AS hour, symbol, avg(price), sum(volume)
> FROM stocks
> GROUP BY exchange, day, symbol, hour
> PRIMARY KEY  ((exchange, day), hour, symbol);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-14 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626668#comment-14626668
 ] 

 Brian Hess commented on CASSANDRA-6477:


A question to behavior here.  If we do the following
{noformat}
INSERT INTO t (k, t, v) VALUES (0, 0, 0) USING TTL 1000;
UPDATE t SET v = null WHERE k = 0 AND t = 0;
INSERT INTO t (k, t) VALUES (1, 1) USING TTL 1000;
{noformat}

Then, doing a 
{noformat}
SELECT * FROM t;
{noformat}
would result in getting both rows: (0, 0, null) and (1, 1, null)

However, if we have the view also have a null for the first row, so it would be 
(0, null, 0), wouldn't we also expect to see a row for (1, null, 1)?
The fact that a cell was non-null and is now null shouldn't really have any 
different behavior than if it was null from the start.

Another point - I think the primary use case will be to mimic query tables.  
That is, the MV will have the same columns in its primary key as the base 
table's, but with different partition keys and different orders for clustering 
columns.  I could be wrong, but that seems like the biggest use case.  And in 
that use case the keys are all non-null.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-07-14 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14626368#comment-14626368
 ] 

 Brian Hess commented on CASSANDRA-6477:


Moreover, if you were to handle NULLs in the application tier (say by the 
application using the string "NULL" or the INT -1 or whatever) you end up with 
exactly the same hot spot.  This is a known issue for distributed systems in 
general (not just Cassandra), so I would note it and move on - just like we 
note that GENDER is a bad partition key today.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
> Attachments: test-view-data.sh, users.yaml
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT

2015-07-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616747#comment-14616747
 ] 

 Brian Hess commented on CASSANDRA-9415:


We already silently rewrite queries for users when they use a secondary index.  
The user doesn't specify that C* should consult the secondary index when they 
query; it is implicit.

> Implicit use of Materialized Views on SELECT
> 
>
> Key: CASSANDRA-9415
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: ponies
>
> CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
> write path for the best-practice of "query tables".  But it does not simplify 
> the read path as much as our users want/need.
> We suggest to folks to create multiple copies of their base table optimized 
> for certain queries - hence "query table".  For example, we may have a USER 
> table with two type of queries: lookup by userid and lookup by email address. 
>  We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both 
> would have the exact same schema, with the same PRIMARY KEY columns, but 
> different PARTITION KEY - the first would be USERID and the second would be 
> EMAIL.
> One complicating thing with this approach is that the application now needs 
> to know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
> INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
> covers this nicely.
> However, the other side of the coin is that the application needs to know 
> which query table to leverage based on the selection criteria.  Using the 
> example above, if the query has a predicate such as "WHERE userid = 'bhess'", 
> then USERS_BY_USERID is the better table to use.  Similarly, when the 
> predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is 
> appropriate.
> On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" 
> to the collection of tables.  You do operations just on the base table.  It 
> is very attractive for the SELECT side as well.  It would be very good to 
> allow an application to simply do "SELECT * FROM users WHERE userid = 
> 'bhess'" and have that query implicitly leverage the USERS_BY_USERID 
> materialized view.
> For additional use cases, especially analytics use cases like in Spark, this 
> allows the Spark code to simply push down the query without having to know 
> about all of the MVs that have been set up.  The system will route the query 
> appropriately.  And if additional MVs are necessary to make a query run 
> better/faster, then those MVs can be set up and Spark will implicitly 
> leverage them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT

2015-07-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616746#comment-14616746
 ] 

 Brian Hess commented on CASSANDRA-9415:


For #1, that would be part of choosing the MV.  If it didn't contain all the 
columns in the projection list then you couldn't use that MV.
For #2, that would also be understandable by C*.  That is, you can see if the 
predicate of the query "matches" the predicate of the MV.  For example, if the 
MV had a predicate like "WHERE x > 100", then if the query had a predicate like 
"WHERE x=200" then you know you could use the MV.

As to the driver being able to route the query - in order for the driver to 
route the query you need to prepare the statement.  In preparing the statement, 
you would choose the MV, and the driver gets that anyway.

> Implicit use of Materialized Views on SELECT
> 
>
> Key: CASSANDRA-9415
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: ponies
>
> CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
> write path for the best-practice of "query tables".  But it does not simplify 
> the read path as much as our users want/need.
> We suggest to folks to create multiple copies of their base table optimized 
> for certain queries - hence "query table".  For example, we may have a USER 
> table with two type of queries: lookup by userid and lookup by email address. 
>  We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both 
> would have the exact same schema, with the same PRIMARY KEY columns, but 
> different PARTITION KEY - the first would be USERID and the second would be 
> EMAIL.
> One complicating thing with this approach is that the application now needs 
> to know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
> INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
> covers this nicely.
> However, the other side of the coin is that the application needs to know 
> which query table to leverage based on the selection criteria.  Using the 
> example above, if the query has a predicate such as "WHERE userid = 'bhess'", 
> then USERS_BY_USERID is the better table to use.  Similarly, when the 
> predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is 
> appropriate.
> On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" 
> to the collection of tables.  You do operations just on the base table.  It 
> is very attractive for the SELECT side as well.  It would be very good to 
> allow an application to simply do "SELECT * FROM users WHERE userid = 
> 'bhess'" and have that query implicitly leverage the USERS_BY_USERID 
> materialized view.
> For additional use cases, especially analytics use cases like in Spark, this 
> allows the Spark code to simply push down the query without having to know 
> about all of the MVs that have been set up.  The system will route the query 
> appropriately.  And if additional MVs are necessary to make a query run 
> better/faster, then those MVs can be set up and Spark will implicitly 
> leverage them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT

2015-07-07 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616743#comment-14616743
 ] 

 Brian Hess commented on CASSANDRA-9415:


Wouldn't the preparing of the statement handle that?

> Implicit use of Materialized Views on SELECT
> 
>
> Key: CASSANDRA-9415
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: ponies
>
> CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
> write path for the best-practice of "query tables".  But it does not simplify 
> the read path as much as our users want/need.
> We suggest to folks to create multiple copies of their base table optimized 
> for certain queries - hence "query table".  For example, we may have a USER 
> table with two type of queries: lookup by userid and lookup by email address. 
>  We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both 
> would have the exact same schema, with the same PRIMARY KEY columns, but 
> different PARTITION KEY - the first would be USERID and the second would be 
> EMAIL.
> One complicating thing with this approach is that the application now needs 
> to know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
> INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
> covers this nicely.
> However, the other side of the coin is that the application needs to know 
> which query table to leverage based on the selection criteria.  Using the 
> example above, if the query has a predicate such as "WHERE userid = 'bhess'", 
> then USERS_BY_USERID is the better table to use.  Similarly, when the 
> predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is 
> appropriate.
> On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" 
> to the collection of tables.  You do operations just on the base table.  It 
> is very attractive for the SELECT side as well.  It would be very good to 
> allow an application to simply do "SELECT * FROM users WHERE userid = 
> 'bhess'" and have that query implicitly leverage the USERS_BY_USERID 
> materialized view.
> For additional use cases, especially analytics use cases like in Spark, this 
> allows the Spark code to simply push down the query without having to know 
> about all of the MVs that have been set up.  The system will route the query 
> appropriately.  And if additional MVs are necessary to make a query run 
> better/faster, then those MVs can be set up and Spark will implicitly 
> leverage them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8986) Major cassandra-stress refactor

2015-06-30 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14608369#comment-14608369
 ] 

 Brian Hess commented on CASSANDRA-8986:


It would be good if you could leverage asynchronous execution instead of 
spawning so many threads yourself.  You can have flags/options to limit the 
number of futures in flight (see FutureManager at 
https://github.com/brianmhess/cassandra-loader/blob/master/src/main/java/com/datastax/loader/futures/FutureManager.java
 for example) and the number of threads, in addition to limiting the rate 
(which is already an option).

> Major cassandra-stress refactor
> ---
>
> Key: CASSANDRA-8986
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8986
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Benedict
>Assignee: Benedict
>
> We need a tool for both stressing _and_ validating more complex workloads 
> than stress currently supports. Stress needs a raft of changes, and I think 
> it would be easier to deliver many of these as a single major endeavour which 
> I think is justifiable given its audience. The rough behaviours I want stress 
> to support are:
> * Ability to know exactly how many rows it will produce, for any clustering 
> prefix, without generating those prefixes
> * Ability to generate an amount of data proportional to the amount it will 
> produce to the server (or consume from the server), rather than proportional 
> to the variation in clustering columns
> * Ability to reliably produce near identical behaviour each run
> * Ability to understand complex overlays of operation types (LWT, Delete, 
> Expiry, although perhaps not all implemented immediately, the framework for 
> supporting them easily)
> * Ability to (with minimal internal state) understand the complete cluster 
> state through overlays of multiple procedural generations
> * Ability to understand the in-flight state of in-progress operations (i.e. 
> if we're applying a delete, understand that the delete may have been applied, 
> and may not have been, for potentially multiple conflicting in flight 
> operations)
> I think the necessary changes to support this would give us the _functional_ 
> base to support all the functionality I can currently envisage stress 
> needing. Before embarking on this (which I may attempt very soon), it would 
> be helpful to get input from others as to features missing from stress that I 
> haven't covered here that we will certainly want in the future, so that they 
> can be factored in to the overall design and hopefully avoid another refactor 
> one year from now, as its complexity is scaling each time, and each time it 
> is a higher sunk cost. [~jbellis] [~iamaleksey] [~slebresne] [~tjake] 
> [~enigmacurry] [~aweisberg] [~blambov] [~jshook] ... and @everyone else :) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9550) Support fixed precision data type

2015-06-04 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573703#comment-14573703
 ] 

 Brian Hess commented on CASSANDRA-9550:


[~snazy] - I'm not sure what you mean by "you lose the value of the 
BigDecimal.precision field during serialization", but losing the precision is 
certainly an issue.

> Support fixed precision data type
> -
>
> Key: CASSANDRA-9550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9550
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>
> SQL databases support a fixed precision data type.  When converting data 
> models from SQL DBs to Cassandra, the choices are not clear.  DOUBLE or FLOAT 
> are possibilities, but with UDFs and Aggregates this is actually a lot less 
> clear, as you really do need fixed precision arithmetic.  That is, one flaw 
> with floating-point math is that the order of the operations can change the 
> final result.  That is not an issue with fixed precision arithmetic (though 
> other trade-offs are there).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9550) Support fixed precision data type

2015-06-04 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573612#comment-14573612
 ] 

 Brian Hess commented on CASSANDRA-9550:


There is a difference between arbitrary precision numbers (namely BigDecimal) 
and fixed precision numbers (namely SQL's NUMERIC).  Are you suggesting that we 
use DECIMAL in C* instead of NUMERIC(p,s) in SQL systems?  

The fixed precision is frequently needed in SQL - especially in financial 
situations (which we specifically call out in the C* docs).  Scaling by a 
factor in the app does not work well with the advent of UDFs and UDAs.  For 
example, the scale changes on multiplication and division, and without 
NUMERIC(p,s) you would not be able to convey that the scale changed.

> Support fixed precision data type
> -
>
> Key: CASSANDRA-9550
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9550
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>
> SQL databases support a fixed precision data type.  When converting data 
> models from SQL DBs to Cassandra, the choices are not clear.  DOUBLE or FLOAT 
> are possibilities, but with UDFs and Aggregates this is actually a lot less 
> clear, as you really do need fixed precision arithmetic.  That is, one flaw 
> with floating-point math is that the order of the operations can change the 
> final result.  That is not an issue with fixed precision arithmetic (though 
> other trade-offs are there).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9552) COPY FROM times out after 110000 inserts

2015-06-04 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14573605#comment-14573605
 ] 

 Brian Hess commented on CASSANDRA-9552:


Ah - thank you - this is version 2.1.5.469 (according to CQLSH).

> COPY FROM times out after 11 inserts
> 
>
> Key: CASSANDRA-9552
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9552
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: cqlsh
>
> I am trying to test out performance of COPY FROM on various schemas.  I have 
> a 100-BIGINT-column table defined as:
> {code}
> CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE test.test100 (
> pkey bigint,ccol bigint,col0 bigint,col1 bigint,col10 
> bigint,
> col11 bigint,col12 bigint,col13 bigint,col14 bigint,col15 
> bigint,
> col16 bigint,col17 bigint,col18 bigint,col19 bigint,col2 
> bigint,
> col20 bigint,col21 bigint,col22 bigint,col23 bigint,col24 
> bigint,
> col25 bigint,col26 bigint,col27 bigint,col28 bigint,col29 
> bigint,
> col3 bigint,col30 bigint,col31 bigint,col32 bigint,col33 
> bigint,
> col34 bigint,col35 bigint,col36 bigint,col37 bigint,col38 
> bigint,
> col39 bigint,col4 bigint,col40 bigint,col41 bigint,col42 
> bigint,
> col43 bigint,col44 bigint,col45 bigint,col46 bigint,col47 
> bigint,
> col48 bigint,col49 bigint,col5 bigint,col50 bigint,col51 
> bigint,
> col52 bigint,col53 bigint,col54 bigint,col55 bigint,col56 
> bigint,
> col57 bigint,col58 bigint,col59 bigint,col6 bigint,col60 
> bigint,
> col61 bigint,col62 bigint,col63 bigint,col64 bigint,col65 
> bigint,
> col66 bigint,col67 bigint,col68 bigint,col69 bigint,col7 
> bigint,
> col70 bigint,col71 bigint,col72 bigint,col73 bigint,col74 
> bigint,
> col75 bigint,col76 bigint,col77 bigint,col78 bigint,col79 
> bigint,
> col8 bigint,col80 bigint,col81 bigint,col82 bigint,col83 
> bigint,
> col84 bigint,col85 bigint,col86 bigint,col87 bigint,col88 
> bigint,
> col89 bigint,col9 bigint,col90 bigint,col91 bigint,col92 
> bigint,
> col93 bigint,col94 bigint,col95 bigint,col96 bigint,col97 
> bigint,
> PRIMARY KEY (pkey, ccol)
> ) WITH CLUSTERING ORDER BY (ccol ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
> AND comment = ''
> AND compaction = {'min_threshold': '4', 'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32'}
> AND compression = {'sstable_compression': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99.0PERCENTILE';
> {code}
> I then try to load the linked file of 120,000 rows of 100 BIGINT columns via:
> {code}
> cqlsh -e "COPY 
> test.test100(pkey,ccol,col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30,col31,col32,col33,col34,col35,col36,col37,col38,col39,col40,col41,col42,col43,col44,col45,col46,col47,col48,col49,col50,col51,col52,col53,col54,col55,col56,col57,col58,col59,col60,col61,col62,col63,col64,col65,col66,col67,col68,col69,col70,col71,col72,col73,col74,col75,col76,col77,col78,col79,col80,col81,col82,col83,col84,col85,col86,col87,col88,col89,col90,col91,col92,col93,col94,col95,col96,col97)
>  FROM 'data120K.csv'"
> {code}
> Data file here: 
> https://drive.google.com/file/d/0B87-Pevy14fuUVcxemFRcFFtRjQ/view?usp=sharing
> After 11 rows, it errors and hangs:
> {code}
> :1:11 rows; Write: 19848.21 rows/s
> Connection heartbeat failure
> :1:Aborting import at record #1196. Previously inserted records are 
> still present, and some records after that may be present as well.
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9552) COPY FROM times out after 110000 inserts

2015-06-04 Thread Brian Hess (JIRA)
 Brian Hess created CASSANDRA-9552:
--

 Summary: COPY FROM times out after 11 inserts
 Key: CASSANDRA-9552
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9552
 Project: Cassandra
  Issue Type: Improvement
Reporter:  Brian Hess


I am trying to test out performance of COPY FROM on various schemas.  I have a 
100-BIGINT-column table defined as:

{{
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '3'}  AND durable_writes = true;

CREATE TABLE test.test100 (
pkey bigint,ccol bigint,col0 bigint,col1 bigint,col10 
bigint,
col11 bigint,col12 bigint,col13 bigint,col14 bigint,col15 
bigint,
col16 bigint,col17 bigint,col18 bigint,col19 bigint,col2 
bigint,
col20 bigint,col21 bigint,col22 bigint,col23 bigint,col24 
bigint,
col25 bigint,col26 bigint,col27 bigint,col28 bigint,col29 
bigint,
col3 bigint,col30 bigint,col31 bigint,col32 bigint,col33 
bigint,
col34 bigint,col35 bigint,col36 bigint,col37 bigint,col38 
bigint,
col39 bigint,col4 bigint,col40 bigint,col41 bigint,col42 
bigint,
col43 bigint,col44 bigint,col45 bigint,col46 bigint,col47 
bigint,
col48 bigint,col49 bigint,col5 bigint,col50 bigint,col51 
bigint,
col52 bigint,col53 bigint,col54 bigint,col55 bigint,col56 
bigint,
col57 bigint,col58 bigint,col59 bigint,col6 bigint,col60 
bigint,
col61 bigint,col62 bigint,col63 bigint,col64 bigint,col65 
bigint,
col66 bigint,col67 bigint,col68 bigint,col69 bigint,col7 
bigint,
col70 bigint,col71 bigint,col72 bigint,col73 bigint,col74 
bigint,
col75 bigint,col76 bigint,col77 bigint,col78 bigint,col79 
bigint,
col8 bigint,col80 bigint,col81 bigint,col82 bigint,col83 
bigint,
col84 bigint,col85 bigint,col86 bigint,col87 bigint,col88 
bigint,
col89 bigint,col9 bigint,col90 bigint,col91 bigint,col92 
bigint,
col93 bigint,col94 bigint,col95 bigint,col96 bigint,col97 
bigint,
PRIMARY KEY (pkey, ccol)
) WITH CLUSTERING ORDER BY (ccol ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'min_threshold': '4', 'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
}}

I then try to load the linked file of 120,000 rows of 100 BIGINT columns via:
{{
cqlsh -e "COPY 
test.test100(pkey,ccol,col0,col1,col2,col3,col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16,col17,col18,col19,col20,col21,col22,col23,col24,col25,col26,col27,col28,col29,col30,col31,col32,col33,col34,col35,col36,col37,col38,col39,col40,col41,col42,col43,col44,col45,col46,col47,col48,col49,col50,col51,col52,col53,col54,col55,col56,col57,col58,col59,col60,col61,col62,col63,col64,col65,col66,col67,col68,col69,col70,col71,col72,col73,col74,col75,col76,col77,col78,col79,col80,col81,col82,col83,col84,col85,col86,col87,col88,col89,col90,col91,col92,col93,col94,col95,col96,col97)
 FROM 'data120K.csv'"
}}

Data file here: 
https://drive.google.com/file/d/0B87-Pevy14fuUVcxemFRcFFtRjQ/view?usp=sharing

After 11 rows, it errors and hangs:
{{
:1:11 rows; Write: 19848.21 rows/s
Connection heartbeat failure
:1:Aborting import at record #1196. Previously inserted records are 
still present, and some records after that may be present as well.
}}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9550) Support fixed precision data type

2015-06-04 Thread Brian Hess (JIRA)
 Brian Hess created CASSANDRA-9550:
--

 Summary: Support fixed precision data type
 Key: CASSANDRA-9550
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9550
 Project: Cassandra
  Issue Type: Improvement
Reporter:  Brian Hess


SQL databases support a fixed precision data type.  When converting data models 
from SQL DBs to Cassandra, the choices are not clear.  DOUBLE or FLOAT are 
possibilities, but with UDFs and Aggregates this is actually a lot less clear, 
as you really do need fixed precision arithmetic.  That is, one flaw with 
floating-point math is that the order of the operations can change the final 
result.  That is not an issue with fixed precision arithmetic (though other 
trade-offs are there).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-9402) Implement proper sandboxing for UDFs

2015-05-26 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559420#comment-14559420
 ] 

 Brian Hess edited comment on CASSANDRA-9402 at 5/26/15 5:17 PM:
-

I do think that you may want to leverage FENCED UDFs to accomplish some of the 
objections/objectives here.  
Either way, I created CASSANDRA-9481 for FENCED UDFs


was (Author: brianmhess):
I'm do think that you may want to leverage FENCED UDFs to accomplish some of 
the objections/objectives here.  
Either way, I created CASSANDRA-9481 for FENCED UDFs

> Implement proper sandboxing for UDFs
> 
>
> Key: CASSANDRA-9402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9402
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>Assignee: Robert Stupp
>Priority: Critical
>  Labels: doc-impacting, security
> Fix For: 2.2.0 rc1
>
>
> We want to avoid a security exploit for our users.  We need to make sure we 
> ship 2.2 UDFs with good defaults so someone exposing it to the internet 
> accidentally doesn't open themselves up to having arbitrary code run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9402) Implement proper sandboxing for UDFs

2015-05-26 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559420#comment-14559420
 ] 

 Brian Hess commented on CASSANDRA-9402:


I'm do think that you may want to leverage FENCED UDFs to accomplish some of 
the objections/objectives here.  
Either way, I created CASSANDRA-9481 for FENCED UDFs

> Implement proper sandboxing for UDFs
> 
>
> Key: CASSANDRA-9402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9402
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>Assignee: Robert Stupp
>Priority: Critical
>  Labels: doc-impacting, security
> Fix For: 2.2.0 rc1
>
>
> We want to avoid a security exploit for our users.  We need to make sure we 
> ship 2.2 UDFs with good defaults so someone exposing it to the internet 
> accidentally doesn't open themselves up to having arbitrary code run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9481) FENCED UDFs

2015-05-26 Thread Brian Hess (JIRA)
 Brian Hess created CASSANDRA-9481:
--

 Summary: FENCED UDFs
 Key: CASSANDRA-9481
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9481
 Project: Cassandra
  Issue Type: New Feature
Reporter:  Brian Hess


Related to security/sandboxing of UDFs (CASSANDRA-9042)
Essentially, the UDF will run in a separate process when it is registered as 
FENCED, and run in-process when it is registered as UNFENCED.

This doesn't necessarily remove all the issues, but it does help mitigate 
them/some - especially since it would (optionally) run as another user.

This could look like the following with Cassandra:
- FENCED is a GRANTable privilege
- In cassandra.yaml you can specify the user to use when launching the separate 
process (so that it is not the same user that is running the database - or 
optionally is)
  - This is good so that the UDF can't stop the database, delete database 
files, etc.
- For FENCED UDFs, IPC would be used to transfer rows to the UDF and to return 
results. We could use CQL rows for the data. This could be shared memory or 
sockets (Unux or TPC - slight preference for sockets for some follow-on ideas).
- Ideally, switching from FENCED to UNFENCED would be just a DDL change. That 
is, the API would work such that a simple "ALTER FUNCTION myFunction(DOUBLE, 
DOUBLE) UNFENCED" would change it.
- If you wanted, because this is a separate process you could use a separate 
class loader.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9402) Implement proper sandboxing for UDFs

2015-05-26 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14559291#comment-14559291
 ] 

 Brian Hess commented on CASSANDRA-9402:


There is another alternative - FENCED UDFs.  DB2 and Netezza had these options. 
 Essentially, the UDF will run in a separate process when it is registered as 
FENCED, and run in-process when it is registered as UNFENCED.

This doesn't necessarily remove all the issues, but it does help mitigate 
them/some - especially since it would (optionally) run as another user.

This could look like the following with Cassandra:
- FENCED is a GRANTable privilege
- In cassandra.yaml you can specify the user to use when launching the separate 
process (so that it is *not* the same user that is running the database - or 
optionally is)
  - This is good so that the UDF can't stop the database, delete database 
files, etc.
- For FENCED UDFs, IPC would be used to transfer rows to the UDF and to return 
results.  We could use CQL rows for the data.  This could be shared memory or 
sockets (Unux or  TPC - slight preference for sockets for some follow-on ideas).
- Ideally, switching from FENCED to UNFENCED would be just a DDL change.  That 
is, the API would work such that a simple "ALTER FUNCTION myFunction(DOUBLE, 
DOUBLE) UNFENCED" would change it.
- If you wanted, because this is a separate process you could use a separate 
class loader.

> Implement proper sandboxing for UDFs
> 
>
> Key: CASSANDRA-9402
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9402
> Project: Cassandra
>  Issue Type: Task
>Reporter: T Jake Luciani
>Assignee: Robert Stupp
>Priority: Critical
>  Labels: doc-impacting, security
> Fix For: 2.2.0 rc1
>
>
> We want to avoid a security exploit for our users.  We need to make sure we 
> ship 2.2 UDFs with good defaults so someone exposing it to the internet 
> accidentally doesn't open themselves up to having arbitrary code run.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-05-21 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14554333#comment-14554333
 ] 

 Brian Hess commented on CASSANDRA-6477:


Can someone clarify whether, based on the current implementation, the 
Materialized View (or some of the Materialized Views, if there are more than 
one) can contain the mutation but the base table does not - or vice versa?  Can 
you clarify what happens in each of the failure scenarios?  It will be very 
important for us to know for sure when we discuss this with users/customers.

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.0 beta 1
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-6477) Materialized Views (was: Global Indexes)

2015-05-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14549088#comment-14549088
 ] 

 Brian Hess commented on CASSANDRA-6477:


I really don't need to re-ignite the nomenclature debate, but another name that 
might be more fitting with how we've talked about these tables in the past is 
DB2's name: Materialized Query Tables
Anyone like or hate that name more?

> Materialized Views (was: Global Indexes)
> 
>
> Key: CASSANDRA-6477
> URL: https://issues.apache.org/jira/browse/CASSANDRA-6477
> Project: Cassandra
>  Issue Type: New Feature
>  Components: API, Core
>Reporter: Jonathan Ellis
>Assignee: Carl Yeksigian
>  Labels: cql
> Fix For: 3.x
>
>
> Local indexes are suitable for low-cardinality data, where spreading the 
> index across the cluster is a Good Thing.  However, for high-cardinality 
> data, local indexes require querying most nodes in the cluster even if only a 
> handful of rows is returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT

2015-05-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548608#comment-14548608
 ] 

 Brian Hess commented on CASSANDRA-9415:


[~jbellis] Oracle, DB2, and SQL Server do this (at least - maybe others).

In Oracle it is Materialized Views (see 
http://docs.oracle.com/cd/B28359_01/server.111/b28313/basicmv.htm):
bq. The end user queries the tables and views at the detail data level. The 
query rewrite mechanism in the Oracle server automatically rewrites the SQL 
query to use the summary tables. This mechanism reduces response time for 
returning results from the query. Materialized views within the data warehouse 
are transparent to the end user or to the database application."

In DB2 it is Materialized Query Tables (see 
http://www.ibm.com/developerworks/data/library/techarticle/dm-0509melnyk/)
bq. Materialized query tables can significantly improve the performance of 
queries, especially complex queries. If the optimizer determines that a query 
or part of a query could be resolved using an MQT, the query might be rewritten 
to take advantage of the MQT.

In SQL Server it is Indexed Views (see 
https://msdn.microsoft.com/en-us/library/dd171921(SQL.100).aspx):
bq. The indexed view can be used in a query execution in two ways. The query 
can reference the indexed view directly, or, more importantly, the query 
optimizer can select the view if it determines that the view can be substituted 
for some or all of the query in the lowest-cost query plan. In the second case, 
the indexed view is used instead of the underlying tables and their ordinary 
indexes. The view does not need to be referenced in the query for the query 
optimizer to use it during query execution. This allows existing applications 
to benefit from the newly created indexed views without changing those 
applications.


> Implicit use of Materialized Views on SELECT
> 
>
> Key: CASSANDRA-9415
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
> Project: Cassandra
>  Issue Type: Improvement
>Reporter:  Brian Hess
>  Labels: ponies
>
> CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
> write path for the best-practice of "query tables".  But it does not simplify 
> the read path as much as our users want/need.
> We suggest to folks to create multiple copies of their base table optimized 
> for certain queries - hence "query table".  For example, we may have a USER 
> table with two type of queries: lookup by userid and lookup by email address. 
>  We would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both 
> would have the exact same schema, with the same PRIMARY KEY columns, but 
> different PARTITION KEY - the first would be USERID and the second would be 
> EMAIL.
> One complicating thing with this approach is that the application now needs 
> to know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
> INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
> covers this nicely.
> However, the other side of the coin is that the application needs to know 
> which query table to leverage based on the selection criteria.  Using the 
> example above, if the query has a predicate such as "WHERE userid = 'bhess'", 
> then USERS_BY_USERID is the better table to use.  Similarly, when the 
> predicate is "WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is 
> appropriate.
> On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" 
> to the collection of tables.  You do operations just on the base table.  It 
> is very attractive for the SELECT side as well.  It would be very good to 
> allow an application to simply do "SELECT * FROM users WHERE userid = 
> 'bhess'" and have that query implicitly leverage the USERS_BY_USERID 
> materialized view.
> For additional use cases, especially analytics use cases like in Spark, this 
> allows the Spark code to simply push down the query without having to know 
> about all of the MVs that have been set up.  The system will route the query 
> appropriately.  And if additional MVs are necessary to make a query run 
> better/faster, then those MVs can be set up and Spark will implicitly 
> leverage them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-8234) CTAS for COPY

2015-05-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548484#comment-14548484
 ] 

 Brian Hess commented on CASSANDRA-8234:


What will the user-experience be for CTAS?  Will it come under the same query 
timeout as with a normal SELECT query?  Will it be synchronous or asynchronous? 

> CTAS for COPY
> -
>
> Key: CASSANDRA-8234
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8234
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
>Reporter: Robin Schumacher
> Fix For: 3.x
>
>
> Continuous request from users is the ability to do CREATE TABLE AS SELECT... 
> The COPY command can be enhanced to perform simple and customized copies of 
> existing tables to satisfy the need. 
> - Simple copy is COPY table a TO new table b.
> - Custom copy can mimic Postgres: (e.g. COPY (SELECT * FROM country WHERE 
> country_name LIKE 'A%') TO …)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9200) Sequences

2015-05-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548436#comment-14548436
 ] 

 Brian Hess commented on CASSANDRA-9200:


[~jjordan] - I was responding to this comment "numbers that a human should be 
able to read" from [~snazy]  It would at least cover that space.

Hence asking for the real customer need for increasing/decreasing values.

> Sequences
> -
>
> Key: CASSANDRA-9200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9200
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> UUIDs are usually the right choice for surrogate keys, but sometimes 
> application constraints dictate an increasing numeric value.
> We could do this by using LWT to reserve "blocks" of the sequence for each 
> member of the cluster, which would eliminate paxos contention at the cost of 
> not being strictly increasing.
> PostgreSQL syntax: 
> http://www.postgresql.org/docs/9.4/static/sql-createsequence.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9415) Implicit use of Materialized Views on SELECT

2015-05-18 Thread Brian Hess (JIRA)
 Brian Hess created CASSANDRA-9415:
--

 Summary: Implicit use of Materialized Views on SELECT
 Key: CASSANDRA-9415
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9415
 Project: Cassandra
  Issue Type: Improvement
Reporter:  Brian Hess


CASSANDRA-6477 introduces Materialized Views.  This greatly simplifies the 
write path for the best-practice of "query tables".  But it does not simplify 
the read path as much as our users want/need.

We suggest to folks to create multiple copies of their base table optimized for 
certain queries - hence "query table".  For example, we may have a USER table 
with two type of queries: lookup by userid and lookup by email address.  We 
would recommend creating 2 tables USER_BY_USERID and USER_BY_EMAIL.  Both would 
have the exact same schema, with the same PRIMARY KEY columns, but different 
PARTITION KEY - the first would be USERID and the second would be EMAIL.

One complicating thing with this approach is that the application now needs to 
know that when it INSERT/UPDATE/DELETEs from the base table it needs to 
INSERT/UPDATE/DELETE from all of the query tables as well.  CASSANDRA-6477 
covers this nicely.

However, the other side of the coin is that the application needs to know which 
query table to leverage based on the selection criteria.  Using the example 
above, if the query has a predicate such as "WHERE userid = 'bhess'", then 
USERS_BY_USERID is the better table to use.  Similarly, when the predicate is 
"WHERE email = 'bhess@company.whatever'", USERS_BY_EMAIL is appropriate.

On INSERT/UPDATE/DELETE, Materialized Views essentially give a single "name" to 
the collection of tables.  You do operations just on the base table.  It is 
very attractive for the SELECT side as well.  It would be very good to allow an 
application to simply do "SELECT * FROM users WHERE userid = 'bhess'" and have 
that query implicitly leverage the USERS_BY_USERID materialized view.

For additional use cases, especially analytics use cases like in Spark, this 
allows the Spark code to simply push down the query without having to know 
about all of the MVs that have been set up.  The system will route the query 
appropriately.  And if additional MVs are necessary to make a query run 
better/faster, then those MVs can be set up and Spark will implicitly leverage 
them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9200) Sequences

2015-05-18 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14548364#comment-14548364
 ] 

 Brian Hess commented on CASSANDRA-9200:


Can you not get a lot of what you want with a UDF that maps a UUID to a BIGINT 
hash?  That will not be increasing/decreasing numbers, but will be an 8-byte 
integer.  Can someone explain more on how the increasing/decreasing quality is 
needed/leveraged/used by customers?

> Sequences
> -
>
> Key: CASSANDRA-9200
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9200
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Jonathan Ellis
>Assignee: Robert Stupp
> Fix For: 3.x
>
>
> UUIDs are usually the right choice for surrogate keys, but sometimes 
> application constraints dictate an increasing numeric value.
> We could do this by using LWT to reserve "blocks" of the sequence for each 
> member of the cluster, which would eliminate paxos contention at the cost of 
> not being strictly increasing.
> PostgreSQL syntax: 
> http://www.postgresql.org/docs/9.4/static/sql-createsequence.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (CASSANDRA-9259) Bulk Reading from Cassandra

2015-04-28 Thread Brian Hess (JIRA)
 Brian Hess created CASSANDRA-9259:
--

 Summary: Bulk Reading from Cassandra
 Key: CASSANDRA-9259
 URL: https://issues.apache.org/jira/browse/CASSANDRA-9259
 Project: Cassandra
  Issue Type: Improvement
  Components: Core
Reporter:  Brian Hess


This ticket is following on from the 2015 NGCC.  This ticket is designed to be 
a place for discussing and designing an approach to bulk reading.

The goal is to have a bulk reading path for Cassandra.  That is, a path 
optimized to grab a large portion of the data for a table (potentially all of 
it).  This is a core element in the Spark integration with Cassandra, and the 
speed at which Cassandra can deliver bulk data to Spark is limiting the 
performance of Spark-plus-Cassandra operations.  This is especially of 
importance as Cassandra will (likely) leverage Spark for internal operations 
(for example CASSANDRA-8234).

The core CQL to consider is the following:
SELECT a, b, c FROM myKs.myTable WHERE Token(partitionKey) > X AND 
Token(partitionKey) <= Y

Here, we choose X and Y to be contained within one token range (perhaps 
considering the primary range of a node without vnodes, for example).  This 
query pushes 50K-100K rows/sec, which is not very fast if we are doing bulk 
operations via Spark (or other processing frameworks - ETL, etc).  There are a 
few causes (e.g., inefficient paging).

There are a few approaches that could be considered.  First, we consider a new 
"Streaming Compaction" approach.  The key observation here is that a bulk read 
from Cassandra is a lot like a major compaction, though instead of outputting a 
new SSTable we would output CQL rows to a stream/socket/etc.  This would be 
similar to a CompactionTask, but would strip out some unnecessary things in 
there (e.g., some of the indexing, etc). Predicates and projections could also 
be encapsulated in this new "StreamingCompactionTask", for example.

Another approach would be an alternate storage format.  For example, we might 
employ Parquet (just as an example) to store the same data as in the primary 
Cassandra storage (aka SSTables).  This is akin to Global Indexes (an alternate 
storage of the same data optimized for a particular query).  Then, Cassandra 
can choose to leverage this alternate storage for particular CQL queries (e.g., 
range scans).

These are just 2 suggestions to get the conversation going.

One thing to note is that it will be useful to have this storage segregated by 
token range so that when you extract via these mechanisms you do not get 
replications-factor numbers of copies of the data.  That will certainly be an 
issue for some Spark operations (e.g., counting).  Thus, we will want 
per-token-range storage (even for single disks), so this will likely leverage 
CASSANDRA-6696 (though, we'll want to also consider the single disk case).

It is also worth discussing what the success criteria is here.  It is unlikely 
to be as fast as EDW or HDFS performance (though, that is still a good goal), 
but being within some percentage of that performance should be set as success.  
For example, 2x as long as doing bulk operations on HDFS with similar node 
count/size/etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-27 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384764#comment-14384764
 ] 

 Brian Hess commented on CASSANDRA-9048:


[~thobbs] - do you mean CASSANDRA-8234?  That's CTAS for COPY.  I think we all 
agree that using Spark for that use case is very appropriate.  This is about 
bulk loading files from outside Cassandra.

> Delimited File Bulk Loader
> --
>
> Key: CASSANDRA-9048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter:  Brian Hess
> Fix For: 3.0
>
> Attachments: CASSANDRA-9048.patch
>
>
> There is a strong need for bulk loading data from delimited files into 
> Cassandra.  Starting with delimited files means that the data is not 
> currently in the SSTable format, and therefore cannot immediately leverage 
> Cassandra's bulk loading tool, sstableloader, directly.
> A tool supporting delimited files much closer matches the format of the data 
> more often than the SSTable format itself, and a tool that loads from 
> delimited files is very useful.
> In order for this bulk loader to be more generally useful to customers, it 
> should handle a number of options at a minimum:
> - support specifying the input file or to read the data from stdin (so other 
> command-line programs can pipe into the loader)
> - supply the CQL schema for the input data
> - support all data types other than collections (collections is a stretch 
> goal/need)
> - an option to specify the delimiter
> - an option to specify comma as the decimal delimiter (for international use 
> casese)
> - an option to specify how NULL values are specified in the file (e.g., the 
> empty string or the string NULL)
> - an option to specify how BOOLEAN values are specified in the file (e.g., 
> TRUE/FALSE or 0/1)
> - an option to specify the Date and Time format
> - an option to skip some number of rows at the beginning of the file
> - an option to only read in some number of rows from the file
> - an option to indicate how many parse errors to tolerate
> - an option to specify a file that will contain all the lines that did not 
> parse correctly (up to the maximum number of parse errors)
> - an option to specify the CQL port to connect to (with 9042 as the default).
> Additional options would be useful, but this set of options/features is a 
> start.
> A word on COPY.  COPY comes via CQLSH which requires the client to be the 
> same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
> etc).  This tool should be able to connect to any version of Cassandra 
> (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
> Moreover, CQLSH's COPY command does not support a number of the options 
> above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
> considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-27 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384448#comment-14384448
 ] 

 Brian Hess commented on CASSANDRA-9048:


To answer [~jbellis]'s and [~iamaleksey]'s question about performance.  I took 
the latest 2.1 (including [~thobbs]'s latest improvements to COPY) and set up a 
5-node i2.xlarge cluster in EC2.  I had another i2.xlarge node in the same EC2 
data center be the client.  I compared the COPY command with the java tool in 
the patch.

There were 2 tests I did.  The first was 13107200 rows of 10 BIGINTs (so about 
1GB of data) from a single file (on disk about 630MB).  COPY was able to load 
this in 16:46 (1006 seconds).  The java tool did it in 8:24 (504 seconds).  So, 
the java tool was twice as fast as COPY.

The second test was 10M rows of 6 BIGINTs, 2 DOUBLEs, and 2 TEXT fields (random 
date strings).  COPY failed to complete this load (I tried a number of times - 
it failed with "" - which is also weird since CQLSH defaults to CL_ONE), but the 
java tool completed it in 6:58 (418 seconds).  I dropped the number of rows to 
4M so COPY could complete.  COPY was able to load in  5:17 (317 seconds).  The 
java tool completed the load in 2:44 (164 seconds).  Again, the java tool is 
about twice as fast as COPY.

> Delimited File Bulk Loader
> --
>
> Key: CASSANDRA-9048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter:  Brian Hess
> Fix For: 3.0
>
> Attachments: CASSANDRA-9048.patch
>
>
> There is a strong need for bulk loading data from delimited files into 
> Cassandra.  Starting with delimited files means that the data is not 
> currently in the SSTable format, and therefore cannot immediately leverage 
> Cassandra's bulk loading tool, sstableloader, directly.
> A tool supporting delimited files much closer matches the format of the data 
> more often than the SSTable format itself, and a tool that loads from 
> delimited files is very useful.
> In order for this bulk loader to be more generally useful to customers, it 
> should handle a number of options at a minimum:
> - support specifying the input file or to read the data from stdin (so other 
> command-line programs can pipe into the loader)
> - supply the CQL schema for the input data
> - support all data types other than collections (collections is a stretch 
> goal/need)
> - an option to specify the delimiter
> - an option to specify comma as the decimal delimiter (for international use 
> casese)
> - an option to specify how NULL values are specified in the file (e.g., the 
> empty string or the string NULL)
> - an option to specify how BOOLEAN values are specified in the file (e.g., 
> TRUE/FALSE or 0/1)
> - an option to specify the Date and Time format
> - an option to skip some number of rows at the beginning of the file
> - an option to only read in some number of rows from the file
> - an option to indicate how many parse errors to tolerate
> - an option to specify a file that will contain all the lines that did not 
> parse correctly (up to the maximum number of parse errors)
> - an option to specify the CQL port to connect to (with 9042 as the default).
> Additional options would be useful, but this set of options/features is a 
> start.
> A word on COPY.  COPY comes via CQLSH which requires the client to be the 
> same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
> etc).  This tool should be able to connect to any version of Cassandra 
> (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
> Moreover, CQLSH's COPY command does not support a number of the options 
> above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
> considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382724#comment-14382724
 ] 

 Brian Hess commented on CASSANDRA-9048:


[~carlyeks] and [~jjirsa] - this tool (as implemented, at least), does not 
depend on the version of Cassandra.  It leverages the Java driver, much like 
cassandra-stress and can be pointed at C* 2.0 or 2.1, etc.  In similar lines, 
it aligns with the tree in the same way that cassandra-stress aligns with the 
tree.

[~iamaleksey] - The use case here is that a client machine has a pile of 
delimited files that need to be loaded in bulk to Cassandra - a common use case 
we see.  In the Spark-based tool, you would have to have Spark on the client 
(perhaps that's okay) but moreover it would be reading files from the local 
filesystem (or stdin), not from a distributed file system, so there would be no 
parallelism from Spark. 

I think that a Spark-based tool to do COPY for files in a distributed file 
system might make sense, but I don't think it applies for files on the one 
machine.  Spark certainly also makes sense for the CREATE-TABLE-AS-SELECT 
(which is what CASSANDRA-8234 is about, not about loading).

> Delimited File Bulk Loader
> --
>
> Key: CASSANDRA-9048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter:  Brian Hess
> Fix For: 3.0
>
> Attachments: CASSANDRA-9048.patch
>
>
> There is a strong need for bulk loading data from delimited files into 
> Cassandra.  Starting with delimited files means that the data is not 
> currently in the SSTable format, and therefore cannot immediately leverage 
> Cassandra's bulk loading tool, sstableloader, directly.
> A tool supporting delimited files much closer matches the format of the data 
> more often than the SSTable format itself, and a tool that loads from 
> delimited files is very useful.
> In order for this bulk loader to be more generally useful to customers, it 
> should handle a number of options at a minimum:
> - support specifying the input file or to read the data from stdin (so other 
> command-line programs can pipe into the loader)
> - supply the CQL schema for the input data
> - support all data types other than collections (collections is a stretch 
> goal/need)
> - an option to specify the delimiter
> - an option to specify comma as the decimal delimiter (for international use 
> casese)
> - an option to specify how NULL values are specified in the file (e.g., the 
> empty string or the string NULL)
> - an option to specify how BOOLEAN values are specified in the file (e.g., 
> TRUE/FALSE or 0/1)
> - an option to specify the Date and Time format
> - an option to skip some number of rows at the beginning of the file
> - an option to only read in some number of rows from the file
> - an option to indicate how many parse errors to tolerate
> - an option to specify a file that will contain all the lines that did not 
> parse correctly (up to the maximum number of parse errors)
> - an option to specify the CQL port to connect to (with 9042 as the default).
> Additional options would be useful, but this set of options/features is a 
> start.
> A word on COPY.  COPY comes via CQLSH which requires the client to be the 
> same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
> etc).  This tool should be able to connect to any version of Cassandra 
> (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
> Moreover, CQLSH's COPY command does not support a number of the options 
> above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
> considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (CASSANDRA-9048) Delimited File Bulk Loader

2015-03-26 Thread Brian Hess (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-9048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14382474#comment-14382474
 ] 

 Brian Hess commented on CASSANDRA-9048:


I have created a version of this as a Java program via executeAsync().  Some 
testing has shown that for bulk writing to Cassandra, if you are starting with 
delimited files (not SSTables), that Java's executeAsync() is more 
efficient/performant than creating SSTables and then calling sstableloader.

This implementation provides for the options above, as well as a way to specify 
the parallelism of the asynchronous writing (the number of futures "in 
flight").  In addition to the Java implementation, I created a command-line 
utility a la cassandra-stress called cassandra-loader to invoke the Java 
classes with the appropriate CLASSPATH.  As such, I also modified build.xml and 
tools/bin/cassandra.in.sh as appropriate.

The patch is attached for review.

The command-line usage statement is:

{{Usage: -f  -host  -schema  [OPTIONS]
OPTIONS:
  -delim  Delimiter to use [,]
  -delmInQuotes true Set to 'true' if delimiter can be inside 
quoted fields [false]  -dateFormat  Date format [default for 
Locale.ENGLISH]
  -nullStringString that signifies NULL [none]
  -skipRowsNumber of rows to skip [0]
  -maxRows  Maximum number of rows to read (-1 means all) 
[-1]
  -maxErrors  Maximum errors to endure [10]
  -badFile  Filename for where to place badly parsed rows. 
[none]
  -port  CQL Port Number [9042]
  -numFuturesNumber of CQL futures to keep in flight [1000]
  -decimalDelimDecimal delimiter [.] Other option is ','
  -boolStyleStyle for booleans [TRUE_FALSE] }}


> Delimited File Bulk Loader
> --
>
> Key: CASSANDRA-9048
> URL: https://issues.apache.org/jira/browse/CASSANDRA-9048
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Tools
>Reporter:  Brian Hess
> Attachments: CASSANDRA-9048.patch
>
>
> There is a strong need for bulk loading data from delimited files into 
> Cassandra.  Starting with delimited files means that the data is not 
> currently in the SSTable format, and therefore cannot immediately leverage 
> Cassandra's bulk loading tool, sstableloader, directly.
> A tool supporting delimited files much closer matches the format of the data 
> more often than the SSTable format itself, and a tool that loads from 
> delimited files is very useful.
> In order for this bulk loader to be more generally useful to customers, it 
> should handle a number of options at a minimum:
> - support specifying the input file or to read the data from stdin (so other 
> command-line programs can pipe into the loader)
> - supply the CQL schema for the input data
> - support all data types other than collections (collections is a stretch 
> goal/need)
> - an option to specify the delimiter
> - an option to specify comma as the decimal delimiter (for international use 
> casese)
> - an option to specify how NULL values are specified in the file (e.g., the 
> empty string or the string NULL)
> - an option to specify how BOOLEAN values are specified in the file (e.g., 
> TRUE/FALSE or 0/1)
> - an option to specify the Date and Time format
> - an option to skip some number of rows at the beginning of the file
> - an option to only read in some number of rows from the file
> - an option to indicate how many parse errors to tolerate
> - an option to specify a file that will contain all the lines that did not 
> parse correctly (up to the maximum number of parse errors)
> - an option to specify the CQL port to connect to (with 9042 as the default).
> Additional options would be useful, but this set of options/features is a 
> start.
> A word on COPY.  COPY comes via CQLSH which requires the client to be the 
> same version as the server (e.g., 2.0 CQLSH does not work with 2.1 Cassandra, 
> etc).  This tool should be able to connect to any version of Cassandra 
> (within reason).  For example, it should be able to handle 2.0.x and 2.1.x.  
> Moreover, CQLSH's COPY command does not support a number of the options 
> above.  Lastly, the performance of COPY in 2.0.x is not high enough to be 
> considered a bulk ingest tool.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >