Re: Triggers

2020-12-15 Thread Brian Hess
This can be accomplished with batches (logged or unlogged), but that means that 
all apps need to be updated to dual-write. With a Trigger and admin can tee the 
data without having to change a line of app code. 

>Brian

> On Dec 15, 2020, at 7:59 AM, Stefan Miklosovic 
>  wrote:
> 
> Hi,
> 
> why can't this be achieved by batches? Do I miss something fundamental
> here? Batches may write to different tables right ... I am just
> missing the point of using triggers for this.
> 
> I add specifics to Brian's first paragraph, this is covered by
> CASSANDRA-13985 -
> https://github.com/apache/cassandra/commit/54de771e643e9cc64d1f5dd28b5de8a9a91a219e
> This will be firstly introduced in 4.0.
> 
> Stefan
> 
>> On Tue, 15 Dec 2020 at 13:49, Brian Hess  wrote:
>> 
>> One challenge to be aware of is that when you use multiple data centers, the 
>> users can make changes in either data center and those changes will 
>> propagate to the other data center. That is, there is no concept of a 
>> “read-only data center” in Cassandra. That may be fine, but some 
>> organizations want to grant access to the data for analytics but don’t want 
>> those teams to be able to modify the original data. You can, in some cases, 
>> restrict the write access through user/role permissions (the analytics team 
>> only has read access to that table), but that may not work depending on your 
>> use case (but it usually does work).
>> 
>> One comment from Benjamin’s comment below. There is one scenario where the 
>> Trigger could guarantee the data makes it to both tables, specifically if 
>> both tables reside in the same keyspace and have the same partition key(s). 
>> Mutations in the same keyspace on tables that have the same partition key 
>> are internally to Cassandra merged into a single internal Mutation and 
>> always applied atomically. So, if you had an exactly same schema for your 
>> second table and it resides in the same keyspace (mytable and 
>> mytable_analytics, say, both in mykeyspace) your trigger could duplicate the 
>> mutation to the source table to be an exact copy into the second table and 
>> Cassandra will apply these both atomically (they both succeed or they both 
>> fail - never just one). In this scenario, the analytics team could modify 
>> data in the second table and not effect the data in the source table.
>> 
>> >Brian
>> 
>>>> On Dec 15, 2020, at 7:38 AM, pauloricardomg  
>>>> wrote:
>>> 
>>> To extend Paul's point, datacenters in cassandra are logical concepts which
>>> may be useful for your use case and do not necessarily need to be
>>> represented by physical data centers.
>>> 
>>> The presentation mentioned by Andrew, while helpful, covers some concepts
>>> which are specific to Hadoop and may be outdated in more recent versions of
>>> Cassandra.
>>> 
>>> I'd recommend two more recent presentations on the multi-DC topic:
>>> -
>>> https://www.slideshare.net/DataStax/apache-cassandra-multidatacenter-essentials-julien-anguenot-iland-internet-solutions-c-summit-2016
>>> -
>>> https://www.slideshare.net/DataStax/operations-consistency-failover-for-multidc-clusters-alexander-dejanovski-the-last-pickle-cassandra-summit-2016
>>> 
>>> Finally, if you have any more questions on this I'd recommend you send them
>>> to the u...@cassandra.apache.org mailing list as this mailing list (
>>> dev@cassandra.apache.org) is related to the project development of
>>> Cassandra.
>>> 
>>>> Em ter., 15 de dez. de 2020 às 09:28, Greg Oliver
>>>>  escreveu:
>>>> 
>>>> Can't see it in the email. What's the slide #?
>>>> 
>>>> From: Andrew Cobley (Staff) 
>>>> Sent: Tuesday, December 15, 2020 12:26 PM
>>>> To: dev@cassandra.apache.org
>>>> Subject: [EXTERNAL] Re: Triggers
>>>> 
>>>> Yes that's right.  I remember this illustration:
>>>> 
>>>> [Diagram  Description automatically generated]
>>>> 
>>>> 
>>>> From this presentation:
>>>> 
>>>> https://www.slideshare.net/rastrick/presentation-12982302<
>>>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.slideshare.net%2Frastrick%2Fpresentation-12982302=04%7C01%7Cgolive%40microsoft.com%7Cb752cef4803d4deae42b08d8a0f4acd5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436320031889415%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=itM2Kb7eQqOSPO09fGiuHXJPVj

Re: Triggers

2020-12-15 Thread Brian Hess
One challenge to be aware of is that when you use multiple data centers, the 
users can make changes in either data center and those changes will propagate 
to the other data center. That is, there is no concept of a “read-only data 
center” in Cassandra. That may be fine, but some organizations want to grant 
access to the data for analytics but don’t want those teams to be able to 
modify the original data. You can, in some cases, restrict the write access 
through user/role permissions (the analytics team only has read access to that 
table), but that may not work depending on your use case (but it usually does 
work). 

One comment from Benjamin’s comment below. There is one scenario where the 
Trigger could guarantee the data makes it to both tables, specifically if both 
tables reside in the same keyspace and have the same partition key(s). 
Mutations in the same keyspace on tables that have the same partition key are 
internally to Cassandra merged into a single internal Mutation and always 
applied atomically. So, if you had an exactly same schema for your second table 
and it resides in the same keyspace (mytable and mytable_analytics, say, both 
in mykeyspace) your trigger could duplicate the mutation to the source table to 
be an exact copy into the second table and Cassandra will apply these both 
atomically (they both succeed or they both fail - never just one). In this 
scenario, the analytics team could modify data in the second table and not 
effect the data in the source table. 

>Brian

> On Dec 15, 2020, at 7:38 AM, pauloricardomg  wrote:
> 
> To extend Paul's point, datacenters in cassandra are logical concepts which
> may be useful for your use case and do not necessarily need to be
> represented by physical data centers.
> 
> The presentation mentioned by Andrew, while helpful, covers some concepts
> which are specific to Hadoop and may be outdated in more recent versions of
> Cassandra.
> 
> I'd recommend two more recent presentations on the multi-DC topic:
> -
> https://www.slideshare.net/DataStax/apache-cassandra-multidatacenter-essentials-julien-anguenot-iland-internet-solutions-c-summit-2016
> -
> https://www.slideshare.net/DataStax/operations-consistency-failover-for-multidc-clusters-alexander-dejanovski-the-last-pickle-cassandra-summit-2016
> 
> Finally, if you have any more questions on this I'd recommend you send them
> to the u...@cassandra.apache.org mailing list as this mailing list (
> dev@cassandra.apache.org) is related to the project development of
> Cassandra.
> 
>> Em ter., 15 de dez. de 2020 às 09:28, Greg Oliver
>>  escreveu:
>> 
>> Can't see it in the email. What's the slide #?
>> 
>> From: Andrew Cobley (Staff) 
>> Sent: Tuesday, December 15, 2020 12:26 PM
>> To: dev@cassandra.apache.org
>> Subject: [EXTERNAL] Re: Triggers
>> 
>> Yes that's right.  I remember this illustration:
>> 
>> [Diagram  Description automatically generated]
>> 
>> 
>> From this presentation:
>> 
>> https://www.slideshare.net/rastrick/presentation-12982302<
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.slideshare.net%2Frastrick%2Fpresentation-12982302=04%7C01%7Cgolive%40microsoft.com%7Cb752cef4803d4deae42b08d8a0f4acd5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436320031889415%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=itM2Kb7eQqOSPO09fGiuHXJPVjW6C51qXCMYq985WaE%3D=0
>>> 
>> 
>> Might help.
>> 
>> Andy
>> 
>> [University of Dundee shield logo]<
>> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fuod.ac.uk%2Fsig-home=04%7C01%7Cgolive%40microsoft.com%7Cb752cef4803d4deae42b08d8a0f4acd5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436320031899372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=WrafubTszeH9pttSA%2BYVqlfYrcPkTfyIVVZXp7BEq8s%3D=0
>>> 
>> 
>> 
>> Andy Cobley
>> Senior Lecturer, Program Director Data Science and Data Engineering MSc
>> School of Science and Engineering, University of Dundee
>> +44 (0)1382 385078 (Not at present) | a.e.cob...@dundee.ac.uk> a.e.cob...@dundee.ac.uk>
>> [University of Dundee Facebook]<
>> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fuod.ac.uk%2Fsig-fb=04%7C01%7Cgolive%40microsoft.com%7Cb752cef4803d4deae42b08d8a0f4acd5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436320031899372%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=8IIFStCp5WnED9YVMsPytY0iZghIS%2FZOIKQMKF4cyC4%3D=0>
>> [University of Dundee Twitter] <
>> https://nam06.safelinks.protection.outlook.com/?url=http%3A%2F%2Fuod.ac.uk%2Fsig-tw=04%7C01%7Cgolive%40microsoft.com%7Cb752cef4803d4deae42b08d8a0f4acd5%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637436320031909329%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=6wKLsHnA29J3IAMKab0X65HW6m%2BFxQCTrO9k8ckHUs8%3D=0>
>> [University of Dundee LinkedIn] <
>> 

Re: Why isn't there a separate JVM per table?

2018-02-23 Thread Brian Hess
Something folks haven't raised, but would be another impediment here is that in 
Cassandra if you submit a batch (logged or unlogged) for two tables in the same 
keyspace with the same partition then Cassandra collapses them into the same 
Mutation and the two INSERTs are processed atomically. There are a few (maybe 
more than a few) things that take advantage of this fact. 

If you move each table to its own JVM then you cannot really achieve this 
atomicity. So, at most you would want to consider a JVM per keyspace (or 
consider touching a lot of code or changing a pretty fundamental/deep contract 
in Cassandra). 

>Brian

Sent from my iPhone

> On Feb 22, 2018, at 7:10 PM, J. D. Jordan  wrote:
> 
> I would be careful with anything per table for memory sizing. We used to have 
> many caches and things that could be tuned per table, but they have all since 
> changed to being per node, as it was a real PITA to get them right.  Having 
> to do per table heap/gc/memtable/cache tuning just sounds like a usability 
> nightmare.
> 
> -Jeremiah 
> 
> On Feb 22, 2018, at 6:59 PM, kurt greaves  wrote:
> 
>>> 
>>> ... compaction on its own jvm was also something I was thinking about, but
>>> then I realized even more JVM sharding could be done at the table level.
>> 
>> 
>> Compaction in it's own JVM makes sense. At the table level I'm not so sure
>> about. Gotta be some serious overheads from running that many JVM's.
>> Keyspace might be reasonable purely to isolate bad tables, but for the most
>> part I'd think isolating every table isn't that beneficial and pretty
>> complicated. In most cases people just fix their modelling so that they
>> don't generate large amounts of GC, and hopefully test enough so they know
>> how it will behave in production.
>> 
>> If we did at the table level we would inevitable have to make each
>> individual table incredibly tune-able which would be a bit tedious IMO.
>> There's no way for us to smartly decide how much heap/memtable space/etc
>> each table should use (not without some decent AI, anyway).
>> ​
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Getting partition min/max timestamp

2018-01-16 Thread Brian Hess
Jeremiah, this might be the exception, since the value that is being
aggregated is exactly the same value that determines liveliness of the
data, and more so since the aggregation requested is the *max* of the
timestamp, given that Cassandra is a Last-Write-Wins (so, looks at the
maximum timestamp).  So, you could actually record the last timestamp of
the last mutation on each partition and have an aggregation you can read at
consistency levels greater than ONE.

That said, it will be the timestamp of the last mutation.  That is, it has
to include tombstones, range tombstones, partition tombstones, etc. That
is, it's quite a bit harder to record the timestamp of the last "live"
value of the data in the partition.

Minimum timestamp is quite a bit harder, especially in the face of Time to
Live operations.  Once the "oldest" timestamped mutation ages off, it's
essentially a full partition scan to find the new minumum timestamp.  It's
also difficult to "break the tie" if two replicas come back with different
minimum timestamps.  The issue is if some other replica has deleted the
mutation that holds the minimum timestamp then you would want to discard
this timestamp value, but the only way to do that is a second lookup to
determine what value corresponds to the minimum timestamp and see if the
value is still live.  If it is not live, then how will you determine the
actual minimum?  Also, assume this is the case for the arbitrarily
large *N* minimum
timestamps on the replica(s).

TL/DR, while maximum might be doable, minimum does fall into the category
that Jeremiah calls out (it's hard to do aggregations on an eventually
consistent system).

On Sun, Jan 14, 2018 at 5:37 PM, Benedict Elliott Smith  wrote:

> It's a long time since I looked at the code, but I'm pretty sure that
> comment is explaining why we translate *no* timestamp to *epoch*, to save
> space when serializing the encoding stats.  Not stipulating that the data
> may be inaccurate.
>
> However, being such a long time since I looked, I forgot we still only
> apparently store these stats per sstable.  It's not actually immediately
> clear to me if storing them per partition would help tremendously (wrt
> compression, as this data was intended) given you would expect a great deal
> of correlation between partitions.  But they would also be extremely cheap
> to persist per partition, so only a modestly positive impact on compression
> would be needed to justify (or permit) them.
>
> I don't think this use case would probably drive development, but if you
> were to write the patch and demonstrate it approximately maintained present
> data sizes, it's likely such a patch would be accepted.
>
> On 14 January 2018 at 20:33, arhel...@gmail.com 
> wrote:
>
> > First of all, thx for all the ideas.
> >
> > Benedict ElIiott Smith, in code comments I found a notice that data in
> > EncodingStats can be wrong, not sure that its good idea to use it for
> > accurate results. As I understand incorrect data is not a problem for the
> > current use case of it, but not for my one. Currently, I added fields for
> > every AtomicBTreePartition. That fields I update in addAllWithSizeDelta
> > call, but also now I get that I should think about the case of data
> > removing.
> >
> > I currently don't really care about TTL's, but its the case about I
> should
> > think, thx.
> >
> > Jeremiah Jordan, thx for notice, but I don't really get what are you mean
> > about replica aggregation optimization’s. Can you please explain it for
> me?
> >
> > On 2018-01-14 17:16, Benedict Elliott Smith  wrote:
> > > (Obviously, not to detract from the points that Jon and Jeremiah make,
> > i.e.
> > > that if TTLs or tombstones are involved the metadata we have, or can
> add,
> > > is going to be worthless in most cases anyway)
> > >
> > > On 14 January 2018 at 16:11, Benedict Elliott Smith <
> bened...@apache.org
> > >
> > > wrote:
> > >
> > > > We already store the minimum timestamp in the EncodingStats of each
> > > > partition, to support more efficient encoding of atom timestamps.
> This
> > > > just isn't exposed beyond UnfilteredRowIterator, though it probably
> > could
> > > > be.
> > > >
> > > > Storing the max alongside would still require justification, though
> its
> > > > cost would actually be fairly nominal (probably only a few bytes; it
> > > > depends on how far apart min/max are).
> > > >
> > > > I'm not sure (IMO) that even a fairly nominal cost could be justified
> > > > unless there were widespread benefit though, which I'm not sure this
> > would
> > > > provide.  Maintaining a patched variant of your own that stores this
> > > > probably wouldn't be too hard, though.
> > > >
> > > > In the meantime, exposing and utilising the minimum timestamp from
> > > > EncodingStats is probably a good place to start to explore the
> > viability of
> > > > the approach.
> > > >
> > > > On 14 January 2018 at 15:34, 

Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Brian Hess
This is not true. 

You cannot provide a ConsistencyLevel for the Materialized Views on a table 
when you do a write. That is, you do not explicitly write to a Materialized 
View, but implicitly write to it via the base table. There is not consistency 
guarantee other than eventual  between the base table and the Materialized 
View. That is, the coordinator only acknowledges the write when the proper 
number of replicas in the base table have acknowledged successful writing. 
There is no waiting or acknowledgement for any Materialized Views on that 
table. 

Therefore, while you can specify a Consistency Level on read since you are 
reading directly from the Materialized View as a table, you cannot specify a 
Consistency Level on wrote for the Materialized View. So, you cannot apply the 
R+W>RF formula. 

>Brian

> On Feb 10, 2017, at 3:17 AM, Kant Kodali  wrote:
> 
> thanks!
> 
> On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth 
> wrote:
> 
>> Yes it is
>> 
>> Am 10.02.2017 00:46 schrieb "Kant Kodali" :
>> 
>>> If reading from materialized view with a consistency level of quorum am I
>>> guaranteed to have the most recent view? other words is w + r > n
>> contract
>>> maintained for MV's as well for both reads and writes?
>>> 
>>> Thanks!
>>> 
>> 


Re: [VOTE] Release Apache Cassandra 3.8

2016-07-21 Thread Brian Hess
I have no vote here, but I think that #2 is not a good idea here.  We would
be implicitly releasing an "odd" release with new features, which is more
than a little confusing.  I do think that CDC is important, so I don't like
#4 (but for less important reasons than #2).  So, I'm good with options 3,
5, or 6 (and perhaps 1, but I don't know enough about it's severity to
endorse/not endorse option 1).  Basically, please don't do #2 (and I'd like
it if you didn't do #4)  :)

On Thu, Jul 21, 2016 at 10:58 AM, Aleksey Yeschenko 
wrote:

> What we’d usually do is revert the offending ticket and push it to the
> next release, if this indeed were significant enough.
>
> So option 4 would be to revert CDC fast (painful) and ship.
> Option 5 would be to quickly fix the issue, retag, and revote, with 3.9
> still following up on schedule.
> Option 6 would be to ignore the calendar entirely. Fix or revert the issue
> eventually, and release 3.8 then. Have 3.9 and 3.0.9 out at whatever time
> we decide to, and go back to monthly cycles from there on.
>
> TBH I don’t think anybody is even going to notice, or care. So I’m fine
> with 1, 4, 5, 6, but not reverting my +1 so far.
>
> --
> AY
>
> On 21 July 2016 at 14:46:17, Sylvain Lebresne (sylv...@datastax.com)
> wrote:
>
> On Thu, Jul 21, 2016 at 3:21 PM, Jonathan Ellis  wrote:
>
> > I see the alternatives as:
> >
> > 1. Release this as 3.8
> > 2. Skip 3.8 and release 3.9 next month on schedule
> > 3. Skip this month and release 3.8 next month instead
> >
>
> I've hopefully made it clear I don't really like 1. I'm totally fine with
> either 2 or 3 though (with a very very small preference for 3. because I
> suspect skipping a release might confuse a few users, but also knowing that
> 2. has the small advantage of keeping the 3.0.x and 3.x versions released
> more or less in lockstep).
>
>
>
> >
> > On Thu, Jul 21, 2016 at 8:19 AM, Aleksey Yeschenko 
> > wrote:
> >
> > > I still think the issue is minor enough, and with 3.8 being extremely
> > > delayed, and being a non-odd release, at that, we’d be better off just
> > > pushing it.
> > >
> > > Also, I know we’ve been easy on -1s when voting on releases, but I want
> > to
> > > remind people in general that release votes can not be vetoed and only
> > > require a majority of binding votes,
> > > http://www.apache.org/foundation/voting.html#ReleaseVotes
> > >
> > > --
> > > AY
> > >
> > > On 21 July 2016 at 08:57:22, Sylvain Lebresne (sylv...@datastax.com)
> > > wrote:
> > >
> > > Sorry but I'm (binding) -1 on this because of
> > > https://issues.apache.org/jira/browse/CASSANDRA-12236.
> > >
> > > I disagree that knowingly releasing a version that will temporarily
> break
> > > in-flight queries during upgrade, even if it's for a very short
> > time-frame
> > > until re-connection, is ok. I'll note in particular that in the test
> > > report, there is 74! failures in the upgrade tests (for reference the
> 3.7
> > > test report had only 2 upgrade tests failure both with open tickets).
> > Given
> > > that we have a known problem during upgrade, I don't really buy the "We
> > are
> > > assuming these are due to a recent downsize in instance size that these
> > > tests run on" and that suggest to me the problem is not too minor.
> > >
> > >
> > > On Thu, Jul 21, 2016 at 6:18 AM, Dave Brosius <
> dbros...@mebigfatguy.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > >
> > > > On 07/20/2016 05:48 PM, Michael Shuler wrote:
> > > >
> > > >> I propose the following artifacts for release as 3.8.
> > > >>
> > > >> sha1: c3ded0551f538f7845602b27d53240cd8129265c
> > > >> Git:
> > > >>
> > > >>
> > >
> >
> http://git-wip-us.apache.org/repos/asf?p=cassandra.git;a=shortlog;h=refs/tags/3.8-tentative
> > > >> Artifacts:
> > > >>
> > > >>
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1123/org/apache/cassandra/apache-cassandra/3.8/
> > > >> Staging repository:
> > > >>
> > > >>
> > >
> >
> https://repository.apache.org/content/repositories/orgapachecassandra-1123/
> > > >>
> > > >> The debian packages are available here:
> > > >> http://people.apache.org/~mshuler/
> > > >>
> > > >> The vote will be open for 72 hours (longer if needed).
> > > >>
> > > >> [1]: http://goo.gl/oGNH0i (CHANGES.txt)
> > > >> [2]: http://goo.gl/KjMtUn (NEWS.txt)
> > > >> [3]: https://goo.gl/TxVLKo (3.8 Test Summary)
> > > >>
> > > >>
> > > >
> > >
> >
> >
> >
> > --
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced
> >
>


Re: Does Cassandra CQL supports 'Create Table as Select'?

2015-05-20 Thread Brian Hess
Not as of yet. See https://issues.apache.org/jira/browse/CASSANDRA-8234

Also, that use case is a lot like Materialized Views. See 
https://issues.apache.org/jira/browse/CASSANDRA-6477

Brian

Sent from my iPhone

 On May 20, 2015, at 1:23 AM, amit tewari amittewar...@gmail.com wrote:
 
 Hi
 
 We would like to have the ability of being able to create new tables from
 existing tables, but with with a new/different partition key.
 
 Can this be done from CQL?
 
 Thanks
 Amit


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-11 Thread Brian Hess
One thing that does jump out at me, though, is about CQL2.  As much as we
have advised against using cassandra-jdbc, I have encountered a few that
actually have used that as an integration point.  I believe that
cassandra-jdbc is CQL2-based, which is the main reason we have been
advising folks against it.

Can we just confirm that there isn't in fact widespread use of CQL2-based
cassandra-jdbc?  That just jumps out at me.

On Mon, May 11, 2015 at 2:59 PM, Aleksey Yeschenko alek...@apache.org
wrote:

  So I think EOLing 2.0.x when 2.2 comes
  out is reasonable, especially considering that 2.2 is realistically a
 month
  or two away even if we can get a beta out this week.

 Given how long 2.0.x has been alive now, and the stability of 2.1.x at the
 moment, I’d say it’s fair enough to EOL 2.0 as soon as 2.2 gets out. Can’t
 argue here.

  If push comes to shove I'm okay being ambiguous here, but can we just
 say
  when 3.0 is released we EOL 2.1?

 Under our current projections, that’ll be exactly “a few months after 2.2
 is released”, so I’m again fine with it.

  P.S. The area I'm most concerned about introducing destabilizing changes
 in
  2.2 is commitlog

 So long as you don’t you compressed CL, you should be solid. You are
 probably solid even if you do use compressed CL.

 Here are my only concerns:

 1. New authz are not opt-in. If a user implements their own custom
 authenticator or authorized, they’d have to upgrade them sooner. The test
 coverage for new authnz, however, is better than the coverage we used to
 have before.

 2. CQL2 is gone from 2.2. Might force those who use it migrate faster. In
 practice, however, I highly doubt that anybody using CQL2 is also someone
 who’d already switch to 2.1.x or 2.2.x.


 --
 AY

 On May 11, 2015 at 21:12:26, Jonathan Ellis (jbel...@gmail.com) wrote:

 On Sun, May 10, 2015 at 2:42 PM, Aleksey Yeschenko alek...@apache.org
 wrote:

  3.0, however, will require a stabilisation period, just by the nature of
  it. It might seem like 2.2 and 3.0 are closer to each other than 2.1 and
  2.2 are, if you go purely by the feature list, but in fact the opposite
 is
  true.
 

 You are probably right. But let me push back on some of the extra work
 you're proposing just a little:

 1) 2.0.x branch goes EOL when 3.0 is out, as planned
 

 3.0 was, however unrealistically, planned for April. And it's moving the
 goalposts to say the plan was always to keep 2.0.x for three major
 releases; the plan was to EOL with the next major release after 2.1
 whether that was called 3.0 or not. So I think EOLing 2.0.x when 2.2 comes
 out is reasonable, especially considering that 2.2 is realistically a month
 or two away even if we can get a beta out this week.

 2) 3.0.x LTS branch stays, as planned, and helps us stabilise the new
  storage engine
 

 Yes.


  3) in a few months after 2.2 gets released, we EOL 2.1. Users upgrade to
  2.2, get the same stability as with 2.1.7, plus a few new features
 

 If push comes to shove I'm okay being ambiguous here, but can we just say
 when 3.0 is released we EOL 2.1?

 P.S. The area I'm most concerned about introducing destabilizing changes in
 2.2 is commitlog; I will follow up to make sure we have a solid QA plan
 there.

 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder, http://www.datastax.com
 @spyced



Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-11 Thread Brian Hess
Jeremiah - still need to worry about whether folks are doing CQL2 or CQL3
over cassandra-jdbc.

If it is not in much use, that's fine by me.  I just wanted to raise one
place where folks might be using CQL2 without realizing it.

On Mon, May 11, 2015 at 4:00 PM, Jeremiah Jordan jerem...@datastax.com
wrote:

 Cassandra-jdbc can do cql3 as well as cql2. The rub (and why I would never
 recommend it) is that it does cql3 over thrift. So you lose out on all the
 native protocol features.



  On May 11, 2015, at 2:53 PM, Brian Hess brianmh...@gmail.com wrote:
 
  One thing that does jump out at me, though, is about CQL2.  As much as we
  have advised against using cassandra-jdbc, I have encountered a few that
  actually have used that as an integration point.  I believe that
  cassandra-jdbc is CQL2-based, which is the main reason we have been
  advising folks against it.
 
  Can we just confirm that there isn't in fact widespread use of CQL2-based
  cassandra-jdbc?  That just jumps out at me.
 
  On Mon, May 11, 2015 at 2:59 PM, Aleksey Yeschenko alek...@apache.org
  wrote:
 
  So I think EOLing 2.0.x when 2.2 comes
  out is reasonable, especially considering that 2.2 is realistically a
  month
  or two away even if we can get a beta out this week.
 
  Given how long 2.0.x has been alive now, and the stability of 2.1.x at
 the
  moment, I’d say it’s fair enough to EOL 2.0 as soon as 2.2 gets out.
 Can’t
  argue here.
 
  If push comes to shove I'm okay being ambiguous here, but can we just
  say
  when 3.0 is released we EOL 2.1?
 
  Under our current projections, that’ll be exactly “a few months after
 2.2
  is released”, so I’m again fine with it.
 
  P.S. The area I'm most concerned about introducing destabilizing
 changes
  in
  2.2 is commitlog
 
  So long as you don’t you compressed CL, you should be solid. You are
  probably solid even if you do use compressed CL.
 
  Here are my only concerns:
 
  1. New authz are not opt-in. If a user implements their own custom
  authenticator or authorized, they’d have to upgrade them sooner. The
 test
  coverage for new authnz, however, is better than the coverage we used to
  have before.
 
  2. CQL2 is gone from 2.2. Might force those who use it migrate faster.
 In
  practice, however, I highly doubt that anybody using CQL2 is also
 someone
  who’d already switch to 2.1.x or 2.2.x.
 
 
  --
  AY
 
  On May 11, 2015 at 21:12:26, Jonathan Ellis (jbel...@gmail.com) wrote:
 
  On Sun, May 10, 2015 at 2:42 PM, Aleksey Yeschenko alek...@apache.org
  wrote:
 
  3.0, however, will require a stabilisation period, just by the nature
 of
  it. It might seem like 2.2 and 3.0 are closer to each other than 2.1
 and
  2.2 are, if you go purely by the feature list, but in fact the opposite
  is
  true.
 
  You are probably right. But let me push back on some of the extra work
  you're proposing just a little:
 
  1) 2.0.x branch goes EOL when 3.0 is out, as planned
 
  3.0 was, however unrealistically, planned for April. And it's moving the
  goalposts to say the plan was always to keep 2.0.x for three major
  releases; the plan was to EOL with the next major release after 2.1
  whether that was called 3.0 or not. So I think EOLing 2.0.x when 2.2
 comes
  out is reasonable, especially considering that 2.2 is realistically a
 month
  or two away even if we can get a beta out this week.
 
  2) 3.0.x LTS branch stays, as planned, and helps us stabilise the new
  storage engine
 
  Yes.
 
 
  3) in a few months after 2.2 gets released, we EOL 2.1. Users upgrade
 to
  2.2, get the same stability as with 2.1.7, plus a few new features
 
  If push comes to shove I'm okay being ambiguous here, but can we just
 say
  when 3.0 is released we EOL 2.1?
 
  P.S. The area I'm most concerned about introducing destabilizing
 changes in
  2.2 is commitlog; I will follow up to make sure we have a solid QA plan
  there.
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder, http://www.datastax.com
  @spyced